Thursday, March 23, 2017

Evolution Strategies as a Scalable Alternative to Reinforcement Learning - implementation -




We explore the use of Evolution Strategies, a class of black box optimization algorithms, as an alternative to popular RL techniques such as Q-learning and Policy Gradients. Experiments on MuJoCo and Atari show that ES is a viable solution strategy that scales extremely well with the number of CPUs available: By using hundreds to thousands of parallel workers, ES can solve 3D humanoid walking in 10 minutes and obtain competitive results on most Atari games after one hour of training time. In addition, we highlight several advantages of ES as a black box optimization technique: it is invariant to action frequency and delayed rewards, tolerant of extremely long horizons, and does not need temporal discounting or value function approximation.





Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Implementation: Compressed Sensing using Generative Models


Alex just mentioned the availability of the code for the recent Compressed Sensing using Generative Models.



It's all here: https://github.com/AshishBora/csgm

Enjoy !



Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Wednesday, March 22, 2017

Paris Machine Learning Hors Serie #10 : Workshop SPARK (atelier 1)






Leonardo Noleto, data scientist chez KPMG, nous fait découvrir le processus de nettoyage et transformation des données brutes en données “propres” avec Apache Spark.

Apache Spark est un framework open source généraliste, conçu pour le traitement distribué de données. C’est une extension du modèle MapReduce avec l’avantage de pouvoir traiter les données en mémoire et de manière interactive. Spark offre un ensemble de composants pour l’analyse de données: Spark SQL, Spark Streaming, MLlib (machine learning) et GraphX (graphes).

Cet atelier se concentre sur les fondamentaux de Spark et le paradigme de traitement de données avec l’interface de programmation Python (plus précisément PySpark).

L’installation, configuration, traitement sur cluster, Spark Streaming, MLlib et GraphX ne seront pas abordés dans cet atelier.

Matériel à installer c'est ici. ..


Objectifs 

  • Comprendre les fondamentaux de Spark et le situer dans l'écosystème Big Data ;
  • Savoir la différence avec Hadoop MapReduce ;
  • Utiliser les RDD (Resilient Distributed Datasets) ;
  • Utiliser les actions et transformations les plus courantes pour manipuler et analyser des données ;
  • Ecrire un pipeline de transformation de données ;
  • Utiliser l’API de programmation PySpark.


Cet atelier est le premier d’une série de 2 ateliers avec Apache Spark. Pour suivre les prochains ateliers, vous devez avoir suivi les précédents ou être à l’aise avec les sujets déjà traités.


Quels sont les pré-requis ?


  • Connaître les base du langage Python (ou apprendre rapidement via ce cours en ligne Python Introduction)
  • Être sensibilisé au traitement de la donnée avec R, Python ou Bash (why not?)
  • Aucune connaissance préalable en traitement distribué et Apache Spark n’est demandée. C’est un atelier d’introduction. Les personnes ayant déjà une première expérience avec Spark (en Scala, Java ou R) risquent de s'ennuyer (c’est un atelier pour débuter).


Comment me préparer pour cet atelier ?


  • Vous devez être muni d’un ordinateur portable relativement moderne et avec minimum 4 Go de mémoire, avec un navigateur internet installé. Vous devez pouvoir vous connecter à Internet via le Wifi.
  • Suivre les instructions pour vous préparer à l’atelier (installation Docker + image docker de l’atelier).
  • Les données à nettoyer sont comprises dans l’image Docker. Les exercices seront fournis lors de l’atelier en format Jupyter notebook. 

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Summer School "Structured Regularization for High-Dimensional Data Analysis" - IHP Paris - June 19th to 22nd

Gabriel just sent me the following:

Dear Igor,
In case you find it suitable, could you advertise for this summer school ?
All the best
Gabriel

Sure !
=======
=======
The SMF (French Mathematical Society) and the Institut Henri Poincaré organize a mathematical summer school on "Structured Regularization for High-Dimensional Data Analysis". This summer school will be the opportunity to bring together students, researchers and people working on High-Dimensional Data Analysis around three courses and four talks on new methods in structured regularization. The mathematical foundations of this event will lie between probability, statistics, optimization, image and signal processing.
More information (including registration, free but mandatory) is available on the webpage: https://regularize-in-paris.github.io/
Courses:
  • * Anders Hansen (Cambridge)
  • * Andrea Montanari (Stanford)
  • * Lorenzo Rosasco (Genova and MIT)
Talks:
  • * Francis Bach (INRIA and ENS)
  • * Claire Boyer (UPMC)
  • * Emilie Chouzenoux (Paris Est)
  • * Carlos Fernandez-Granda (NYU)
Organizers:
  • * Yohann De Castro (Paris-Sud)
  • * Guillaume Lecué (CNRS and ENSAE)
  • * Gabriel Peyré (CNRS and ENS)
The program is here; https://regularize-in-paris.github.io/program/


Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Tuesday, March 21, 2017

Making Backpropagation Plausible

The two papers mentioned on Monday morning are, to a certain, opening the door for making backpropagation plausible within the human brain architecture and also potentially allow for much faster and scalable ways of learning. Here are four papers that are investigating this avenue as a result of these two papers ([1][2]). 

When training neural networks, the use of Synthetic Gradients (SG) allows layers or modules to be trained without update locking - without waiting for a true error gradient to be backpropagated - resulting in Decoupled Neural Interfaces (DNIs). This unlocked ability of being able to update parts of a neural network asynchronously and with only local information was demonstrated to work empirically in Jaderberg et al (2016). However, there has been very little demonstration of what changes DNIs and SGs impose from a functional, representational, and learning dynamics point of view. In this paper, we study DNIs through the use of synthetic gradients on feed-forward networks to better understand their behaviour and elucidate their effect on optimisation. We show that the incorporation of SGs does not affect the representational strength of the learning system for a neural network, and prove the convergence of the learning system for linear and deep linear models. On practical problems we investigate the mechanism by which synthetic gradient estimators approximate the true loss, and, surprisingly, how that leads to drastically different layer-wise representations. Finally, we also expose the relationship of using synthetic gradients to other error approximation techniques and find a unifying language for discussion and comparison.

An ongoing challenge in neuromorphic computing is to devise general and computationally efficient models of inference and learning which are compatible with the spatial and temporal constraints of the brain. One increasingly popular and successful approach is to take inspiration from inference and learning algorithms used in deep neural networks. However, the workhorse of deep learning, the gradient descent Back Propagation (BP) rule, often relies on the immediate availability of network-wide information stored with high-precision memory, and precise operations that are difficult to realize in neuromorphic hardware. Remarkably, recent work showed that exact backpropagated weights are not essential for learning deep representations. Random BP replaces feedback weights with random ones and encourages the network to adjust its feed-forward weights to learn pseudo-inverses of the (random) feedback weights. Building on these results, we demonstrate an event-driven random BP (eRBP) rule that uses an error-modulated synaptic plasticity for learning deep representations in neuromorphic computing hardware. The rule requires only one addition and two comparisons for each synaptic weight using a two-compartment leaky Integrate & Fire (I&F) neuron, making it very suitable for implementation in digital or mixed-signal neuromorphic hardware. Our results show that using eRBP, deep representations are rapidly learned, achieving nearly identical classification accuracies compared to artificial neural network simulations on GPUs, while being robust to neural and synaptic state quantizations during learning.

The back-propagation (BP) algorithm has been considered the de-facto method for training deep neural networks. It back-propagates errors from the output layer to the hidden layers in an exact manner using the transpose of the feedforward weights. However, it has been argued that this is not biologically plausible because back-propagating error signals with the exact incoming weights is not considered possible in biological neural systems. In this work, we propose a biologically plausible paradigm of neural architecture based on related literature in neuroscience and asymmetric BP-like methods. Specifically, we propose two bidirectional learning algorithms with trainable feedforward and feedback weights. The feedforward weights are used to relay activations from the inputs to target outputs. The feedback weights pass the error signals from the output layer to the hidden layers. Different from other asymmetric BP-like methods, the feedback weights are also plastic in our framework and are trained to approximate the forward activations. Preliminary results show that our models outperform other asymmetric BP-like methods on the MNIST and the CIFAR-10 datasets.



Recent studies have shown that synaptic unreliability is a robust and sufficient mechanism for inducing the stochasticity observed in cortex. Here, we introduce Synaptic Sampling Machines (S2Ms), a class of neural network models that uses synaptic stochasticity as a means to Monte Carlo sampling and unsupervised learning. Similar to the original formulation of Boltzmann machines, these models can be viewed as a stochastic counterpart of Hopfield networks, but where stochasticity is induced by a random mask over the connections. Synaptic stochasticity plays the dual role of an efficient mechanism for sampling, and a regularizer during learning akin to DropConnect. A local synaptic plasticity rule implementing an event-driven form of contrastive divergence enables the learning of generative models in an on-line fashion. S2Ms perform equally well using discrete-timed artificial units (as in Hopfield networks) or continuous-timed leaky integrate and fire neurons. The learned representations are remarkably sparse and robust to reductions in bit precision and synapse pruning: removal of more than 75% of the weakest connections followed by cursory re-learning causes a negligible performance loss on benchmark classification tasks. The spiking neuron-based S2Ms outperform existing spike-based unsupervised learners, while potentially offering substantial advantages in terms of power and complexity, and are thus promising models for on-line learning in brain-inspired hardware.

Job: Machine Learning, LightOn, Paris.


LightOn is hiring. We have two openings: One for a person in Machine Learning, the other one for a person in Electronics Hardware Design. More info at: http://www.lighton.io/careers

1. Machine Learning Research Engineer
Do you want to contribute to a fast-growing company at the cutting edge of innovation between optics and artificial intelligence? LightOn is looking for a Research Engineer specialized in Machine Learning / Data Science for the development of new optical co-processors for Artificial Intelligence.
Within the R&D team and reporting to the CTO, your main duties will include :
  • the design of statistical learning algorithms that take advantage of LightOn processors,
  • algorithm testing on LightOn’s processors,
  • managing and interacting with industrial partners,
  • interacting with developers of the software layer for network access (API),
  • interfacing with the hardware developing team,
  • carrying out rapid prototyping activities in synchronization with the rest of the team.
REQUIRED PROFILE Engineering Degree (MSc or PhD) in Machine Learning / Data Science. An industry experience would be a plus.
Technical skills (required) : You should
  • Have some theoretical knowledge and hand-on experience in unsupervised or supervised machine learning (eg Deep Neural Networks),
  • Have some experience on how to process and make sense of very large amounts of data,
  • Be proficient in scientific programming (Python, C ++, Matlab, ...).
  • Be a user of one or more Machine Learning/Deep Learning framework(s) (Scikit-learn, TensorFlow, Keras, Theano, Torch, etc)
A significant interest in one or more of the following topics would be a plus:
  • automated search for hyper-parameters,
  • digital electronics or FPGA programming.
In order to work in a small startup such as LightOn, you will also need to creative and pragmatic, have some team spirit and some good communication skills.  
CONDITIONS  This position is for a full-time employment, that can start as soon as a possible. Salary will based on technical skills and experience.  The candidate must have the right to work in the EU.  We cannot pay for relocation costs.
CONTACT
To respond to this offer, please send an e-mail to jobs@LightOn.io with [ML Engineer] in the subject line. Please attach a resume and cover letter both in PDF.
THE COMPANY  Founded in 2016, LightOn (www.LightOn.io)  is a technology start-up that develops a new generation of optical co-processors designed to accelerate the low power Artificial Intelligence algorithms for massive amounts of data. The technology developed by LightOn originates from the ESPCI and Ecole Normale Supérieure laboratories. LightOn won in 2016 the best Digital Tech startup from the City of Paris. We are located in the center of Paris within the Agoranov incubator.
2. Electronics Hardware Systems engineer
Would you like to contribute to a fast-growing company at the cutting edge of innovation between optics and artificial intelligence? LightOn is looking for a Research Engineer specialized in Electronics and Embedded Systems to develop our new optical co-processors for Artificial Intelligence.
Within the R&D team, reporting to the CTO, your main duties include :
  • system integration with high-throughput opto-electronic components,
  • design of driver software,
  • digital design / guidance of PCB layout,
  • Interaction with developers of the software layer for network access (API),
  • Rapid prototyping activities with the rest of the team.
  • functional verification,
  • manufacturing production / subcontracting support.
REQUIRED PROFILE An Engineering Degree (MSc or PhD) in Electrical Engineering or related field.
Technical skills (required) :
  • Relevant experience (ideally 5+ years in industry) designing embedded systems,
  • Successful track record of delivering highly innovative products.
  • Digital logic board design of embedded CPU, RAM, ROM, and FPGA subsystems.
  • Experience with high-speed digital circuits such as HDMI, DDR, PCIe, USB, Ethernet / GigE
A significant interest in one or more of the following topics would be a plus:
  • Machine Learning,
  • Cloud-based services.
In order to work in a small startup such as LightOn, you will also need to creative and pragmatic, have some team spirit and some good communication skills.  
CONDITIONS  This position is for a full-time employment, that can start as soon as a possible. Salary will based on technical skills and experience.  The candidate must have the right to work in the EU.  We cannot pay for relocation costs.
CONTACT
To respond to this offer, please send an e-mail to jobs@LightOn.io with [EE Engineer] in the subject line. Please attach a resume and cover letter both in PDF.
THE COMPANY  Founded in 2016, LightOn (www.LightOn.io)  is a technology start-up that develops a new generation of optical co-processors designed to accelerate the low power Artificial Intelligence algorithms for massive amounts of data. The technology developed by LightOn originates from the ESPCI and Ecole Normale Supérieure laboratories. LightOn won in 2016 the best Digital Tech startup from the City of Paris. We are located in the center of Paris within the Agoranov incubator.






Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Random Triggering Based Sub-Nyquist Sampling System for Sparse Multiband Signal

Yijiu let me know of his new paper:



Random Triggering Based Sub-Nyquist Sampling System for Sparse Multiband Signal by Yijiu Zhao, Yu Hen Hu, Jingjing Liu

We propose a novel random triggering based modulated wideband compressive sampling (RT-MWCS) method to facilitate efficient realization of sub-Nyquist rate compressive sampling systems for sparse wideband signals. Under the assumption that the signal is repetitively (not necessarily periodically) triggered, RT-MWCS uses random modulation to obtain measurements of the signal at randomly chosen positions. It uses multiple measurement vector method to estimate the non-zero supports of the signal in the frequency domain. Then, the signal spectrum is solved using least square estimation. The distinct ability of estimating sparse multiband signal is facilitated with the use of level triggering and time to digital converter devices previously used in random equivalent sampling (RES) scheme. Compared to the existing compressive sampling (CS) techniques, such as modulated wideband converter (MWC), RT-MWCS is with simple system architecture and can be implemented with one channel at the cost of more sampling time. Experimental results indicate that, for sparse multiband signal with unknown spectral support, RT-MWCS requires a sampling rate much lower than Nyquist rate, while giving great quality of signal reconstruction.



Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Monday, March 20, 2017

Learning in the Machine: Random Backpropagation and the Learning Channel

Carlos Perez's blog entry on Medium entitled Deep Learning: The Unreasonable Effectiveness of Randomness just led me to the following paper I had not read before (probably because it came out during NIPS). I also added the latest version of Arild Nokland's earlier paper on a similar idea that was itself published at NIPS (and featured on Nuit Blanche). 




Random backpropagation (RBP) is a variant of the backpropagation algorithm for training neural networks, where the transpose of the forward matrices are replaced by fixed random matrices in the calculation of the weight updates. It is remarkable both because of its effectiveness, in spite of using random matrices to communicate error information, and because it completely removes the taxing requirement of maintaining symmetric weights in a physical neural system. To better understand random backpropagation, we first connect it to the notions of local learning and the learning channel. Through this connection, we derive several alternatives to RBP, including skipped RBP (SRPB), adaptive RBP (ARBP), sparse RBP, and their combinations (e.g. ASRBP) and analyze their computational complexity. We then study their behavior through simulations using the MNIST and CIFAR-10 bechnmark datasets. These simulations show that most of these variants work robustly, almost as well as backpropagation, and that multiplication by the derivatives of the activation functions is important. As a follow-up, we study also the low-end of the number of bits required to communicate error information over the learning channel. We then provide partial intuitive explanations for some of the remarkable properties of RBP and its variations. Finally, we prove several mathematical results, including the convergence to fixed points of linear chains of arbitrary length, the convergence to fixed points of linear autoencoders with decorrelated data, the long-term existence of solutions for linear systems with a single hidden layer, and the convergence to fixed points of non-linear chains, when the derivative of the activation functions is included.



Artificial neural networks are most commonly trained with the back-propagation algorithm, where the gradient for learning is provided by back-propagating the error, layer by layer, from the output layer to the hidden layers. A recently discovered method called feedback-alignment shows that the weights used for propagating the error backward don't have to be symmetric with the weights used for propagation the activation forward. In fact, random feedback weights work evenly well, because the network learns how to make the feedback useful. In this work, the feedback alignment principle is used for training hidden layers more independently from the rest of the network, and from a zero initial condition. The error is propagated through fixed random feedback connections directly from the output layer to each hidden layer. This simple method is able to achieve zero training error even in convolutional networks and very deep networks, completely without error back-propagation. The method is a step towards biologically plausible machine learning because the error signal is almost local, and no symmetric or reciprocal weights are required. Experiments show that the test performance on MNIST and CIFAR is almost as good as those obtained with back-propagation for fully connected networks. If combined with dropout, the method achieves 1.45% error on the permutation invariant MNIST task.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Saturday, March 18, 2017

Six million page views: The Number's Game in Long Distance Blogging



The Long Distance Blogging continues. Six million page views roughly amounts to about a million page views per year but those stats don't include the 800+ people receiving this blog as a "newsletter" every day. If you want every new entries directly in your mailbox, enter your email address here:
At NIPS, I was surprised to learn that some readers would call the blog a newsletter as they probably seldo come on the site directly.  There is also the subscribe to Nuit Blanche RSS feed which has about 1200 subscribers. 

The Nuit Blanche Community where I repost blog entries on other social networks are:
These are two groups on LinkedIn where people can interact. Members receive only a monthly Nuit Blanche in Review, as opposed to a daily blog posting frequency as above:


We are currently in Season 4 of the Paris Machine Learning meetup. Our membership seems to indicate that this is probably the third largest in the world regionwise after Silicon Valley and New York, so with Franck, we decided to invest in a site, it's MLParis.org. We've had 18 meetups since September. Here are some sites associated with the Paris Machine Learning Community 

Over the course of writing Nuit Blanche, there was a need to make some information available in more constant fashion, here are these reference pages
Here are the historical figures:

Friday, March 17, 2017

The Unreasonable Effectiveness of Random Orthogonal Embeddings



In the series "The Unreasonable effectiveness of", we've had 

Today, we have something that is a subset of random projections: The Unreasonable Effectiveness of Random Orthogonal Embeddings by Krzysztof Choromanski, Mark Rowland, Adrian Weller
We present a general class of embeddings based on structured random matrices with orthogonal rows which can be applied in many machine learning applications including dimensionality reduction, kernel approximation and locality-sensitive hashing. We show that this class yields improvements over previous state-of-the-art methods either in computational efficiency (while providing similar accuracy) or in accuracy, or both. In particular, we propose the \textit{Orthogonal Johnson-Lindenstrauss Transform} (OJLT) which is as fast as earlier methods yet provably outperforms them in terms of accuracy, leading to a `free lunch' improvement over previous dimensionality reduction mechanisms. We introduce matrices with complex entries that further improve accuracy. Other applications include estimators for certain pointwise nonlinear Gaussian kernels, and speed improvements for approximate nearest-neighbor search in massive datasets with high-dimensional feature vectors.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Printfriendly