What is Machine Learning

Table of Contents

What is Machine Learning (ML)? Here’s a somewhat unavailing answer;

It’s nothing new
It’s Important and timely
It’s both “good”, and, “evil”
It’s a “killer app” for modern computing accelerators like GPU’s (and probably FPGA’s soon)
It’s a LOT of Mathematics
It’s also something you can play with without a lot of mathematics

I’ll give you a more meaningful definition now before I elaborate on the nonsense above.

Machine Learning — Machine Learning (ML) is a multidisciplinary field focused on implementing computer algorithms capable of drawing predictive insight from static or dynamic data sources using analytic or probabilistic models and using refinement via training and feedback. It make use of pattern recognition, artificial intelligence learning methods, and statistical data modeling. Learning is achieved in two primary ways; Supervised learning, where the desired outcome is known and an annotated data set or measured values are used to train/fit a model to accurately predict values or labels outside of the training set. This is basically “regression” for values and “classification” for labels. Unsupervised learning uses “unlabeled” data and seeks to find *inherent* partitions or clusters of characteristics present in the data. ML draws from the fields of computer science, statistics, probability, applied mathematics, optimization, information theory, graph and network theory, biology, and neuroscience.

The above is my somewhat naive definition of Machine Learning. Basically I see it as “curve fitting” on LOTS of steroids! It is very interesting and rich with wonderful results and seemingly endless areas of application!

Nothing New

Machine Learning has gotten a lot of attention lately but it’s roots and major mathematical methods go back several decades. I played around with neural networks a bit in the early 90’s when I was a grad student. One of the books I still have on my shelf from that time period is, “Introduction to the Theory of Neural Computation” by John Hertz, Anders Krogh and Richard G. Palmer. There is a forward by Jack Cowan, of the University of Chicago Math Department, that starts,

“The past decade has seen an explosive growth in studies of neural networks. In part this was the result of technological advances in personal and main-frame computing, enabling neural network investigators to simulate and test ideas in ways not readily available before 1980.”

That could easily have been written today! (the reference to “main-frame computing” is a give away to the age of the quote though 🙂 It would have been written today with a reference to the incredible performance achievable now using GPU acceleration.

The mathematical foundations of the methods of ML were established years ago. Progress and interest in some of the more inventive methodologies like neural networks was slow and sparse because of a lack of “really adequate” computing power and rich data sources. That is no longer the case! You can do more now with a workstation and a few video cards than folks in 1990 could do with the entire computing resources of the planet! And, there is so much data to work with now that worldwide data storage is estimated in zettabytes. 1ZB = 1 trillion gigabytes = 10^21 bytes. This is probably the start of the golden age of machine learning.

Important and Timely

All of the pieces are in place for ML. There are Mountains of data from business records, mountains of data from scientific instruments, mature and well annotated databases. There are well established theoretical and numerical methods and many easy to use software libraries and API’s. There are inexpensive computing resources and well understood programming methods for utilizing add in accelerators like GPU’s. AND, there is money! Money for research, money for startups and big business budget allocations. “Data Scientist” is the hot new job title.

The dark side

Unfortunately the biggest driving force for the resurgence of Machine Learning is to try to sell you stuff! Some of the best young minds are being seduced by high paying jobs at places like Google, Amazon, Baidu, Facebook, and Microsoft. Brilliant work is being done to improve marketing and advertising. Arrrggg! Another driving force is government's insatiable desire to poke their nose in your business. The shiny friend in your pocket is a great tracking device, there are cameras everywhere, and everything you do online is recorded somewhere. All of that data is collected by someone somewhere and the government wants to sift through all of that to see if they can find out if you are up to no good. Again, Arrrgggg!

The bright side

There is great good to be achieved with ML. Machine learning has huge potential for support of scientific discovery. There are very large databases of high quality scientific information and modern research instruments can produce terabytes of data per day. Possibilities for discovery in biomedicine, chemistry, physics, sociology, economics, etc. are endless. I’m confident that any researcher could scribble out on a napkin a Machine Learning project in their field after having an espresso and a couple of beers. There is also great potential for good from the role of ML in “future technology” such as robotics, autonomous vehicles, voice recognition and translation, and realizations of various applications of artificial intelligence. Also, not all use of ML in business and government is evil. There is tremendous possibilities for business to improve how they operate to better serve their customers in very positive ways. ML can also fuel positive social economic and political change. Governments can use their large data resources to better understand the needs of citizens and effect positive global change. … but they will probably just spy on everybody …

Killer App for GPU’s and other Accelerators

There is no doubt that the work done at Google in 2012 analysing YouTube data using a deep neural network classifier with unsupervised learning did a lot to spur popular interest in ML. The feature extraction classifier they built discovered (by itself) that cat videos are popular on YouTube — Surprise! It was actually really interesting work! The two patterns below are what the systems decided were the most important things on YouTube 🙂

Their first work was done using a conventional compute cluster for the training calculations. It consisted of a 1000 nodes with 1600 cores! A year later it was shown that a model with the same complexity could be run in the same time using 3 workstations with 4 GPU’s in each (and those were 2013 GPU’s!). That was exciting and the excitement was not lost on NVIDIA. This is a great use case for GPU acceleration since single precision is generally adequate for these types of tasks (in fact it’s arguable that single precision is overkill). NVIDIA is really good at developing “ecosystems” for developers and they are working very hard to provide great libraries and general support for the ML community. … and they are succeeding! This is illustrated by their work on the “DIGITS” software stack. It’s an example of their commitment to the community and their effort to lower the barrier to entry for exploration.

It’s not just GPU’s that can be used to accelerate ML computations. Given that seemingly every-other booth I walked by at SC15 had FPGA’s in it, I wouldn’t be at all surprised to see ML applications in the reconfigurable computing space. And, don’t forget about Intel! All of those vector units in the Phi can be put to good use on ML codes and they are working on their own optimized Python language implementation which is one of the favored languages for ML.

Lot’s of Mathematics

Machine Learning is an area of focus in applied mathematics. ML draws on many mathematical topics, the major ones being, Probability and Statistics, Linear Algebra, Calculus, and Computer Programming. Also useful is Optimization theory, Information Theory, Graph/Network Theory, Numerical Analysis … You could probably come up with a pretty long list of maths that are useful for ML. A level of mathematical exposure equivalent to an undergraduate minor in applied mathematics together with some computer skills would get you going in the literature of ML. If you are aspiring to be a “data scientist” that probably means a degree in Math along with some computer science and application study like business or one of the sciences.

You can play with without a lot of mathematics

Do you really need an extensive mathematics background to “do” machine learning? No, probably not. If you want to have a deep understand and possibly do research then yes, you need the math. If you have strong curiosity and like exploring you will find that some of the software being developed is usable by mere mortals. If you start searching online you will discover a large number of free courses and tutorials. You could try to set up NVIDIA’s DIGITS software stack and work through the examples. You may discover that you can do some interesting things without a deep understanding about what’s under the hood. It’s a hot topic and there is lots of discussion and inspiration to be found. Just dive in!

I recommend taking a look at the excellent keynote talks by Jeff Dean (Google) and Andrew Ng (Baidu) given at NVIDIA 2015 GTC conference. They will give you a feeling of the excitement and possibilities of ML. If you are interested in this stuff you should really consider going to the GTC 2016 in April. I expect there to be a very heavy focus on ML. It’s a great conference!

If you are interested in going deeper into the core of ML I recommend this wonderful book by Christopher M. Bishop “Pattern Recognition and Machine Learning”.

I hope to spend a significant amount of time with Machine Learning myself this year. I’ll keep you posted on the good stuff 🙂

Happy Computing! –dbk