Starting to learn a bit of AI
10 August 2022

The goal is to transform data into knowledge

Over the next couple of years, one of the learning "Adventures" I want to go on is learning more about AI and, specifically, the sub field of Machine Learning. It is not easy knowing where to start on such a big topic.

I want develop an understanding of Why not just How, this eliminates many of the obvious courses/books. Finding a first principle book on AI is quite hard paricularly if you want something that retains pragmatic elements.

I picked out an old book from 2015 called "Python Machine Learning" by Sebastian Raschika. The libraries it uses are certainly outdated and many of the techniques are probably surpassed. It had the advantage that is was on my bookshelf and it seems to have a nice mix of the why and how.

A journey starts with the first step and this seems like a reasonable one first one to take.

So I read the Introduction chapter

An introduction in the a technical book has to cover a number things. - What's include in the book. - The best way to read the book. - Routes through the book, if there is more than one. - Start to introduce terminology. - Motivate the student.

I tend to bring my own motivation and all but introducing terminology are only interesting if you are actually reading the book.

The very nice super high level overview of machine learning one liner they proposed was The goal is to transform data into knowledge. I really like this concise sentence and it was not one I have head before.

What followed then was a discussion of the three different types of machine learning.

  1. Supervised
  2. Unsupervised
  3. Reinforcement

Supervised Learning

Lets take a look quick look at Supervised learning first. This is probably the the better known set of techniques where you try to make predictions about previously unseen data by learning from large sets of labelled data.

The classic exmaple is deciding whether an email just received is spam or not. It is new data not previous seen but if you have a large amount of other emails that have been labelled as spam or not then perhaps you can use that to infer whether this new email is spam. Do people actually use email any more outside of a business context.

If you are trying to predict a continuous outcome like exam result based on hours studied then you then to have to use regression. The classic technique here would be linear regression.

Reinforcement Learning

This is where you use an Agent(system) that operates inside an environment, often including a reward signal. That sentence screams out for a reference to the The Matrix. Once you factor in a reward signal it start to look quite close to supervised learning.

The example they use here is Chess.

Unsupervised learning

The previous two techniques had some element of you know where you are going, either you have labelled data or a reward signal to show you the way. Unsupervised learning is dealing with unlabelled or unknown data. The goal is apply techniques to explore the data and extract the information.

Clustering would be a good example technique.

The book then mentions dimensionallity reduction as a sub field. This is often a pre-process on the data to reduce the amount of data given to other techniques. I can't recall the example given. It is also useful if you want to visualise data.


My final sentence in my notes is you will often want to reduce any bias you might have about the data by applying multiple different techniques. This feels very important sentence to internalise.

I felt the chapter did a good job at introducing the top level ideas along with some whys.

The Approach I am taking

Even thought the book is old I am wary about putting too many details onto this blog. The author worked hard on this book and deserves recognition for the text they crafted.

To avoid parroting their text, I am taking notes while reading and using theese to write these blog entries. That mean any errors are likely from this process or my lack of understanding. Later chapters that get more practical should provide additional distance from the original text. With this work flow I may attribute something to the book that is not there. You have been warned.

I also may not even finish the book, not because of the quality but I may get pulled away to explore a different machine learning topic and just never return to the book. Flexibility in approach is important.