I decided to take the online course “Machine Learning Foundations – A Case Study Approach” offered by Coursera and taught by Carlos Guestrin and Emily Fox (professors from University of Washington).
This introductory and intuitive course treats the Machine Learning method as a black box. The idea is to learn ML concepts through a case study approach, so the course doesn’t deepen on how to describe a ML model and optimize it.
It’s a 6-week course and I’ll share here the highlights related to my research.
Week 1 – course overview
Machine learning is changing the world: In fact, if you look some of the most industry successful companies today – Companies that are called disruptive – they’re often differentiated by intelligent applications, by intelligence that uses machine learning at its core. So, for example, early days Amazon really disrupted the retail market by bringing in product recommendations into their website. We saw Google disrupting the advertising market by really targeting advertising with machine learning to figure out what people would click on. You saw Netflix, the movie distribution company, really change how movies are seen. Now we don’t go to a shop and rent movies anymore. We go to the web and we stream data. Netflix really changed that. And at the core, there was a recommender system that helped me find the movies that I liked, the movies that are good for me out of the many, many, many thousands of movies they were serving. You see companies like Pandora, where they’re providing a music recommendation system where I find music that I like. And I find streams that are good for the morning when I’m sleepy or at night when I’m ready to go to bed and I want to listen to different music. And they really find good music for us. And you see that in many places, in many industries, you see Facebook connecting me with people who I might want to be friends with. And you even see companies like Uber disrupting the taxi industry by really optimizing how to connect drivers with people in real time. So, in all these areas, machine learning is one of the core technologies, the technology that makes that company’s product really special.
The Machine Learning pipeline: the data to intelligence pipeline. We start from data and bring in a machine learning method that provides us with a new kind of analysis of the data. And that analysis gives us intelligence. Intelligence like what product am I likely to buy right now?
Case study 1: Predicting house prices
Machine Learning can be used to predict house values. So, the intelligence we’re deriving is a value associated with some house that’s not on the market. So, we don’t know what its value is and we want to learn that from data. And what’s our data? In this case, we look at other houses and look at their house sales prices to inform the house value of this house we’re interested in. And in addition to the sales prices, we look at other features of the houses. Like how the number of bedrooms, bathrooms, the number of square feet, and so on. What the machine learning method does it to relate the house attributes to the sales price. Because if we can learn this model – this relationship from house level features to the observed sales price – then we can use that for predicting on this new house. We take its house attribute and predict its house sales price. And this method is called regression.
Case study 2: Sentiment analysis
Machine Learning can be used to a sentiment analysis task where the training data are reviews of restaurants. In this case, a review can say the sushi was awesome, the drink was awesome, but the service was awful. A possible ML goal in this scenario can be to take this single review and classify whether or not it has a positive sentiment. If it is a good review, thumbs up; if it has negative sentiment, thumbs down. To do so, the ML pipeline analyses a lot of other reviews (training data) considering the text and the rating of the review in order to understand what’s the relationship here, for classification of this sentiment. For example, the ML model might analyze the text of this review in terms of how many time the word “awesome” versus how many times the word “awful” was used. And doing so for all reviews, the model will learn – based on the balance of usage of these words – a decision boundary between whether it’s a positive or negative review. And the way the model learn from these other reviews is based on the ratings associated with that text. This method is called a classification method.
Case study 3: Document retrieval
The third case study it’s about a document retrieval task. From a huge collection of articles and books (dataset) the system could recommend, the challenge is to use machine learning to indicate those readings more interesting to a specific person. In this case, the ML model tries to find structure in the dataset based on groups of related articles (e.g. sports, world news, entertainment, science, etc.). By finding this structure and annotating the corpus (the collection of documents) then the machine can use the labels to build a document retrieval engine. And if a reader is currently reading some article about world news and wants to retrieve another one, then, aware of its label, he or she knows which type of category to keep searching over. This type of approach is called clustering.
Case study 4: Product recommendation
The fourth case study addresses an approach called collaborative filtering that’s had a lot of impact in many domains in the last decade. Specifically, the task is to build a product recommendation applications, where the ML model gets to know the costumer’s past purchases and tries to use those to recommend some set of other products the customer might be interested in purchasing. The relation the model tries to understand to make the recommendation is on the products the consumer bought before and what he or she is likely to buy in the future. And to learn this relation the model looks at the purchase histories of a lot of past customers and possibly features of those customers (e.g. age, genre, family role, location …).
Case study 5: Visual product recommender
The last case study is about a visual product recommender. The concept idea is pretty much like the latter example. The task here is also a recommendation application, but the ML model learns from visual features of an image and the outcome is also an image. Here, the data is an input image (e.g. black shoe, black boot, high heel, running shoe or some other shoe) chosen by a user on a browser. And the goal of the application is to retrieve a set of images of shoes visually similar to the input image. The model does so by learning visual relations between different shoes. Usually, these models are trained on a specific kind of architecture called Convolutional Neural Network (CNN). In CNN architecture, every layer of the neural network provides more and more descriptive features. The first layer is supposed to just detect features like different edges. By the second layer, the model begins to detect corners and more complex features. And as we go deeper and deeper in these layers, we can observe more intricate visual features arising.