# Machine Learning Algorithms Explained Clearly

Machine learning is a function of artificial intelligence. It is essentially the process of feeding a set of data into a particular algorithm that interacts with the data to analyze it. Data is organized, segmented, or “parsed” and used to make predictions, reach binary decisions, or detect patterns within a set of data.

In this post I will explain the different types of machine learning algorithms, how these algorithms interact with data to create machine learning, and the application of various machine learning algorithms.

**Categories of Algorithms**

For machine learning, there are 3 different types of algorithms that exist. These represent 3 different styles of machine learning that can be used.

**1. Supervised Machine Learning Algorithms**

Supervised Machine learning is used to make useful predictions about real world scenarios. There are two subcategories for supervised algorithms:

- Regression algorithms are used to predict a numerical value, based off of changing variables. Examples include predicting a person’s salary based off level of education obtained, determining likelihood for a specific ailment by age, or determining the value of a house by various factors.
- Classification Algorithms are used to predict an exact value of something in the binary sense. (true/false) (0,1) (photo of a duck/not photo of a duck)

Supervised learning relies upon using a set of data that has examples of the “correct” answer to the math function. So if you’re using a regression algorithm, you would need to provide some data showing a given salary at a certain level of education, ages and factors that developed a certain ailment, or certain values for houses under certain conditions.

For a classification algorithm, data including examples of what is true or false would need to be included. For obvious reasons, the higher a quantity of training examples, the more accurate the prediction will be.

**Popular Supervised Machine Learning Algorithms:**

**Classification**

- Logistic Regression – (Classification) this type of algorithm is utilized when a binary answer to a question needs to be determined from either one or multiple independent variables. In short, they predict the likelihood that something will occur. For example, will a person have a heart attack based on X conditions? What are the odds a person will quit their job based on X conditions? The outcome is binary. Yes or no, true or false.

- Naive Bayes – (Classification) This algorithm is used to categorical purposes, such as determining whether an email is spam or not, categorizing news, or even face recognition.

- Logistic Regression – (Classification) this type of algorithm is utilized when a binary answer to a question needs to be determined from either one or multiple independent variables. In short, they predict the likelihood that something will occur. For example, will a person have a heart attack based on X conditions? What are the odds a person will quit their job based on X conditions? The outcome is binary. Yes or no, true or false.

- Support Vector Machines (Mostly Classification) – this algorithm uses hyperplanes to divide data as optimally as possible.. Essentially, this technique plots a line between two categories of data to divide items into either one set or the other to the highest level of accuracy. This process can be repeated with different sets of data to achieve the proper categorization of an item. This can be used for the classification of images such as facial recognition (more accurate method), handwriting recognition, text/article categorizing.

- Random Forests (Classification) – This algorithm is called a ‘forest’ because it essentially combines many decision trees to reach a solution. Each ‘tree’ is assigned a random portion of the data, so that every tree is running solutions for different but similar data. This can then be compared to find the solution that most trees had in common. This algorithm performs best for extremely high amounts of data.

**Regression**

- AdaBoost (Regression or Classification) – short for Adaptive Boosting, this algorithm can be used in unison with other algorithms to obtain more accurate results for a solution. This process involves primarily refining a set of data over and over again in order to improve it. So after an algorithm has been executed, the outliers can be selected and be used exclusively in an algorithm to see what went wrong.

- Decision Trees (Regression or Classification) this type of algorithm is used to present appropriate solutions based on certain conditions that ‘branch’ from the main topic. Decision trees can be used in target marketing, such as determining who should be sent an invitation to apply for a credit card, or a free trial for something based on the likelihood that they’ll purchase the product afterwards.

- Linear Regression – (Regression) these algorithms use a traditional linear function (y=mx +b) to create a prediction for a specific set of conditions. Examples include our aforementioned education salary or house value.

- Nearest Neighbor – (Regression) this algorithm is often used to find “similar items” through what in technical terms is called a k-NN search.

**2. Unsupervised Machine Learning Algorithms**

Unsupervised algorithms are used to detect patterns in data, and descriptive modeling*. Organization of databases is the primary function of unsupervised learning, in that it provides structure and allows scientists to make sense of unlabeled data. In this way unsupervised algorithms are able to present information that a scientist wouldn’t have thought to look for. This technique is necessary when a set of data exists without a precise goal or model. No training data is available or prescribed in unsupervised learning.

**Descriptive modeling is a mathematical process that describes real-world events and the relationships between factors responsible for them.*

There are two subcategories of algorithms for unsupervised machine learning as well:

- Clustering Algorithms – these algorithms will separate data into like groups or ‘clusters’ due to similar features and attributes. Data that is clustered together will be have more similarities than of those in other clusters. Clustering algorithms are non-binary in that they can organize data into multiple clusters, not just divide in half.
- Association Rule Mining Algorithms – these algorithms are essentially “if/then statements” that find commonalities between pieces of data. They differ from clustering algorithms in that clustering searches for common associations in data, and ARMs seek to identify what causes these associations. Put another way, Clustering is the ‘what’ and Association Rules are the ‘why’ that seek to understand the clusters and make predictions.

**Popular Unsupervised Machine Learning Algorithms:**

**Clustering **

- k-means Clustering (Linear Clustering) – this algorithm represents the most popular model of clustering, and divides data into multiple clusters for analyzation. Multiple points on a graph are picked as the centerpoints for each cluster, each representing some sort of variable. Data is then assigned a place on the graph according to its relation to each centerpoint, grouping similar data together. This needs to occur multiple times, as the location of a centerpoint can affect the results.

- Hierarchical Clustering (Linear Clustering) – these algorithms arrange data in a treelike structure, called a dendrogram. Data disperses from the trunk of the tree as similarities between pieces of data decrease. Dendrograms can occur horizontally or vertically across a graph.

- Distribution Based Clustering – these algorithms assign a given outcome as the center variable for each cluster. Data is organized by outcome and the results will show which data led to which outcomes.
- Density Based Algorithms (Non-Linear Clustering) – similar to k-means clustering, these algorithms group data that is more similar closer together. It does not need a specified amount of clusters however, and can automatically create clusters unlike k-means.

**Association**

- Apriori Algorithm (Association Rule) – these algorithms serve to operate on a set of data containing a large amount of transactions, such as items purchased by customers or medical reactions to a particular medication. The information is used to make predictions about what variables will lead to a given outcome. (such as a sale or side-effect in this case.)

- Eclat Algorithm (Association Rule) – this algorithm is used to detect direct correlation between data sets in a transactional context. For example, if a person buys chips they are also likely to buy salsa, or two books that may be purchased in association by a large number of individuals.

- FP-growth Algorithm (Association Rule) – this algorithm is an improvement on apriori, it identifies frequent patterns using a tree ‘growth’ technique that illuminates various possibilities more efficiently.

**3. Reinforcement Machine Learning Algorithms**

Reinforcement learning, inspired by behaviorist psychology, is a process of machine learning that functions without training data, much like unsupervised learning. The programs here learn through decision making functions, which are algorithms that describe how the algorithm should and can behave. The program then uses the decision making process to perform an action. If the decision is a ‘good’ one, this is reinforced and considered a reward. The program finds out only after a decision is made whether it is a good or bad one. This process goes in a loop as long as the program is running. The process of reinforcement learning is considered the great hope for artificial intelligence because it most closely mimics the way we learn as humans. Reinforcement learning algorithms can be used for computer vs. human strategy games, self-driving cars, robotic hands, and much more.

**Reinforcement Machine Learning Algorithms:**

- Q-Learning – this simplification of reinforcement learning involves predicting the value of Q given other values in a matrix.

- Temporal Difference – this process predicts future quantities, such as how many rewards are expected over a period of time.

<Back to Home>

**References**

**https://www.zendesk.com/blog/machine-learning-and-deep-learning/**

**https://medium.com/machine-learning-for-humans/supervised-learning-740383a2feab**

**http://dataaspirant.com/2016/09/24/classification-clustering-alogrithms/**

**https://sites.google.com/site/dataclusteringalgorithms/home**

**http://www.dataschool.io/comparing-supervised-learning-algorithms/**

**https://www.analyticsvidhya.com/blog/2017/04/comparison-between-deep-learning-machine-learning/**

https://www.redpixie.com/blog/examples-of-machine-learning

**https://home.deib.polimi.it/matteucc/Clustering/tutorial_html/**

**https://machinelearningmastery.com/implement-simple-linear-regression-scratch-python/**

**https://medium.com/simple-ai/linear-regression-intro-to-machine-learning-6-6e320dbdaf06**

**http://dataaspirant.com/2016/09/24/classification-clustering-alogrithms/**

**https://pdfs.semanticscholar.org/d405/8d9f3f66c53ddea776c974fbd740afd994b4.pdf**

**https://docs.tibco.com/pub/spotfire/6.5.0/doc/html/hc/hc_method_overview.htm**

**https://sites.google.com/site/dataclusteringalgorithms/k-means-clustering-algorithm**

**https://www.hackerearth.com/blog/machine-learning/beginners-tutorial-apriori-algorithm-data-mining-r-implementation/**

http://www.statisticssolutions.com/what-is-logistic-regression/

https://www.datasciencecentral.com/profiles/blogs/random-forests-explained-intuitively