Machine Learning

“Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalise to unseen data, and thus perform tasks without explicit instructions. […]The application of ML to business problems is known as predictive analytics” Wikipedia

Deals with obtaining the features (inputs) from data
Deals with predictive tasks, Classification and/or Regression

Machine Learning

Other Fields and Machine Learning (ML):

Artificial intelligence (AI): ML is a subset of AI
Data mining: ML and Data Mining overlap significantly
Statistics: Statistics and ML are closely related (but goals are different)
Source: Wikipedia

Machine Learning and Automation

Computers have been used to automate many business decisions such as:
- payroll
- sending out invoices
- summarizing sales by region
- creating management reports
- descriptive analysis of customer reviews
- etc
This is [in general] called digitization
Machine learning is central to the fourth industrial revolution, where computers are used to create intelligence
Copied from J. Hull, Machine Learning in Business: An Introduction to the World of Data Science Chapter 1

Machine Learning Tasks

Forecasting

Anomaly Detection

Segmenting customers based on purchasing behavior

Predicting customer churn

Ranking

Recommendation/Decision

Machine Learning

Data Rich Environment
Difficult to explain Human Expertise
Dynamic Systems, Changing with time
Needs for adaptation

Examples of Recent Success stories:

Speech Recognition

NLP

Translation

Image Processing

Machine Learning: Lots of Keywords about Learning

Learning:

Supervised Learning
Unsupervised Learning
Semi-Supervised Learning (Out of the scope of this course)
Reinforcement Learning (Out of the scope of this course)
Deep Learning (Neural Networks will be introduced as the foundation of Deep Learning)
etc.

Task and Data:

Regression
Classification
Clustering
etc.

Why Machine Learning in Business and Finance: An Example

Example: Loan Applications (digitization vs. ML)

If there are certain known rules known rules loan officers can apply, one could digitize their activities
If rules are not known ML can be used to determine them
ML can also be used to improve upon the rules for loan decisions
Copied from J. Hull, Machine Learning in Business: An Introduction to the World of Data Science Chapter 1

The New World of Data

Huge Amount of data
1000s of covariates
ML becomes more fashionable
ML algorithms become more available
Computers are more powerful
Need to utilize different data types in modelling
Data to Pattern to [hopefully] theory is promising

Traditional Statistics is helpful (and necessary) but new methods/approaches are essential for business decisions

Machine Learning Approaches

Supervised Learning	Unsupervised Learning	Reinforcement Learning
{Y;X} available	{X} available	Actions in Dynamic Environment Ex: Game
\(E[Y \: given \: X]\)	Pattern inside data
\(P(Y=y \: given \:X=x)\)	Homogeneous Groups
Ex: Regression	Ex: Clustering

Supervised Learning: Labelled Data

\(Data=Pattern + Error, \: y=f(X)+\varepsilon\)

Unsupervised Learning: Unlabeled Data

\(Data \propto Pattern, \: X \propto Pattern\)

Reinforcement Learning: An Intelligent Agent should Take Actions in Dynamic Environment to maximize a reward

Data Rich Environment: [Very] High Dimensionality

Supervised Learning: main goal is to find:

\(Data=Pattern(s)+Error(s)\)

Example: Standard Regression

\(y=\beta_0+\beta_1 x_1+\beta_2 x_2+ \cdots + \beta_k x_k + \varepsilon\)

for some \(k>>2\)

This is equivalent to

\(Pattern=\beta_0+\beta_1 x_1+\beta_2 x_2+ \cdots + \beta_k x_k \; and \; \; error=\varepsilon\)

Or put in another form:

\(\mu(X)=E[Y|X=x]=\hat \beta_0+\hat \beta_1 x_1+\hat \beta_2 x_2+ \cdots +\hat \beta_k x_k\)

given \(E[\varepsilon]=0\) and \(\hat \beta_i\) are the estimated coefficients.

How to find the parameters, \(\hat \beta_i\):

An example: Minimize Loss (Cost) Function. Example, minimize Mean Squared Errors (MSE)

\(MSE=\frac{1}{N+1} \sum_{i=0}^{N} (y_i-\mu(x_i))^2=\frac{1}{N+1} \sum_{i=0}^{N} \varepsilon_i^2\)

High Dimensionality

In most of the cases, number of observations, \(N\) is much grater than number of covariates (parameters), \(P\), \(N>>P\)
If the number of observations is similar to number of covariates, \(N \sim P\), then one might talk about some failure due to degrees of freedom
if \(N < P\) estimation fails

\(\implies\) high dimensionality comes with difficulties.

Fundamental Table

Data	Causal	Predictive
Observational	Good/Bad	Good/Bad
Experimental	Good/Bad	Good/Bad

Lets think two variables, \(y\) and \(x\), and the causality structure such that \(X\) causes \(Y\). All of the alternatives are:

\(X\) causes \(Y\), (or shown as \(X \implies Y\))

\(Y\) causes \(X\), \(Y \implies X\)

\(Z\) causes both \(X\) and \(Y\), \(Z \implies {Y, X}\) (but \(z\) may or may not be available)

By Chance, [remember p-value ]

By Selection

Causality

Experiment to remove the effects of potential confounder? (variable that influences both the dependent variable and independent variable)
Sample split randomly

It is possible then,

\(X \implies Y\)

\(Y\) do not causes \(X\) since the sample is split by chance then chance causes \(X\)

\(Z\) may cause both possible but by chance

It could still be by chance

It could be by selection, but it should be excluded by the experimenter

Fundamental Table

Data	Causal	Predictive
Observational	Bad	Good
Experimental	Good	Bad

In economics Observational Data set is used for Causal Inference

In \(Theories \implies Models \implies Validate \: with \: Data\) flow, causal structure is dictated by \(Theories\). Hence the word EconoMetrics had been used in Economics.

In Economics

Causal Relationships are very important

Means:

Correlation vs Causation must be discussed (this one is the main critique)

Error structure is important

Behavioral assessments to model is crucial

Goodness of fit is not the main focus (though it is important)

In Econometrics

Theories (~~Thought Exercise: Idea~~)
Lots of assumptions without a plausible way to test them (many of them are unrealistic)
Theories \(\implies\) Models
Estimate Models (OLS, IV Regression, Max Likelihood GMM etc.)
Conclude with estimated parameters and standard errors

Machine Learning for Business and Economics - Why

Machine Learning

Machine Learning

Machine Learning and Automation

Machine Learning Tasks

Machine Learning

Machine Learning: Lots of Keywords about Learning

Why Machine Learning in Business and Finance: An Example

The New World of Data

Machine Learning Approaches

Data Rich Environment: [Very] High Dimensionality

High Dimensionality

Fundamental Table

Causality

Fundamental Table

In Economics

In Econometrics