Machine Learning – Business Applications
Course syllabus
Course title: Machine Learning – Business Applications
ECTS: 5
Semester: 4
Location: Universidad Técnica Federico Santa María (Chile)
Compulsory course: YES – Track GEME - Globalisation and Emerging Market Economies
Lecturer & Contact: Sebastián Azócar M. Master’s in Data Science
Email: hizocar@gmail.com
Prerequisites: Applied Econometrics I and II
Learning outcomes and competences:
The general objective of the course is that students acquire the knowledge and practice necessary to Machine learning (ML). This is a branch of computer science that uses algorithms to mimic the way humans learn. In this course, we will analyze the different techniques and statistical methods used in ML to make predictions with Business Applications.
Organisation / learning methods:
The methodology of the course is focused on learning by doing, so the individual work of each student is key (a study load of at least 3 hours per week is assumed), and each student is required to read the required material before each class.
Course contents:
A. Data
- A. Data and Decision Making
- Different Types of Data
- Data manipulation
B. What is Machine Learning?
- B.1 Machine Learning Models
- What is a ML Model?
- Fitting a Model
- KNN
- Polynomial Regression
- Overfitting and Underfitting: Bias versus Variance
- The Cost Function
- The Training Error
- The Test Error
- B.2 The Machine Learning Pipeline
- The Bias-Variance Trade-Off
- Cross-Validation
- Applying the machine learning pipeline
C. Classification
- C.1 Logistic Regression
- What is classification?
- Technique and Methodology
- Measuring the Model Performance
- The ROC Curve
- C.2 Generative Models
- Basic Concepts
- The Naïve Bayesian Classifier
- Text Classification
- NLP Application: Measuring Text Sentiment
D. Trees and Forests
- D.1 Tree based Methods
- Structure of decision trees
- Types of tree-based methods
- Loss functions
- Tree pruning
- Regression Trees
- Classification Trees
- D.2 Ensemble Methods
- Basic Concepts
- Techniques and Methodology
- Bagging
- Random Forests
- Boosting
E. Selection
- E.1 Variable Selection
- Applications to Variable Selections
- Techniques and Methodology
- Best subset selection
- Stepwise, Backward and Forward Selection
- E.2 Shrinkage Methods
- Shrinkage versus Selection
- LASSO Regression
- RIDGE Regression
- Elastic Net Regression
F. Unlabeled Data
- F.1 Dimension Reduction
- Unlabeled data
- Principal Components
- Application of PCA
- F.2 Clustering
- k-means clustering
- Hierarchical clustering
- Advantages and Limitations
- Practical Application
G. Introduction to Neural Networks
- G.1 Neural Networks
- Basic concepts
- Artificial neural networks (ANNs)
- The simple perceptron
- Structure of the ANN
- Methods
- G.2 Neural Networks Implementation
- Introduction
- Advantages and Limitations
- Training the Model
- Model Optimization
Readings / literature:
A. Readings
- Breiman, L. (2001). Statistical modeling: The two cultures (with comments and a rejoinder by the author). Statistical science, 16(3), 199-231.
- Burgess, M. (2018). This is how Netflix’s secret recommendation system works. Wired.
- Castañón, J. (10). Machine Learning Methods that Every Data Scientist Should Know, 2019.
- Pant, A. (2019). Introduction to Machine Learning for Beginners. Preuzeto, 19, 2021.
- Zhang (2018). Data Types From A Machine Learning Perspective With Examples.
B. Readings
E. Readings
- ISLR sections 6.1 (Subset selection), 6.2 (Shrinkage methods)
- Lesson 4: Variable Selection
- Lesson 5: Shrinkage Methods
- Deol, G. (2019)
C. Readings
- ISLR sections 4.1, 4.2, 4.3, 4.6.2
- Lesson 9.1: Logistic Regression
- Asiri, S.(2018)
D. Readings
- ISLR Chapter 8
- Lesson 11: Tree-based Methods
- [Analytics Vidha (2016). Tree Based Algorithms: A Complete Tutorial from Scratch (in R & Python)](https://www.analyticsvidhya.com/blog/2016/04/tree-based-algorithms-complete-tutorial-s