Course Description

This course is the first part of a two-semester data analytics course that is required to extract knowledge from business data. The topics covered are: Review of Programming Software, Getting Started with Data, introduction to data- analytics, data-driven thinking, Causality, Learning: Supervised and Unsupervised, Data and Models, Linear Models, logistic regression, regression and classification trees, Entropy, Information Gain, Neighbor Models, and Distance as Similarity.

Objectives

By the end of this course, students will be able to:

  1. Identify correct data analytics method

  2. Check data for errors and use correct method to clean data for analysis

  3. Apply supervised learning methods (Decision Trees, KNN, Regression) for classification

  4. Estimate probability of an event (e.g., default risk) occurring

  5. Evaluate performance

Course Schedule

Tentative Course Schedule

Week Theme Key Competencies
1 Introduction Content of the course
1, 2 Why, Remember: Installing R, RStudio, Remember: Rstudio 101 Environment setup on Windows, CRAN
3,4 Steps, Statistical Learning: an Intro, Data - Model - Analysis Learning from data, Causality
4,5 Data - Model - Analysis, Missing Value Treatment Data Types, Exploratory Data Analysis, Summaries, Visualization, Missing Values and Treatment
5,6 A Brief Intro to Linear Regression Model Linear Model, Linear Regression, Least Squares Estimation, Maximum Likelihood
7,8 Regression, Dependent: Categorical Linear Regression, Logistic Regression, Probit Regression
9,10 Decision Trees Classification and Regression Trees, Information Gain, Gini Gain
11 K-Nearest Neighbors Neighbors Models, KNN
12 Similarity Measures Similarity measures
13 Term Project Presentations
14 Term Project Presentations

Course Materials

Evaluation Criteria

Policies

Academic integrity is fundamental to the academic mission of the university. Acts of academic dishonesty, including but not limited to plagiarism, cheating, fabrication, or unauthorized collaboration, undermine the learning process and violate university policies.

Specific guidelines include:

  1. Plagiarism: Using someone else’s work, ideas, or words without proper attribution is strictly prohibited. This includes copying and pasting from any source, paraphrasing without citation, or submitting another person’s work as your own.

  2. Cheating: Unauthorized use of materials, devices, or information during exams or assignments, including sharing or receiving answers, is not allowed.

  3. Fabrication: Falsifying or inventing data, citations, or research is a breach of academic integrity.

  4. Collaboration: While collaboration on group assignments may be permitted, sharing answers or work on individual tasks is not acceptable unless explicitly authorized.

  5. Consequences: Violations of academic integrity will be addressed following the university’s academic policies, potentially leading to penalties such as assignment failure, course failure, or further disciplinary actions.