This course is the second part of a two-semester data analytics course (MIS 301). The topics covered are: Review of Data Analytics I course content, Review of programming SW, Cross-Validation, dimension reduction approaches, Feature Selection, Similarity, clustering methods.
By the end of this course, students will be able to:
Apply linear regression to analyze relationships between
variables.
Implement generative models for classification problems.
Use cross-validation and resampling methods to assess model performance.
Apply dimension reduction techniques.
Understand Feature Selection briefly.
Explore and interpret clusters in unsupervised data.
Tentative Course Schedule
Week | Topic | Chapters | Key Activities/Assignments |
---|---|---|---|
1 | Introduction to Statistical Learning | 1, 2 | Overview, R Setup, Basic Commands |
2 | Simple and Multiple Linear Regression: Review | 3 | Lab: Linear Regression |
3 | Evaluation of Linear Regression | 3 | Lab: Linear Regression Evaluations |
4 | Model for Classification: Review | 4 | Lab: Logistic Regression, KNN |
5 | Cross-Validation | 5 | Lab: Cross-Validation Methods |
6 | Model Selection (Left For ML Course) | 6 | Lab: Model Selection Methods |
7 | Dimension Reduction (Left For ML Course) | 6 | Lab: Principal Components |
8 | Before Midterm Exam | Review | |
9 | Midterm Exam | ||
10 | Support Vector Machines | 9 | Lab: SVMs for Classification |
11 | Similarity Measures | 12 | Lab: Similarity Measures |
12 | Clustering Methods | 12 | K-Means, Hierarchical Clustering |
13 | Presentations | ||
14 | Recap: What have we learned |
Textbook: Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. An Introduction to Statistical Learning with Applications in R (Second Edition).
Software: R, RStudio (download links and setup instructions will be provided), and relevant R packages.
Attendance: Regular attendance is expected and will be rewarded.
Late Submissions: Assignments submitted late will incur a penalty unless prior approval is granted.
Academic Integrity:
Academic integrity is fundamental to the academic mission of the university. Acts of academic dishonesty, including but not limited to plagiarism, cheating, fabrication, or unauthorized collaboration, undermine the learning process and violate university policies.
Specific guidelines include:
Plagiarism: Using someone else’s work, ideas, or words without proper attribution is strictly prohibited. This includes copying and pasting from any source, paraphrasing without citation, or submitting another person’s work as your own.
Cheating: Unauthorized use of materials, devices, or information during exams or assignments, including sharing or receiving answers, is not allowed.
Fabrication: Falsifying or inventing data, citations, or research is a breach of academic integrity.
Collaboration: While collaboration on group assignments may be permitted, sharing answers or work on individual tasks is not acceptable unless explicitly authorized.
Consequences: Violations of academic integrity will be addressed following the university’s academic policies, potentially leading to penalties such as assignment failure, course failure, or further disciplinary actions.