MPP2023-523

Khoa học Dữ liệu trong Chính sách công

Huỳnh Nhật Nam, Vladimir Yapit Mariano & Võ Tuấn Kiệt
Ngày: 26/10/2022 23:28; Kích thước: 119,150 bytes
Vui lòng tham khảo trên Microsoft Teams

The first part of the course will introduce the fundamental principles and concepts underlying common algorithms in machine learning and their applications in business and policy. More specifically, students will first be introduced to the concepts of bias-variance tradeoff and overfitting, which are arguably the bedrock of developing effective predictive analytic models. Techniques to help identify overfitting (e.g. cross-validation) and eliminate overfitting (e.g. regularisation) of predictive models will be presented. While students may have been familiar with parametric approaches to predictive analytics (e.g. regression models introduced in 'Quantitative Methods' courses), this course aims to introduce students to non-parametric approaches (with a focus on classification problems), such as K-nearest neighbours, tree-based methods, and support vector machines. They will also be introduced to the principles and application of popular unsupervised methods, such as hierarchical and k-means clustering algorithms.

The second part of the course will introduce students to the use of off-the-shelf artificial intelligence libraries for image processing, with a focus (and hands-on exercises) on their use for satellite image processing, facial recognition and vehicle counting.

The courses will consist of lectures and computing exercises which are designed to help students practice and thus understand better the theoretical concepts presented in the lectures. The lectures are not intended to be mathematical intensive. Mathematical details will be provided just enough to help students understand the data science concepts and associated techniques. The programming language Python will be used to demonstrate these concepts and techniques. Students will be required to write codes in Python for exercises, assignments and the final exam. While a brief introduction to Python for data manipulation will be provided, it is recommended that students have prior knowledge of programming, either with Python or another language.

The course will be taught in English (without interpretation).