最新消息
 
教师信息
 
课程简介
 
课件下载
 
常见问题(新)
 
食宿信息
 
交通信息
 
科大地图
 


课程简介


Big Data Analytics: Selected Topics in Data Mining and Statistical Methods

This course provides a fast track and hands-on introduction to big data analytics from the angles of data mining and statistical methods. The focus is on the essential concepts and techniques, as well as the fundamental principles in building up applications. This is an advanced graduate course. The audience should be familiar with the popular discrete mathematics (including basics in set theory, abstract algebra, logics, and graph theory), algorithm analysis and design, and basic probability and statistics. C++/Java and R/Matlab programming is expected.

It is a 5 day course, 3 hours every morning and 2 hours every afternoon. As a condensed version of a full graduate course, the tentative schedule is as follows, which is subject to change without notice. The content may also be customized according to audience's interest.

课程安排


6/29 Day 1:

- Introduction (big data, data mining, statistical data analytics, and applications)
- Sampling methods (concepts, basic ideas, Fisher information, simple random sampling, confidence intervals and sample size)

6/30 Day 2:

- Sampling: advanced topics (estimating proportions and with unequal sampling probabilities, stratified and cluster/systematic sampling, multistage design, network sampling)

7/1 Day 3:

- Statistical methods (EM, Monte Carlo)
- 7月1日下午,学员休息、复习

7/2 Day 4:

- Statistical methods (continued): Markov chain Monte Carlo
- Nonparametric density estimation

7/3 Day 5:

- Big data analytics application examples (statistical mining in social networks and social medial, healthcare informatics, public health and population/organization health)
- Summary: brainstorming for future work