This course aims at offering some hands-on experience for incoming and current PhD students interested in data-driven research. This course will use a real-world example to show how to collect, clean, and analyze data, from beginning to end. In addition, this course will also briefly cover machine learning and natural language processing tools that may greatly facilitate data analysis. The goal of this short course is to develop familiarity with data-related skills and tools, so that students know where to start and how to possibly get things done in their future data-related research.
There is no prerequisite for this course. Feel free to attend the sessions selectively. Auditing is welcome (please register). There will be assignments for practice after each session, but no exam. Please bring your own laptop for this course.
1. R and Python Basics. Version control.
2. Data Collection: Facebook, Twitter, and general Web pages.
3. Data Cleaning: data.table (R)
4. Data Analysis: Data Mining vs. Causal Inference
6. Advanced Data Analysis: estimating your own model.
7. Big Data Introduction
8. Natural Language Processing.
Setup: whl files
Time and Location
3:00pm-4:30pm Monday/Wednesday/Friday July 31-August 17 (8 sessions), JMHH F55