This course aims to equip incoming and current Wharton doctoral students with the basic technical skills and tools required for empirical research. This includes publicly available analysis tools (e.g., R and Python) and Wharton-specific resources (e.g., Wharton grid computing cluster and WRDS). The course is primarily concerned with acquiring, cleaning, managing, and analyzing real-world datasets. It will also provide hands-on experience using a variety of computing tools, including machine learning and natural language processing techniques. At the end of this short-term course, students will have a better understanding of what tools are most appropriate for various data analysis tasks at hand.
There is no prerequisite for this course. Feel free to attend the sessions selectively. Auditing is welcome. The format will be roughly a 60-min lecture followed by a 30-min lab session, where you are encouraged to work on exercises. There is no exam. Please bring your own laptop for this course.
- Dates: July 30 to August 15 (8 lectures, Monday/Wednesday/Friday)
- Time: 10:00-11:30am
- Location: F50 in JMHH
1. Course Intro, Command Line, R and Python Basics Mon 30 July, F50 in JMHH
2. More Intro to R & Python Wed 1 August, F50 in JMHH
3. Data Acquisition 1: Consuming APIs Friday 3 August, F50 in JMHH
Pre-class: Apply for Yelp Fusion API Key by Following Authentication
Resources: Wharton Research Data Services (WRDS); Data Offered by Lippincott Library; Apply for Data from WCAI; REST API Tutorial; Documentation of Yelp Fusion API; Interfaces of Requests Package in Python; Other API Examples Available on Websites of Past Tech Camps (links at the bottom)
4. Data Acquisition 2: Web Scraping Monday 6 August, F50 in JMHH
5. Data Analysis: Summarization and Visualization, Causation vs. Prediction Wed 8 August, F50 in JMHH
6. Wharton HPCC and Behavioral Lab (Guest Speakers) Fri 10 August, F50 in JMHH
7. Intro to Machine Learning and Foundations of Deep LearningMon 13 August, F50 in JMHH
Resources: An Introduction to Statistical Learning with Applications in R Book; Mining of Massive Datasets Book; Deep Learning Reading List by Dokyun Lee; Stanford CS231n Convolutional Neural Networks for Visual Recognition; Coursera Deep Learning Course; Deep Learning Textbook; WILDML Blog
8. Intro to Natural Language Processing Wed 15 August, F50 in JMHH