- Lecture/Discussion: 3
- Lab: 1.5 (biweekly)
In this course, concepts and techniques introduced in DATA100 will be reinforced. Students will explore the data science lifecycle, including question formulation, data collection and cleaning, exploratory data analysis and visualization, statistical inference and prediction, and decision-making. This course will focus on quantitative critical thinking and key principles and techniques needed to carry out this cycle. The main topics include languages for transforming, querying and analyzing data; algorithms for machine learning methods, including regression, classification and clustering; principles behind creating informative data visualizations; statistical concepts of measurement error and prediction; and techniques for scalable data processing. The programming language Python will primarily be used to demonstrate the techniques. Other computational tools may include Excel and R.