top of page

CSC 380 : Intro to Data Science  

Archived  | Online Asynchronous | Summer 2025

Enfa George.jpeg

Enfa Fane
Instructor

bennett_brixen.png

Bennett Brixen
Teaching Assistant

Course Format and Teaching Methods

  • Asynchronous Online Lectures: Pre-recorded lectures will be provided for students to access

at their convenience.
 

  • Discussions on Piazza: Piazza will be used for students to ask questions, engage in

discussions, and seek clarifications from the instructor and teaching assistant.
 

  • Project-Based Homeworks: Assignments will be designed to apply data science techniques to

real-world problems.
 

  • Weekly Check-ins: Regular individual check-ins via a Google form. This will involve

questions/discussions related to the reading, a platform to address concerns, and provide live
feedback on the course.

Course Content

I used materials from Prof. Jason Pacheco's class ( Fall 2021 ),  Prof.John P Dickerson's lectures in CMSC641, and a few new topics I was advised to include by previous instructors of the course Prof. Kwang-Sung Jun ( Fall & Spring 2022 ), Prof. Chicheng Zhang and Prof. Kyoungseok Jang (Spring 2023).

Week 1:  Course Introduction and setup.
 

  • Revision of basics needed for the course.

  • Welcome & Introduction

  • Introduction to Data Science

​

Week 2:  Applied Probability and Statistics (1⁄2)
 

  • Random Events and Probability

  • Moments and Independence

  • Statistics & Bayesian Probability

  • Binomial Probability

​

Week 3: Data Collection and Data Processing, Exploratory Analysis

​

  • Introduction to Pandas

  • Data Collection (Part 1 of 2)

  • Data Collection (Part 2 of 2) & Data Processing (Part 1)

  • New topics added in this iteration include data scraping using the requests and BeautifulSoup libraries, querying APIs, and a brief introduction to SQL databases with a focus on joins.

​

Week 4: Data Visualization, Introduction to ML

 

  • Pandas cont…

  • Data Preprocessing - Part 1 |

  • New topics added in this iteration include processing Tabular Data, Audio and Image data


Week 5:Supervised Learning , Model Assessment

​​​

  • Data Preprocessing - Part 2 & Data Visualization

​

Week 6:Unsupervised Learning

​

  • Data Visualization and Introduction to Machine Learning

  • Machine Learning - Key Concepts & Supervised ML : Linear Regression

  • Hands on Demo

​

Week 7: ML continued

​

  • Data Preprocessing and ML 

  • Evaluation and cont Machine Learning.

  • Algorithms discussed include Linear Regression, Naive Bayes, Kmeans, Decision Tree, Logistic Regression and Knn.

​

Week 8 : Applied Probability and Statistics (2/2)

​

  • Useful Discrete Distributions

  • Useful Continuous Distributions and MLE

  • Wrapping up Statistics and Probability


Week 9 :  Deep Learning
 

  • Neural Networks

* Less focus on Lectures in the last week,  the focus was on one-on-one attention for the student's final project*

Graded Work 

Homeworks

Homework 1: Statistics

Introduced fundamental statistical concepts relevant to data science.

​

Homework 2: Data Collection & Pandas​

Focused on practical data collection methods. Students collected data from an API and through web scraping, then used the Pandas library to clean and explore the data.

​

Homework 3: Linear Regression Project

An end-to-end data science assignment where students applied linear regression to a dataset. They were responsible for data preparation, model building, evaluation, and interpretation.

​

Homework 4: Independent Data Science Project

Each student selected a unique dataset (approved by the TA/instructor) that had minimal existing code or analysis online. Using a detailed prompt and grading rubric, they completed an end-to-end data science project. Students had to adapt the general instructions to fit their dataset and problem domain.

Students met with a TA or instructor for a 30-minute project check-in, during which they received feedback and suggestions.​

Final

​For the final, students built on Homework 4 by addressing a new set of prompts, an expanded set of machine learning models to choose from. Other additions included:Explaining model choice and how it worksJustifying data cleaning and preprocessing strategiesEvaluating model performanceIdentifying limitationsResponding to an ethics-related questionStudents submitted both HW4 and the Final as a single Jupyter notebook and were also required to submit a README file for their project repository.

Participation Grade

4 Activties worth total 5% of grade
 

  • Easter Egg in Syllabus : Message to send favorite movie and a meme hidden in syllabus      ( 1 point )

  • Jupyter Notebook Setup Marked as Homework 0      ( 1 point )

  • Github Setup       ( 1 point )

  • Quiz on neural networks     ( 4 points )

Weekly Check-Ins

Regular individual check-ins via a Google form. This will involve questions/discussions related to the reading, a platform to address concerns, and provide live feedback on the course

  • LinkedIn

Last updated on July 11, 2025

© 2025 Enfa Fane

bottom of page