CS61061: Data Analytics

Class timing

Autumn Semester 2024
Wednesday: 12:00 - 12:55 PM | Thursday: 11:00 - 11:55 AM | Friday: 9:00 - 10:55 AM

Venue

NR411 (Nalanda Classroom Complex)

MS Teams

Please join the Teams channel through this link.

Reading materials

  • Introduction to Data Mining (2nd Ed.) by Pang-Ning Tan, Michael Steinbach, Anuj Karpatne and Vipin Kumar.
  • Data Mining: The Textbook by Charu Aggarwal.
  • Python for Data Analysis by Wes McKinney.
  • Lecture slides and relevant online tutorials.

Tentative course content

  1. Data Collection and Preprocessing
    • Data collection methods
    • Different types of data
    • Data quality and preprocessing techniques
    • Handling missing data, outliers, and noise
    • Data transformation and normalization
  2. Exploratory Data Analysis
    • Descriptive statistics
    • Data visualization techniques
    • Identifying patterns and trends
    • Tools: Pandas, Matplotlib, Seaborn
  3. Classification Techniques
    • Decision trees
    • k-Nearest Neighbors (k-NN)
    • Naive Bayes
    • Support Vector Machines (SVM)
  4. Clustering Techniques
  5. Association Rule Mining
  6. Data Ethics and Privacy
    • Ethical considerations in data analytics
    • Data privacy
    • Regulations and compliance (GDPR, CCPA)

Pre-requisites for the course

  • Data structures and algorithms.
  • Working knowledge of AI/ML would be a plus.
  • The assignments will require programming experience.

Course evaluation

  • Mid-term and end-term exams: 35% + 40%
  • Assignments: 20%
  • Attendance and class participation: 5%