WPI Worcester Polytechnic Institute

Computer Science Department
------------------------------------------

DS504/CS586 - Big Data Analytics - Spring 2024

Version:

------------------------------------------

Home Class Info Schedule Projects
Grading Reviews Resources

------------------------------------------

Class Information:

When/where: Tuesdays, 6:00pm - 8:50pm, In-person at Salisbury Laboratories 411.
Web: http://wpi.edu/~yli15/courses/DS504CS586Spring24/

Instructor:

    Prof. Yanhua Li
    Email: yli15 at wpi.edu
    Website: http://wpi.edu/~yli15/
    Office hour:
    Tuesdays 12-1PM in UH384;
    Others by appointments

    TA: Mingzhi Hu
    Email: mhu3@wpi.edu
    Office hour:
    Mondays 10-11AM on Zoom (Zoom link is available on Canvas)
    Wednesdays 12-1pm at UH341
    Others by appointments

Course Description:

    [Topics.] We are living in the age of big data, where data is measured in terabytes and zetabytes, streamed in real-time, and derived at unprecedented speeds in diverse forms. Big data promises to impact the world as we know it, from increased productivity at our workplace to how we live our daily social lives. However, it also presents tremendous challenges as entities from individuals, companies, organizations, political groups, to governments strive to gain insights from vast torrents of complex data. This course covers computational techniques and algorithms for measuring, analyzing and mining patterns in large-scale datasets. Techniques studied may include data analysis issues related to large-scale data sampling and estimation, data cleaning, management, clustering, etc. Real-world applications using these techniques, for instance urban computing, social media analysis and recommender systems, are selectively discussed. As part of this course, we will read literature and try our hands on this technology by conducting course projects.

    [Recommended background.] This is an *advanced* graduate course which is primarily targeted for second (or higher) year Ph.D/MS graduate students. The priority for enrollment will be given to CS/DS Ph.D students who are working in big data analytics and related areas; then other Ph.D students or MS students who have taken course(s) in databases and/or in data mining, or equivalent knowledge. Sufficient programming experience and knowledge of data analytics (e.g., data mining, machine learning, optimization, or control theory) is expected so that you are comfortable to undertake a course project. The course will focus on developing skills to solve real-world bigdata / data-driven problems, rather than introducing basics of data mining/machine learning techniques. If you are in doubt, please talk to the instructor.

    [Course structure.] This is not a lecture-based course, with 4 individual projects tying to four topics we covered in big data analytics, and a final team project.

Textbook:

    The topic is evolving. Thus no one comprehensive text book exists that would contain the material we will study in this course. Instead we will be utilizing a variety of sources, including publications from the primary literature and book chapters. These manuscripts will be provided to the class and/or linked into our schedule.

Coursework and Evaluation:

    The grading system for this course is A,B,C,NR (without +/-).
    Four Individual Projects: 15%, 20%, 20%, 20%.
    Team Project: 25%.
    Note:Please see more details of the breakdowns for each part in the grading page, and the "Important Dates" for the timing of Critiques, presentation slides, and projects in the projects page.

Course Objectives:

    Gain knowledge in fundamental principles, algorithms and technological advances in the field of big data analytics.
    Develop skills needed to critically read and make use of technical literature.
    Get practice designing a project or research agenda related to big data analytics.
    Learn to identify and acquire new knowledge on a chosen subject of interest.

Learning Outcomes:

    Upon completion of this course, students should be able to:

    Explain challenges and advances in the state-of-art in big data analytics.
    Design, develop and fully execute a big data analytics project.
    Demonstrate skills to critically review technical literature and assess technological advances in big data analytics.
    Communicate their ideas effectively in the form of a presentation and written documents to a technical audience.



yli15 at wpi.edu