WPI Worcester Polytechnic Institute

Computer Science Department
------------------------------------------

DS504/CS586 - Big Data Analytics - Fall 2016

Version: Aug 24th, 2016

------------------------------------------

Home Class Info Schedule Projects
Grading Reviews Presentation Resources

------------------------------------------

Tentative Schedule:

Slides will be updated after each lecture.

-0. Week 1 (8/25 R following Monday schedule): No Class; See link 1 and link 2.

-1. Week 2 (9/1 R):

    Topic 0: Overview of Big Data Analytics (slides) and Class Logistics (Slides)
    Readings: N/A

-2. Week 3 (9/8 R):

    Topic 1: Big data acquisition and measurement (Slides).
    Readings: [ACM IMC 2011] Counting YouTube Videos via Random Prefix Sampling. (paper)
    Note: Project I starts.

-3. Week 4 (9/15 R):

    Topic 2: Big data Preprocessing/Cleaning.(Slides)
    Reading1: [ACM SIGSPATIAL GIS 2009] Map-Matching for Low-Sampling-Rate GPS Trajectories. (paper)
    Reading2: Section 3.5 in [ACM TIST] Trajectory Data Mining: An Overview.(paper)
    Optional: Section 3.1-3.4 in [ACM TIST] Trajectory Data Mining: An Overview.
    Red team: Team 2
    Topic 1: Big data Acqusition and measurement. (paper presentation and discussion) (Slides)
    Readings: pp.1-5 Section 1 - Section 3.1 before Remark 1: [TKDE] Efficiently Estimating Statistics of Points of Interests on Maps (paper)
    Optional: pp.1-6 Section I - Section III: [ICDE'14]Region Sampling and Estimation of GeoSocial Data with Dynamic Range Calibration. (paper)
    Presenting Team: Team 1
    Red team: Team 4 Note: Project 1 proposal is due.

-4. Week 5 (9/22 R):
    Topic 3: Big data Management.(Slides)
    Reading1: Section 4.1 in [ACM TIST] Trajectory Data Mining: An Overview.(paper)
    Reading2: [ACM CIKM 2016] Sampling Big Trajectory Data. (paper)
    Red team: Team 6
    Topic 2: Big data Preprocessing/Cleaning. (paper presentation and disussion) (Slides)
    Readings: [IEEE MDM 2010] An Interactive-Voting Based Map Matching Algorithm. (paper)
    Presenting Team: Team 3
    Red team: Team 1

-5. Week 6 (9/29 R):
    Topic 4: Big Graph Data Mining I (Sampling Large-Scale Networks via Random Walk). (Slides)
    Readings: M. Gjoka, M. Kurant, C. T. Butts, A. Markopoulou, Walking in Facebook: A Case Study of Unbiased Sampling of OSNs, INFOCOM 2010. (paper)
    Readings: Section 0 and Section 1. L. Lovasz, Random Walks on Graphs: A Survey, Combinatorics, Volume 2, 1993. (paper)
    Red Team: Team 3
    Topic 3: Big Data Management. (paper presentation and discussion) (Slides)
    Readings: [ACM SIGMOD 2010] Searching Trajectories by Locations: An Efficiency Study. (paper)
    Presenting Team: Team 5
    Red team: Team 2

-6. Week 7 (10/6 R):
    Topic 4: Big Graph Data Mining II (Node Importance Ranking on Large-Scale Graphs/Networks)(Slides)
    Readings: E. Even-Dar and A. Shapira, A Note on Maximizing the Spread of Influence in Social Networks, WINE 2007.(paper)
    Readings: PageRank (Link), Hub and Authority (HITS) (Link)
    Red team: Team 5
    Topic 4: Big Graph Data Mining I. (paper presentation and discussion.)(Slides)
    Readings: B. Ribeiro, Estimating and Sampling Graphs with Multidimensional Random Walks, IMC 2010(paper)
    Presenting Team: Team 2
    Red team: Team 4 Note: Project 2 starts.

-7. Week 8 (10/13 R):
    Topic 4: Big Graph Data Mining II. (paper presentation and discussion.)(Slides)
    Readings: T. H. Haveliwala, Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search, TKDE Vol 15 Num 4, 2003. (paper)
    Presenting Team: Team 4
    Red team: Team 6
    Topic NA: Project 1 presentation (part 1) (Slides)
    Readings: N/A
    Note: Project 1 is due.

-8. Week 9 (10/20 R): No Class.

-9. Week 10 (10/27 R):
    Topic NA: Project 1 presentation (part 2)
    Readings: N/A
    Note: Project 2 proposal Due on Friday (10/28) at 11:59pm

-10. Week 11 (11/3 R): Guest Lecture: Prof Xiangnan Kong
    Topic 6: Deep Neural Network I.(slides)
    Readings: David Silver, et al., Mastering the game of Go with deep neural networks and tree search, Nature 2016. (paper).
    Red team: Team 3
    Topic 5: Big Data Clustering I. (paper presentation and discussion.)(Slides)
    Readings: Jae-Gil Lee, Jiawei Han, Kyu-Young Whang, Trajectory Clustering: A Partition-and-Group Framework, SIGMOD 2007 (paper).
    Presenting Team: Team 6
    Red team: Team 5

-11. Week 12 (11/10 R):
    Topic 5: Big Data Clustering II (Slides)
    Readings: Martin Ester, Hans-Peter Kriegel, Jorg Sander, Xiaowei Xu, A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise, ACM KDD 1996 (paper).
    Optional: Alexander Hinneburg and Hans-Henning Gabriel, DENCLUE 2.0: Fast Clustering based on Kernel Density Estimation, Advances in Intelligent Data Analysis VII. Springer Berlin Heidelberg, 2007. 70-80. (paper).
    Red team: Team 2
    Topic 7: Recommender System I. (Slides)
    Readings: Jie Bao, Yu Zheng, and Mohamed F. Mokbel, Location-based and Preference-Aware Recommendation Using Sparse Geo-Social Networking Data, ACM SIGSPATIAL GIS 2012 (paper)
    Readings: Collaborative filtering on WIKI
    Red team: Team 4
    Quiz 1 on Data Clustering

-12. Week 13 (11/17 R): Guest Lecture: Xiangnan Kong
    Topic 6: Deep Neural Network II. (Slides)
    Guest Lecture: Prof Xiangnan Kong
    No required readings:
    Red team: Team 6
    Topic 7: Recommender System. (paper presentation and discussion.)(Slides)
    Readings: Jia-Dong Zhang, Chi-Yin Chow, iGSLR: Personalized Geo-Social Location Recommendation - A Kernel Density Estimation Approach, SIGSPATIAL GIS 2013 (paper)
    Optional: Nicholas Jing Yuan, Yu Zheng, Liuhang Zhang, Xing Xie, T-Finder: A Recommender System for Finding Passengers and Vacant Taxis, TKDE 2013 (paper)
    Presenting Team: Team 2
    Red team: Team 1

-13. Week 14 (11/24 R): Thanksgiving. No Class.

-14. Week 15 (12/1 R):
    Topic 8: Big Data Application (I):(slides)
    Readings: Yu Zheng, Furui Liu, Hsun-Ping Hsieh, U-Air: When Urban Air Quality Inference Meets Big Data, SIGKDD 2013. (paper)
    Red team: Team 3
    Project 2 progress Presentation and Discussions
    Readings: N/A
    Quiz 2 on Recommender systems

-15. Week 16 (12/8 R):
    Class Review and Discussions(Slides)
    Readings: N/A
    Note: Post your project 2 final reports in the discussion forum (by 12/13 Tue 11:59pm).
    Note: Submit your self-and-peer evaluation form for project 2 (by 12/13 Tue 11:59PM).
    Topics 8: Big Data Application(paper presentation and discussion.)(slides)
    Readings: Yexin Li, Yu Zheng, Huichu Zhang, Lei Chen, Traffic Prediction in a Bike-Sharing System, SIGSPATIAL GIS 2015. (paper)
    Presenting Team: Team 1
    Red team: Team 5

16. Week 17 (12/15 R):
    Topics: Final Project Presentations. Team 1-6.
    Readings: N/A



yli15 at wpi.edu