CS 495 - Python & Web Mining
Fall 2012: Wed 7:10pm-9:50pm, Dragas 1117

Instructor

Announcements

  • 08/29/12 - Course Calendar Updated.
  • 08/06/12 - Course Calendar Updated.
  • 07/12/12 - Class Registeration is open: CRN 31383.
  • 07/11/12 - Class' webpage is up. Please check for frequent updates/announcements

Class Calender


    Week Date Topics & Reading Assignments Presenter/Slides
    1
    Wed Aug 29

    - Introduction and Administrivia
    - Introduction to Python
    slides
    2
    Wed Sep 05

    - Python in Depth slides
    3

    Wed Sep 12




    Fri Sep 14

    - Introduction to Collective Intelligence & Machine Learning (Chapter 1)
    - Recommendation Systems (Chapter 2)
    - Submission of Assignment 1
    - Assignment 1 Demos
    - Open codes for peer-improvement phase.

    - Open peer-improvement submission.
    slides
    Code
    4


    Wed Sep 19



    Fri Sep 21

    - Clustering (Chapter 3)
    - Submission of Assignment 2
    - Assignment 2 Demos
    - Peer-Improvement Presentations

    - Open codes for peer-improvement phase.

    - Open peer-improvement submission.

    slides
    Code
    6

    Wed Sep 26

    - No Class: Instructor at TPDL2012
    - Submission of Assignment 4
    5


    Wed Oct 03



    Fri Oct 05

    - Document Filtering (Chapter 6)
    - Submission of Assignment 3
    - Assignment 3 Demos
    - Peer-Improvement Presentations

    - Open codes for peer-improvement phase.

    - Open peer-improvement submission.

    slides
    Code
    7

    Wed Oct 10


    Fri Oct 12

    - Crawling, Searching, and Ranking (Chapter 4)
    - Assignment 4 Demos
    - Open codes for peer-improvement phase.

    - Open peer-improvement submission.

    slides
    Code
    8


    Wed Oct 17



    Fri Oct 19

    - Optimization (Chapter 5)
    - Submission of Assignment 5
    - Assignment 5 Demos
    - Peer-Improvement Presentations

    - Open codes for peer-improvement phase.

    - Open peer-improvement submission.

    slides
    Code
    9


    Wed Oct 24



    Fri Oct 26

    - Decision trees (Chapter 7)
    - Submission of Assignment 6
    - Assignment 6 Demos
    - Peer-Improvement Presentations

    - Open codes for peer-improvement phase.

    - Open peer-improvement submission.

    slides
    10


    Wed Oct 31



    Fri Nov 02

    - K-Nearest Neighbors (Chapter 8)
    - Submission of Assignment 7
    - Assignment 7 Demos
    - Peer-Improvement Presentations

    - Open codes for peer-improvement phase.

    - Open peer-improvement submission.

    slides
    11


    Wed Nov 07



    Fri Nov 09

    - Advanced Classification: Kernel Methods and SVMs (Chapter 9)
    - Submission of Assignment 8
    - Assignment 8 Demos
    - Peer-Improvement Presentations

    - Open codes for peer-improvement phase.

    - Open peer-improvement submission.

    slides
    12


    Wed Nov 14



    Fri Nov 16

    - Feature Extraction (Chapter 10)
    - Submission of Assignment 9
    - Assignment 9 Demos
    - Peer-Improvement Presentations

    - Open codes for peer-improvement phase.

    - Open peer-improvement submission.

    slides
    13
    Wed Nov 21

    - No Class...Turkey Week!
    - Submission of Assignment 10
    14

    Wed Nov 28



    Fri Nov 30

    - Evolving Intelligence: Genetic Programming (Chapter 11)
    - Assignment 10 Demos
    - Peer-Improvement Presentations

    - Open codes for peer-improvement phase.

    - Open peer-improvement submission.

    slides
    15

    Wed Dec 05



    Fri Dec 07

    - Submission of Assignment 11
    - Assignment 11 Demos
    - Peer-Improvement Presentations

    - Open codes for peer-improvement phase.

    - Open peer-improvement submission.

    16 Wed Dec 12 - No Class: Exams Week

 

Course Description

A survey of web data mining techniques. Students will learn to program in Python and learn the mathematical techniques for mining the web and interacting with web APIs from popular web sites for data collection. The course is designed in two parts, the first part will cover Python programming language and how to build applications, the second part is more into mining the web by covering topics that will include recommendation systems, clustering, ranking, optimization, classifiers, decision trees, k-nearest neighbors, kernel methods and support vector machines, feature extraction and genetic programming. The grade will be based on completing assignments from the text book and class participation. The students will learn when and how to apply the various web mining techniques in real applications. Throughout the semester and on a weekly basis there will be extra challenges which will be rewarded with extra credit upon successful completion.

 

Course Overview

  • In this course you learn how to program in python from novice to expert.
  • You will learn how the web works, how search engines function. You will learn the mining techniques of the web from recommendation systems, clustering, ranking, optimization, classifiers, decision trees, k-nearest neighbors, kernel methods and support vector machines, feature extraction and genetic programming.
  • This hands-on course and projects will enable you to apply the python programming skills you learned along with web mining techniques to build real useful applications.
  • Extra credit will be awarded on peer improvement as described in the slides of lecture 1.
  • Students will work individually on weekly assignments and will be required to present their work on a weekly basis through a speed demo.
  • There will be no exams in this course, marks will be awarded for weekly assignments, Assignment Demos, peer-improvement extra credit opportunities, and class participation.
  • Students are encouraged to bring their laptops to class during the code walkthroughs sections.

 

CRN Identifier

The CRN identifier for registeration is: CRN 31383.

 

Syllabus

You can find a detailed version of the syllabus here: CS495-Syllabus

 

Text

The required text will be:

    Programming Collective Intelligence: Building Smart Web 2.0 Applications By Toby Segaran [$26.39 at Amazon].


Recommended but not required purchases: 

 

Class Mailing List

Students should join this group/mailing list: CS495-fall12