CP421
Data Mining
0.5 Credit

Hours per week:
  • Lecture/Discussion: 3

The course is aimed at an entry level study of information retrieval and data mining techniques. It is about how to find relevant information and subsequently extract meaningful patterns out of it. While the basic theories and mathematical models of information retrieval and data mining are covered, the course is primarily focused on practical algorithms of textual document indexing, relevance ranking, web usage mining, text analytics, as well as their performance evaluations. At the end of the course student are expected to understand the following:
1. The common algorithms and techniques for information retrieval (document indexing and retrieval, query processing, etc). 2. The quantitative evaluation methods for the IR systems and data mining techniques. 3. The popular probabilistic retrieval methods and ranking principles. 4. The techniques and algorithms existing in practical retrieval and data mining systems such as those in web search engines and recommender systems. 5. The challenges and existing techniques for the emerging topics of MapReduce, portfolio retrieval and online advertising.

Additional Course Information
Prerequisites
CP312, CP317.