MENY
This is the study programme for 2019/2020. It is subject to change.


The course offers an introduction to techniques and methods for processing, mining, and searching in massive text collections. The course considers a broad variety of applications and provides an opportunity for hands-on experimentation with state-of-the-art algorithms using existing software tools and data collections.

Learning outcome

Knowledge:
  • Theory and practice of concepts, methods, and techniques for managing and analyzing large amounts of text data.

Skills:
  • Process and prepare large-scale textual data collections for retrieval and mining.
  • Apply retrieval, classification, and clustering methods to a range of information access problems.
  • Conduct performance evaluation and error analysis.

General competencies:
  • Understanding of the strengths and limitations of modern information retrieval and text mining techniques. Being able to identify promising business applications, participate in and lead such projects.

Contents

  • Search engine architecture
  • Text preprocessing and indexing
  • Retrieval models (vector-space model, probabilistic models, learning to rank, neural models)
  • Search engine evaluation
  • Query modeling, relevance feedback
  • Web search (crawling, indexing, link analysis)
  • Semantic search (knowledge bases, entity retrieval, entity linking)
  • Text clustering
  • Text categorization
  • Topic analysis (PLSA, LSA)

Required prerequisite knowledge

None.

Exam

Project work and written exam
Weight Duration Marks Aid
Project work2/5 A - F
Written exam3/54 hoursA - F

Course teacher(s)

Course coordinator
Krisztian Balog
Head of Department
Tom Ryen

Method of work

6 hours of lectures/lab exercises each week.

Overlapping courses

Course Reduction (SP)
Web Search and Data Mining (DAT630_1) 5

Open to

Master students at the Faculty of Science and Technology.

Course assessment

Form and/or discussion.

Literature

Text Data Management and Analysis: A Practical Introduction to Information Retrieval and Text Mining (Zhai and Massung), ACM and Morgan & Claypool Publishers, 2016.


This is the study programme for 2019/2020. It is subject to change.

Sist oppdatert: 15.12.2019