MENY
Dette er studietilbudet for studieår 2020-2021.


The course offers an introduction to techniques and methods for processing, mining, and searching in massive text collections. The course considers a broad variety of applications and provides an opportunity for hands-on experimentation with state-of-the-art algorithms using existing software tools and data collections.

Learning outcome

Knowledge:
  • Theory and practice of concepts, methods, and techniques for managing and analyzing large amounts of text data.

Skills:
  • Process and prepare large-scale textual data collections for retrieval and mining.
  • Apply retrieval, classification, and clustering methods to a range of information access problems.
  • Conduct performance evaluation and error analysis.

General competencies:
  • Understanding of the strengths and limitations of modern information retrieval and text mining techniques. Being able to identify promising business applications, participate in and lead such projects.

Contents

  • Search engine architecture
  • Text preprocessing and indexing
  • Retrieval models (vector-space model, probabilistic models, learning to rank, neural models)
  • Search engine evaluation
  • Query modeling, relevance feedback
  • Web search (crawling, indexing, link analysis)
  • Semantic search (knowledge bases, entity retrieval, entity linking)
  • Text clustering
  • Text categorization
  • Topic analysis
  • Opinion mining and sentiment analysis

Required prerequisite knowledge

None.

Exam

Project work and written exam
Weight Duration Mark Supporting materials
Project work2/5 A - F
Written exam3/54 hoursA - F
The project is carried out individually or in groups of 2 or 3. The project is carried out in the groups set up by the course instructor. If a student fails the project, she/he has to take this part next time the subject is lectured.
Permitted aid: all written and printed material, and basic calculator

Course teacher(s)

Course coordinator
Krisztian Balog
Head of Department
Tom Ryen

Method of work

6 hours of lectures/lab exercises each week.

Overlapping courses

Course Reduction (credits)
Web Search and Data Mining (DAT630_1) 5

Open to

Admission to Single Courses at the Faculty of Science and Technology
Computer Science - Master's Degree Programme
Industrial Automation and Signal Processing - Master's Degree Programme - 5 year

Course assessment

Form and/or discussion.

Literature


Link to reading list


Dette er studietilbudet for studieår 2020-2021.

Last updated: 15.08.2020