- Theory and practice of data preparation, selection and mining.
- Concepts, methods, and techniques to gain insights from large-scale data.
- Frequent itemset mining, association rule mining, clustering, classification, graph and stream mining
- Process and prepare large-scale data for various data mining tasks
- Implement data mining pipelines, evaluate, and tune parameters for various data mining models using state-of-the-art tools
- Identify the theoretical and practical issues behind various data mining techniques. Being able to list and describe strengths, limitations and trade-offs among various data mining techniques and choose the appropriate techniques for solving data science problems for various applications.
- Data cleansing, transformation and preparation
- Dimensionality reduction
- Recommendation systems
- Graph mining
- Neural Networks and Deep learning
- Mining frequent patterns, associations and correlations
- Mining data streams
Required prerequisite knowledge
|Written exam||3/5||4 hours||A - F||1)|
|Project report||2/5||A - F|
All programming exercises must be passed to attend for the written exam and to get project approved. Completion of mandatory lab assignments are to be made at the times and in the groups that are assigned. Absence due to illness or for other reasons must be communicated as soon as possible to the laboratory personnel. One cannot expect that provisions for completion of the lab assignments at other times are made unless prior arrangements with the laboratory personnel have been agreed upon. Failure to complete the assigned labs on time or not having them approved will result in barring from taking the exam of the course.
Method of work
|Web Search and Data Mining (DAT630_1)||5|
Computer Science - Master's Degree Programme
- Data Mining: Practical Machine Learning Tools and Techniques, Third Editiion, by Ian H. Witten, Eibe Frank, Mark A. Hall
- An Introduction to Data Mining, 2nd edition, Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Anuj Karpatne
- For labs: Python Data Science Handbook by Jake VanderPlas https://jakevdp.github.io/PythonDataScienceHandbook/ (free ebook available no need to buy)
Last updated: 25.01.2020