Skip to main content

Introduction to data science DAT540

The course will provide a knowledge and experience in data engineering tasks and will accustom students with data science project lifecycle.


Course description for study year 2022-2023

Facts
Course code

DAT540

Version

1

Credits (ECTS)

10

Semester tution start

Autumn

Number of semesters

1

Exam semester

Autumn

Language of instruction

English

Content
The ability to create, manage and utilize data has become one of the most important challenges for practitioners in almost all disciplines, sectors, and industries. In this course, students become familiar with basic tools and processes used in Data Science. Students work through the whole data lifecycle from loading, through cleaning and modeling, to storing the data. The work is performed using Python stack consisting ia of: IPython, NumPy, Pandas, Matplotlib, and Jupyter Notebooks. Students learn to structure their work using CRISP-DM and Data Science Process (Ask, Get, Explore, Model, Communicate and Visualize).
Learning outcome

Knowledge :

  • Execute/Develop tools to load, parse, clean, transform, merge, reshape, and store data.
  • Compare regular Python, NumPy, and Pandas data structures and choose one for the given problem. Use the IPython shell and Jupyter notebook for exploratory computing.
  • Execute/Develop simple machine learning or data mining algorithms.

Skills:

  • Organize data analysis following CRiSP-DM and Data Science Process
  • Build engaging visualizations of data analysis using matplotlib
  • Optimize data analysis applying available structure and methods 
  • Evaluate, communicate and defend results of data analysis

General qualifications :

  • Solve real-world data analysis problems following a well-structured process
Required prerequisite knowledge
10 Credits in Programming, Databases or Software Engineering related courses.
Recommended prerequisites
DAT120 Introduction to Programming, STA500 Probability and Statistics 2
Exam

Project work and written exam

Form of assessment Weight Duration Marks Aid
Project work in groups 3/5 Letter grades
Written exam (Multiple Choice) 2/5 3 Hours Letter grades

Project Work in GroupsThe project is completed in groups. Project work is to be performed at the times and in the groups that are assigned and published. Absence due to illness or for other reasons must be communicated as soon as possible to the lecturer.A project report, included source code, contribute to the grade. The students must present the project work orally in order to get a grade in the subject.If a student fails the project work, he / she has to take this part again next time the subject is lectured.

Course teacher(s)
Course coordinator: Antorweep Chakravorty
Head of Department: Tom Ryen
Method of work
The work will consist of 6 hours of lecture, scheduled laboratory, supervised group work per week. Students are expected to spend an additional 6-8 hours a week on self-study, group discussions, and development work.
Open for
Applied Data Science, Master of Science Degree Programme Computer Science, Master of Science Degree Programme Exchange programme at Faculty of Science and Technology
Literature
The syllabus can be found in Leganto