Hopp til hovedinnhold

Big Data Analysis for Social Sciences BST240

Big Data Analysis is all about using computer-assisted methods to analyze large amounts of data. Using the R statistical programming language, methods such as topic modelling and sentiment analysis will be explored. But, as a course that does not require any prior knowledge in programming, also the general form and function of (R) code will be discussed. Although most of the methods discussed in the course are applicable to many kinds of data, there will be a specific focus on text data, such as newspaper articles.


Course description for study year 2020-2021

Facts
Emnekode

BST240

Versjon

1

Vekting (SP)

10

Semester undervisningsstart

Spring

Antall semestre

1

Vurderingsemester

Spring

Undervisningsspråk

English

Tilbys av

Faculty of Social Sciences, Department of Media and Social Sciences

Learning outcome
After completion of this course students will be able to understand and explain the main concepts related to machine learning and automated text analysis. They will also be able to apply this knowledge to develop research questions and designs that are suitable for use with machine learning methods. Finally, students will be able to write code in R/RStudio to manage their data and conduct simple machine learning tasks.
Content

The first part of the course consists of a general introduction into (coding with) R, and a theoretical introduction to the different kinds of machine learning methods available. At the end of the first part, students will write a (graded) research proposal to illustrate their understanding of the main concepts in big data analysis, and relate these concepts to a subject of their personal interest.

During the second phase of the course, students will be placed into small groups, based on their interest, and develop a single research proposal out of their separate proposals. They will then use the methods learned in part one to answer the research question(s) in this proposal and write a report, which will be graded.

The course will consist of the following subjects:

  • Data structures (csv, json, databases)
  • Introduction to R
  • Introduction to RStudio
  • Preprocessing
  • Supervised learning
  • Unsupervised learning

Note that the maximum number of students able to participate in this course is limited to 12. Also, students should have successfully completed the Quantitative Methods course before they can take part in this course. Students that are eligible are admitted to the course based on their time of registration (first come, first served)

A well-functioning laptop is required. Minimal system requirements are at least an Intel Core i3 or equivalent, and at least 4GB RAM. Recommended is an Intel Core i5 or equivalent, and at least 6GB RAM. When in doubt, please contact the course coordinator.

Required prerequisite knowledge
BSS300 Quantitative research methods
Eksamen / vurdering

Individual research proposal and Joint research report

Vurderingsform Vekting Varighet Karakter Hjelpemiddel
Individual research proposal 3/5 A - F
Joint research report 2/5 A - F

Individual research proposal (60%)Joint research report (40%)

Course teacher(s)
Course coordinator: Erik de Vries
Method of work
  • Lectures
  • Groupwork
  • Individual work
  • Supervisory group meetings

Literature
Search for literature in Leganto