Hopp til hovedinnhold

# Applied Data Analytics and Statistics for Spatial and Temporal Modeling MOD550

Course description for study year 2021-2022

Facts
Emnekode

MOD550

Versjon

1

Vekting (SP)

10

Semester undervisningsstart

Spring

Antall semestre

1

Undervisningsspråk

English

Tilbys av

Faculty of Science and Technology, Department of Energy Resources

Timeplan
Learning outcome

Learning outcome

Knowledge:

• Basic understanding of data, statistics, and probability

• Exploratory data analysis for univariate, bivariate, and multivariate data

• Probability distributions and models thereof

• Fitting distribution to data

• Predictive modeling

• Regression modeling and analysis

• Multivariate data analysis including principal component, cluster, and discriminant analysis

• Applying machine learning techniques (e.g., random forest, gradient boosting machine, support vector regression, and kriging model) for predictive modeling

• Translating model input uncertainty into uncertainty in model predictions using Monte Carlo simulation

• Understand sensitivity analysis and the information it provides

• Visualization and reporting to provide input for a decision by transferring information to decision-makers.

Skills:

• Have the skills needed to build a good spatial or temporal model and to use it in generating powerful insights into the decision situation

• Have the skills needed to implement the basic statistical methods to analyze data and to estimate and simulate spatial and temporal properties conditioned on data by using Python and other programming tools

General qualifications:

•Students should understand fundamental logical principles and analyses and be able to communicate their choices and recommendations clearly.

Content

Statistics as a traditional science has considerable limitations when you apply it in spatial or temporal contexts. Key to these fields is the spatial and temporal aspects of the data. For example, a spatial sample or measurement is often attached to a spatial coordinate (x,y,z) describing where the sample was taken. Traditional statistics very often neglects this spatial context and simply works with the data as they are. However, from our own experiences (e.g. geological or environmental), we know that samples located close together are more "related" to each other, and this relationship may be useful to us when interpreting our data. In this course, we will deal explicitly with data distributed in space or time and aim at explicitly modeling the spatial or temporal relationship between data.

The focus of this course is the basic science, technology and related assumptions involved in applying statistics in spatial or temporal contexts. The emphasis is on providing students with knowledge of the fundamentals of statistics most relevant for spatial or temporal data.

The core of the course is around data analysis and constructing static spatial and temporal models. Data sources, quality, relevance and choice of modeling techniques will be covered. This is followed by classical gridding, mapping and contouring. Kriging is introduced as a data-driven (variograms) form of classical mapping (estimation) and a means of data integration. Simulation techniques are introduced as a means of modeling heterogeneity and uncertainty. Machine learning techniques, such as regression modeling and analysis as well as multivariate data analysis, will be introduced and applied. Python and other programming tools will be used for modeling, preparing spatial and temporal data, scripting statistical workflows, and constructing visualizations to communicate model and analysis results.

What are the benefits of building and using spatial and temporal models, as opposed to relying on mental models or just "gut feel?" The primary purpose of modeling is to generate decision insight; by which we mean an improved understanding of the decision situation at hand. While mathematical models consist of numbers and symbols, the real benefit of using them is to make better decisions. Better decisions results from improved understanding, not just the numbers themselves.

Required prerequisite knowledge
None
Eksamen / vurdering

Written exam, portfolio evaluation and project report

The overall course grade will be based on continuous evaluation which includes a final exam, a modeling Project, and a Portfolio. Each element is percentage-based whilst the overall course grade is letter-based. The final grade is made up of:
• 30% Written exam
• 40% Project report*
• 30% Portfolio which consist of 4 written assignments

*The project is an extended analysis which must be presented in a written report over no more than 20 pages  A resit exam is offered for students who do not pass the written exam. Students who do not pass or want to improve their grade in the project report or portfolio must take these assessment parts when the course is offered again.

Coursework requirements
Lectures and compulsory exercises.
Course teacher(s)
Course coordinator: Reidar Brumer Bratvold
Course coordinator: Aojie Hong
Head of Department: Alejandro Escalona Varela
Method of work
The work will consist of 6 hours of lecture and scheduled tutorials per week. Students are expected to spend an additional 6-8 hours a week on self-study, assignments, and project.
Literature
The syllabus can be found in Leganto