Statistics: Analysis of Textual Data
Start
Spring 2026
Level
Master's
Language
English
Place of study
Lund
Course code
STAN49
The course provides an introduction to statistical analysis of text. You will study both methods based on classic statistical approaches (including Bayesian models) and modern approaches such as deep learning and large language models. Topics covered include
- Different ways to represent text to facilitate statistical analysis
- Techniques for classification of text
- Text clustering
- Techniques for identifying different topics/themes in text (topic modelling)
- Techniques for identifying the emotions or sentiments in text (sentiment analysis)
- Methods for text summarisation.
Course literature
The course literature listed may be updated up to eight weeks before the course begins.
Course literature STAN49 (PDF, New tab)Teaching consists of lectures and exercises. You will be assessed through quizzes, where you demonstrate your theoretical knowledge, and through written assignments, where you have the opportunity to show your practical skills.
The course is designed to be taken in parallel with STAN47 Statistics: Deep Learning and Artificial Intelligence Methods
Prerequisites
STAN48 Statistics: Advanced Statistical Programming and STAN52 Statistics: Advanced Machine Learning, or the equivalent.
Selection criteria
Seats are allocated according to: ECTS (HPAV): 100 %.
Tuition fees for non-EU/EEA citizens
Citizens of countries outside:
- The European Union (EU)
- The European Economic Area (EEA) and
- Switzerland
are required to pay tuition fees. You pay an instalment of the tuition fee in advance of each
semester.
Tuition fees, payments and exemptions
Full programme/course tuition fee: SEK 16,875
First payment: SEK 16,875
Note that you may also need to pay an application fee, or provide proof of exemption.
No tuition fees for citizens of the EU, EEA and Switzerland
There are no tuition fees for citizens of the European Union (EU), the European Economic Area (EEA) and Switzerland.