From Sound to Structure : Robust Localization of Sources, Sensors, and Surroundings
Från ljud till struktur : Robust lokalisering av ljudkällor, sensorer och omgivande miljö
Author
Summary, in English
In the case of an uncontrolled sound source, one of the main ways of extracting geometric information comes from computing the time between a sound arriving at each of two microphones. This measurement is referred to as the Time-Difference-of-Arrival (TDOA) and it defines a hyperboloid relative to the two microphones, on which the sound source must lie. While classical correlation-based techniques exist for how to compute the TDOA from two recordings, they typically struggle in reverberant environments where the two signals are not just shifted noisy versions of each other. One of the results of this thesis is showing that better time-delay estimation can be performed by using a learning-based approach. The main issue with using a learning-based approach in this domain is a lack of data. However, this thesis demonstrates that it is possible to solve this issue by utilizing simulations of sound propagation to create synthetic data. This data can then be used to train an energy-based model, which demonstrates improved performance on real data compared to classical methods.
After computing primitive geometric relationships from the sensor data, the goal is to convert them into more useful higher-level information such as the locations of microphones and sound sources. The main problem here lies in that a fraction of the measurements are outliers which means that robust estimation methods such as RANSAC (a hypothesis-and-test framework) need to be used. Since the speed of hypothesis creation is key when using RANSAC, this thesis shows how to construct new minimal solvers for several problems. One example is that we show that sensor network self-calibration in the presence of a reverberant plane allows for minimal problems containing fewer microphones than in the echo-free case.
Department/s
Publishing year
2025-12-04
Language
English
Full text
- Available as PDF - 33 MB
- Download statistics
Document type
Dissertation
Publisher
Centre for Mathematical Sciences, Lund University
Topic
- Computer Vision and learning System
Status
Published
Research group
- Computer Vision and Machine Learning
ISBN/ISSN/Other
- ISSN: 1404-0034
- ISSN: 1404-0034
- ISBN: 978-91-8104-764-6
- ISBN: 978-91-8104-763-9
Defence date
16 January 2026
Defence time
13:15
Defence place
Lecture Hall MH:Hörmander, Centre of Mathematical Sciences, Märkesbacken 4, Faculty of Engineering LTH, Lund University, Lund.
Opponent
- Tuomas Virtanen (Prof.)