Your browser has javascript turned off or blocked. This will lead to some parts of our website to not work properly or at all. Turn on javascript for best performance.

The browser you are using is not supported by this website. All versions of Internet Explorer are no longer supported, either by us or Microsoft (read more here:

Please use a modern browser to fully experience our website, such as the newest versions of Edge, Chrome, Firefox or Safari etc.

Burcas : A Simple Concatenation-based MIDI-to-Singing Voice Synthesis System for Swedish


  • Marcus Uneson

Summary, in English

After a brief outlook on the field of concatenative synthesis of singing, with emphasis on the differences in comparison to synthesis of speech, the present paper gives an overview of a simple system for singing synthesis in Swedish based on concatenation of diphones. The system, called Burcas, accepts as input a text file for lyrics, from which it extracts a target phoneme sequence using basic letter-to-sound conversion, and a MIDI file?possibly holding multiple parts?, from which it extracts melodical information, i.e. note duration and frequency. After associating a syllable (or a part of a syllable) to each note, a simple model of segment durations is used to calculate the duration of each segment of the syllable. Finally, segment data are then used as control parameters (allophone, duration, frequency) for the MBROLA speech generator. The speech generator outputs sound files in standard format, given a suitable diphone database. In a concluding section, the far more sophisticated corpus-based approach to concatenative synthesis of singing is considered.

Publishing year




Document type

Student publication for Master's degree (one year)


  • Languages and Literatures


  • Syntetisk sång utifrån svenska språket
  • Sångsyntes
  • Burcas
  • Bokstav - ljud - sång
  • Sångrösten
  • Phonetics, phonology
  • Fonetik, fonologi


  • Joost van de Weijer