Header

Search
  • VoiceID2026

  • VoCS Project

  • VoCSA

The Center for Forensic Phonetics and Acoustics brings together expertise spanning phonetics, neuroscience, computer science, cognitive science, psychology, medicine, engineering, and physics, bridging fundamental research on voice identity with applied work in forensic and industrial contexts. By integrating experimental, computational, and analytical approaches, we investigate how voices are perceived, recognized, compared, and manipulated. Our research addresses questions across the following domains:

Voice Identity & Perception

  • How distinctive and individual are human voices? Link Link

  • How stable is a voice across contexts, speakers, and recording conditions? Link

  • How do linguistic content and context shape voice identity? Link

Recognition & Technology

  • How do humans and machines recognise and compare voices? Link

  • What biases affect voice perception and automatic recognition? Link

  • How reliable are automatic voice recognition systems in real-world settings? Link Link

Forensic Inference & Evidence

  • How strong is the evidence that two voice samples come from the same speaker? Link

  • How do recording conditions and transmission channels affect voice comparison? Link

Synthesis & Security

  • How can voices be synthesised and manipulated? Link

  • How can synthetic voices (deepfakes) be detected and analysed? Link

CFPA’s mission is to foster research and education in the area of forensic phonetics for forensic practitioners or the police, students in respective careers as well PhD and PostDocs. CFPA members serve as principal investigators in the VoCS project (www.vocs.eu.com), a flagship European Doctoral Network training the next generation of 20+ researchers on some of the questions outlined above.

What We Offer

  • Research: We collaborate on academic and industry-funded research projects worldwide across a wide range of topics outlined above. Our work combines phonetics, computational modelling, and forensic applications. If you are interested in collaborating with us, we welcome enquiries.
  • Education: We provide teaching and training across all academic levels, from BA/MA and BSc/MSc programmes to PhD and postdoctoral training. In addition, we offer tailored courses and workshops on forensic voice analysis for professionals, including police, prosecutors, defence lawyers, and other stakeholders. See below for a list of relevant university courses taught in our department. 
  • Expertise: Together with our partner institutions (JP French International, Zurich Forensic Science Institute) we provide services in forensic voice analysis like: speaker profiling, speaker comparison, transcription, analysis of disputed utterances, authentication, audio enhancement

Executive Board

Board_Pic

First point of contact: PD Dr. Elisa Pellegrino elisa.pellegrino@uzh.ch 

Partners

For our mission in research, education and forensic casework, we work together with the following partners: 

FOR Logo
 

 

JPFrench

 

 

Phonexia.com

We are closely co-operating with the following networks and associations: 

VoCSA.org

vocs.eu.com
voice-id.org

Education

CFPA fosters education in the area of forensic phonetics for forensic practitioners or the police, students in respective careers as well PhD and PostDocs.

University Courses:

  • Automatic Speaker and Language Recognition (Dr. Srikanth Madikeri)
  • Computational Processing of Speech Rhythm for Language and Speaker Classification (PD Dr. Elisa Pellegrino) More
  • Experiments with Speech (Prof. Volker Dellwo) More
  • Fundamentals of Speech Sciences and Signal Processing (Prof. Volker Dellwo) More
  • Introduction to Forensic Speech Sciences (PD Dr. Elisa Pellegrino) More
  • Speech Technology (Dr. Srikanth Madikeri) More
  • Voice Analysis (Prof. Volker Dellwo) More

 

Selected Publications

2026

  • Farhadipour, A., Marquenie, J., Madikeri, S., & Chodroff, E. (2026). TidyVoice: A curated multilingual dataset for speaker verification derived from Common Voice. arXiv. https://doi.org/10.48550/arXiv.2601.16358 
  • Farhadipour, A., Marquenie, J., Madikeri, S., Vukovic, T., Dellwo, V., Reid, K., Tyers, F. M., Siegert, I., & Chodroff, E. (2026). TidyVoice 2026 challenge evaluation plan. arXiv. https://doi.org/10.48550/arXiv.2601.21960
  • Fröhlich, A., Ramon, M., French, P., & Dellwo, V. (2026). Implicit voice learning through discrimination outperforms explicit listen-and-memorize tasks. Scientific Reportshttps://doi.org/10.1038/s41598-026-41541-z 

2025

  • Bradshaw, L., Chodroff, E., & Dellwo, V. (2025). The role of phonetic overlap for speaker discrimination. Journal of the Acoustical Society of America, 157, 3572–3589. https://doi.org/10.1121/10.0036562
  • Chapariniya, M., Vuković, T., Ebling, S. & Dellwo, V. (2025). Beyond Appearance: Transformer-based Person Identification from Conversational Dynamics," 2025 15th International Conference on Computer and Knowledge Engineering (ICCKE), Mashhad, Iran, Islamic Republic of, pp. 1-6.https://doi.org/10.1109/iccke68588.2025.11273850
  • Farhadipour, A., Liu, S., Chapariniya, M., Vyshnevetska, V., Madikeri, S., Vukovic, T., & Dellwo, V. (2025). CL-UZH submission to the NIST SRE 2024 speaker recognition evaluation. arXiv. https://doi.org/10.48550/arXiv.2510.00952
  • He L. (2025). Time-integrated curvature of fundamental frequency as a novel approach to pitch complexity: Insights from first-language pitch development. Journal of Speech, Language, and Hearing Research 2025;68: 3171–3182.
    He, L. (2025). Mouth rhythm as a “packaging mechanism” of information in speech: A proof of concept. Journal of the Acoustical Society of America. 157 (3):1612–1617.
  • Madikeri, S., Motlicek, P., Sanchez-Cortes, D., Rangappa, P., Hughes, J., Tkaczuk, J., Sanchez Lara, A., Khalil, D., Rohdin, J., Zhu, D., Krishnan, A., Klakow, D., Ahmadi, Z., Kováč, M., Boboš, D., Kalogiros, C., Alexopoulos, A., & Marraud, D. (2025). Autocrime—Open multimodal platform for combating organized crime. Forensic Science International: Digital Investigation, 54, Article 301937.  https://doi.org/10.1016/j.fsidi.2025.301937 
  • Vyshnevetska, V., Giroud, N., Ramon, M., & Dellwo, V. (2025). Listeners are biased towards voices of young speakers and female speakers when discriminating voices. Cognitive Research: Principles and Implications, 10(1), 28. https://doi.org/10.1186/s41235-025-00636-3 

2024

  • Farhadipour, A., Chapariniya, M., Vukovic, T., & Dellwo, V. (2024, October). Comparative analysis of modality fusion approaches for audio-visual person identification and verification. In M. Abbas & A. A. Freihat (Eds.), Proceedings of the 7th International Conference on Natural Language and Speech Processing (ICNLSP 2024) (pp. 168–177). Association for Computational Linguistics. https://aclanthology.org/2024.icnlsp-1.19/ 
  • Iob, N. A., He, L., Ternström,  S., Cai, H., & Brockmann-Bauser, M. (2024) Effects of speech characteristics on electroglottographic and instrumental acoustic voice analysis metrics in women with structural dysphonia before and after treatment. Journal of Speech, Language, and Hearing Research 67(6): 1660-1681.
  • Pellegrino, E., Dellwo, V., Möbius, B., & Pardo, J. S. (2024). Forms, factors and functions of phonetic convergence: Editorial. Speech Communication, 165, 10314. https://doi.org/10.1016/j.specom.2024.103142

2023

  • Lins Machado, C., & He, L. (2023) Beyond the average: embracing speaker individuality in the dynamic modeling of the acoustic-articulatory relationship. Loquens 10(1-2), e103.
  • Pellegrino, E., & Dellwo, V. (2023). Speakers are more cooperative and less individual when interacting in larger group sizes. Frontiers in Psychology. Volume 14 - 2023. https://doi.org/10.3389/fpsyg.2023.1145572
  • Perepelytsia, V., Bradshaw, L., & Dellwo, V. (2023). IDEAR: A speech database of identity-marked, clear, and read speech. In R. Skarnitzl & J. Volín (Eds.), Proceedings of the 20th International Congress of Phonetic Sciences (pp. 3216–3220). Guarant International. Zora

  • Perepelytsia, V., & Dellwo, V. (2023). Acoustic compression in Zoom audio does not compromise voice recognition performance. Scientific Reports, 13(1), 18742.  https://doi.org/10.1038/s41598-023-45971-x

2022

  • He, L. (2022) Characterizing first and second language rhythm in English using spectral coherence between temporal envelope and mouth opening-closing movements. Journal of the Acoustical Society of America 152(1): 567–579.
  • Pellegrino, E., Kathiresan, T., & Dellwo, V. (2022). Vowel convergence does not affect auditory speaker discriminability in humans and machine in a case study on Swiss German dialects. The International Journal of Speech Language and the Law. Zora

2021

  • Bernardasci, C., Dipino, D., Garassino, D., Negrinelli, S., Pellegrino, E., & Schmid, S. (2021). (eds), L’individualità del parlante nelle scienze fonetiche: applicazioni tecnologiche e forensi. Milano, Officinaventuno 2021. ISSN: 2612-226X. Link
  • Pellegrino, E., He, L. & Dellwo, V. (2021). Age-related rhythmic variations. The role of syllable intensity variability. TRANEL, 84, 167-185. https://doi.org/10.26034/tranel.2021.2924

2020

  • He, L., & Zhang, Y. (2020) Characterizing speech rhythm using spectral coherence between jaw displacement and speech temporal envelope. Loquens 7: e74.

2019

  • Foulkes, P., French, P., & Wilson, K. (2019). LADO as forensic speaker profiling. In P. Patrick, M. Schmid, & K. Zwaan (Eds.), Language Analysis for the Determination of Origin SPRINGER. Link
  • He, L., Zhang, Y., & Dellwo, V. (2019) Between-speaker variability and temporal organization of the first formant. Journal of the Acoustical Society of America 145(3): EL209–EL214.

2018

  • Asadi, H., Nourbakhsh, M., He, L., Pellegrino, E., & Dellwo, V. (2018).  Between-speaker rhythmic variability is not dependent on language rhythm, as evidence from Persian reveals. International Journal of Speech, Language and the Law, 25(2):151-174. ZORA
  • Braun, A., Llamas, C., Watt, D., French, J. P., & Robertson, D. (2018). Sub-regional ‘other-accent’ effects on lay listeners’ speaker identification abilities: a voice line-up study with speakers and listeners from the North East of England. International Journal of Speech, Language and the Law25(2), 231-255. https://doi.org/10.1558/ijsll.37340
  • Dellwo, V., French, P., & He, L. (2018). Voice biometrics for forensic speaker recognition applications. In: Frühholz, Sascha ; Belin, Pascal . The Oxford Handbook of Voice Perception. Oxford: Oxford University Press, 777-798. ZORA

2017

  • He, L., & Dellwo, V. (2017).  Between-speaker variability in temporal organizations of intensity contours. Journal of the Acoustical Society of America, 141(5):EL488-EL494. ZORA
  • Kolly, M.-J., Boula de Mareüil, P., Leemann, A., & Dellwo, V. (2017).  Listeners use temporal information to identify French- and English-accented speech. Speech Communication, 86:121-134. ZORA

2016

  • He, L., & Dellwo, V. (2016). The role of syllable intensity in between- speaker rhythmic variability. International Journal of Speech, Language and the Law, 23(2):243-273. ZORA

Selected Funded Research Projects

  • Madikeri, S. (2026-2029) Speaker recognition across age-groups for cantonal law enforcement agencies [Digitalization Initiative of the Zurich Higher Education Institutions] More
  • Dellwo, V./Pellegrino (2024-2028) PIs in EU MSCA Voice Communication Sciences (VoCS)More  
  • Dellwo, V./Pellegrino (2020-2024) PIs in NCCR Evolving language: WP 1: Accommodation; WP 2: Speech Production; (2024-2028): WP1: Channel Integration; [Swiss National Sciences Foundation, 51AU40_180888] More

  • Dellwo, V./Pellegrino, E. Voice Communication Sciences (VoCS) - a Marie Skłodowska-Curie Actions Doctoral Network. More
  • Vukovic, T. (2023-2027) CAPIRE - Computational Analysis of Personal Identity in Interaction: Recognition and Ethics [Digital Society Initiative Postdoc Grant]; 750’000 CHF
  • He, L. (2021-2025) Untangling the relationship between voice and face [Swiss National Science Foundation]; 911’000 CHF.
  • He, L. (2021-2023) The vocal apparatus and acoustics in coalescence: A multimodal approach to speech rhythm [UZH Forschungskredit]; 180’000 CHF.
  • Dellwo, V. (2019-2024) The dynamics of indexical information in speech and its role in speech communication and speaker recognition [Swiss National Science Foundation; 185399; GRS-027/13]. More
  • Dellwo, V./Frühholz, S. (2018-2020) Voice Theft: chances and risks of digital voice technology [Swiss National Science Foundation; 10DL15_183152]; 250’000 CHF. More
  • He, L. (2018-2019) Speaker recognition based on articulatory information [Swiss National Science Foundation];  88’000 CHF.
  • Dellwo, V./Kerdpol, K. (2017-2018) Speaker recognition in tone languages. [State Secretary for Education, Research and Innovation (SERI); seed money 2013-16]; 10'000 CHF.
  • He, L./Pellegrino, E./Dellwo, V. (2017-2018) Forensic phonetic analysis of electronically disguised voice [International Association for Forensic Phonetics and Acoustics]; 1’800 GBP.
  • Schwab, S./Dellwo, V. (2017-2018) Effects of nicotine on voice quality [International Association for Forensic Phonetics and Acoustics]; 1’800 GBP.
  • Dimos, K./ He, L./Dellwo, V. (2015-2016) An investigation of the rhythmic acoustic differences between normal and shouted voices [International Association for Forensic Phonetics and Acoustics]; 1'800 GBP.
  • Dellwo, V. (2014-2016) VoiceTime – Sprecherkennung mittels Zeitbereichsinformationen. [Gebert-Rüf Foundation; GRS-027/13]; 340'000 CHF. More
  • Dellwo, V./ Schmid, S. (2012-2015) Speaker identification based on speech temporal information: A forensic phonetic study of Swiss German [Swiss National Science Foundation; 100015_135287]; 432'456 CHF. More
  • Dellwo, V. (2011-15) The functions of speech rhythm in segregating simultaneous speakers [Stiftung für wissenschaftliche Forschung der U. Zürich]; 14'000 CHF.

Selected Bachelor, Master, and PhD Research Projects

PhD Research Projects

  • Massoumeh Chapariniya (ongoing). Multimodal AI for human identity and behavior analysis
  • Aref Farhadi Pour (ongoing). Multimodal person recognition
  • Andrea Fröhlich (ongoing). Voice super-recognizers: novel approaches to large-scale forensic speaker comparison
  • David Grünert (ongoing). 
  • Alessandro de Luca (ongoing). Vocal identity dynamics
  • Erdem Baha Topbas (ongoing). Temporal voice specific properties for forensic speaker comparison
  • Tianze Xu (ongoing). Attention to vocal identity cues
  • Carolina Lins Machado. Incorporating individual variation in the modeling of acoustic-articulatory relationships in vowel dynamics Link
  • Valeriia Vyshnevetska. Interdisciplinary approaches to voice recognition Link
  • Yu Zhang. Speaker idiosyncrasy in temporal organizations in speech production Link

Master Research Projects

  • Miao Yu (ongoing). Tone co-articulation realization in natural and deep fake voices
  • Shiran Liu. Voice similarity perception in human and machine variable investigation and experiment design
  • Damiano N. Picco. The role of expressive audiovisual information on voice-face, face-voice prediction
  • Allison Ponce de Leon Diaz. The role of expressive dynamic facial information on voice-face, face-voice prediction
  • Florian Heinz (programming project). The role of speech rhythm for natural and deep fake voices classification
  • Andrea Fröhlich. Evaluation des forensischen Sprechererkennungssystems 'iVocalise' 
  • Sarah Lim. Speaker recognition in deceptive speech
  • Elena Mayorova. Temporal information in speaker recognition
  • Dario Brander. The use of hesitation for forensic phonetic speaker comparison
  • Lukas Fischer. Voice variability with age
  • Sibylle Sutter. The influence of visual speech on auditory speaker identification ability

Bachelor Research Projects:

  • Linus Manser. Automatic age recognition

Past and Present News & Events

2026

2023

9-12 July 2023: 31st IAFPA Conference held in Zurich

2022

2020

  • Homa Asadi and Volker Dellwo received a bridging grant by 'Leadinghouse Southeast Asia and Iran' for a common project between UZH and the University of Isfahan on voice dynamics starting in autumn 2020.

Additional Information

Contact

Address:
University of Zurich
Department of Computational Linguistics
Centre for Forensic Phonetics and Acoustics (CFPA)
Andreasstr. 15
CH-8050 Zurich
Switzerland
Map

Phone: +41(0)44 634 2995