Who spoke when?

Audio-based speaker location estimation for diarization

Fotogalerie

Maral Dadvar

Who spoke when?

Audio-based speaker location estimation for diarization

Broschiertes Buch

Jetzt bewerten Jetzt bewerten

Autorenporträt

Andere Kunden interessierten sich auch für

Trong Tan Ho (Jason)
When-to-release Planning in consideration of Technical Debt

47,99 €
Eliyahu Miller
Who Knows 9?

14,99 €
Peter Trebuna
Modelling and Simulation in TX Process Simulate software

36,99 €
Cristina Oliveira
The New Faces of Interactivity-Analysis & Implementation of Mobile App

47,99 €
M. Sunil Kumar
Business Process Reengineering

36,99 €
Mina Kumari
Artificial Intelligence Unveiled

40,99 €
Uzair Iqbal
Review-Scrum(R-Scrum): Introduction of Model Driven Architecture (MDA)

19,99 €

Produktbeschreibung

Speaker diarization is the process which detects active speakers and groups those speech signals which has been uttered by the same speaker. Generally we can find two main applications for speaker diarization. Automatic Speech Recognition systems make use of the speaker homogeneous clusters to adapt the acoustic models to be speaker dependent and therefore increase recognition performance. Speaker indexing and rich transcription systems also use the speaker diarization output as one of information extracted from a recording, which allow its automatic indexation and other further processing. In this study a speaker diarization application is developed using multiparty binaural speech recordings to track speaker activity based on interaural time difference (ITD) cues. These cues, for a given speech signal frame, are computed using gammatone filtering and cross-correlation technique. Their values are used to determine which speaker in the recording produce the considered speech fragment. This study has been supervised by Dr. Jon Barker, and defended to fulfill the requirements for the degree of Master in Advanced Computer Science, University of Sheffield, United Kingdom, 2007.

Produktdetails

Produktdetails
Verlag: LAP Lambert Academic Publishing
Seitenzahl: 68
Erscheinungstermin: 1. Juli 2011
Englisch
Abmessung: 220mm x 150mm x 5mm
Gewicht: 107g
ISBN-13: 9783844386288
ISBN-10: 3844386289
Artikelnr.: 33694324

Herstellerkennzeichnung
Books on Demand GmbH
Überseering 33
22297 Hamburg
bod@bod.de

Produktdetails

Verlag: LAP Lambert Academic Publishing
Seitenzahl: 68
Erscheinungstermin: 1. Juli 2011
Englisch
Abmessung: 220mm x 150mm x 5mm
Gewicht: 107g
ISBN-13: 9783844386288
ISBN-10: 3844386289
Artikelnr.: 33694324

Herstellerkennzeichnung
Books on Demand GmbH
Überseering 33
22297 Hamburg
bod@bod.de

Autorenporträt

Maral Dadvar trabaja en el Grupo de Interacción con los Medios Humanos de la Universidad de Twente, en los Países Bajos, como investigador de doctorado. Desarrolló un interés en el procesamiento del lenguaje natural cuando implementó la diarización del hablante para su tesis de maestría. Maral tiene una maestría en ciencias informáticas avanzadas de la Universidad de Sheffield, Reino Unido.