Man-Machine Speech Communication (eBook, PDF)
19th National Conference, NCMMSC 2024, Urumqi, China, August 15-18, 2024, Proceedings
Redaktion: Ling, Zhenhua; Li, Ya; He, Liang; Hamdulla, Askar; Chen, Xie
Alle Infos zum eBook verschenken
Man-Machine Speech Communication (eBook, PDF)
19th National Conference, NCMMSC 2024, Urumqi, China, August 15-18, 2024, Proceedings
Redaktion: Ling, Zhenhua; Li, Ya; He, Liang; Hamdulla, Askar; Chen, Xie
- Format: PDF
- Merkliste
- Auf die Merkliste
- Bewerten Bewerten
- Teilen
- Produkt teilen
- Produkterinnerung
- Produkterinnerung
Hier können Sie sich einloggen
Bitte loggen Sie sich zunächst in Ihr Kundenkonto ein oder registrieren Sie sich bei bücher.de, um das eBook-Abo tolino select nutzen zu können.
This book constitutes the refereed proceedings of the 19th National Conference on Man-Machine Speech Communication, NCMMSC 2024, held in Urumqi, China, during August 15-18, 2024.
The 33 papers included in these proceedings were carefully reviewed and selected from 205 submissions. They deal with topics such as speech technology and large language models, audio processing, prosody modeling and dialogue systems. Key areas include speech recognition, speaker identification and verification, speech/sound/music synthesis, speech enhancement, sound event detection, multimodal systems,…mehr
- Geräte: PC
- ohne Kopierschutz
- eBook Hilfe
- Größe: 39.41MB
- Man-Machine Speech Communication (eBook, PDF)61,95 €
- Man-Machine Speech Communication (eBook, PDF)40,95 €
- Man-Machine Speech Communication (eBook, PDF)65,95 €
- Asoke Kumar DattaTime Domain Representation of Speech Sounds (eBook, PDF)73,95 €
- Statistical Language and Speech Processing (eBook, PDF)40,95 €
- Nonlinear Speech Modeling and Applications (eBook, PDF)40,95 €
- Advances in Speech and Language Technologies for Iberian Languages (eBook, PDF)40,95 €
-
-
-
The 33 papers included in these proceedings were carefully reviewed and selected from 205 submissions. They deal with topics such as speech technology and large language models, audio processing, prosody modeling and dialogue systems. Key areas include speech recognition, speaker identification and verification, speech/sound/music synthesis, speech enhancement, sound event detection, multimodal systems, conversational AI, phonetics, phonology and prosody analysis, auditory processing, and acoustic scene modeling etc.
Dieser Download kann aus rechtlichen Gründen nur mit Rechnungsadresse in A, B, BG, CY, CZ, D, DK, EW, E, FIN, F, GR, HR, H, IRL, I, LT, L, LR, M, NL, PL, P, R, S, SLO, SK ausgeliefert werden.
- Produktdetails
- Verlag: Springer Nature Singapore
- Seitenzahl: 400
- Erscheinungstermin: 26. Dezember 2024
- Englisch
- ISBN-13: 9789819610457
- Artikelnr.: 72680477
- Verlag: Springer Nature Singapore
- Seitenzahl: 400
- Erscheinungstermin: 26. Dezember 2024
- Englisch
- ISBN-13: 9789819610457
- Artikelnr.: 72680477
- Herstellerkennzeichnung Die Herstellerinformationen sind derzeit nicht verfügbar.
.- M-CMGAN: Attempting to Use Mamba on Speech Enhancement.
.- A Backend-friendly On-device Multi-channel Speech Enhancement System with IPD and PHM.
.- SESNet: A Speech Enhancement and Separation Network in Noisy Reverberant Environments.
.- ASD-Diff: Unsupervised Anomalous Sound Detection With Masked Diffusion Model.
.- Emergence of Hemispheric Asymmetries and Predictive Coding in the Neural Mechanism of Speech Perception.
.- Phoneme Semantic Backdoor Attacks with Multiple Task Learning for Speech Classification Task.
.- AESR: Speech Recognition With Speech Emotion Recogniting Learning.
.- A Comparative Analysis of Diphthong Acquisition in Standard Chinese by Learners from 'the Belt and Road'.
.- ESTVocoder: An Excitation-Spectral-Transformed Neural Vocoder Conditioned on Mel Spectrogram.
.- Transformer-based Model for Auditory EEG Decoding.
.- A Neural Denoising Vocoder for Clean Waveform Generation from Noisy Mel-Spectrogram based on Amplitude and Phase Predictions.
.- Sound Zone Control Based on a Kronecker Second-Order Tensor Decomposition.
.- MCDubber: Multimodal Context-Aware Expressive Video Dubbing.
.- TeleSpeechPT: Large-Scale Chinese Multi-Dialect And Multi-Accent Speech Pre-Training.
.- Investigation into the Impact of Speaker Adversarial Perturbation on Speech Recognition.
.- Pruning and Quantization Enhanced Densely Connected Neural Network for Efficient Acoustic Echo Cancellation.
.- Improved DOA Estimation of Sound Source of Small Amplitudes using a Single Acoustic Vector Sensor.
.- Investigation on Training Strategy for Cross-Modal Large Language Models with Speech and Text.
.- ExARN: Target Speaker Extraction with Attentive Recurrent Networks.
.- Tone Perception by Putonghua-Learning Preschool Children in South Xinjiang Uyghur Autonomous Region.
.- Study on Prosodic Disambiguation of VP/NP Syntactic Structure by Chinese EFL Learners.
.- An electroencephalogram-based study of neural responses to imagined speech in Mandarin.
.- A Speech Corpus of Putonghua-Learning Preschoolers From the Uygur Ethnic Group in South Xinjiang Uygur Autonomous Region of China.
.- Evaluation of Data Inconsistency for Multi-modal Sentiment Analysis.
.- LDMME: Latent Diffusion Model for Music Editing.
.- Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech.
.- Speech emotion recognition based on multi acoustic feature fusion.
.- DA-KWFormer: A Domain Adaptation Network with K-Weight Transformer for Speech Emotion Recognition.
.- An Unsupervised Domain Adaptation Method based on Distribution Alignment for Speaker Verification.
.- Cross-Model Knowledge Distillation and Metadata Fusion for Respiratory Sound Classification.
.- Effect of Focus on Vowel Duration and Formant in Cantonese.
.- A Quantitative Parameter of Pronunciation, TVVF.
.- M-CMGAN: Attempting to Use Mamba on Speech Enhancement.
.- A Backend-friendly On-device Multi-channel Speech Enhancement System with IPD and PHM.
.- SESNet: A Speech Enhancement and Separation Network in Noisy Reverberant Environments.
.- ASD-Diff: Unsupervised Anomalous Sound Detection With Masked Diffusion Model.
.- Emergence of Hemispheric Asymmetries and Predictive Coding in the Neural Mechanism of Speech Perception.
.- Phoneme Semantic Backdoor Attacks with Multiple Task Learning for Speech Classification Task.
.- AESR: Speech Recognition With Speech Emotion Recogniting Learning.
.- A Comparative Analysis of Diphthong Acquisition in Standard Chinese by Learners from 'the Belt and Road'.
.- ESTVocoder: An Excitation-Spectral-Transformed Neural Vocoder Conditioned on Mel Spectrogram.
.- Transformer-based Model for Auditory EEG Decoding.
.- A Neural Denoising Vocoder for Clean Waveform Generation from Noisy Mel-Spectrogram based on Amplitude and Phase Predictions.
.- Sound Zone Control Based on a Kronecker Second-Order Tensor Decomposition.
.- MCDubber: Multimodal Context-Aware Expressive Video Dubbing.
.- TeleSpeechPT: Large-Scale Chinese Multi-Dialect And Multi-Accent Speech Pre-Training.
.- Investigation into the Impact of Speaker Adversarial Perturbation on Speech Recognition.
.- Pruning and Quantization Enhanced Densely Connected Neural Network for Efficient Acoustic Echo Cancellation.
.- Improved DOA Estimation of Sound Source of Small Amplitudes using a Single Acoustic Vector Sensor.
.- Investigation on Training Strategy for Cross-Modal Large Language Models with Speech and Text.
.- ExARN: Target Speaker Extraction with Attentive Recurrent Networks.
.- Tone Perception by Putonghua-Learning Preschool Children in South Xinjiang Uyghur Autonomous Region.
.- Study on Prosodic Disambiguation of VP/NP Syntactic Structure by Chinese EFL Learners.
.- An electroencephalogram-based study of neural responses to imagined speech in Mandarin.
.- A Speech Corpus of Putonghua-Learning Preschoolers From the Uygur Ethnic Group in South Xinjiang Uygur Autonomous Region of China.
.- Evaluation of Data Inconsistency for Multi-modal Sentiment Analysis.
.- LDMME: Latent Diffusion Model for Music Editing.
.- Emotion-Aware Prosodic Phrasing for Expressive Text-to-Speech.
.- Speech emotion recognition based on multi acoustic feature fusion.
.- DA-KWFormer: A Domain Adaptation Network with K-Weight Transformer for Speech Emotion Recognition.
.- An Unsupervised Domain Adaptation Method based on Distribution Alignment for Speaker Verification.
.- Cross-Model Knowledge Distillation and Metadata Fusion for Respiratory Sound Classification.
.- Effect of Focus on Vowel Duration and Formant in Cantonese.
.- A Quantitative Parameter of Pronunciation, TVVF.