InterSpeech 历年最佳论文
InterSpeech
(主要数据来源:每年会议的 AbstractBook.pdf)
- 2023 年
时间:2023 年 8 月 20 日 ~ 24 日
地点:爱尔兰 都柏林(线下)
共收到 2293 篇投稿,其中 2207 篇被审稿,最终接收 1097 篇论文
序号 |
最佳学生论文(提名) | 作者 | 获奖情况 |
---|---|---|---|
1 | Multimodal Turn-taking Model Using Visual Cues for End-of-Utterance Prediction in Spoken Dialogue Systems | Fuma Kurata, Mao Saeki, Shinya Fujie, Yoichi Matsuyama | 最佳学生论文 |
2 | Transvelar Nasal Coupling Contributing to Speaker Characteristics in Non-nasal Vowels | Ziyu Zhu, Yujie Chi, Zhao Zhang, Kiyoshi Honda, Jianguo Wei | |
3 | Diverse and Expressive Speech Prosody Prediction with Denoising Diffusion Probabilistic Model | Xiang Li, Songxiang Liu, Max W. Y. Lam, Zhiyong Wu, Chao Weng, Helen Meng | |
4 | MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets | Ziyang Ma, Zhisheng Zheng, Changli Tang, Yujin Wang, Xie Chen | |
5 | Neural Model Reprogramming with Similarity Based Mapping for Low-Resource Spoken Command Recognition | Hao Yen, Pin-Jui Ku, Chao-Han Huck Yang, Hu Hu, Sabata Marco Siniscalchi, Pin-Yu Chen, Yu Tsao | |
6 | Speech Self-Supervised Representation Benchmarking: Are We Doing it Right? | Salah Zaiem, Youcef Kemiche, Titouan Parcollet, Slim Essid, Mirco Ravanelli | |
7 | Speaker Tracking using Graph Attention Networks with Varying Duration Utterances across Multi-Channel Naturalistic Data: Fearless Steps Apollo-11 Audio Corpus | Meena M. C. Shekar, John H. L. Hansen | |
8 | Two-Stage Voice Anonymization for Enhanced Privacy | Francesco Nespoli, Joerg Bitzer, Daniel P. Barreda, Patrick A. Naylor | |
9 | Sociodemographic and Attitudinal Effects on Dialect Speakers’ Articulation of the Standard Language: Evidence from German-Speaking Switzerland | Carina Steiner, Dieter Studer-Joho, Corinne Lanthemann, Andrin Büchler, Adrian Leemann | |
10 | Which aspects of motor speech disorder are captured by Mel Frequency Cepstral Coefficients? Evidence from the change in STN-DBS conditions in Parkinson’s disease | Vojtěch Illner, Petr Krýže, Jan Švihlík, Mário Souza, Paul Krack, Elina Tripoliti, Robert Jech, Jan Rusz | |
11 | An Automatic Multimodal Approach to Analyze Linguistic and Acoustic Cues on Parkinson’s Disease Patients | Daniel Escobar-Grisales, Tomas Arias-Vergara, Cristian David Ríos Urrego, Elmar Noeth, Adolfo Garcia, Juan Rafael Orozco-Arroyave | |
12 | CommonAccent: Exploring Large Acoustic Pretrained Models for Accent Classification Based on Common Voice | Juan Pablo Zuluaga Gomez, Sara Ahmed, Danielius Visockas, Cem Subakan | |
13 | Automatic Prediction of Language Learners’ Listenability Using Speech and Text Features Extracted from Listening Drills | Yingxiang Gao, Jaehyun Choi, Nobuaki Minematsu, Noriko Nakanishi, Daisuke Saito |
- 2022 年
时间:2022 年 9 月 18 日 ~ 22 日
地点:韩国 仁川(线下/远程混合会议)
共收到 2490 篇投稿,其中 2140 篇被审稿,最终接收 1102 篇论文
序号 |
最佳学生论文(提名) | 作者 | 获奖情况 |
---|---|---|---|
1 | Trajectories Predicted by Optimal Speech Motor Control Using LSTM Networks | Tsiky Rakotomalala, Pierre Baraduc, Pascal Perrier | |
2 | Transfer Learning Framework for Low-Resource Text-toSpeech Using a Large-Scale Unlabeled Speech Corpus | Minchan Kim, Myeonghun Jeong, Byoung Jin ChoiA, Sunghwan Ahn, Joun Yeop Lee, Nam Soo Kim | 最佳学生论文 |
3 | Pharyngealization in Amazigh: Acoustic and Articulatory Marking Over Time | Tsiky Rakotomalala, Pierre Baraduc, Pascal Perrier | |
4 | Tree-constrained Pointer Generator with Graph Neural Network Encodings for Contextual Speech Recognition | Guangzhi Sun, Chao Zhang, Phil Woodland | 最佳学生论文 |
5 | Where's the Uh, Hesitation? the Interplay between Filled Pause Location, Speech Rate and Fundamental Frequency in Perception of Confidence | Ambika Kirkland, Harm Lameris, Éva Székely, Joakim Gustafson | |
6 | Deep Residual Spiking Neural Network for Keyword Spotting in Low-Resource Settings | Qu Yang, Qi Liu, Haizhou Li | |
7 | Attentive Feature Fusion for Robust Speaker Verification | Bei Liu, Zhengyang Chen, Yanmin Qian | |
8 | Robust Self-Supervised Audio-Visual Speech Recognition | Bowen Shi, Wei-Ning Hsu, Abdelrahman Mohamed | |
9 | Distance-Based Sound Separation | Katharine Patterson, Kevin Wilson, Scott Wisdom, John R. Hershey | |
10 | Investigating Perception of Spoken Dialogue Acceptability through Surprisal | Sarenne Carrol Wallbridge, Catherine Lai, Peter Bell | 最佳学生论文 |
11 | Complex-Valued Time-Frequency Self-Attention for Speech Dereverberation | Vinay Kothapally, John H.L. Hansen | |
12 | Learning Audio-Text Agreement for Open-vocabulary Keyword Spotting | Hyeon-Kyeong Shin, Hyewon Han, Doyeon Kim, SooWhan Chung, Hong-Goo Kang |
- 2021 年
时间:2021 年 8 月 30日 ~ 9 月 3 日
地点:捷克 布尔诺(远程会议)
共收到 2277 篇投稿,其中 1990 篇被审稿,最终接收 963 篇论文
序号 |
最佳学生论文(提名) | 作者 | 获奖情况 |
---|---|---|---|
1 | StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for NaturalSounding Voice Conversion | Yinghao Li, Ali Zare, Nima Mesgarani | 最佳学生论文 |
2 | Stochastic Process Regression for Cross-Cultural Speech Emotion Recognition | Mani Kumar Tellamekala, Enrique Sanchez, Georgios Tzimiropoulos, Timo Giesbrecht, Michel Valstar | |
3 | Multilingual Transfer of Acoustic Word Embeddings Improves When Training on Languages Related to the Target Zero-Resource Language | Christiaan Jacobs, Herman Kamper | |
4 | Effective Phase Encoding for End-to-end Speaker Verification | Junyi Peng, Xiaoyang Qu, Rongzhi Gu, Jianzong Wang, Jing Xiao, Lukas Burget, Jan Černocký | |
5 | Dialogue Situation Recognition for Everyday Conversation Using Multimodal Information | Yuya Chiba, Ryuichiro Higashinaka | |
6 | Audio Retrieval with Natural Language Queries | Andreea-Maria Oncescu, A. Sophia Koepke, João Henriques, Zeynep Akata, Samuel Albanie | |
7 | Optimally Encoding Inductive Biases into the Transformer Improves End-to-End Speech Translation | Piyush Vyas, Anastasia Kuznetsova, Donald Williamson | 最佳学生论文 |
8 | WSRGlow: A Glow-based Waveform Generative Model for Audio Super-Resolution | Kexun Zhang, Yi Ren, Changliang Xu, Zhou Zhao | |
9 | Acoustic Indicators of Speech Motor Coordination in Adults With and Without Traumatic Brain Injury | Tanya Talkar, Nancy Solomon, Douglas Brungart, Stefanie Kuchinsky, Megan Eitel, Sara Lippa, Tracey Brickell, Louis French, Rael Lange, Thomas Quatieri | |
10 | A Discriminative Entity-Aware Language Model for Virtual Assistants | Mandana Saebi, Ernest Pusateri, Aaksha Meghawat, Christophe Van Gysel | |
11 | An Automatic, Simple Ultrasound Biofeedback Parameter for Distinguishing Accurate and Misarticulated Rhotic Syllables | Sarah Li, Colin Annand, Sarah Dugan, Sarah Schwab, Kathryn Eary, Michael Swearengen, Sarah Stack, Suzanne Boyce, Michael Riley, T. Mast | |
12 | Active Speaker Detection as a Multi-Objective Optimization with Uncertainty-based Multimodal Fusion | Baptiste Pouthier, Laurent Pilati, Leela Gudupudi, Charles Bouveyron, Frederic Precioso | |
13 | Exploring the Potential of Lexical Paraphrases for Mitigating Noise-Induced Comprehension Errors | Anupama Chingacham, Vera Demberg, Dietrich Klakow | 最佳学生论文 |
14 | Automatic Analysis of the Emotional Content of Speech in Daylong Child-Centered Recordings from a Neonatal Intensive Care Unit | Einari Vaaras, Sari Ahlqvist-Björkroth, Konstantinos Drossos, Okko Räsänen | |
15 | Leveraging Real-time MRI for Illuminating Linguistic Velum Action | Miran Oh, Dani Byrd, Shrikanth Narayanan |
- 2020 年
时间:2020 年 10 月 25 日 ~ 29 日
地点:中国 上海(远程会议)
共收到 2258 篇投稿,其中 2103 篇被审稿,最终接收 1021 篇论文
序号 |
最佳学生论文(提名) | 作者 | 获奖情况 |
---|---|---|---|
1 | Nonlinear ISA with Auxiliary Variables for Learning Speech Representations | Amrith Setlur, Barnabas Poczos, Alan W Black | |
2 | Low Latency End-to-End Streaming Speech Recognition with a Scout Network | Chengyi Wang, Yu Wu, Liang Lu, Shujie Liu, Jinyu Li, Guoli Ye, Ming Zhou | |
3 | A Differentiable Perceptual Audio Metric Learned from Just Noticeable Differences | Pranay Manocha, Adam Finkelstein, Richard Zhang, Nicholas Bryan, Gautham Mysore, Zeyu Jin | |
4 | Speaker discrimination in humans and machines: Effects of speaking style variability | Amber Afshan, Jody Kreiman, Abeer Alwan | |
5 | FaceFilter: Audio-visual speech separation using still images | Soo-Whan Chung, Soyeon Choe, Joon Son Chung, Hong-Goo Kang | 最佳学生论文 |
6 | ECAPA-TDNN: Emphasized Channel Attention, Propagation and Aggregation in TDNN Based Speaker Verification | Brecht Desplanques, Jenthe Thienpondt, Kris Demuynck | |
7 | Distilling the Knowledge of BERT for Sequence-to-Sequence ASR | Hayato Futami, Hirofumi Inaguma, Sei Ueno, Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara | |
8 | Vector-Quantized Autoregressive Predictive Coding | Yu-An Chung, Hao Tang, James Glass | 最佳学生论文 |
9 | Automatic Detection of Phonological Errors in Child Speech Using Siamese Recurrent Autoencoder | Si-Ioi Ng, Tan Lee | |
10 | Phonetic Accommodation of L2 German Speakers to the Virtual Language Learning Tutor Mirabella | Iona Gessinger, Bernd Möbius, Bistra Andreeva, Eran Raveh, Ingmar Steiner | 最佳学生论文 |
11 | Abstractive Spoken Document Summarization using Hierarchical Model with Multi-stage Attention Diversity Optimization | Potsawee Manakul, Mark Gales, Linlin Wang |
- 2019 年
时间:2019 年 9 月 15 日 ~ 19 日
地点:奥地利 格拉茨
共收到 2180 篇投稿,其中 1855 篇被审稿,最终接收 914 篇论文
序号 |
最佳学生论文(提名) | 作者 | 获奖情况 |
---|---|---|---|
1 | ReMASC: Realistic Replay Attack Corpus for Voice Controlled Systems | Yuan Gong, Jian Yang, Jacob Huber, Mitchell MacKnight, Christian Poellabauer | |
2 | Curriculum-Based Transfer Learning for an Effective End-to-End Spoken Language Understanding and Domain Portability | Antoine Caubrière, Natalia Tomashenko, Antoine Laurent, Emmanuel Morin, Nathalie Camelin and Yannick Estève | |
3 | Learning Problem-agnostic Speech Representations from Multiple Self-supervised Tasks | Santiago Pascual, Mirco Ravanelli, Joan Serrà, Antonio Bonafonte and Yoshua Bengio | |
4 | Evaluating Near End Listening Enhancement Algorithms in Realistic Environments | Carol Chermaz, Cassia Valentini-Botinhao, Henning Schepker, Simon King | 最佳学生论文 |
5 | Adversarially Trained End-to-end Korean Singing Voice Synthesis System | Juheon Lee, Hyeong-Seok Choi, Chang-Bin Jeon, Junghyun Koo, Kyogu Lee | 最佳学生论文 |
6 | The Contribution of Acoustic Features Analysis to Model Emotion Perceptual Process for Language Diversity | Xingfeng Li, Masato Akagi | |
7 | CycleGAN-based Emotion Style Transfer as Data Augmentation for Speech Emotion Recognition | Fang Bao, Michael Neumann, Ngoc Thang Vu | |
8 | Exploiting Visual Features using Bayesian Gated Neural Networks for Disordered Speech Recognition | Shansong Liu, Shoukang Hu, Yi Wang, Jianwei Yu, Rongfeng Su, Xunying Liu, Helen Meng | |
9 | On the Role of Style in Parsing Speech with Neural Models | Trang Tran, Jiahong Yuan, Yang Liu, Mari Ostendorf | |
10 | An Effective Deep Embedding Learning Architecture for Speaker Verification | Yiheng Jiang, Yan Song, Ian McLoughlin, Zhifu Gao, Lirong Dai | |
11 | Towards the Speech Features of Mild Cognitive Impairment: Universal Evidence from Structured and Unstructured Connected Speech of Chinese | Tianqi Wang, Chongyuan Lian, Jingshen Pan, Quanlei Yan, Feiqi Zhu, Manwa L. Ng, Lan Wang,Nan Yan | |
12 | Semi-supervised Sequence-to-sequence ASR using Unpaired Speech and Text | Murali Karthick Baskar, Shinji Watanabe, Ramón Astudillo, Takaaki Hori, Lukas Burget, Jan Černocký | |
13 | Language Modeling with Deep Transformers | Kazuki Irie, Albert Zeye, Ralf Schlüter, Hermann Ney | 最佳学生论文 |
- 2018 年
时间:2018 年 9 月 2 日 ~ 6 日
地点:印度 海得拉巴
共收到 1668 篇投稿,最终接收 749 篇论文
序号 |
最佳学生论文(提名) | 作者 | 获奖情况 |
---|---|---|---|
1 | Automatic Glottis Localization and Segmentation in Stroboscopic Videos Using Deep Neural Network | Achuth Rao M V, Rahul Krishnamurthy, Pebbili Gopikishore, Veeramani Priyadharshini, Prasanta Kumar Ghosh | |
2 | Effects of User Controlled Speech Rate on Intelligibility in Noisy Environments | John S. Novak, III, Robert V. Kenyon | |
3 | An Interlocutor-Modulated Attentional LSTM for Differentiating between Subgroups of Autism Spectrum Disorder | Yun-Shao Lin, Susan Shur-Fen Ga, Chi-Chun Lee | |
4 | An Improved Deep Embedding Learning Method for Short Duration Speaker Verification | Zhifu Gao, Yan Song, Ian McLoughlin, Wu Guo, Lirong Dai | |
5 | Joint Localization and Classification of Multiple Sound Sources Using a Multi-task Neural Network | Weipeng He, Petr Motlicek, Jean-Marc Odobez | |
6 | A Priori SNR Estimation Based on a Recurrent Neural Network for Robust Speech Enhancement | Yangyang Xia and Richard M. Stern | |
7 | Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations | Ju-chieh Chou, Cheng-chieh Yeh, Hung-yi Lee and Lin-shan Lee | |
8 | Multi-Modal Data Augmentation for End-to-End ASR | Adithya Renduchintala, Shuoyang Ding, Matthew Wiesner and Shinji Watanabe | 最佳学生论文 |
9 | A GPU-based WFST Decoder with Exact Lattice Generation | Zhehuai Chen, Justin Luitjens, Hainan Xu, Yiming Wang, Daniel Povey and Sanjeev Khudanpur | |
10 | User-centric Evaluation of Automatic Punctuation in ASR Closed Captioning | Ákos Máté Tündik, György Szaszák, Gábor Gosztolya and András Bek | |
11 | Detecting Depression with Audio/Text Sequence Modeling of Interview | Tuka Alhanai, Mohammad Ghassemi and James Glass | 最佳学生论文 |
12 | Joint Learning of Interactive Spoken Content Retrieval and Trainable User Simulator | Pei-Hung Chung, Kuan Tung, Ching-Lun Tai and Hung-Yi Lee | 最佳学生论文 |
- 2017 年
时间:2017 年 8 月 20 日 ~ 24 日
地点:瑞典 斯德哥尔摩
共收到 1711 篇投稿,其中 1582 篇被审稿,最终接收 799 篇论文
序号 |
最佳学生论文(提名) | 作者 | 获奖情况 |
---|---|---|---|
1 | Relating Unsupervised Word Segmentation to Reported Vocabulary Acquisition | Elin Larsen, Alejandrina Cristia, Emmanuel Dupoux | |
2 | Mind the Peak: When Museum Is Temporarily Understood as Musical in Australian English | Katharina Zahner, Heather Kember, Bettina Braun | |
3 | Jointly Predicting Arousal, Valence and Dominance with Multi-Task Learning | Srinivas Parthasarathy, Carlos Busso | |
4 | VoxCeleb: A Large-scale Speaker Identification Dataset | Arsha Nagrani, Joon Son Chung, Andrew Zisserman | 最佳学生论文 |
5 | Hidden Markov Model Variational Autoencoder for Acoustic Unit Discovery | Janek Ebbers, Jahn Heymann, Lukas Drude, Thomas Glarner, Reinhold Haeb-Umbach, Bhiksha Raj | 最佳学生论文 |
6 | Tight Integration of Spatial and Spectral Features for BSS with Deep Clustering Embeddings | Lukas Drude, Reinhold Haeb-Umbach | |
7 | VCV Synthesis using Task Dynamics to Animate a Factor-based Articulatory Model | Rachel Alexander, Tanner Sorensen, Asterios Toutios, Shrikanth Narayanan | |
8 | CTC in the Context of Generalized Full-Sum HMM Training | Albert Zeyer, Eugen Beck, Ralf Schlüter, Hermann Ney | |
9 | Residual Memory Networks in Language Modeling: Improving the Reputation of Feed-Forward Networks | Karel Beneš, Murali Baskar, Lukáš Burget | 最佳学生论文 |
10 | Experiments in Character-level Neural Network Models for Punctuation | William Gale, Sarangarajan Parthasarathy | |
11 | Entrainment in Multi-Party Spoken Dialogues at Multiple Linguistic Levels | Zahra Rahimi, Anish Kumar, Diane Litman, Susannah Paletz, Mingzhi Yu | |
12 | Topic Identification for Speech without ASR | Chunxi Liu, Jan Trmal, Matthew Wiesner, Craig Harman, Sanjeev Khudanpur |
- 2016 年
时间:2016 年 9 月 8 日 ~ 12 日
地点:美国 旧金山
共收到 1644 篇投稿,其中 1585 篇被审稿,最终接收 779 篇论文
序号 |
最佳学生论文(提名) | 作者 | 获奖情况 |
---|---|---|---|
1 | Characterizing Vocal Tract Dynamics Across Speakers Using Real-Time MRI | Tanner Sorensen, Asterios Toutios, Louis Goldstein, Shrikanth Narayanan | 最佳学生论文 |
2 | Towards Machine Comprehension of Spoken Content: Initial TOEFL Listening Comprehension Test by Machine | Bo-Hsiang Tseng, Sheng-Syun Shen, Hung-Yi Lee, Lin-Shan Lee | |
3 | A DNN-HMM Approach to Story Segmentation | Jia Yu, Xiong Xiao, Lei Xie, Eng Siong Chng, Haizhou Li | |
4 | Head Motion Generation with Synthetic Speech: A Data Driven Approach | Najmeh Sadoughi, Carlos Busso | |
5 | The Rhythmic Constraint on Prosodic Boundaries in Mandarin Chinese Based on Corpora of Silent Reading and Speech Perception | Wei Lai, Jiahong Yuan, Ya Li, Xiaoying Xu, Mark Liberman | 最佳学生论文 |
6 | Is Deception Emotional? An Emotion-Driven Predictive Approach | Shahin Amiriparian, Jouni Pohjalainen, Erik Marchi, Sergey Pugachevskiy, Björn Schuller | |
7 | Probabilistic Approach Using Joint Clean and Noisy I-Vectors Modeling for Speaker Recognition | Waad Ben Kheder, Driss Matrouf, Ajili Moez, Jean-François Bonastre | |
8 | Majorisation-Minimisation Based Optimisation of the Composite Autoregressive System with Application to Glottal Inverse Filtering | Lauri Juvela, Hirokazu Kameoka, Manu Airaksinen, Junichi Yamagishi, Paavo Alku | |
9 | Local Sparsity Based Online Dictionary Learning for Environment-Adaptive Speech Enhancement with Nonnegative Matrix Factorization | Kwang Myung Jeon, Hong Kook Kim | |
10 | GlottDNN - A Full-Band Glottal Vocoder for Statistical Parametric Speech Synthesis | Manu Airaksinen, Bajibabu Bollepalli, Lauri Juvela, Zhizheng Wu, Simon King, Paavo Alku | 最佳学生论文 |
11 | Stimulated Deep Neural Network for Speech Recognition | Chunyang Wu, Penny Karanasou, Mark Gales, Khe Chai Sim | |
12 | Contextual Prediction Models for Speech Recognition | Yoni Halpern, Keith Hall, Vlad Schogol, Michael Riley, Brian Roark, Gleb Skobeltsyn, Martin Baeuml |
- 2015 年
时间:2015 年 9 月 6 日 ~ 10 日
地点:德国 德累斯顿
共 1458 篇被审稿,最终接收 746 篇论文
序号 |
最佳学生论文(提名) | 作者 | 获奖情况 |
---|---|---|---|
1 | Low-Frequency Components Analysis in Running Speech for the Automatic Detection of Parkinson's Disease | T. Villa-Cañas, J.D. Arias-Londoño, J.R. Orozco-Arroyave, J.F. Vargas-Bonilla, E. Nöth | |
2 | Speech Planning in 4-Year-Old Children Versus Adults: Acoustic and Articulatory Analyses | Guillaume Barbier, Pascal Perrier, Lucie Ménard, Yohan Payan, Mark Tiede, Joseph Perkell | |
3 | Using Representation Learning and Out-of-Domain Data for a Paralinguistic Speech Task | Benjamin Milde, Chris Biemann | |
4 | Under-Resourced Speech Recognition Based on the Speech Manifold | Reza Sahraeian, Dirk Van Compernolle, Febe de Wet | |
5 | Estimation of the Air-Tissue Boundaries of the Vocal Tract in the Mid-Sagittal Plane from Electromagnetic Articulograph Data | Satyabrata Parida, Pattem Ashok Kumar, Prasanta Kumar Ghosh | |
6 | Adapting Machine Translation Models toward Misrecognized Speech with Text-To-Speech Pronunciation Rules and Acoustic Confusability | Nicholas Ruiz, Qin Gao, William Lewis, Marcello Federico | 最佳学生论文 |
7 | A Universal VAD Based on Jointly Trained Deep Neural Networks | Qing Wang, Jun Du, Xiao Bao, Zi-Rui Wang, Li-Rong Dai, Chin-Hui Lee | |
8 | Semi-supervised Maximum Mutual Information Training of Deep Neural Network Acoustic Models | Vimal Manohar, Daniel Povey, Sanjeev Khudanpur | |
9 | Salient Dimensions in Implicit Phonotactic Learning | Elise Michon, Emmanuel Dupoux , Alejandrina Cristia | |
10 | A Time Delay Neural Network Architecture for Efficient Modeling of Long Temporal Contexts | Vijayaditya Peddinti, Daniel Povey, Sanjeev Khudanpur | 最佳学生论文 |
11 | Representing Nonspeech Audio Signals through Speech Classification Models | Huy Phan, Lars Hertel, Marco Maass, Radoslaw Mazur, Alfred Mertins | |
12 | Objective Intelligibility Assessment of Text-To-Speech Systems through Utterance Verification | Raphael Ullmann, Ramya Rasipuram, Mathew Magimai.-Doss, Hervé Bourlard | 最佳学生论文 |
- 2014 年
时间:2014 年 9 月 14 日 ~ 18 日
地点:新加坡
共 1173 篇被审稿,最终接收 614 篇论文
序号 |
最佳学生论文(提名) | 作者 | 获奖情况 |
---|---|---|---|
1 | Investigating the Effect of F0 and Vocal Intensity on Harmonic Magnitudes: Data from Laryngeal High-Speed Video Endoscopy | Gang Chen, Soo Jin Park, Jody Kreiman, Abeer Alwan | |
2 | Automatic Estimation of the Lip Radiation Effect in Glottal Inverse Filtering | Manu Airaksinen, Tom Bäckström, Paavo Alku | |
3 | Modeling Therapist Empathy through Prosody in Drug Addiction Counseling | Bo Xiao, Daniel Bone, Maarten Van Segbroeck, Zac Imel, David Atkins, Panayiotis Georgiou, Shrikanth Narayanan | |
4 | Intrinsic Spectral Analysis Based on Temporal Context Features for Query by Example Spoken Term Detection | Peng Yang, Cheung-Chi Leung, Lei Xie, Bin Ma, Haizhou Li | |
5 | Direct F0 Control of an Electrolarynx Based on Statistical Excitation Feature Prediction and Its Evaluation through Simulation | Tomoki Toda, Graham Neubig, Sakriani Sakti, Satoshi Nakamura, Ko Tanaka | |
6 | Towards a Neural Measure of Perceptual Distance - Classification of Electroencephalographic Responses to Synthetic Vowels | Manson Cheuk-Man Fong, James William Minett, Thierry Blu, William Shi-Yuan Wang | |
7 | Exploring Modulation Spectrum Features for Speech-Based Depression Level Classification | Elif Bozkurt, Orith Toledo - Ronen, Alexander Sorin, Ron Hoory | |
8 | Lexical Representation of Consonant, Vowels and Tones in Early Childhood | Hwee Hwee Goh, Charlene Fu, Kheng Hui Yeo | |
9 | Speech Synthesis in Various Communicative Situations: Impact of Pronunciation Variations | Sandrine Brognaux, Benjamin Picart, Thomas Drugman | 最佳学生论文 |
10 | Adaptive Speech Recognition and Dialogue Management for Users with Speech Disorders | Iñigo Casanueva, Heidi Christensen, Thomas Hain, Phil Green | |
11 | Word-level Invariant Representations from Acoustic Waveforms | Stephen Voinea, Chiyuan Zhang, Georgios Evangelopoulos, Lorenzo Rosasco, Tomaso Poggio | 最佳学生论文 |
12 | Acoustic Modeling with Deep Neural Networks Using Raw Time Signal for LVCSR | Zoltán Tüske, Pavel Golik, Ralf Schlüter, Hermann Ney | 最佳学生论文 |
- 2013 年
时间:2013 年 9 月 14 日 ~ 18 日
地点:法国 里昂
序号 |
最佳学生论文 | 作者 |
---|---|---|
1 | Speaker and Noise Independent Voice Activity Detection | François Germain, Dennis Sun, Gautham Mysore |
2 | A Two-Step Technique for MRI Audio Enhancement Using Dictionary Learning and Wavelet Packet Analysis | Colin Vaz, Vikram Ramanarayanan, Shrikanth Narayanan |
3 | Using Text and Acoustic Features to Diagnose Progressive Aphasia and Its Subtypes | Kathleen Fraser, Frank Rudzicz, Elizabeth Rochon |
- 2012 年
时间:2012 年 9月 9 日 ~ 13 日
地点:美国 波特兰
序号 |
最佳学生论文 | 作者 |
---|---|---|
1 | Discriminatively Learning Factorized Finite State Pronunciation Models from Dynamic Bayesian Networks | Preethi Jyothi, Eric Fosler-Lussier, Karen Livescu |
2 | MAP Estimation of Whole-Word Acoustic Models with Dictionary Priors | Keith Kintzley, Aren Jansen, Hynek Hermansky |
3 | Age Estimation from Telephone Speech using i-vectors | Mohamad Hasan Bahari, Mitchell McLaren, Hugo Van hamme, David Van Leeuwen |
- 2011 年
时间:2011 年 8 月 27 日 ~ 31 日
地点:意大利 佛罗伦萨
序号 |
最佳学生论文 | 作者 |
---|---|---|
1 | Low-Frequency Bandwidth Extension of Telephone Speech Using Sinusoidal Synthesis and Gaussian Mixture Model | Hannu Pulakka, Ulpu Remes, Santeri Yrttiaho, Kalle Palomäki, Mikko Kurimo, Paavo Alku |
2 | One-to-Many Voice Conversion Based on Tensor Representation of Speaker Space | Daisuke Saito, Keisuke Yamamoto, Nobuaki Minematsu, Keikichi Hirose |
3 | Modelling Novelty Preference in Word Learning | Maarten Versteegh, Louis ten Bosch, Lou Boves |
- 2010 年
时间:2010 年 9 月 26 日 ~ 30 日
地点:日本 千葉市
序号 |
最佳学生论文 | 作者 |
---|---|---|
1 | Did you say susi or shushi? Measuring the Emergence of Robust Fricative Contrasts in English- and Japanese-Acquiring Children | Jeffrey J. Holliday, Mary E. Beckman, Chanelle Mays |
2 | Using Non-Native Error Patterns to Improve Pronunciation Verification | Joost van Doremalen, Catia Cucchiarini, Helmer Strik |
3 | Reliable Tracking Based on Speech Sample Salience of Vocal Cycle Length Perturbations | Christophe Mertens, Francis Grenez, Lise Crevier-Buchman, Jean Schoentgen |
- 2009 年
时间:2009 年 9 月 6 日 ~ 10 日
地点:英国 布莱顿
序号 |
最佳学生论文 | 作者 |
---|---|---|
1 | Sequencing of Articulatory Gestures using Cost Optimization | Juraj Simko, F. Cummins. |
2 | A Deterministic plus Stochastic Model of the Residual Signal for Improved Parametric Speech Synthesis | Thomas Drugman, G. Wilfart, T.Dutoit |
3 | On the Semi-Supervised Learning of Multi-Layered Perceptrons | Jonathan Malkin, Amarnag Subramanya, J. Bilmes |
- 2008 年
时间:2008 年 9 月 22 日 ~ 26 日
地点:澳大利亚 布里斯班
序号 |
最佳学生论文 | 作者 |
---|---|---|
1 | On the Equivalence of Gaussian and Log-linear HMMS | Georg Heigold, Lehnen, P., Schlueter, R., Ney, H. |
2 | Combining Continuous Progressive Model Adaptation and Factor Analysis for Speaker Verification | Mitchell McLaren, Matrouf, D., Vogt R., Bonastre, J. |
3 | Effect of Intonational Phrase Boundaries on Pitch-Accented Syllables in American English | Yen-Liang Shue, Shuttuck-Hufnagel, S., Iseli, M., Jun S., Veilleux N., Alwan, A. |
- 2007 年
时间:2007 年 8 月 27 日 ~ 31 日
地点:比利时 安特卫普
序号 |
最佳学生论文 | 作者 |
---|---|---|
1 | Speech Recognition Techniques for a Sign Language Recognition System | Philippe Dreuw, David Rybach, Thomas Deselaers, Morteza Zahedi, Hermann Ney |
2 | An Empirical Investigation of the Nonuniqueness in the Acoustic-to-Articulatory Mapping | Chao Qin, Miguel A. Carreira-Perpinan |
- 2006 年
时间:2006 年 9 月 17 日 ~ 21 日
地点:美国 匹兹堡
序号 |
最佳学生论文 | 作者 |
---|---|---|
1 | Detecting Question-Bearing Turns in Spoken Tutorial Dialogues | Jackson Liscombe, Jennifer J. Venditti, Julia Hirschberg |
2 | Soft Margin Estimation of Hidden Markov Model Parameters | Jinyu Li, Ming Yuan, Chin-Hui Lee |
3 | Acoustic Cues for the Classification of Regular and Irregular Phonation | Kushan Surana, Janet Slifka |
- 2005 年
时间:2005 年 9 月 4 日 ~ 8 日
地点:葡萄牙 里斯本
序号 |
最佳学生论文 | 作者 |
---|---|---|
1 | On the Integration of Speech Recognition and Statistical Machine Translation | E. Matusov, S. Kanthak, H. Ney. |
- 2004 年
时间:2004 年 10 月 4 日 ~ 8 日
地点:韩国 济州岛
序号 |
最佳学生论文 | 作者 |
---|---|---|
1 | Hot Discussion or Frosty Dialogue? Towards a Temperature Metric for Conversational Interactivity | Peter Reichl, Florian Hammer |