Senior researcher and project manager at the Telecommunications Research Center Vienna (FTW). |
|
April 2012: Call for Participation: ACM 3rd International Symposium on Facial Analysis and Animation (FAA 2012), Vienna 21st September 2012 |
|
Projects |
|
Publications |
|
Teaching |
|
Curriculum Vitae |
|
Professional activities |
|
Projects |
|
Ongoing Research Projects |
|
| [FWF: P23821-N23] AMTV - Acoustic modeling and transformation of varieties for speech synthesis (as principal investigator) | |
| [FWF: P22890-N23] AVDS - Adaptive Audio-Visual Dialect Synthesis (as principal investigator) | |
Completed Research Projects |
|
| [EU-COST] Cost 2102: Cross-Modal Analysis of Verbal and Non-verbal Communication | |
| [WWTF] VSDS - Viennese Sociolect and Dialect Synthesis (as principal investigator) | |
| [COMET] HI-MONI - Highway Monitoring (as project manager) | |
| [T-LABS] TIDE - Testbed for Interactive Dialog System Evaluation (as project manager) | |
| [EU-FP6] AMI - Augmented Multiparty Interaction | |
| [K-PLUS] MONA - Mobile Multimodal Next Generation Applications | |
| [K-PLUS] Service Platform and Interoperability | |
| [K-PLUS] Speech and More | |
Development Projects |
|
| September 2011: Development of HMM-based version of 2 Austrian German adapted voices (mpu, kep) for the CSTR HTS Voice Library (freely available for academic research). | |
| March 2011: Development of HMM-based version of Austrian German voice "Leopold" (leo), which was added to the CSTR HTS Voice Library (freely available for academic research). | |
| September 2010: "Leopold" available for Windows and Mac OSX from the Webshop of Cereproc, UK. | |
| May 2010: Development of "Leopold" the first synthetic voice for Austrian German together with company partners, which was integrated into a web reading service for the Website of the City of Vienna (Die Wiener Stimme). | |
| May 2010: Open source release of 3 Viennese voices for the Festival Speech Synthesis System presented at the 7th International conference on Language Resources and Evaluation (LREC) [conference paper]. | |
Publications |
|
| Note concerning IEEE publications: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder (see IEEE copyright policies). | |
Journal articles |
|
| 2012, Phillip L. De Leon, Michael Pucher, Junichi Yamagishi, Inma Hernaez, Ibon Saratxaga Evaluation of Speaker Verification Security and Detection of Synthetic Speech. IEEE Transactions on Audio, Speech, and Language Processing. (ACCEPTED) | |
| 2010, Michael Pucher, Dietmar Schabus, Junichi Yamagishi, Friedrich Neubarth, Volker Strom, Modeling and interpolation of Austrian German and Viennese dialect in HMM-based speech synthesis. Speech Communication, Volume 52, Issue 2, February 2010, Pages 164-179. [audio samples]. | |
| 2002, Georg Niklfeld, Michael Pucher, Robert Finan, Wolfgang Eckhart, Kombinierte Sprache/Display-Schnittstellen für mobile Datendienste. PIK - Praxis der Informationsverarbeitung und Kommunikation, 25 (4), pages 196-201. | |
Conference and workshop papers |
|
Speech synthesis |
|
| 2012, Dietmar Schabus, Michael Pucher, Gregor Hofer, Speaker-adaptive visual speech synthesis in the HMM-framework . INTERSPEECH 2012 (SUBMITTED). | |
| 2012, Ibon Saratxaga, Inma Hernaez, Michael Pucher, Eva Navas, Inaki Sainz, Perceptual Importance of the Phase Related Information in Speech. INTERSPEECH 2012 (SUBMITTED). | |
| 2012, Michael Pucher, Dietmar Schabus, Gregor Hofer, Nadja Kerschhofer-Puhalo, Sylvia Moosmüller, Regionalizing Virtual Avatars - Towards Adaptive Audio-Visual Dialect Speech Synthesis. CogSys 2012, 5th International Conference on Cognitive Systems, Vienna, Austria, pp. 95. | |
| 2012, Michael Pucher, Nadja Kerschhofer-Puhalo, Dietmar Schabus, Sylvia Moosmüller, Gregor Hofer, Sprachressourcen für adaptive Sprachsynthese von Dialekten. 7. Kongress der Internationalen Gesellschaft für Dialektologie und Geolinguistik (SIDG), Vienna, Austria. | |
| 2012, Michael Pucher, Dietmar Schabus, Gregor Hofer, Von Wienerisch zu Österreichisch und wieder zurück - Ein Berechnungsverfahren zur Realisierung eines Varietätenreglers. 7. Kongress der Internationalen Gesellschaft für Dialektologie und Geolinguistik (SIDG), Vienna, Austria. | |
| 2012, Dietmar Schabus, Michael Pucher, Gregor Hofer, Building a synchronous corpus of acoustic and 3D facial marker data for adaptive audio-visual speech synthesis. LREC 2012, Istanbul, Turkey. | |
| 2011, Michael Pucher, Nadja Kerschhofer-Puhalo, Dietmar Schabus, Phone set selection for HMM-based dialect speech synthesis. 1st Workshop on Algorithms and Resources for Modelling of Dialects and Language Varieties (DIALECTS 2011). EMNLP 2011: Conference on Empirical Methods in Natural Language Processing, Edinburgh, UK, pp. 65-69. | |
| 2011, Dietmar Schabus, Michael Pucher, Gregor Hofer, Simultaneous Speech and Animation Synthesis. Poster at 38th International Conference and Exhibition on Computer Graphics and Interactive Techniques (SIGGRAPH 2011), Vancouver, Canada. [video] | |
| 2010, Michael Pucher, Dietmar Schabus, Junichi Yamagishi, Synthesis of fast speech with interpolation of adapted HSMMs and its evaluation by blind and sighted listeners. 11th Annual Conference of the International Speech Communication Association (INTERSPEECH 2010), Makuhari, Japan, pp. 2186-2189. | |
| 2010, Michael Pucher, Friedrich Neubarth, Volker Strom, Sylvia Moosmüller, Gregor Hofer, Christian Kranzler, Gudrun Schuchmann, Dietmar Schabus, Resources for speech synthesis of Viennese varieties. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC), Valletta, Malta. [presentation] | |
| 2009, Michael Pucher, Friedrich Neubarth, Volker Strom, Optimizing phonetic encoding for Viennese dialect unit selection speech synthesis. COST 2102 conference, Dublin, 2009. | |
| 2009, Christian Kranzler, Franz Pernkopf, Rudolf Muhr, Michael Pucher, Friedrich Neubarth, Text-to-Speech Engine with Austrian German Corpus. In Proceedings of the XIII International conference Speech and Computer (SPECOM 2009), St. Petersburg, Russia. | |
| 2008, Michael Pucher, Gudrun Schuchmann, Peter Fröhlich, Regionalized Text-to-Speech Systems: Persona Design and Application Scenarios. In Lecture Notes in Artificial Intelligence (LNAI), volume 5398, pages 216-222. COST Action 2102 School, Vietri sul Mare, Italy. | |
| 2008, Friedrich Neubarth, Michael Pucher, Christian Kranzler, Modeling Austrian dialect varieties for TTS. In Proceedings of the 9th Annual Conference of the International Speech Communication Association (INTERSPEECH 2008), pages 1877-1880, Brisbane, Australia. | |
| 2005, Michael Pucher, Peter Fröhlich, A user study on the influence of mobile device class, synthesis method, data rate and lexicon on speech synthesis quality. In Proceedings of the 9th European Conference on Speech Communication and Technology (EUROSPEECH 2005), pages 2501-2504, Lisboa, Portugal. | |
| 2003, Michael Pucher, Friedrich Neubarth, Erhard Rank, Georg Niklfeld, Qi Guan, Combining non-uniform unit selection with diphone based synthesis. In Proceedings of the 8th European Conference on Speech Communication and Technology (EUROSPEECH 2003), pages 1329-1332, Geneva, Switzerland. | |
Speaker recognition |
|
| 2011, Phillip L. De Leon, Inma Hernaez, Ibon Saratxaga, Michael Pucher, Junichi Yamagishi, Detection of synthetic speech for the problem of imposture. In Proceedings of the 36th International Conference on Acoustics, Speech and Signal Processing (ICASSP), Prague, Czech Republic, pp. 4844-4847. | |
| 2010, Phillip L. De Leon, Michael Pucher, Junichi Yamagishi, Evaluation of the Vulnerability of Speaker Verification to Synthetic Speech. In Proceedings of Odyssey 2010 - The Speaker and Language Recognition Workshop, Brno, Czech Republic. | |
| 2010, Phillip L. De Leon, Vijendra Raj Apsingekar, Michael Pucher, Junichi Yamagishi, Revisiting the security of speaker verification systems against imposture using synthetic speech. In Proceedings of the 35th International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Dallas, USA. | |
Sensor fusion |
|
| 2011, Dietmar Schabus, Thomas Zemen, Michael Pucher, Distributed Field Estimation Algorithms in Vehicular Sensor Networks. IEEE 73rd Vehicular Technology Conference (VTC2011-Spring), Budapest, Hungary. | |
| 2010, Michael Pucher, Dietmar Schabus, Peter Schallauer, Yuriy Lypetskyy, Franz Graf, Harald Rainer, Michael Stadtschnitzer, Sabine Sternig, Josef Birchbauer, Wolfgang Schneider, Bernhard Schalko, Multimodal Highway Monitoring for Robust Incident Detection. 13th International IEEE Conference on Intelligent Transportation Systems (ITSC), Madeira, Portugal. | |
Multimodal and spoken dialog systems |
|
| 2010, Michael Pucher, Friedrich Neubarth, Dietmar Schabus, Design and development of spoken dialog systems incorporating speech synthesis of Viennese varieties . In Proceedings of the 12th International Conference on Computers Helping People with Special Needs (ICCHP 2010), Vienna, Austria. | |
| 2007, Michael Pucher, Andreas Türk, Jitendra Ajmera, Natalie Fecher, Phonetic distance measures for speech recognition vocabulary and grammar optimization . In Proceedings of the 3rd congress of the Alps Adria Acoustics Association, Graz, Austria. | |
| 2007, Sebastian Möller, Klaus Peter Engelbrecht, Michael Pucher, Peter Fröhlich, Lu Huo, Ulrich Heute, Frank Oberle, TIDE: A testbed for interactive spoken dialogue system evaluation . In Proceedings of the XII International conference Speech and Computer (SPECOM 2007), Moscow, Russia. | |
| 2005, Georg Niklfeld, Hermann Anegg, Michael Pucher, Raimund Schatz, Rainer Simon, Florian Wegscheider, Alexander Gassner, Michael Jank, Günther Pospischil, Device independent mobile multimodal user interfaces with the MONA Multimodal Presentation Server. In Proceedings of the Eurescom summit 2005 on Ubiquitous Services and Applications, Heidelberg, Germany. | |
| 2004, Hermann Anegg, Thomas Dangl, Michael Jank, Georg Niklfeld, Michael Pucher, Raimung Schatz, Rainer Simon, Florian Wegscheider, Multimodal interfaces in mobile devices - the MONA project . In Proceedings of the Workshop on Emerging Applications for Wireless and Mobile Access. 13th International World Wide Web Conference (WWW 2004), New York, USA. | |
| 2004, Lynne Baillie, Michael Pucher, Marian Kepesi, A multimodal mobile robot for the home . In Proceedings of the IADIS International Conference e-Society 2004, Avila, Spain. | |
| 2004, Lynne Baillie, Michael Pucher, Marian Kepesi, A supportive multimodal mobile robot for the home . In Lecture Notes in Computer Science (LNCS), volume 3196, pages 375-383. 8th ERCIM Workshop on User Interfaces for All, Vienna, Austria. | |
| 2003, Michael Pucher, Marian Kepesi, Multimodal Mobile Robot Control using Speech Application Language Tags. In Lecture Notes in Computer Science (LNCS), volume 2875, pages 56-64. European Symposium on Ambient Intelligence, Eindhoven, the Netherlands. | |
| 2003, Michael Pucher, Julia Tertyshnaya, Florian Wegscheider, Personal voice call assistant: SIP and VoiceXML in a distributed environment. In Proceedings of the Workshop on Emerging Applications for Wireless and Mobile Access. 12th International World Wide Web Conference (WWW 2003), Budapest, Hungary. | |
| 2002,Georg Niklfeld, Michael Pucher, Robert Finan, Wolfgang Eckhart, Mobile multi-modal data services for GPRS phones and beyond. In Proceedings of the 4th IEEE International Conference on Multimodal Interfaces (ICMI 2002), Pittsburgh, USA. | |
| 2002,Georg Niklfeld, Michael Pucher, Robert Finan, Wolfgang Eckhart, Steps towards multi-modal data services in GPRS and in UMTS or WLAN networks . In Proceedings of the ISCA Tutorial and Research Workshop on Multi-Modal Dialogue in Mobile Environments, Irsee, Germany. | |
| 2001, Georg Niklfeld, Robert Finan, Michael Pucher,Multimodal interface architecture for mobile data services. In Proceedings of the Workshop on Wearable Computing (TCMC 2001) , Graz, Austria. | |
| 2001, Georg Niklfeld, Robert Finan, Michael Pucher, Architecture for adaptive multimodal dialog systems based on VoiceXML. In Proceedings of the 7th European Conference on Speech Communication and Technology (EUROSPEECH 2001), pages 2341-2344, Aalborg, Denmark. | |
| 2001, Georg Niklfeld, Robert Finan, Michael Pucher, Component-based multimodal dialog interfaces for mobile knowledge creation. In Proceedings of the Workshop on Human Language Technology and Knowledge Management, pages 103-110. 39th Annual Meeting of the Association for Computational Linguistics (ACL 2001), Toulouse, France. | |
Language modeling for speech recognition |
|
| 2007, Michael Pucher, WordNet-based semantic relatedness measures in automatic speech recognition for meetings. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL 2007), pages 129-132, Prague, Czech Republic. | |
| 2006, Michael Pucher, Yan Huang, Özgür Çetin, Combination of latent semantic analysis based language models for meeting recognition. In Proceedings of the Second IASTED International Conference on Computational Intelligence (CI 2006), pages 465-469, San Francisco, USA. | |
| 2006, Michael Pucher, Yan Huang, Özgür Çetin, Optimization of latent semantic analysis based language model interpolation for meeting recognition. In Proceedings of the 5th Slovenian and 1st International Language Technologies Conference, pages 74-78, Ljubljana, Slovenia. | |
| 2005, Michael Pucher, Performance evaluation of WordNet-based semantic relatedness measures for word prediction in conversational speech. In Proceedings of the 6th International Workshop on Computational Semantics (IWCS 6), pages 332-342, Tilburg, the Netherlands. | |
| 2005, Michael Pucher, Yan Huang, Latent semantic analysis based language models for meetings. 2nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI 2005), Edinburgh, UK. | |
Book chapters |
|
| 2008, Sebastian Möller, Klaus-Peter Engelbrecht, Michael Pucher, Peter Fröhlich, Lu Huo, Ulrich Heute, Frank Oberle, A New Testbed for Semi-automatic Usability Evaluation and Optimization of Spoken Dialogue Systems. In Usability of Speech Dialog Systems – Listening to the Target Audience (T. Hempel, ed.), pages 81-103, Springer, Berlin, Germany. | |
| 2005, Georg Niklfeld, Michael Pucher, Robert Finan, Wolfgang Eckhart, Wolfgang Minker, A Path to Multimodal Data Services for Telecommunications. In Spoken Multimodal Human-Computer Dialogue in Mobile Environments, pages 149-167, Springer, Netherlands. | |
Theses |
|
| 2007, Michael Pucher, Semantic Similarity in Automatic Speech Recognition for Meetings, Doctoral Thesis, Electrical Engineering, Graz University of Technology. | |
| 2001, Michael Pucher, Formale Wahrheitstheorien nach Alfred Tarski, Diploma Thesis, Department of Philosophy, University of Vienna. | |
Teaching |
|
| Winter semester 2011/2012: Lecture on Cognitive User Interfaces at Insitute of Computer Languages at TU Vienna. | |
| July 2011: Seminar on Audio-Visual Speech Synthesis at the Signal Processing Laboratory (Aholab) of the University of the Basque Country. | |
| Summer semester 2008: Seminar on Speech Synthesis at the Signal Processing and Speech Communication Laboratory (SPSC Lab) at TU Graz. | |
| I am co-supervising the following PhD theses at FTW: | |
| Dietmar Schabus, Adaptive audio-visual speech synthesis. | |
| I co-supervised the following diploma theses at FTW: | |
| 2009, Dietmar Schabus, Interpolation of Austrian German and Viennese Dialect / Sociolect in HMM-based Speech Synthesis. Diploma thesis. Technische Universität Wien, 2009. | |
| 2008, Christian Kranzler, Text-to-Speech Engine with Austrian German corpus. Diploma thesis. Technische Universität Graz, 2008. | |
| 2008, Michael Bruss, Quantitative und phonetische Analyse von nicht-linguistischen Partikeln in spontan gesprochener Sprache der Wiener Soziolekte. Magisterarbeit. Universität des Saarlandes, Saarbrücken, 2008. | |
Curriculum Vitae |
|
Education |
|
| 2007: Doctoral degree in Electrical Engineering (with distinction) from Graz University of Technology | |
| 2004 to 2007: Doctoral studies in Electrical Engineering at Graz University of Technology | |
| July 2005: Participation in "European Masters in Language and Speech" summer school, Edinburgh, UK | |
| 2001: Diploma in Philosophy (with distinction) from the University of Vienna | |
| 1995 to 2000: Studies in Computer Science (Computational Logic) at Vienna University of Technology | |
| 1994 to 2001: Studies in Logic, Philosophy, and Mathematics at the University of Vienna | |
| 1994: Studies in Interdisciplinary Art at Wiener Kunstschule | |
| January to April 1992: French language course in Paris | |
| 1984 to 1988: Cook apprenticeship | |
| 1979 to 1984: High school in Judenburg, Austria | |
| 1975 to 1979: Primary school in Trieben, Austria | |
Professional Experience |
|
| Since 2011: Lecturer at Vienna University of Technology (VUT) - Lecture on Cognitive User Interfaces (CUI) at the Institute for Computer Languages. | |
| August to September 2008: Visiting researcher at the Centre for Speech Technology Research (CSTR), University of Edinburgh, UK | |
| Since 2007: Senior Researcher at the Telecommunications Research Center Vienna (FTW) | |
| August 2006: Visiting researcher in the usability group at Deutsche Telekom Laboratories (T-Labs), Berlin, Germany | |
| February to July 2005: Visiting researcher in the speech group at the International Computer Science Institute in Berkeley (ICSI), California | |
| Since 2001: Researcher at the Telecommunications Research Center Vienna (FTW) | |
| 1999 to 2002: Software/database design and development with Java2/Oracle | |
| 1999: Teaching assistant at the Institute for Database Systems and Artificial Intelligence (DBAI) at Vienna University of Technology | |
| 1989 to 1993: Worked as a chef in restaurants in Austria and Liechtenstein | |
Professional activities |
|
Organizing |
|
| FAA 2012 - The ACM 3rd International Symposium on Facial Analysis and Animation | |
| ICAD 2005 Workshop - Combining Speech and Sound in the User Interface | |
Reviewing |
|
| Speech Communication, IEEE SPL, Computer Speech and Language | |
Memberships |
|
| ISCA, IEEE, ACM, EUCOG III, COST2102 | |