Haşim SakPh.D. Candidate
Department of Computer Engineering
34342 Bebek, İstanbul, Turkey
Phone: +90 212 359 7095
Fax: +90 212 287 2461
I am a Ph.D. candidate in the department of Computer Engineering at Boğaziçi University. My thesis advisors are Tunga Güngör and Murat Saraçlar. I have a B.S. in Computer Science from Bilkent University.
My current research focuses on developing a large vocabulary automatic speech recognition system for Turkish. I am concentrating on the language modeling and speech decoding challenges associated with agglutinative languages and rich morphology.
In my M.S. thesis here, I worked on developing "A corpus-based concatenative speech synthesis system for Turkish". (BibTeX)
I have industrial research experience in speech technologies specifically speech recognition and text-to-speech. For more information you can see my CV.
Research InterestsSpeech and Language Processing, Computational Linguistics:
Automatic Speech Recognition, Speech Synthesis, Statistical Language Modeling,
Morphological Parsing, Spelling Correction, Morphological Disambiguation.
Language Resources & Software
- A finite-state stochastic morphological parser for Turkish (Linux-32bit) (Linux-64bit).
- An averaged perceptron-based morphological disambiguator for Turkish text (download).
- A text corpus compiled from the web (download).
- An averaged perceptron-based morphological disambiguator for Turkish text. This works with the Kemal Oflazer's parser output (download).
- Haşim Sak, Tunga Güngör, and Murat Saraçlar: Resources for Turkish Morphological Processing. Language Resources and Evaluation, Vol. 45, No. 2, pp. 249–261, 2011.
- Haşim Sak, Murat Saraçlar, and Tunga Güngör: On-the-fly Lattice Rescoring for Real-time Automatic Speech Recognition. INTERSPEECH 2010, pp. 2450-2453, 2010.
- Haşim Sak, Murat Saraçlar, and Tunga Güngör: Morphology-based and Sub-word Language Modeling for Turkish Speech Recognition. Acoustics Speech and Signal Processing (ICASSP), pp. 5402–5405, 2010.
- Haşim Sak, Murat Saraçlar, and Tunga Güngör: Integrating Morphology into Automatic Speech Recognition. IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU), 2009, pp. 354–358.
- Ebru Arısoy, Doğan Can, Sıddıka Parlak, Haşim Sak, and Murat Saraçlar: Turkish Broadcast News Transcription and Retrieval. IEEE Transactions on Audio, Speech & Language Processing, Vol. 17, No. 5, pp. 874–883, 2009.
- Haşim Sak, Tunga Güngör, and Murat Saraçlar. Turkish Language Resources: Morphological Parser, Morphological Disambiguator and Web Corpus. In GoTAL 2008, volume 5221 of LNCS, 2008, pages 417-427. Springer. (springerlink) (BibTeX)
- E. Arısoy, M. Kurimo, M. Saraçlar, T. Hirsimäki, J. Pylkkönen, T. Alumäe and H. Sak. 2008. Statistical Language Modeling for Automatic Speech Recognition of Agglutinative Languages. Speech Recognition; Techniques, Technology and Applications, I-Tech Education and Publishing, Vienna, Austria.
- Tuncay Aksungurlu, Sıddıka Parlak, Haşim Sak, Murat Saraçlar. Turkçe Haber Programları için Dil Modelleme Yaklaşımlarının Karşılaştırılması. IEEE 16. Sinyal İşleme, İletişim ve Uygulamaları Konferansı (SİU). Didim, Türkiye, 2008. (pdf)
- Ebru Arısoy, Haşim Sak, and Murat Saraçlar. Language modeling for automatic Turkish broadcast news transcription. In Proceedings of Interspeech 2007 - Eurospeech, pp. 2381-2384, 2007. (pdf) (BibTeX)
- Haşim Sak, Tunga Güngör, and Murat Saraçlar. Morphological disambiguation of Turkish text with perceptron algorithm. In CICLing 2007, volume LNCS 4394, pages 107-118, 2007. (springerlink) (pdf) (BibTeX)
- Haşim Sak, Tunga Güngör, and Yaşar Safkan. A corpus-based concatenative speech synthesis system for Turkish. Turkish Journal of Electrical Engineering and Computer Sciences, 14(2):209-223, 2006. (pdf) (BibTeX) (Sample TTS output)
- Haşim Sak, Tunga Güngör, and Yaşar Safkan. Generation of synthetic speech from Turkish text. In 13th European Signal Processing Conference (EUSIPCO 2005), 2005. (pdf) (BibTeX)