Haşim Sak
Ph.D. Candidate
Contact Info:
Department of Computer Engineering
Boğaziçi University
34342 Bebek, İstanbul, Turkey
Email:
Gmail:
Phone: +90 212 359 7095
Fax: +90 212 287 2461
Department of Computer Engineering
Boğaziçi University
34342 Bebek, İstanbul, Turkey
Email:

Gmail:

Phone: +90 212 359 7095
Fax: +90 212 287 2461
I am a Ph.D. candidate in the department of Computer Engineering at Boğaziçi University. My thesis advisors are Tunga Güngör and Murat Saraçlar. I have a B.S. in Computer Science from Bilkent University.
My current research focuses on developing a large vocabulary automatic speech recognition system for Turkish. I am concentrating on the language modeling and speech decoding challenges associated with agglutinative languages and rich morphology.
In my M.S. thesis here, I worked on developing "A corpus-based concatenative speech synthesis system for Turkish". (BibTeX)
I have industrial research experience in speech technologies specifically speech recognition and text-to-speech. For more information you can see my CV.
Research Interests
Speech and Language Processing, Computational Linguistics:Automatic Speech Recognition, Speech Synthesis, Statistical Language Modeling,
Morphological Parsing, Spelling Correction, Morphological Disambiguation.
Language Resources & Software
- A finite-state stochastic morphological parser for Turkish (Linux-32bit) (Linux-64bit). Please contact me if you need the source version for non-commercial research purposes.
- An averaged perceptron-based morphological disambiguator for Turkish text (download).
- A text corpus compiled from the web (download).
- An averaged perceptron-based morphological disambiguator for Turkish text. This works with the Kemal Oflazer's parser output (download).
Publications
- Ebru Arısoy, Doğan Can, Sıddıka Parlak, Haşim Sak, and Murat Saraçlar: Turkish Broadcast News Transcription and Retrieval. IEEE Transactions on Audio, Speech & Language Processing, 2009. (in print).
- Haşim Sak, Tunga Güngör, and Murat Saraçlar. Turkish Language Resources: Morphological Parser, Morphological Disambiguator and Web Corpus. In GoTAL 2008, volume 5221 of LNCS, 2008, pages 417-427. Springer. (springerlink) (BibTeX) E. Arisoy, M. Kurimo, M. Saraclar, T. Hirsimäki, J. Pylkkönen, T. Alumäe and H. Sak. 2008. Statistical Language Modeling for Automatic Speech Recognition of Agglutinative Languages. To appear in Speech Recognition; Techniques, Technology and Applications, I-Tech Education and Publishing, Vienna, Austria.
- Tuncay Aksungurlu, Sıddıka Parlak, Haşim Sak, Murat Saraçlar. Turkçe Haber Programları için Dil Modelleme Yaklaşımlarının Karşılaştırılması. IEEE 16. Sinyal İşleme, İletişim ve Uygulamaları Konferansı (SİU). Didim, Türkiye, 2008. (pdf)
- Ebru Arısoy, Haşim Sak, and Murat Saraçlar. Language modeling for automatic Turkish broadcast news transcription. In Proceedings of Interspeech 2007 - Eurospeech (To appear), 2007. (pdf) (BibTeX)
- Haşim Sak, Tunga Güngör, and Murat Saraçlar. Morphological disambiguation of Turkish text with perceptron algorithm. In CICLing 2007, volume LNCS 4394, pages 107-118, 2007. (springerlink) (pdf) (BibTeX)
- Haşim Sak, Tunga Güngör, and Yaşar Safkan. A corpus-based concatenative speech synthesis system for Turkish. Turkish Journal of Electrical Engineering and Computer Sciences, 14(2):209-223, 2006. (pdf) (BibTeX) (Sample TTS output)
- Haşim Sak, Tunga Güngör, and Yaşar Safkan. Generation of synthetic speech from Turkish text. In 13th European Signal Processing Conference (EUSIPCO 2005), 2005. (pdf) (BibTeX)
