A tool for detecting De-Da Mi-Mi Ki-Ki spelling mispellings in Turkish

A tool for detecting De-Da Mi-Mi Ki-Ki spelling mispellings in Turkish

Image
The mispelling of de-da is a hot topic among Turkish speakers. Not only are there a lot of spelling mistakes for this case, it is can be quite triggering for many people. Unfortunately, spellchecker/correctors are not very successful in detecting these errors.

 

This project will further develop an exısting machine learning model [3] that detects the incorrect usage of Turkish clitic `de-da'. Then this model will be integrated into a web based application that is integrated into social media.

 

In particular a mechanism for collecting new de-da examples  from Twitter will be implemented and results will be shared. Also, functionality will be offered to detect misspelled words.
 
The coding will be in python. You must be comfortable with APIs and developing web applications. 
 
Useful links:
  1. Complaints related to such mispellings are plenty: see: https://twitter.com/search?q=de%20ayri%20yazilir&src=typed_query
  2. The Role of Contextual Word Embeddings in Correcting the ‘de/da’ Clitic Errors in Turkish, Hasan Öztürk; Alperen Değirmenci; Onur Güngör; Suzan Uskudarli, SIU, 2020.
  3. An earlier work for exploring how well people can identify whether de-da should be together or apart: http://dedatakintisi.derlem.com/deda/bul/
     

  •  

Weekly project meetings will include Dr. Onur Güngör

Project Poster: 

Project Members: 

Mustafa Emir Çolak
Ahmet Melih Aydoğdu

Project Advisor: 

Suzan Üsküdarlı

Project Status: 

Project Year: 

2022
  • Spring

Contact us

Department of Computer Engineering, Boğaziçi University,
34342 Bebek, Istanbul, Turkey

  • Phone: +90 212 359 45 23/24
  • Fax: +90 212 2872461
 

Connect with us

We're on Social Networks. Follow us & get in touch.