BUHMAP-DB Homepage

Boğaziçi University Head Motion Analysis Project Database, Perceptual Intelligence Laboratory, Department of Computer Engineering

Introduction

Sign languages are visual languages. The message is not only transferred via hand gestures (manual signs) but also head/body motion and facial expressions (non-manual signs). In order to test the efficiency of algorithms to analyze and classify the non-manual gestures, we present a database of non-manual signs in this web page. The non-manual signs which are frequently used in Turkish Sign Language (TSL) and those changing the meaning of the performed sign considerably are selected as the sign classes in the database. There are also additional signs which we use in daily life during speaking. The database contains the videos of the selected 8 different classes of signs as well as a ground truth data of 52 manually landmarked points of the face.

About the Database

The non-manual gesture classes used in the database


The database is formed of the following 8 different classes of signs (each class name has a link to a sample video):
  1. Neutral: The neutral state of the face. The subject neither moves his/her face nor makes any facial expressions.
  2. Head L-R: Shaking the head to right and left sides. The initial side varies among subjects, and the shaking continues about 3-5 times. This sign is frequently used for negation in TSL.
  3. Head Up: Raise the head upwards while simultaneously raising the eyebrows. This sign is also frequently used for negation in TSL. 
  4. Head F: Head is moved forward accompanied with raised eyebrows. This sign is used to change the sentence into a question form in TSL. It resembles the surprise expression used in daily life.
  5. Sadness: Lips turned down, eyebrows down. It is used to show sadness, e.g. when apologizing. Some subjects also move their head downwards.
  6. Head U-D: Nodding head up and down continuously. Frequently used for agree-ment.
  7. Happiness: Lips turned up. Subject smiles.
  8. Happy U-D: Head U-D + Happiness. The preceding two classes are performed together. It is introduced to be a challenge for the classifier in successfully dis- tinguishing this confusing class with the two preceding ones.

In the figure left, some frames captured from different sign classes can be seen.

Properties of the Database

  • It involves 11 different subjects (6 female, 5 male).
  • Each subject performs 5 repetitions for each of 8 classes. So there are a total number of 440 videos in the database.
  • Each video lasts about 1-2 seconds.
  • Philips SPC900NC web cam is used with choice of 640×480 resolution and 30 fps.
  • The recording is done in a room eliminated from sunlight and illuminated by using daylight halogen and fluorescent lights.
  • The videos are compressed with “Indeo 5.10” video codec.
  • Each video starts in neutral state, the sign is performed and again ends in neutral state.
  • No subjects have beard, moustache or eyeglasses.
  • There is no occlusion or motion blur.
  • 48 of the videos are annotated as ground truth data.

Annotated Videos

In order to satisfy different experiments on the database, a preferably large number of facial landmarks are chosen for manual annotation. The selected 52 points can be seen here. Due to the difficulty in manually annotating these landmarks in all frames, only 3 repetitions of 4 classes (Head L-R, Head Up, Head F, Happiness) performed by 4 subjects (2 male, 2 female) are annotated in the database. So, there are a total of 48 annotated videos. In total 2880 (48 videos × 60 average frames per video) frames are annotated.

File Protocol

The video files are named as "[subjectname]_[#classId]_[#repetitionNo].avi". For example, if you download the Head F sample video ("ismail_4_1.avi"), you see that this video belongs to "ismail" and this is the first repetition of him performing the sign of 4th class.

The ground truth files involve the location of the landmarks. That is, ith row in a file involves the landmark locations (x1, y1, x2, y2, ..., xL, yL) in the  ith frame of the corresponding video. To illustrate, this MATLAB code snippet reads and animates the landmarks in this sample file

Publications using BUHMAP-DB

How to Get

The database is available and free for academic research purposes. The ground truth data can be downloaded here and the dataset can be downloaded here


How to Cite

If you use the database, please cite the BUHMAP-DB as: 

For Turkish: Aran, O., Arı, İ., Güvensan, M. A., Haberdar, H., Kurt, Z., Türkmen, H. İ., Uyar, A., Akarun, L., "Türk İşaret Dili Yüz İfadesi ve Baş Hareketi Veritabanı", Sinyal İşleme ve Uygulamaları Konferansı (SİU2007), Eskişehir, 2007.
For English: Aran, O., Arı, İ., Güvensan, M. A., Haberdar, H., Kurt, Z., Türkmen, H. İ., Uyar, A., Akarun, L., "A Database of Non-Manual Signs in Turkish Sign Language", Signal Processing and Communications Applications (SIU2007), Eskişehir, 2007

Credits

The BUHMAP database is collected with the collobarion of the project members (Amaç Güvensan, Aslı Uyar, Hakan Haberdar, İsmail Arı, İrem Türkmen, Rüştü Derici, and Zeyneb Kurt) under the supervision of Prof. Lale Akarun and kind helps of Abuzer Yakaryılmaz, Didem Çınar,  Neşe Alyüz, Onur Güngör, Oya Aran, Öner Zafer, and Pınar Santemiz, 

page top

Last update: Dec '09