Study of Auditory and Acoustic Features of English Utterances Spoken by the Major Tribal Groups Khasi, Garo and Pnar of the State of Meghalaya
Audio Forensics plays an important role in the investigation of crime and judicial administration. The purposed work describes to a qualitative study of the influences of different dialects spoken by the three major tribal groups in Meghalaya state in India to their English utterances. The state is majorly a hills state with diverse languages spoken by people of different regions across the entire state. English language is the medium of education and for state official communication amongst the tribal groups. While speaking non-native language (English) they are always influenced by their native accent. In this present study, the three major spoken dialects of the state Meghalaya namely Khasi, Pnar and Garo have been chosen. A set of 30 sample speakers of males and females were selected from the three dialectal groups in equal proportion having common educational, equivalent age and normal health background. The Auditory and Acoustic characteristics of the vowels for the English utterances were measured from the intonation pattern and Formant frequencies included the first (F1) and second formant frequency (F2). The Mean value F1 and F2 for male and female of the three tribal groups speaking specific English utterances have been computed and were subjected for their graphical position showing overall distributions of the vowels on the vowel quadrilateral. The study shows that there are differences in the vowel quadrilaterals of the three tribal dialect speakers while speaking English utterances. The vowel quality, degree and nature of variation are different among their respective vowels of the speakers of these three dialects. The present studies provide major indication that these three tribal groups vary distinctly in their vowel quality in their speech utterances, reflecting distinct identity of a group or region they belong to. The overall distributions of vowels quality in the vowel quadrilateral are found to be unique in all the three tribal groups for both male and female. The study of overall distributions of the vowels on the vowel quadrilateral can be taken as an important parameter for the identification of the three tribal groups while Speaking English Utterances.
Introduction
Meghalaya is one of the states in Northeastern India. The population of Meghalaya as of 2016 was estimated to be 3,211,474. Meghalaya covers an area of approximately 22,430 Square Kilometers with length and breadth ratio of about 3:1. The state is bounded to the South and West by Bangladesh to the North and East by the India’s state of Assam. English is the widely spoken language of the state Meghalaya. The other major languages of Meghalaya are Khasi, Garo and Jaintia. The Khasi language and its various dialects are spoken mainly in the districts of West Khasi hills, East Khasi hills, and Jaintia hills, while Garo is spoken mainly in the districts East and West Garo hills.
Khasi is the official language of Meghalaya and is spoken by about 1,128,000 people residing in Meghalaya. The language spoken by the Khasi tribes hails from the Mon- Khmer family of the Austro-Asiatic [1]. Another language at Meghalaya is the language spoken by the people of the Jaintia hills also known as Pnar. This language, is in fact, a variation of the standard Khasi language and is spoken, along with the Khasi language, by the tribal groups, viz. Khynriam, Bhoi, Pnar and War. Garo, besides Khasi, is also the official language of Meghalaya. Garo does not have any myth related to the genesis of the language. It is remarkable that the Garo language of Meghalaya has a strong bearing upon their ancestry (Figure 1).

was also known as "U KpaKa Khasi Literashor" (Father of Khasi Literature) [1, 2], to continue and improves the task of translation and writing of religious and other writings in Khasi to be used in school and church.
Another important aspect is the evolution of a standard dialect, based on the Sohra (Cherrapunjee) dialect in the southern slopes of the Khasi hills. The importance of Cherrapunjee was due to the Headquarter of the British rule at that time. It is this Sohra dialect which linked the various sub-groups i.e., the Jaintia (Pnar) in the East, the War-Jaintia in the South-east, the Bhoi in the north, the Lyngngam in the West, and the Khynriam (Khasi) in the central region. Much later, when the Headquarter was shifted to the Shillong area, the Sohra dialect was transported to Shillong which became the capital of Meghalaya (Figure 2).

The History of Garo Literature was evolved while they were still in Mandalaya, in Upper Burma and this happened long before they came to Tibet. But as they left Tibet and wandered towards the plains of India, the Garo were known to use Bengali script. Towards the end of the 19th century, the American Baptist missionaries put the north-eastern dialect of Garo called A·we [3, 4] into writing by adopted a new script for the language, that is, the Roman script. The Garo language was first reduced to writing by the British officials and the American Baptist Missionaries in the last decades of the 19th century. Garo literature first began with the compilation of some Garo words. It was John Elliot, the Commissioner of Dacca, who first attempted the compilation of Garo Vocabulary. He As the years rolled by, the missionaries felt it all necessary to change the script and language from Bengali script to Roman script and Bengali language to Garo language. Using the Bengali language and script for the Garo was difficult and tedious. Therefore, after years of struggling with the Bengali alphabets the Roman script was adopted for Garo Literature in 1902, along with it, Bengali language was replaced by Garo language [3, 4] (Figure 3).
also collected some Garo words, evidently from Am.beng [3, 4], one of the dialects of Garo; and these words were compiled in the form of dictionary and got them published in the Asiatic Researches.

Thus, we find that the languages in Meghalaya share the characteristic traits of the social-cultural pattern of the different region of Meghalaya. These three major tribal groups Khasi, Garo and Pnar have been chosen in the present research to study the auditory and acoustic characteristics of the vowels in the vowel quadrilateral of the English utterances.
The aim of this study is to find the differences of auditory and acoustic description of the vowels of these three tribal groups and overall distribution of vowels quality on the vowel quadrilateral among these three tribal groups. The emphasis of the study is mainly put on finding the formant frequencies of different vowels and plotting the vowel quadrilateral and also to find the degree of similarities and difference in their acoustic space of vowel of the three tribal of male and female genders in the state Meghalaya while speaking English utterances by them.
Sl. No. Speaker Gender Age Qualification Self-occupation 1 Speaker A Male 23 B. A Employed 2 Speaker B Male 22 B. Sc Employed 3 Speaker C Male 21 XII Employed 4 Speaker D Male 25 XII Employed 5 Speaker E Male 21 XII Employed 6 Speaker F Female 24 XII Employed 7 Speaker G Female 23 B.A Employed 8 Speaker H Female 28 XII Employed 9 Speaker I Female 23 B. Com Employed 10 Speaker J Female 26 B. Com Employed Table 1: Detail of Khasi speakers.
Material and Methods
| WAV) with variety of mp3 formats. | The Speech sample | |
|---|---|---|
| texts of three major dialect groups of the State Meghalaya | ||
| Khasi, Pnar and Garo for this study and their English | ||
| transcript were prepared for recordings and for | ||
| comparison of their vowel quality in the vowel | ||
| quadrilateral. The | Speakers of 5(five) male and 5(five) |
Table 1: Detail of Pnar speakers.
The three tribal groups where the speech samples have been collected are mention below.
| Sl. No. | Speaker | Gender | Age | Qualification | Self-occupation | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Speaker A | Male | 23 | B. Sc | Employed | ||||||||||||
| 2 | Speaker B | Male | 21 | XII | Employed | ||||||||||||
| 3 | Speaker C | Male | 22 | B. Com | Employed | ||||||||||||
| 4 | Speaker D | Male | 22 | B. A | Employed | ||||||||||||
| 5 | Speaker E | Male | 22 | XII | Employed | ||||||||||||
| 6 | Speaker F | Female | 21 | B. Sc | Employed | ||||||||||||
| 7 | Speaker G | Female | 22 | B. Sc | Employed | ||||||||||||
| 8 | Speaker H | Female | 22 | XII | Employed | ||||||||||||
| 9 | Speaker I | Female | 25 | XII | Employed | ||||||||||||
| 10 | Speaker J | Female | 24 | B. Sc | Employed | ||||||||||||
| 1 | Speaker A | Male | 24 | XII | Employed | ||||||||||||
| 2 | Speaker B | Male | 26 | B. A | Employed | ||||||||||||
| 3 | Speaker C | Male | 23 | B. A | Employed | ||||||||||||
| 4 | Speaker D | Male | 21 | XII | Employed | ||||||||||||
| 5 | Speaker E | Male | 22 | B. A | Employed | ||||||||||||
| 6 | Speaker F | Female | 23 | B. A | Employed | ||||||||||||
| 7 | Speaker G | Female | 23 | B. A | Employed | ||||||||||||
| 8 | Speaker H | Female | 25 | XII | Employed | ||||||||||||
| 9 | Speaker I | Female | 23 | B. A | Employed | ||||||||||||
| 10 | Speaker J | Female | 23 | B. A | Employed |
Table 2: Detail of Pnar speakers.
The Speakers were informed about the purpose of the study & its aim and procedure that would be used while recording their speech samples. They were ensured that there will be no potential risk or cost involved and their identity will be concealed. Each speaker was instructed forehand to read the text of English utterance and to repeat it for three times and to read in their own pace steadily with comfortable loudness level. A brief practice period was provided to the participants to familiarize with the text and the environment. During the recording process the mouth and the microphone was maintained approximately 10-15 cm. For all the recording of the sample the digital recorder was set at sampling rate of 44.1 Hz and quantization rate of 16 bit. The recorded audio signals were stored in the micro SD card of the recorder in wave format. The audio digital data were then transferred and stored in the same format in a computer for analysis.
After the recordings, the samples were transferred in the computer; the samples were organized and arranged separately with respect to their dialectal groups. The recorded speech samples of each of the three tribal groups were then analyzed by listening carefully (Auditory analysis of the recorded Samples) and six vowels i.e. [a], [e], [i], [I], [o] and [u] were chosen for the present study. The common words containing vowel sound from the transcript spoken by the three tribal groups were selected and segregated using Gold Wave Digital Audio Editor software. Suitable software was used for segregation of speech sample, reduction of unwanted noise or background noise etc. After segregation the common words having vowel sound of all the three tribal groups, the common words were analyzed using Multi Speech, a window-based software speech analysis system and display speech in sound spectrogram. Spectrographic analysis of speech is one of the most widely used techniques for studying the acoustic phonetic characteristics of difference phonemes. Two popular qualitative representations used for analysis were wideband and narrowband spectrographic analyses. In a spectrogram, the horizontal dimension represents time and the vertical dimension represents frequency. Each thin vertical slice of the spectrogram shows the spectrum during a short period of time, using darkness to represent for their amplitude. Darker areas show those frequencies where the frequency waves have high amplitude.
On a spectrogram, there are two specific visual elements which resemble a voiced sound. The first is the vertical striations which correspond to this opening of the vocal folds, and when air flows through them every time. The other visual clue is the dark horizontal bands which are typical of vowels, approximants and nasals. These are called formants, which are the natural resonances of the vocal tract. The size and shape of the vocal tract can be modified to allow these formants to vary. This can be done by changing the tongue position, lip position, etc. the common words from the three tribal groups compared using spectrogram i.e. the formant frequencies distribution by measuring at appropriate location of vowel formant for each of the speakers. The common words chosen for our study of the vowel are given below Table 4.
| Selected Words | Vowel | ||||
|---|---|---|---|---|---|
| High | [a] | ||||
| Literacy | [i] | ||||
| Essential | [I] | ||||
| Areas | [e] | ||||
| Only | [o] | ||||
| Usually | [u] |
Table 3: Words chosen to study the Vowels.
The formant frequencies F1 (First formant frequency) and F2 (Second formant frequency) has been measured at appropriate location (central portion) of vowel formant for each speaker for all the three tribal groups separately for English utterances. Khasi Male have a frequency range of 200Hz-2500Hz, Khasi Female have a frequency range of 200Hz-2700Hz, Pnar Male have a frequency range of 300Hz-2200Hz, Pnar Female have a frequency range of 200Hz- 2700Hz, Garo Male have a frequency range of 200Hz- 2300Hz and Garo Female have a frequency range of 200Hz- 2300Hz based on the utterances of the table: 4 mention above. The measured values of frequencies for English utterance for each speaker and the computed mean values are given in the table below. The vowel quadrilateral of average First formant frequency (F1) and Second formant frequency (F2) are drawn separately for male and female speaker for all the three tribal groups.
Results and Discussion
Acoustic analysis of the three dialects speakers speaking English utterances Tables 5 and 6 represent the mean values for the first formant frequency (F1) and Second formant frequency (F2) measured at a location of English utterance having vowels namely [a], [e], [i], [I], [o] and [u], for 5(five) male and 5(five) female speakers respectively Tables 5-7.
| Vowel | Khasi Male | Pnar Male | Garo Male | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| F1(Hz) | F2(Hz) | F1(Hz) | F2(Hz) | F1(Hz) | F2(Hz) | ||||||
| [a] | 563.47 | 1699.18 | 670.93 | 1671.64 | 565.65 | 1520.22 | |||||
| [i] | 372.96 | 1527.92 | 405.97 | 1774.33 | 465.8 | 1723.09 | |||||
| [I] | 334.7 | 1694.29 | 455.51 | 1776.61 | 419.76 | 1700.47 | |||||
| [e] | 359.91 | 2046.51 | 446.57 | 1931.79 | 413.43 | 1855.27 | |||||
| [o] | 380.31 | 1099.68 | 439.4 | 945.56 | 521.44 | 982.16 | |||||
| [u] | 292.36 | 1836.4 | 439.4 | 1184.48 | 280.79 | 1865.16 |
Table 4: Mean Values of F1 and F2 of different vowels of English utterance for Male Speaker.
| Vowel | Khasi Male | Pnar Male | Garo Male | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| F1(Hz) | F2(Hz) | F1(Hz) | F2(Hz) | F1(Hz) | F2(Hz) | |||||||
| [a] | 563.47 | 1699.18 | 670.93 | 1671.64 | 565.65 | 1520.22 | ||||||
| [i] | 372.96 | 1527.92 | 405.97 | 1774.33 | 465.8 | 1723.09 | ||||||
| [I] | 334.7 | 1694.29 | 455.51 | 1776.61 | 419.76 | 1700.47 | ||||||
| [e] | 359.91 | 2046.51 | 446.57 | 1931.79 | 413.43 | 1855.27 | ||||||
| [o] | 380.31 | 1099.68 | 439.4 | 945.56 | 521.44 | 982.16 | ||||||
| [u] | 292.36 | 1836.4 | 439.4 | 1184.48 | 280.79 | 1865.16 |
Table 5: Mean Values of F1 and F2 of different vowels of English utterance for Male Speaker.
| Vowel | Khasi Female | Pnar Female | Garo Female | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| F1(Hz) | F2(Hz) | F1(Hz) | F2(Hz) | F1(Hz) | F2(Hz) | ||||||
| [a] | 431.27 | 1834.29 | 322.72 | 773.25 | 389.85 | 1714.21 | |||||
| [i] | 425.59 | 2044.21 | 405.22 | 2100.24 | 517.84 | 1821.89 | |||||
| [I] | 488.68 | 1937.6 | 521.9 | 1993.35 | 522.18 | 1963.97 | |||||
| [e] | 415.82 | 2526.26 | 425.04 | 2465.4 | 517.39 | 1780.1 | |||||
| [o] | 335.92 | 842.42 | 499.71 | 1391.88 | 436.4 | 863.33 | |||||
| [u] | 333.98 | 1962.23 | 295.23 | 1855.15 | 362.91 | 1939.6 |
Table 6: Mean Values of F1 and F2 of different vowels of English utterance for Female Speaker.


quadrilateral. The vowel [e] produce near-closed and tend to shift toward the front in the vowel quadrilateral. The vowel [a] produces open-mid and tends to shift toward the centralized in the vowel quadrilateral. The vowel [o] produces closed-mid and between the centralized and back in the vowel quadrilateral. The vowel [u] produce closed and between the front and centralized the vowel quadrilateral while speaking English.
Pnar Male speakers produce the vowel [i] produce closed-mid and between the front and centralized in the vowel quadrilateral. The vowel [I] produces mid and between the front and centralized in the vowel quadrilateral. The vowel [e] produces mid and tend to shift toward the front in the vowel quadrilateral. The vowel [a] produces open and tends to shift toward the centralized in the vowel quadrilateral. The vowel [o] produce mid and tend to shift toward the back in the vowel quadrilateral. The vowel [u] produces mid and between the centralized and back in the vowel quadrilateral while speaking English.
Garo Male speakers produce the vowel [i] produce mid and tends to shift toward the centralized in the vowel quadrilateral. The vowel [I] produces closed-mid and tend to shift toward the centralized in the vowel quadrilateral. The vowel [e] produces closed-mid and between the fronts and centralized in the vowel quadrilateral. The vowel [a] produces open-mid and between the front and centralized in the vowel quadrilateral. The vowel [o] produces open-mid and tends to shift toward the centralized in the vowel quadrilateral. The vowel [u] produces closed and between the front and centralized in the vowel quadrilateral while speaking English.
The vowel quality of the three male tribal groups is found to be distinctive from each other while observed their respective vowel quality of all the three tribal groups. The differences among these three tribal groups can be found while speaking the vowel [a], [i], [u] [I], [o] and [e] the degree and variation among these three tribal groups can be differentiated. The Khasi male speaker produce vowels [i], [I], [e] [a] and [o] higher vowels height compared to Pnar and Garo in the vowel quadrilateral while speaking English utterance whereas Pnar male speaker produce vowels [e], [a], [I] and [u] lower vowels height compared to Khasi and Garo in the vowel quadrilateral while speaking English utterance but Garo male speaker produce vowels [i] and [o] lower vowel height in the vowel quadrilateral and produce vowel [u] higher vowel height, the other three vowels [I], [e], [a] are between Khasi and Pnar in the vowel quadrilateral while speaking English utterance.
Figure 5 show the vowel quadrilateral of average F1 and F2 for female speaker of three dialects speaking English utterance. On comparing their respective vowels, it is seen that the Khasi Female speakers produce the vowel [i] open-mid and between the front and centralized in the vowel quadrilateral. The vowel [I] produces near- open and tends to shift toward the centralized in the vowel quadrilateral. The vowel [e] produce mid and tend to shift toward the front in the vowel quadrilateral. The vowel [a] produces open-mid and tends to shift toward the centralized in the vowel quadrilateral. The vowel [o] produces closed-mid and tend to shift toward the back in the vowel quadrilateral. The vowel [u] produce closed- mid and between the fronts and centralized the vowel quadrilateral while speaking English.
Pnar Female speakers produce the vowel [i] produce mid and between the front and centralized in the vowel quadrilateral. The vowel [I] produces open and between the front and centralized in the vowel quadrilateral. The vowel [e] produces open-mid and tends to shift toward the front in the vowel quadrilateral. The vowel [a] produces near-closed and tends to shift toward the back in the vowel quadrilateral. The vowel [o] produces near- open and tends to shift toward the centralized in the vowel quadrilateral. The vowel [u] produces near-closed and between the front and centralized in the vowel quadrilateral while speaking English.
Garo Female speakers produce the vowel [i] produce open and tends to shift toward the centralized in the vowel quadrilateral. The vowel [I] produces open and tend to shift toward the centralized in the vowel quadrilateral. The vowel [e] produces open and tends to shift toward the centralized in the vowel quadrilateral. The vowel [a] produces mid and tend to shift toward the centralized in the vowel quadrilateral. The vowel [o] produces open-mid and tends to shift toward the back in the vowel quadrilateral. The vowel [u] produces closed- mid and between the fronts and centralized in the vowel quadrilateral while speaking English.
The vowel quality of the three female tribal groups is found to be distinctive from each other while observed their respective vowel quality of all the three tribal groups. The differences among these three tribal groups can be found while speaking the vowels [a], [i], [u] [I], [o] and [e] the degree and variation among these three tribal groups can be differentiated. The Khasi female speaker produce vowels[e], [I] and [o] higher vowels height compared to Pnar and Garo in the vowel quadrilateral while speaking English utterance whereas Pnar female speaker produce vowels [i], [a] and [u] higher vowels height compared to Khasi and Garo in the vowel quadrilateral while speaking English utterance but Garo female speaker produce vowels [i], [I], [e] and [u] lower vowel height in the vowel quadrilateral and the other two vowels [u] and [a] are between Khasi and Pnar in the vowel quadrilateral while speaking English utterance.
The male tribal groups are found to be distinctive from that of female tribal groups while observed their overall vowel quality. The variation is prominently found while speaking the vowels [a], [I] and [i]. The male speakers produce vowels [a] in between the open-mid and open whereas the female speaker produce vowel [a] in between the closed-mid and open-mid in the vowel quadrilateral while speaking English utterance. The male speakers produce vowels [I] in between the closed-mid and mid whereas the female speakers produce vowel [I] in between the near-open and open in the vowel quadrilateral while speaking English utterance. The male speakers produce vowels [i] in between the closed-mid and mid whereas the female speakers produce vowel [i] in between the mid and near-open in the vowel quadrilateral while speaking English utterance.
It is also observed that the vowel [e], [I] and [o] for Khasi male and female speakers produce higher vowel height in compared to Garo and Pnar in the vowel quadrilateral while speaking English utterance. The vowel [i] and [I] for Pnar male and female speakers produce more front vowel in compared to Khasi and Garo in the vowel quadrilateral while speaking English utterance. The vowel [a] for Garo male and female speakers are in between Khasi and Pnar whereas the vowel [i] for Garo male and female speakers produce lower vowel height in compared to Khasi and Pnar in the vowel quadrilateral while speaking English utterance [5, 6, 7] (Figure 6).

Conclusion
Based on the qualitative analysis and comparison of the vowel quadrilateral representation of all the vowels as shown in the above Figure (4) and (5) it gives a clear indication that the three tribal groups male and female varies distinctively in their vowel speech quality, thus reflecting a groups or region they belong to. The overall distributions of vowels quality in the vowel quadrilateral are found to be unique in all the three tribal groups for both male and female. The auditory and acoustic feature of the three tribal groups thus proved to be an important factor in speaker profiling characteristics to identify the individuality of a person belonging to a region. The differences in their vowel speech quality of both male and female of the three tribal groups help in pin down the investigation and can define clearly the process of investigation.
Notes
• The above discussion and interpretation of Figure (4)
and (5) were based from the IPA Vowel Chart shown in figure(6) • F1: The first formant (F1) in vowels is inversely related to vowel height, i.e. the higher the formant frequency, the lower the vowel height (and vice versa).
• F2: The second formant (F2) in vowels is somewhat related to degree of backness and frontness of the vowel, i.e. the more front the vowel, the higher the second formant frequency and the more back the vowel, the lower the second formant frequency.
Acknowledgement
The authors express their sincere gratitude to the Director of LNLN NICFS, Ministry of Home Affairs for his inspiration and continuous support in the completion the research work. The entire process of this research work was conducted at NICFS as a dissertation project of PG Diploma curriculum. The authors express their sincere thanks the Director FSL, Meghalaya, for the encouragement and motivation during the research work and sincere thanks to the Principal PTS Shillong, for providing all the necessary facilities for the collection of Voice Samples from the training school. Lastly, we wish to thank all the participants for their contribution and cooperation throughout the process to complete our research work.
Conflict of Interest
The authors pose no conflict of interest in the above paper as the research paper is part of dissertation, and being published with appropriate permission and acknowledgement.
References
-
Pryse W (1855) Introduction to the Khasi Language_._ Calcutta: School Book Society’s Press.
-
Bareh H (1997) Language and Literature of Meghalaya. Shimla: Indian Institute of Advance Study.
-
Sangma M (1983) History of Garo Literature, Nehu Publication, Shillong.
-
Marak Caroline R (2002) Garo Literature_._ Sahitya Akademi.
-
Khasi (Omniglot).
-
Garo (Omniglot).
-
Hollien H (2002) Forensic Voice Identification. Academic Press, London. (Forensic Speech and Audio Analysis Forensic Linguistics.
- Narcotics and Digital Forensics: Bridging Crimes in the Digital Age
- Ethics in Forensic Psychiatry: Principles, Dilemmas, and Human Rights
- Impact of Acute Stress on Attentional Orienting to Social Cues
- Head Injury and Intracranial Hemorrhage in Western Region of Libya
- A Forensic Study on Handedness: Examination of Handwriting Features in Right and Left Handed Writers
- Techniques for Latent Fingerprint Development Using Natural and Synthetic Powders: A Review