What is VOT? And a brief summary of Kwon (2014)

When I announced the 2013 ALS proceedings a couple of months ago, I asked for any suggestions on a follow-up post. Laxsnail got in touch and asked about Kwon’s article Acoustic observation for English speakers perception of a three-way laryngeal contrast of Korean stops. It involves understanding Voice Onset Time and f0 before you can understand the paper itself. So here we go with some definitions!

Voice Onset Time (VOT)

Voicing is a feature of some of the sounds we make. Hold you fingers lightly against the front of your throat and make the sound ssssssss, and then go zzzzzzzz - feel that buzz for the second one? That’s your vocal folds vibrating really quickly. There are lots of minimal pairs like this in English - s/z are fricatives, but there are also stops (AKA plosives), like t/d and p/b. For stops, the voice onset time (VOT) is the relationship between when you open your articulators (AKA mouth bits) and when those vocal folds (AKA voice box bits) start buzzing. Some stops will have the voicing start before the release of the closure, known as a negative VOT, aspirated consonants (a bit of air after the release) with a voiced sound after result in a positive VOT, and those situations where the voicing and opening occur at the same time are known as tenuis VOT just to sound fancy. The VOT of sounds varies across languages, which is a crucial feature of Kwon’s article.

Fundamental Frequency (f0)

Our speech apparatus is basically a very fancy way to create and manipulate acoustic signals. There are lots of different features of speech you can measure - perceptual phoneticians look at how we process these sounds, instrumental phoneticians looks at how we produce these sounds, and acoustic phoneticians look at the nature of these sounds. Acoustic phonetics really involves a lot of physics, maths and statistics, and everything I can explain about it I learnt from people much smarter than I am. 

Speech signals are waveforms of sound, which is why they make those pretty patterns when you look at a visual representation of it. The fundamental frequency of speech is the lowest frequency of a periodic waveform, called the f0 because they count from zero. The lower the fundamental frequency the lower a person’s voice will sound to us. f0 is therefore a way of acoustically measuring what we might perceptually call pitch. Men generally have lower pitch than women, which is due to a number of factors including larynx size, and the length of your vocal folds. 

Kwon’s article

OK! As English speakers, you may be familiar with the sounds represented by /p/ and /b/ - /p/ has a positive VOT, as it’s unvoiced, while /b/ has a tenuis VOT (this is a gross oversimplification, and English VOT is actually more complicated, but that’s not what today is about). Korean equivalents to p/b, t/d and k/g are not the same. Where we have 2 different sounds for each of those sets, Korean has three (this applies to all of them, but I’m going to stick with talking about the p/b bilabial set for simplicity). 

Korean has an aspirated /pʰ/, which is a lot like English /p/ with that puff of air on release of the stop. They have two other bilabial plosives without any voicing either. These other two are distinguished by what is sometimes called a ‘tense/lax’ and other times a 'fortis/lenis’ distinction. This distinction is partly that the VOT for the lax (AKA lenis) are much longer than the tense (fortis), and it is partly that the fortis gives a higher f0 than the following vowel than the lenis. So, there’s a lot going on there!

Kwon hypothesised that an English speaker learning Korean would likely be able to separate out the /pʰ/ from the other two, but lump the fortis and lenis together and hear them more like an English /b/ - because we aren’t used to using this combo of VOT and f0 to distinguish stops. Kwon then created and ran an experiment with English speakers listening to these tricky Korean sounds. Her work supports the hypothesis, as English speakers were rubbish at telling the fortis/lenis pair apart, even though the VOT is quite distinct. Therefore, the f0 appears to be an important cue for telling them apart.  

If you’re learning Korean and have been struggling to master certain words, this may be why!

The Sounds of Aboriginal Languages: Free public talk

The opening talk of this year’s Australian Linguistics Society conference will be a public lecture by Prof. Andy Butcher from Flinders University. Andy’s work is utterly fascinating; he looks at the sounds of Indigenous Australian languages and how this is possibly influenced by a hearing condition otitus media with effusion (OME), or ‘glue ear’. This is the summary from the website:

Chronic OME develops in the majority of Aboriginal infants in remote communities within a few weeks of birth, typically affecting hearing and the perception of speech sounds. Among the specific consequences of this are difficulties in hearing differences between sounds “t” and “s” in words like “tap” versus “sap”, or the “p” and “b” in words like “pack” and “back”. Given the importance of these sounds in distinguishing words, OME-induced hearing loss has been shown to disrupt speech and language development in English. It also may have an adverse role in the development of English literacy. Interestingly, the specific sound frequencies where hearing is not lost happen to be typically those that are used in the acoustic makeup of speech sounds in traditional Aboriginal languages.

In other words, these languages favour consonant and vowel sounds which exploit precisely that area of hearing ability which is most likely to remain intact in OME. Thus Aboriginal languages may be acoustically more robust than English as a medium of communication for those with OME-associated hearing loss.

The talk is on Tuesday the 1st of October at 5pm at The University of Melbourne. You can get more information and register here.

ALS 2013 wrap up

Last week was the 2013 Australian Linguistics Society annual conference. This year it was hosted by the University of Melbourne, and as I was involved in the admin side of things I didn’t get to see as many of the talks as in 2011 and 2012 (although the good news is that I’ll be editing the proceedings, so I’ll get a chance to read about some of the research I didn’t get to see).

I’ve storified some of the best tweets from the conference - it’s nice that every year there’s a growing community of people adding an extra layer of conference experience through social media.

I greatly enjoyed the plenaries, we were very lucky to have Eve Clark and Martin Haspelmath in the country for the first time, and after years of hearing about Andy Butcher’s work I really enjoyed hearing him talk about it. There was also a great range of work on Auslan - and it was even more excellent that it was dotted throughout the program. The number of people who were interested in child language acquisition was also really positive, especially in remote and small languages; the workshop on the topic had people packed in.

My first ALS was in Melbourne in mid-2009, when I was only a few months into my PhD. Now, a few months beyond graduating, it was lovely to reflect on how much more I feel a part of the ALS community.