Video: Alexander Hamilton in IPA

I love musicals, and no musical has loomed larger in popular consciousness in recent years than Hamilton. We don’t have to discuss all the ways in which it’s groundbreaking, but one of the things that’s significant is how Lin-Manuel Miranda has brought a hip-hop sensibility to Broadway in In The Heights and now Hamilton. One of the most prominent ways he’s done that is through his use of rhythm and rhyme.

Inspired by the Wall Street Journal’s analysis of some of Lin-Manuel Miranda’s rhymes, and by Stephen Malinowski’s musical visualisations, I’ve had the puppets do a cover of Alexander Hamilton. In this video, the lyrics of Alexander Hamilton, written in the International Phonetic Alphabet (IPA), scroll by in time with the music.

How I made this video

I recorded the audio using ProTools, using the cast recording as a scratch track and timing the audio to that. Yes, both voices are me. Yes, those are the puppets' voices; any claim that they are controlled and voiced by a human being is fake news. My sister Felicia played the piano track that you hear, which I then laid under the vocal tracks.

The next step was to write out the song in IPA. For the most part, this was fairly straightfoward, but there were points where the pronunciation heard on the cast album is not the canonical General American pronunciation, and I had to make a decision as to how to transcribe it.

 Props to Lin-Manuel Miranda for writing a song that I still want to listen to after I've transcribed it.

Props to Lin-Manuel Miranda for writing a song that I still want to listen to after I've transcribed it.

Some broad guidelines I adhered to:

  • If I heard rhotacisation, I transcribed it. If in doubt, I transcribed it. I only transcribed non-rhotic vowels if I was sure I didn’t hear any rhotacisation.
    • e.g. /ə/ instead of /ɚ/ in “bastard”, line 1.
  • If I heard /ɹ/, I transcribed it. If in doubt, I usually transcribed it. I only left it out if:
    • I’m sure I didn’t hear it
      • e.g. “smarter” /smɑɾə/ in verse 2
    • I’m not sure if I heard it AND it fits the rime not to have /ɹ/ in the syllable coda.
  • /æ/ is left un-raised, for my sanity’s sake. I personally don’t have /æ̝/, and I’m not comfortable making judgements on whether /æ/ is raised or not, especially when there is no contrast involved.
  • Unstressed syllables tend to be transcribed /ə/ or /ɪ/, but the vowel in “and” is often /æ/, and the vowel in "to" is often /u/. This one was a bit of a judgement call.
  • I transcribed //t// sometimes as /t/, sometimes as /ʔ/, sometimes as /ɾ/, and sometimes as null.
  • Syllabification is based purely on phonotactic rules. This means that sometimes syllable boundaries do not correspond to morpheme boundaries:
    • e.g. “start-er”: /stɑ.ɾə/
    • e.g. “with-out”: /wɪ.ðaʊt/
    • e.g. “drip-pin’”: /dɹɪ.pɪn/

Halfway through the transcription, I realised I was a chump for doing this by hand, and I used an IPA transcriber instead. Of course, like any good student taking shortcuts, I checked every syllable of the transcription, and changed it to fit the pronunciation on the cast album where necessary.

The next step took some time to figure out. I love Stephen Malinowski’s work and I wanted my video to create the same sort of visual impression, but with a rigorous theoretical framework undergirding it. I had visions of creating some sort of graphical featural system, with perhaps a colour for each place of articulation, with the colours on a spectrum, so that places of articulation adjacent to each other would have complementary colours. Manners of articulation would be represented by shapes, with voiceless stops having the sharpest corners and approximants having the gentlest (in line with the bouba/kiki effect). As for the vowels, I imagined some use of the colour wheel to reflect vowel height and backness, so that vowels closer in the vowel space would be closer in colour, and vowels further apart would have contrasting colours.

I ran out of colours.

Wit aside, the simple fact is that music is a discipline built on repeating mathematical patterns, which allows Stephen Malinowski to take advantage of colour in his videos. On the other hand, linguistics does not rest on a primarily mathematical framework, and articulatory phonetics is especially subject to the limitations imposed by human anatomy. If all of music theory flows from mathematics, all of articulatory phonetics flows from anatomy. Consequently, it’s difficult to build a notation that tries to fit consonants and vowels into some kind of colour-based visual system.

Speaking of visual representations of consonants and vowels: yes, I tried creating a kiddie version of a spectrogram that would show the acoustic relationship between similar consonants and similar vowels. It’s really unintuitive to non-linguists. It’s also really, really ugly:

 No. Just no. (The above abomination is supposed to represent "how does a...")

No. Just no. (The above abomination is supposed to represent "how does a...")

 The grown-up spectrogram of "how does a..."

The grown-up spectrogram of "how does a..."

Eventually, I came back to just using IPA. English speakers don’t need a linguistics background to figure out which sound each symbol represents. The visual and phonetic relationship between two syllables that have the same nucleus is immediate and obvious: you know at once that /bɹeɪn/ and /peɪn/ rhyme, even if you don’t know the featural specifications of the vowel or diphthong in the syllable nucleus.

While it would have been nice to highlight, say, sibilant fricatives or /s/ + voiceless stop sequences (which frequently fall on the same beat of consecutive bars), I figured that was too much information to try to visually represent or highlight in a video.

It’s a rap! What matters is rhythm and rhyme. I knew how to represent the rhythm; I just needed to focus on the rhyme. So, I did the logical thing: the syllables in this video are coloured according to their nucleus. That is all there is to it.

 This is how the sausage is made.

This is how the sausage is made.

The rest of it was just mechanics: conditional formatting, stitching screenshots into a long image, displacing each column by the correct number of pixels, sticking the image in Final Cut, calibrating the timing of the scrolling to match up with the timing of the recording, overlaying the puppets…

Two separate videos were recorded, one with each puppet, against a fluorescent green piece of card. They were recorded separately because even though I have two hands, I lack the coordination to control two puppets at the same time. I put the puppets together in post-production, which is pretty obvious if you know what to look out for.

That’s how I arrived at the video you see here. I actually finished making this video a few months ago, but I wanted to wait till the Articulatory Phonetics series was done before publishing it, so that if anyone was curious about the IPA symbols, they could look up the theory behind the phonetics.

If I had the chance, there might be a couple of things I might change — but for what is almost a one-person job (once again, shoutout to my sister Felicia for playing the piano part), it’s not too shabby.

I’m happy to take questions about the transcription, the creative process, or the production process. Enjoy.