Speaker: Taylor Berg-Kirkpatrick (UC Berkeley)

Title: Unsupervised Transcription of Language and Music


A variety of transcription tasks–for example, both historical document transcription and polyphonic music transcription–can be viewed as linguistic decipherment problems. I’ll describe an approach to such problems that involves building a detailed generative model of the relationship between the input (e.g. an image of a historical document) and its transcription (the text the document contains). It turns out that these models can be learned in a completely unsupervised fashion–without ever seeing an example of an input annotated with its transcription–effectively deciphering the hidden correspondence. I’ll demo two state-of-the-art systems, one for historical document transcription and one for polyphonic piano music transcription, that outperform supervised methods.

Slides: (pdf)