On the challenges of spoken language identification in a live setting…..
* Lá Fheile Pádraig – St Patricks Day
The current President of Ireland (Úachtarán Na hEireann) is a poet, university lecturer and former senator from Galway named Michael D. Higgins.
On the Irish national holiday on March 17th, he delivered his annual address in Irish, the national and first official language of the Irish State.
This address was uploaded to YouTube, for which the automatic captions were provided using automatic speech recognition (ASR). YouTube has been providing this service for a number of years now, and although the results are not always stellar, they provide useful input for content-based video indexing, topic modelling and other natural language processing techniques where a non-word-for-word transcription is sufficient.
Of course, the YouTube subtitling ASR system used was an English language one, which resulted in subtitles such as:
s a Lolla pottery longer talk to me to Kayla
which should have been:
“Is é Lá ‘le Pádraig an lá go dtagaimid le chéile…”
St Patricks Day is the day that we come together.
And the rather more embarrassing:
cock merely she can’t play in his scaly
which actually reads:
“…imirceach míle sé chéad bliain ó shin…”
Immigrants one thousand six hundred years ago
Spoken language recognition is a trickier beast than the written-language counterpart, although the technology is in existence and no doubt used for more nefarious purposes than this one, but whether having this technology in the pipeline for YouTube uploads may not be a question for today’s post.
As far as I am aware, Google does not currently deploy an Irish-language ASR system at all, although their Google Translate offering for Irish is not bad (At least in the Irish-English direction), if maybe not fit for official purposes. Official documentation of the Irish state, and also a subset of EU documentation must by law be translated into Irish.
Furthermore with regard to Irish-language speech recognition in general, there does not appear to be any great interest in academic circles either, alluded to by Judge et. al. (2012) who note:
“In the area of automated speech recognition there has been no development to date for Irish, however many of the resources which have been developed for synthesis are also crucial for speech recognition and to this extent the foundations for this aspect of technological development are being laid”
As the language is currently only spoken on a daily basis by an ever-dwindling number of people, the demand for such a system could be rather low. Although there are plenty of Irish language recordings in the wild which could be used in the process of creation, the difference in dialects and pronunciation from the different regions where Irish is still spoken could make training an ASR system difficult, however preserving the language through digitisation and creation of these system could be a very valid step in the preservation of the language. The Phonetics and Speech Laboratory at TCD also has extensive experience in Irish language technology systems although synthesis is the prevailing modality here also.
So cad a cheapann tú (what do you reckon) Google? Maybe by next Paddy’s Day, you could train up a small Irish-language speech recognition system and give Michael D. the subtitling he deserves!
Either that or upload a manual translation in advance, but sure where’s the fun in that.
Is maith an scéalaí an aimsir.
(lit) Time is a good storyteller.
Time will tell.
John Judge, Ailbhe Ní Chasaide, Rose Ní Dhubhda, Kevin P. Scannell, and Elaine Uí Dhonnchadha. An Ghaeilge sa Ré Dhigiteach – The Irish Language in the Digital Age. META-NET White Paper Series: Europe’s Languages in the Digital Age. Springer, Heidelberg, New York, Dordrecht, London, September 2012. Georg Rehm and Hans Uszkoreit (series editors)