Monday, February 8, 2010

Google’s new Tower of Babel

You have to admire the sheer market power and dominance that Google has these days. They announce speech-to-speech machine translation on future Android-powered phones – and the whole tech blogger universe goes ballistic in talking about it and likening it to the famous Babel fish of Douglas Adams’ The Hitchhikers’ Guide to the Galaxy.

So why does everybody think machine translation should suddenly work? If you happen to be bi-lingual or even just fluent in more than one language, you know very well that tools like Google Translate can only convey the basic and most rudimentary nuances of any document. Now combine that flawed and unreliable piece of technology with something equally unreliable: speech recognition. What do you get when you mix these two and stir well? Possibly the foundation for a new Tower of Babel, but certainly not the famous Babel fish!

Turmbau_Babel

But do not take my word for it. Or rather, please do take my word for it exactly as it is being processed by those two technologies. After drafting this little blog entry I decided to put Google to the test. As a Google Voice user I already get access to their famous speech recognition, so I called my own Google Voice number and recorded this script. I then took the transcription that Google provided and fed it through the Google Translate service twice: once translating it into German, and then translating it back to English for your enjoyment.

Do you now see what I mean… this is the stuff that can start wars…

Transcription of the above text recorded with Google Voice:

You have to. My but she in market power dominance that Google has these days. Again that owns speech touch each machine translations in future, and with Pulte Homes and to hold tech blog universe goes the lease taken talking about it and liking it to the famous people cessation of. Douglas Adams the Hitchhiker's Guide. To, galaxy. Why does everybody think machine translation should probably work. If you happen to be bilingual, or you can just fluent in more than one language. You know very well but tools like Google Translate can only come database. He can mostly. Maintree nuances of any documents and I'll come by and that's log on reliable piece of technology was something equally unreliable speech recognition. Both you get when you mix these two Lindstrom well. Possibly the foundation for a new tell label, but certainly not the same Staples which but then I'll take my word for it. Or rather, please do take my word for exactly as it is being processed by those to Technologies at the draft this little blog entry. I decided to put group of the test, google voice use them. I would get access to the famous Peter deficient. So I called my own voice number and records the script. I didn't took the transcription of Google provided and said he'd through google plates of twice, one is translating it into driving and then translating it back to English for your enjoyment. Do you know see what I mean, this is the stuff, but in stock awards.

Taking the Google Voice transcription and translating it into German produces:

You have to. Meine aber sie der Marktmacht Dominanz, dass Google in diesen Tagen. Auch besitzt die Rede berühren sich maschinelle Übersetzungen in Zukunft und mit Pulte Homes und Tech-Blog-Universum zu halten geht den Mietvertrag aufgenommen, darüber zu sprechen und Zuneigung es zu den berühmten Einstellung der Menschen. Douglas Adams The Hitchhiker's Guide. So, Galaxie. Warum alle denken, die maschinelle Übersetzung, sollten Sie vielleicht zu arbeiten. Wenn Sie gerade sind zweisprachig, oder Sie können einfach fließend in mehr als einer Sprache. Sie wissen sehr gut, aber Tools wie Google Translate kann nur Datenbank kommen. Er kann vor allem. Maintree Nuancen von Unterlagen und ich komme vorbei und das ist log auf zuverlässige Teil der Technologie war etwas ebenso unzuverlässig Spracherkennung.Beide erhalten Sie, wenn Sie Mischung dieser beiden Lindstrom gut. Möglicherweise die Grundlage für ein neues Label zu sagen, aber sicherlich nicht die gleiche Staples, die dann aber ich werde mein Wort zu nehmen. Oder besser gesagt, bitte nehmen Sie mich beim Wort genau so, wie es von jenen, die Technologien auf den Entwurf verarbeitet diese kleine Blog-Eintrag. Ich beschloss, Gruppe die Probe gestellt, Google Voice verwenden. Ich möchte Zugang zu dem berühmten Peter mangelhaft. Also rief ich meine eigene Stimme Anzahl und zeichnet das Skript. Ich habe nicht nahm die Transkription von Google zur Verfügung gestellt und sagte, er würde durch google Platten zweimal, ist eine Übersetzung in die Lenk-und übersetzt sie dann wieder auf Englisch für Ihren Genuss. Weißt du was ich meine, das ist der Stoff, sondern in Wertpapieren.

Taking the German translation and translating it back to English yields:

You have to. My but they dominate the market power that Google in those days. The question has come in contact with machine translation in the future, and Pulte Homes, and Tech Blog Universe to hold the lease is taken to talk about it and love it to famous people's attitudes.Douglas Adams The Hitchhiker's Guide. Thus, galaxy. Why do all the thinking, machine translation, you might want to work. If you are not bilingual, or you can just fluent in more than one language. You know very well, but tools like Google Translate can only database. He is above all. Maintree nuances of documents and I come by and this is log on reliable piece of technology was something as unreliable speech recognition. Both of you get when you mix these two well Lindstrom. Perhaps to say the basis for a new label, but certainly not the same Staples, which then I shall take my word. Or rather, please take my word for it just like it from those technologies are processed in the draft this little blog entry. I decided to put the sample group to use Google Voice. I want access to the famous Peter deficient. So I called my own voice number and characterized the script. I did not take the transcript made available by Google and said he would be counted twice by google tiles is a translation into driving and then translated back to English for your enjoyment. You know what I mean, this is the substance, but in securities.

But this isn’t limited to just English>German>English translations. You can have as much fun going via a detour into Japanese. Taking the Google Voice transcription in translating it into Japanese results in:

あなたがしなければならない。私が、彼女は市場の力を支配するには、Google、これらの日があります。もう一度、音声将来的に各マシンの翻訳を触れると、所有しているとPulteホームズとハイテクのブログ宇宙保持するためには、リースはそれについての有名な人が停止することand liking話して撮影だ。ダグラスアダムズヒッチハイクガイド』。 、銀河する。なぜ誰もが機械翻訳はおそらく動作するはずだと思う。もしあなたがバイリンガルにするか、またはだけで、複数の言語に堪能することができます。 You know非常によくしかし、ツールGoogle翻訳のような専用のデータベース来ることができます。彼がほとんど。任意のドキュメントのMaintreeニュアンスと私が来るから、技術の信頼性の高い作品には、そのログ何か同じように信頼性の音声認識された。 Both youするときにも、これらの2つのリンドストロムミックスを取得します。新しいが、確かに同じステープルズは、しかし、私はそれを私の言葉を取るよていないラベルを伝えるためのおそらく基盤。というか、してください正確には、ドラフト、この小さなブログのエントリでこれらのtoTechnologiesによって処理されている私の言葉を取るか。私は、テストのグループに配置することを決定、Googleの音声を使用します。私は、有名なピーター欠乏へのアクセスになるだろう。だから、私は自分の声を数と呼ばれるスクリプトを記録します。私は提供される転写of Googleしたことはなく、彼を2回、1つして運転英語を楽しむために戻すの翻訳には翻訳さのプレートのGoogleのだという。私の言いたいことを知って、このものですが、在庫あり賞を受賞した。

Last, but not least, taking the Japanese translation and translating it back to English yields:

You have to do. I, her ability to dominate the market, Google, these days there. Again, and touch each machine in the future translation of speech, and owns space to hold Pulte Holmes and tech blogs, the lease is known to stop people talking about it, shooting it and liking. Dagurasuadamuzuhitchihaikugaido 』. To galaxy. Why Machine Translation I think everyone should probably work. If you or a bilingual, or just can be fluent in several languages. You know But very often, you can get Google tools such as a dedicated database of translations. Most of him. Any documents and I'll come Maintree nuances, reliable technology work is recognized as a reliable voice something that log. Both you even when these two get one Rindosutoromumikkusu. The new Staples is certainly the same, but I was probably based on a label to tell I do not take my word for it. I mean, exactly please the draft, those in this little blog entry to Technologies take my word for it or being handled by. I decided to put to the test group, Google will use the voice. I would want access to the famous Peter. So, I called the script logs the number of your voice. Of Google I never provided a transcript, he twice returned for one to enjoy the English translation of the two drivers is that Google's translation of the plate. You know what I mean, that is, the award-winning stock.

Now imagine any of the above translations being read back to you with text-to-speech synthesis...

And that is why I’m very skeptical about such announcements of speech-to-speech machine translation – even when they come from Google.

4 comments:

Brian Barker said...

Google's "Babel Fish" translator will in never solve the language problem. Not only does it discriminate against anyone who cannot afford a mobile phone, but against minority language groups as well.

There are 6,800 languages worldwide, not fifty-two !

Moreover, if I met a native in Borneo, and he said to me in Hakka "I've lost my mobile phone" how would I understand him :)

And how many starving Africans can afford a mobile phone !

As English loses its economic power, the answer is not for us to move to Mandarin Chinese, but to Esperanto which puts all speakers on an equal footing.

Have a look at http://www.lernu.net or http://www.esperanto.net

Brian Barker said...

Google's "Babel Fish" translator will in never solve the language problem. Not only does it discriminate against anyone who cannot afford a mobile phone, but against minority language groups as well.

There are 6,800 languages worldwide, not fifty-two !

Moreover, if I met a native in Borneo, and he said to me in Hakka "I've lost my mobile phone" how would I understand him :)

And how many starving Africans can afford a mobile phone !

As English loses its economic power, the answer is not for us to move to Mandarin Chinese, but to Esperanto which puts all speakers on an equal footing.

Have a look at http://www.lernu.net or http://www.esperanto.net

XML Aficionado said...

Funny you should mention Esperanto. I just recently started learning it...

Kurt Cagle said...

The intriguing thing to me is not that the translations are bad. It is that the translations are, even into and out of other languages, still understandable enough that you can, with a lot of effort, get at least a sense of what the original message was.

The analogy here is the old saw about a man in a bar who glances up and sees, at a table near him, an old geezer and a dog playing chess. Astounded, he walks over to the table just as the dog moves its knight to capture a rook.

"That's amazing," the man says to the geezer. "Your dog must be a genius!"

"Nah, not really," the old man replies, "I still beat him three times out of five."

Given the variables involved, the fact that training can only occur (if it happens at all) over the intervals of conversation, the variations in vocal pitch, tonality and accent, and its rather incredible that the first order translations are even vaguely cogent, let alone that you can manage translations across a language and back and still retain at least a semblance of meaning.

Sometimes I think we get too blase about the miracles that we create.