Can Your Phone Translate Signs On The Fly?

NAT: Hey, there. I’m Nat. LO: And I’m Lo, and this
is our 20% Project where we go around Google
learning about all the stuff we’re curious about. NAT: And in this episode
we’re finding out how the Google Translate app
does the crazy sci-fi magic that it does. LO: Specifically,
there’s this feature where you press the camera
button, put your phone up to something you
can’t read, and voila. It instantly translates it
into a language you can. NAT: This feature
actually started out as an app called Word
Lens, and then last year the team that made
it came to Google and are now part of
Google Translate. LO: So we met up
with Otavio, who helped invent Word Lens
to find out how it works. NAT: Otavio is a self-professed
non-language guy. OTAVIO: I used to
speak Portuguese, but I kind of forgot it. NAT: In his spare time, he
likes to animate fractals. LO: He goes snowboarding,
and at his wedding, he got a quadcopter
to deliver the rings. NAT: Needless to
say, we were pretty excited to go talk with him. LO: So, OK. So how did it all start? OTAVIO: So I was, you know,
being a tourist in Germany. I was, at some point,
in a bookstore, and I was looking around me
and I could read anything. And there were some books
that looked really cool. They looked like, maybe,
physics books or something. I love physics. And so I wanted to be
able to point my camera phone at these things
and translate them. And so I put
together a prototype over the course
of a couple weeks, and that was enough to
convince me to quit my job and start working
on it full-time. NAT: And then two
and a half years later, with the help of some
other talented programmers– LO: And Otavio’s
super awesome mom. NAT: –who speaks,
like, 10 languages, they turned it
from this prototype to a real working app. LO: Which is now a part
of Google Translate. OTAVIO: So here’s a sign. It’s in Russian. This is the kind of
thing that you will see in a Russian subway station. NAT: I’m going to go with
dangerous piranhas are in this area. OTAVIO: So there’s the sign. NAT: Should I guess it? I have no idea, actually. LO: Diving. NAT: Diving is not or something. LO: So what’s it say? OTAVIO: It said, Attention. Prohibition of bathing. NAT: Oh, bathing. LO: Yeah. OTAVIO: And I like this
one because it does show, like, you know, the
translation isn’t perfect, but it gets the idea across. LO: So can you explain,
like, how it actually works? OTAVIO: So the way it works is
the phone gets an image back from the camera. We just throw out the color. There usually isn’t a whole lot
of information in the color. And first we have to
process it a little bit. You know, we have to get rid
of soft shadows and things that aren’t really
contributing to the text. So we put it through
a bunch of filters. And then at that point,
we can kind of just threshold the image
to black and white. So we have dark things
and light things. And then we kind of look
for little blobs that are sitting next to each other. And those blobs are
probably letters. They might be letters,
they might be trees, they might be hubcaps. So the next step is to
classify what these things are. So when we do the classifier,
the letter classifier, it comes back with a score. And, you know, sometimes
that score says, I am very confident
that this is a letter P. And sometimes the score
comes back and says, I have no idea what
this is, really. And so from that confidence we
can start throwing things out. And then from
there you basically have a bunch of letters,
and you can look that up in your dictionary, do
a simple translation. And then we already
read all the letters, so we can pick up
what the background color is around these letters. Just draw a big box around
the original letters that’s in the background color, and
then draw the foreground font. LO: You make it sound
easy, and yet it, like, happens instantaneously. OTAVIO: I want to make a
window into a world that’s in your language. I want to be able
to go like this, and everything that you
look at is in your language. What helped me was in
Russia I could buy things at the supermarket and
understand what they are, you know? I mean, you see all these
things, they all look the same. And one of them is milk,
and one of them is cream, and one of them is some
other strange kind of milk. You know? And you have to figure
out which one it is. NAT: And now Otavio and
the rest of the team are trying to get new
languages working, which is a lot harder than
I think we’d ever realized. OTAVIO: Latin is one of the
easier scripts for the computer to read. You know, in Latin, there’s
only about 80 to 100 characters that you really care about. Chinese is harder, because there
are thousands of characters. Then there are things like
Arabic, where the whole word is kind of– it’s like cursive
where the word is continuous and it doesn’t
have clear breaks. And that’s really hard for
the computer to figure out. So whenever somebody
suggests I do Arabic, I want to run away
because of the difficulty. NAT: I don’t know
why, in my head I was thinking hieroglyphics. That would be a hard problem. OTAVIO: So when I was
at a museum a while ago, I did a lot of pictures
of hieroglyphics– NAT: Yeah? OTAVIO: –just in case
I someday had the urge to make a hieroglyphics
translator. But it’s actually really hard. LO: So although Google Translate
can’t translate hieroglyphics– NAT: Just yet. LO: –it did just get 20
new languages working, including Swedish, Romanian, and
a few others I can’t remember. NAT: We wanted to say
thank you for, you know, taking the time out
to come talk with us, so we made you this
official thank you. OTAVIO: Uh-oh. I wonder what it says. “Thanks for your help
with our project. This official certificate thanks
from Natalie and Lorraine. They baked for you cookies.” All right. I like cookies. LO: Yeah. I like cookies, too. Can I have a cookie? [MUSIC PLAYING] -Cookie. Oh. Oh.

