27 March 2008 - 22:02Silly Google Language Test

I don’t yet have a useful idea for a Google Language project, but the little script at the bottom of this post is me playing with their translation API.

My fundamental theory is that translating text from English to another language and then back provides a good test of your translation engine. This script does that, over and over, cycling through most of the languages that Google knows. If you uncheck one of the languages, it’s excluded from the cycle.

I have not yet found the limits of this script’s power. For one thing, now we know that if you translate the phrase “I’m pretty sure that you can’t do that on television.” through Arabic, Chinese, and Dutch all the way through Spanish, twice, you get the phrase “This is my body.”

Now I’m going to use the script to tell the future: If you give Google Language the phrase “Will Obama, Clinton, or McCain be the next President of the United States?” you get (after a lot of iterations) the result “The United States, Britain, the next president of the United States, President Clinton, Obama jonmakein mentioned?” That’s pretty clear.

The script can settle an ongoing dispute with my wife: Starting with “Is the ‘guy with an orange for a head’ joke funny?” results in “‘Orange is the President of the United States of America,’ Fortunately, there are several of them?” So…yeah. Pretty damn funny. Thanks, Google.

Finally, I will use this script to turn conflict into art: Starting with “Can you get a cold from being too cold? Faye doesn’t think so.” results in this haunting haiku:

In such cases, the
use of refrigerants. Flu?
I do not think so.

3 Comments » | Tags: Geekery

24 March 2008 - 22:01Ahem

Okay, so much for the weekly-posting pledge.

To get into the habit of writing something here regularly, I’ll post what I’m thinking about right now, which is this.

Google provides an API for automatic language translation. I have no idea how good the translations are, but soon I’ll use this for something. That thing will be adorable and only slowly, perhaps too late, will people realize that it is also important.

No Comments » | Tags: Geekery

15 February 2008 - 16:40AudioTurk

I forgot to post - I wrote a first-pass at the speech->MTurk->speech script last week.  I may clean it up and post the script someplace public, in case anyone else wants to play with it.

I get a variety of kinds of answers:

  • If I ask a question of fact (”How many cows are in Wisconsin?”), I get a cut/paste from Wikipedia.
  • If I ask a question that can be reduced to a question of fact by a literal reading (”How do you feel about the interplay of politics and religion?”), I get a cut/paste from Wikipedia.
  • If I ask a whimsical question (”Why can’t I find my pants?”) I get a cute multi-sentence essay.
  • If I ask an earnest question (”What’s your life like?”), I  get something pretty heart-felt.
  • If I ask someone to settle a dispute with my wife (”Who’s right, me or my wife?”), I am invariably told that my wife is right.

1 Comment » | Tags: Crowdsourcing, Geekery

9 February 2008 - 1:59The Whimsy of Crowds

Huh.

I just posted the “Summer day” question to Mechanical Turk for $1.00 and got a response within seconds.

I posted it again for $0.05 and got a response in less than a minute.

Awesome. I’m going to implement the “Ask the internet” engine now.

But first, I just spent $0.05 on the question “Why can’t I find my pants?” and got this response:

“Because your wife decided it’s time for her to be wearing them in your family. But just in case, did you check under the sofa cushions? That’s like the black hole of lost items in a home and usually where everything missing turns up. Good luck!”

No Comments » | Tags: Crowdsourcing, Geekery

9 February 2008 - 1:24Request for Ideas: Mechanical Turk

An article today reminded me of Amazon’s Mechanical Turk, which I’ve been meaning to look at for a while.

The idea behind the Mechanical Turk (named after a clever 18th-century scam) is that there are some tasks that are tough to perform computationally, but are very simple for humans…recognizing things in photographs, answering questions about personal experiences, etc. Amazon calls those things “Human Intelligence Tasks,” and it provides an online marketplace for getting them done: A Requester posts a task and a price that he’s willing to pay for it (”Tell me if the person in this picture looks happy, and I’ll give you $0.01.”), and a Worker can choose to perform the task for that price. (Side-question: Does this sound evil? Wikipedia says that some criticize the system as being a “virtual sweatshop,” and I can see the point, but it goes on to say that some people perform these tasks as a hobby. Assuming no-evil, I continue:)

Amazon provides an Application Programmer’s Interface to its MTurk system, which means that I can write a program that submits a task request, gets responses, and pays the people who respond…all automatically. Which sounds awesome, I love the idea of having an automatic system that marshals small motes of human effort, but I’m stuck for a good idea for a simple project.

Here’s one idea that assumes a much quicker turnaround-time than is probably reasonable:

  • Say I want my computer to intelligently answer verbal questions of the form “Computer: What’s the best thing about a summer day?”
  • I write a program that’s always listening for the word “Computer,” and then records everything that I say until the next pause.
  • The program takes the audio-file that contains my question and posts it as an MTurk task, “Please answer this question for $1.00.”
  • Someone sees the task and answers with text that says “I have never once been hit in the head by a summer day.”
  • My computer is notified of the answer, authorizes the payment of $1.00, and uses text-to-speech to read the answer to me.

That’s pretty cute, but it depends on someone answering the question quickly and I don’t know if that expectation is reasonable. Although now I want to spend a dollar just to see.

So: Anyone have any cute ideas? There has to be something creative to do here; something that provides a programmatic window into human whimsy.

No Comments » | Tags: Crowdsourcing, Geekery