Trying to confuse Google’s Vision algorithms with dogs and muffins

When I saw this set of very similar pictures of dogs and muffins (which comes from @teenybiscuit‘s tweets), I had only one question: How would Google’s Cloud Vision API perform on this.

At a quick glance, it’s not obvious for a human, so how does the machine perform?  It turns out it does pretty well, check the results in this gallery:

(also find the album on imgur)

For almost each set, there is one tile that is completely wrong, but the rest is at least in the good category. Overall, I am really surprised how well it performs.

You can try it yourself online with your own images here, and of course find the code on GitHub.

Technically it is built entirely in the browser, there is no server side component except the what’s behind the API of course:

  • Images are loaded from presets or via the browser’s File API.
  • Each tile is converted in its own image, and converted to base 64.
  • All of this is sent at once to the Google Cloud Vision API, asking for label detection results (this is what matters to us here, even if the API can do much more like face detection, OCR, landmark detection…)
  • Only the label with the highest score is kept from the results and printed back into the main canvas.

Call Wikipedia API using jQuery

I decided to directly use the data from Wikipedia. Many pages contain a structured “Infobox” that I will use to gather information I need.

There is a Wikipedia API (more precisely, MediaWiki, the engine of Wikipedia, has an API). I invite you to read the documentation.

Here are some examples of what can be done using the javascript library jQuery:

  • get the source of a page (API doc):
    $.getJSON("http://en.wikipedia.org/w/api.php?action=query&format=json&callback=?", {titles:pageName, prop: "revisions", rvprop:"content"}, wikipediaPageResult);
  • get the image names of a page (API doc):
    $.getJSON("http://en.wikipedia.org/w/api.php?action=query&format=json&callback=?", {titles:pageName, prop: "images"}, wikipediaImageResult
  • get the HTML formatted content of a page (API doc) (does not follow redirects)
    $.getJSON("http://en.wikipedia.org/w/api.php?action=parse&format=json&callback=?", {page:pageName, prop:"text"}, wikipediaHTMLResult);

It is important to note the needed “&callback=?” in the query (it will tell jQuery to use JSONP, a way to do cross-site javascript call), thanks to stackoverflow for this tip.

Note that to get the HTML content of a wiki page using javascript from the browser, I cannot use the “action=render” parameter of index.php because of the Same Origin Policy. I had to use the API to do it client side. I think I will rewrite my system to do this call server-side.

In the end, check my gist on github to get the full code of how to import a Wikipedia Page in javascript.