Have you ever wanted to automatically search through your photos like you search with Google? For example, you attended your friend’s wedding recently, and you wanted to find images that included you out of the hundreds taken at the event. Or you want to collect all the photos of your lovable pet golden retriever.
Previously, the only option available was to go through each and every image to manually find out where whatever you are looking for appears. Nowadays, machine learning brings a much better option...
You can create your own application that automates searching through your photos for you! Even better, it will learn and get better as you give it more data. The Clarifai API offers image recognition as a service. Basically, it gives you machine learning functionalities without writing a single line of code related to machine learning.
To highlight how specific Clarifai can get when identifying concepts in images, we’ll look at golden retrievers as an example.
Thanks to Elvis Rozario and Major League Hacking for creating and sharing this Clarifai project tutorial and code.
First, a bit more on how Clarifai works. Clarifai will predict the “concepts” or tags represented within a particular image. When you train your own classifier (as we are about to do here), you can create anything you want to search for as a concept, then search any image on the basis of that concept. Later on, you can search on the basis of one or more concepts you’ve created. In our example, we’ll be creating “golden retriever” as a concept and adding corresponding images.
Clarifai also allows you to search for images on the basis of another image, i.e. visual search. In order to identify a golden retriever, we’ll first need to show it photos of a golden retriever. This is called training and we’ll use a few photos to train our Clarifai model. The more photos the better it will get at identifying golden retrievers, but since Clarifai’s API has already been trained on thousands of images of golden retrievers, your model will learn from those as well. Later, we can choose an image and verify how likely it is to contain the dog breed we seek.
Let’s get started with our example application, which we’ll create in 10 short steps. The Clarifai API supports quite a few languages. I’ll be writing this application in NodeJS for its ease of set-up.
Before creating the application, you’ll need to create an account in Clarifai to use their API. Once you’re logged in, you’ll have a default application in your account named ‘my-first-application’; you can either use that one or create a new application. In your application, click on the API key menu on the left and copy your API key.
The part colored over in blue is your API key unique to the application. Copy it so that we can use it later on. Be sure to keep it secret since it is linked to your personal account.
Once that’s done, you’re ready to start developing the application!
Open the code editor of your choice (VS Code, in my case) and create a file named index.js. You can name it anything you want, but the extension should be .js since we’re writing in NodeJS. Paste the following code:
Replace ‘YOUR_API_KEY’ with the API key you copied in the first step.
The first line tells the application to use the Clarifai package which we installed in the previous step. The next line initializes a new Clarifai app, with your API Key. This tells Clarifai that the API calls are for your particular Clarifai account and application.
Let’s add the images we want to search on. In my case, it’ll be a set of random dog images.
In the below code, I’ve collected some dog images from a simple Google search and added them as inputs.
Note: A max of 128 images can be sent in one call. If you have more than 128 images, batch them into separate calls. You can also send base64 encoded images, instead of image URLs.
Now that the images have been uploaded to Clarifai, you can search on the basis of another image using visual search. In this case, since I want to search for golden retrievers, I’ll find an image of a golden retriever and search with it as an input.
The response will be a list of Image URLs with ranked similarity scores relative to the image you’ve searched with. The image URLs are those of the images you uploaded initially.
The predictions in the response are sorted in descending order. In this example, the highest score is 0.895311.
We’ve now searched images with an image, so let’s explore the other method: searching images with a concept. The Clarifai API already gives us thousands of common concepts, like dog, people, etc.
I’m confident that of the images I uploaded earlier, one of them was of a pug. Since Clarifai’s General model has a predefined concept named pug, I can search with that concept word.
You’ll get a response like this:
This shows that for my input, that there was only one image matching the criterion specified and the match had a high probability of almost 1 (or 100%).
So you know how to search on the basis of concepts already available in the Clarifai General model, but what if what you want to search with something that is not already available? Let’s create our own concept of ‘golden retriever’ in our Clarifai app.
We’ll add some images with concepts as labels. We should know what these images contain since we are giving them as labeled examples, so we won’t use random photos like we did before. In this case, let’s add some golden retriever images along with the concept name ‘golden retriever.’
We’re using the same function we used to add images before. However, this time, instead of only sending images, we’re also sending an array of concepts (labels) for each image. Each concept contains an ‘id’ and ‘value’. The id is like the name of the concept (the way we want to specify the concept) while value can either be true or false, determining whether this concept is present or absent from the image, respectively.
We are extending Clarifai’s General model and so need to create our own custom model, with its own unique id. You’ll need to tell your model to include the new inputs (images with labels) you’ve uploaded. Then, as we’ll see in the next step, we’ll trigger the training of this model by Clarifai’s servers so you can use the fully-trained result to perform visual searches on your custom concept.
The code above creates a model named ‘golden retriever’ for the concept ‘golden retriever’.
We’ve uploaded labeled images to our new model, but it hasn’t learned from them yet. To do this we need to trigger the training function, which happens quickly and efficiently on Clarifai’s servers so we don’t have to worry about it. After we run the training function, the model can provide similarities of new images to the images we’ve previously uploaded and tagged.
The code above trains the model we just created. When you train a model, you are telling the system to look at all the images with concepts you’ve provided and learn from them. Training the model is asynchronous, which means that even though we will get an API response right away, it doesn’t necessarily mean that the training is done. The training takes some time depending on the number of models and concepts that need to be trained.
Just like we searched earlier, we can search by the concept ‘golden retriever’ using this code:
As before, we’ll get a list of images along with a confidence number:
The highest confidence images contain golden retrievers, as we expect! There are also other dog images as well, but with a much lower likelihood of them being goldens. You can control results by setting a cut-off (i.e., must be above 80% confidence) and by training with more labeled images.
Great work! You've just built your very own personal visual search app!
As you can see, the visual search is already functional when training on only three images. You can keep training the application to make sure edge conditions are satisfied and keep making it better and better.
Better, use what you’ve learned to move beyond dogs and train a new concept that has meaning in your life. Dig into the thorough Clarifai documentation to see all their public visual models and other object recognition features.