June 1, 2023

Bulk labeling the fast and easy way

Table of Contents:

Welcome to our short tutorial, where we’ll be exploring the power of Clarifai's General Recognition model for super fast bulk labeling, and the concept of transfer learning. Together, we’ll learn how to annotate and train our own model at breakneck speed. Our training subjects will be an eclectic mix of animals – five different species to be exact. This tutorial is based on the following Transfer Learning video, but we want to focus on the actual bulk labeling part of it. If you'd like to see this in action, give it a watch!

We begin by taking a look at our dataset. By using Clarifai's General Recognition model to perform a search for 'cats' in the dataset, we confirm that there are no cats in the dataset. This is simply a quick way of showing a negative example, that when a concept is absent from the dataset, the model will correctly return nothing.

 

Our search continues, and we decide to look for horses, confident that our the dataset does contain them. Success! The Clarifai model identifies numerous images of horses. Here's the magic - we can select all these images and label them as 'horses'. This labeling process becomes part of our model training and it means that we don't need to individually label these images later. This process significantly speeds up our workflow, taking care of the horses swiftly. Of course, we’ll revisit later to ensure no horses have been overlooked.

We repeat the process for dogs, relying on a quick visual scan to confirm that all identified images indeed look canine. The process is quick and satisfying – all detected images are dogs! We click the "Label as..." button at the bottom to bring up the following dialog:

Next on our list are the elephants. As we sift through the dataset, the model identifies a commendable 97 out of 100 elephants within a collection of 500 pictures. The dataset is evenly divided, featuring 100 images for each type of animal listed under the labels tab. Again we label them as "elephant" using the "Label as..." button.

Butterflies are up next. This task proves to be relatively easy, given the striking colors and close-ups of these winged beauties. Despite the last image looking more like a piece of abstract art than a butterfly, we select all and add them to our labels. We're getting closer to completing our animal kingdom catalog.

Finally, we tackle the challenge of identifying hens. Some confusion arises here as we ponder whether to label these feathered friends as hens or chickens. A search for 'hens' reveals 95 matches, which we promptly mark as 'chickens' for consistency.

With our primary annotation completed, we turn our attention to the unlabeled images. A quick scan reveals a variety of peculiar images, from indecipherable objects to dogs shrouded in watermarks and appearing straight out of a horror movie. We also come across a stock photo of a horse, signaling our return to familiar territory.

The unlabeled images are meticulously scanned and the horses, elephants, butterflies, and chickens that were previously missed are added to our labels. From artistic and silhouette representations of elephants to an elusive butterfly amongst flowers, we’re able to recover and correctly label these images, ensuring our model’s learning experience is as comprehensive as possible.

The next stage in our process is the creation of a custom model. By using a transfer learning classifier, we're able to use an existing model and fine-tune it to suit our specific needs. 

We name our model 'animal-classifier' and select our training set, including all the animal categories we've identified. We ensure these categories are mutually exclusive, considering that each picture in our dataset contains only one type of animal.

 

Our model is then trained using transfer learning, a process that adapts an existing model for our specific purpose. This method is incredibly efficient and far quicker than training a model from scratch. In about 3 seconds the training is complete.

Once our model is trained, it’s time to test it. We upload a set of 50 new images into our test dataset. 

Using the 'Predict' feature, we allow our Animal Classifier to make predictions about the contents of the images. The model performs admirably, assigning high probability scores to the animals present in the images.

To sum up this tutorial, we’ve successfully labeled 500 images in a matter of minutes, created a custom transfer learning classifier, and rigorously tested it on a new set of images. We wrap up by viewing some adorable puppies, and, thanks to our new Animal Classifier, we can say with high certainty that they indeed are puppies.

So, there you have it! A quick and easy guide to using Clarifai's General Recognition model for bulk labeling and transfer learning. We hope you enjoyed this tutorial and look forward to hearing about your experiences with machine learning.