February 7, 2017

Use Clarifai’s Face Detection Model to Find Faces in Images

Table of Contents:

Clarifai’s new Face Detection model finds faces in images and returns bounding box location coordinates. This model is useful for security camera footage, photo filter apps, dating apps, digital photography, and more. Here’s a copy + paste tutorial on how to use our Face Detection model and Javascript to build a fun face photo filter!

Kevin Lewis Face Detection Model Developer

The brand new Face Detection model has recently been released in alpha and does exactly what you should expect it to – you provide an image with some lovely faces (or not so lovely faces) and it returns the position of any faces it has found. I immediately thought of Snapchat’s photo/video filters as an application and wondered what would happen if I combined Clarifai’s Face Detection model with the 💩 emoji, because I am a serious and mature adult adulting all over the place.

This project isn’t too complex at all, and if you’re happy to just read (commented) code, here is a link to the GitHub repository for this project.
How is this going to work?


Face Detection Celebrity model


How Face Detection Works

We’re going to wait for an image to be selected, and then use the Clarifai JavaScript client to get the position of the faces it’s found. Finally, we’ll take those values, relate them directly to our image and overlay poo emojis on their unsuspecting faces.


Get set up with this project
Firstly, we’ll need a Clarifai account. You can get one here. Create a new application and take note of your Client ID and Client Secret – don’t share these with anyone else.

Next, rename keys.example.js to just keys.js and paste in your Client ID and Client Secret.

Next, let’s set up our markup for this project. If you’re starting from scratch, create a file called index.html and make it look the same as mine. If you’re building this into an existing project, just make sure you have jQuery and the Clarifai JavaScript client included, and it has the following markup in the body:

<div class="input-group">
      <label for="image">Choose an image</label>
      <input type="file" id="image">
<img src="">
<canvas id="canvas"></canvas>

Let’s build this thing!
If you want to follow along with the finished JavaScript file, here it is.

Right, let’s do some housekeeping and create some initial variables which we will populate later.

var overlay = "💩";
var canvas, ctx;
var imageDetails = {
  clarifaiFaces: [],
  realFaces: []

The overlay variable contains the emoji which we wish to paste onto the faces we find in the image. The canvas and ctx variables will contain our HTML canvas context – which is how we’re going to draw the emojis on top of the image. Finally, the imageDetails object and properties will contain information about our image and face positions later in the file.

Next up, we’re going to initialize a new application using the Clarifai JavaScript client.

var app = new Clarifai.App(cId, cSec);

Now let’s create the code which will be run when the file input has a file selected. I’ve split this application up into three functions, which run one after another. You should hopefully see the flow from one function to the next.

$("input#image").on("change", function() {
  if(this.files[0]) {
    var reader = new FileReader();
    reader.onload = function(e) {
      imageDetails.b64 = e.target.result;
      $("img").attr("src", imageDetails.b64);
      imageDetails.b64Clarifai = imageDetails.b64.replace(/^data:image\/(.*);base64,/, '');
      imageDetails.width = $("img").width();
      imageDetails.height = $("img").height();

This is an event listener and is waiting for the file input to change. If a file has been selected, we do the following:

  1. Convert the image to a base-64 encoded string and insert the string into the <img> src. This displays it on the screen. This string is stored in imageDetails.
  2. Creates a version of the base-64 encoded string without metadata, which is what Clarifai needs. This is also stored in imageDetails.
  3. Stores the images width and height in imageDetails too, which we’ll use later to determine where the faces are in our image.


This function finishes by firing off our faceDetection() function. Let’s take a look at how it’s put together.

function faceDetection(b64Img) {
  app.models.predict("a403429f2ddf4b49b307e318f00e528b", {
    base64: b64Img
    function(res) {
      var data = res.outputs[0].data.regions;
      if (data !== null) {
        for (var i = 0; i < data.length; i++) {
    function(err) {

This function takes the Clarifai-ready base-64 encoded string and calls the face detection alpha model. It waits for a response and pushes each of the bounding boxes (the location information of each face) into the imageDetails.clarifaiFaces array.

The important thing to note about these values is that they provide the top-left and bottom-right positions of each box with values ranging from 0 to 1. We will later have to relate these to the actual position on our image using the width and height properties we stored earlier.

But I’m getting ahead of myself. Let’s look at the final function which is called at the end of faceDetection(), and that’s a function called drawBoxes().

function drawBoxes() {
  canvas = document.getElementById("canvas");
  $(canvas).attr("width", imageDetails.width).attr("height", imageDetails.height);
  ctx = canvas.getContext("2d");
  ctx.textBaseline = "top";

  for(var i=0; i<imageDetails.clarifaiFaces.length; i++) {
    box = {
      x: imageDetails.clarifaiFaces[i].left_col * imageDetails.width,
      y: imageDetails.clarifaiFaces[i].top_row * imageDetails.height,
      w: (imageDetails.clarifaiFaces[i].right_col * imageDetails.width) - (imageDetails.clarifaiFaces[i].left_col * imageDetails.width),
      h: (imageDetails.clarifaiFaces[i].bottom_row * imageDetails.height) - (imageDetails.clarifaiFaces[i].top_row * imageDetails.height)
    ctx.font = (box.w * 1.4) + "px monospace";
    ctx.fillText(overlay, box.x - (box.w / 5), box.y - (box.h/4));

The first line of this function stores the canvas element in the canvas element we set up at the start of this file. The second line makes the canvas the same size as the image, in both its width and its height. Next, we store the canvas context to the ctx variable – this is how we will interact with the canvas when drawing to it. Finally for the canvas setup, we set the baseline to the top, which is required to place the emoji properly.

Next, we iterate over the bounding boxes which we received from Clarifai and do some arithmetic which does the following for each box:

  1. Takes the top-left values from being between 0 and 1 to the correct pixel values on the image (for example, a box top-left position may be (0.124, 0.52), but these ais assuming the bottom-right of the image is (1, 1). We make them into pixel-correct values).
  2. Takes the bottom-right values and change them to show the width and height of the image (if the image width starts at 1.2 and ends at 2.5, then the width is 1.3, for example).
  3. We store both the top-left, width and height values for each box in imageDetails.realFaces.
  4. We draw the emoji set at the top of the image to the position and size of the box.


So wait, how does this work again?

Face Detection Image Recognition Model

There you have it – a poopifying app using face detection. Of course, you can be more generous with your friends and use hearts, unicorns, or other emojis to show you care – we’re not like that. Share your particular use case with @Clarifai for a chance to be featured on the blog!


I learned how to code #💩 over all your faces with #JavaScript + #Clarifai Face Detection model … GET READY 💩💩💩