The Predict API returns an array of regions. Each region element has bounding box coordinates for each face detected as well as a data object containing a ‘vector’ and ‘num_dimensions’.
The returned ‘bounding_box’ values are the coordinates of the box outlining each face within the image. They are specified as float values between 0 and 1, relative to the image size; the top-left coordinate of the image is (0.0, 0.0), and the bottom-right of the image is (1.0, 1.0). If the original image size is (500 width, 333 height), then the box above corresponds to the box with top-left corner at (208 x, 83 y) and bottom-right corner at (175 x, 139 y). Note that if the image is rescaled (by the same amount in x and y), then box coordinates remain the same. To convert back to pixel values, multiply by the image size, width (for “left_col” and “right_col”) and height (for “top_row” and “bottom_row”).
The ‘vector’ is a numerical vector that represents the face detected in a 1024-dimensional space. The numerical values within the vectors are between 0 and 1, inclusive. The vectors of visually similar faces will be close to each other in the 1024-dimensional space. The ‘num_dimensions’ for this model is set at 1024.
Gather valuable business insights from images, text and data using machine learning, natural language processing and computer vision.
Detect toxic, obscene, racist, or threatening language, or your own custom moderation models.
Assign tags or categories to analyze text based on content. Build models for topic analysis, sentiment analysis, smart reply and more.
Identify unwanted content such as gore, drugs, explicit nudity or suggestive nudity.
Recognize over 11,000 different concepts including objects, themes, moods and more.
Create your own model and teach it with your own images and concepts.
Predict the age, gender or cultural appearances of faces.
Detect the location of faces with bounding boxes.
Analyze images and return probability scores on the likelihood that the media contains the face(s) of over 10,000 recognized celebrities.
Detect items of clothing or fashion-related items.
Analyze images and returns numerical vectors that represent each detected face in the image in a 1024-dimensional space.
Recognize more than 1,000 food items in images down to the ingredient level.
Identify the dominant colors present in your images in hex or W3C form.
Recognize textures and patterns in a two-dimensional image e.g., feathers, woodgrain, petrified wood, glacial ice and overarching descriptive concepts (veined, metallic).
Identify different levels of nudity in your visual data. Ideal for moderating and filtering offensive content from your platform.
Recognize specific features of residential, hotel, and travel-related properties.
Recognize over 400 concepts related to weddings including bride, groom, flowers and more.
Analyze images and returns numerical vectors that represent each detected face in the image in a 1024-dimensional space computed by our General model.
Explore our pre-built, ready-to-use image recognition models to suit your specific needs.