What is image intelligence?
Images are inherently data visualizations. Each digital picture is a big data set in itself, with each pixel a building block for some subset of information. An image taken by the camera of a modern smartphone contains more than 12 million pixels. If you’re looking to analyze 1,000 pictures—over one trillion data points—you will need help. That’s where Computer Vision (CV) comes in. This field uses algorithms generated by machine learning to extract from images information about objects, text, and subjective qualitative elements. In all, image intelligence can generate meaningful data on spaces, users and activities.
THE SIMPLEST PROBLEM WE COULD FIND
The new layout for Hub—Perkins+Will’s internal project and personnel database—required that we update the aspect ratio on all of its 2,500 employee headshots. All the pre-existing pictures had been cropped manually and without strict guidelines. How could we generate new headshots from the scattered original images, achieving consistency and uniformity of scale and position? Should we have someone manually re-crop them? Old-school. Crowd-source labor to update them, perhaps using Amazon’s Mechanical Turk marketplace? Too time-consuming. Or could we leverage machine intelligence to perform this task?
THE MACHINE INTELLIGENCE WAY
To crop a photo, a human must open an image, look at it, recognize a face, and resize the area around that face with a pleasing ratio of padding. The first, second, and fourth steps are instructions that can be given to a computer at the project’s onset. So how do we get a machine to recognize a face?
To solve our problem, we used Rekognition, an Amazon service that can analyze an image and detect faces and key features. (In the image above, the orange box is the facial bounding box returned by the service.) In most cases the Rekognition results were fine; they also were able to include such “landmark” features as eyes, noses, and mouth corners. These attributes are much more consistent and useful for capturing angle/rotation and even identity (more on that later). Using these features we were able to construct our own crop box around the main facial features, visualized by the red box. We used the center of this box (the red dot) to calculate an even padding around the face, with a slight offset to allow for hairstyle where needed.
The final step was to integrate the cloud for a truly seamless tool. In the beginning we identified the steps the human would have to take to crop a photo, which included opening apps and issuing commands. Now that we had an application that was capable of handling the entire process via automation, why not trigger the process automatically as well? We integrated the process directly into Box, our cloud content management system, allowing the entire operation to occur autonomously and behind the scenes. A user just needed to drag and drop an image into a folder; automatically, a new folder storing different size and aspect ratio headshots was created. If an updated headshot was loaded, all headshots and avatars linked to that user (Hub, Skype, LinkedIn, etc.) would be formatted, replaced, and archived.
HOW MUCH TIME DOES THIS SAVE?
This is an excellent demonstration of how to leverage cutting-edge technology to improve our internal processes and free up time for our design experts to focus on more valuable efforts. Currently headshots are manually cropped by a marketing professional in each of our 24 offices. It is a significant effort to manually collect and process each headshot. Even if we assume a conservative estimate of five minutes per image, this process saves 24 over-qualified team members over 200 combined hours. If we add a new service or need to update headshots again in the future, the same amount of man-hours would be required. Our new method also ensures consistency, labeling accuracy, and correct placement under our file-storage protocols.
Stay tuned for Part 2 of our investigation into Image Intelligence.