November 21, 2023
Perfect cut: smart image resizing with imgproxy
Wrangling web images to fit different aspect ratios depending on screen size, device type, or design variant should not feel like fitting a square peg in a round hole. There are a few ways you can avoid cropping fiascos with imgproxy: from the built-in smart crop that automatically chooses the part of the image most likely to attract attention—to more advanced AI-powered strategies for selecting faces or objects. All of that done entirely on the fly, no pre-processing required.
The main idea behind imgproxy is that you can store a single high-resolution image original and use it to generate all the variants you need on the fly. This is a great way to save storage space and avoid the hassle of pre-processing images for different screen sizes and aspect ratios. imgproxy is simply an HTTP server that accepts the location of the original image (HTTP, S3, local, etc.) and a set of processing options. You get the fully processed image back as a response. Check out our getting started guide to get more acquainted with the product.
In this guide, we will use some beautiful high-resolution source images from Unsplash and demonstrate how imgproxy makes cropping decisions when resizing them. The goal is to turn the image into a square of 1500x1500 pixels and achieve a perfect crop. We will rely on the resize:fill
processing option that will resize the image to the target dimensions and then crop it to the desired aspect ratio. The fill
argument makes sure that the image is not stretched or distorted in the process. We will also rely on the gravity
option to tell imgproxy which part of the image we want to keep in the frame.
Figure 0: Basic Gravity Types
imgproxy supports all the basic gravity types that you would expect from an image processing tool: by default, imgproxy uses gravity:ce
to center the image in the frame. You can also use gravity:no
to keep the top edge of the image in the frame, gravity:so
to keep the bottom edge, and so on. The full list of options you can find in our docs. In this guide, we will pay no attention to the basics and jump straight to the advanced options.
Figure 1: Focus Point
For a more precise control over the part of the image to keep, imgproxy has a
gravity:fp
(or its shorthand g:fp
) option. fp
stands for focus point and it takes two floating point arguments between 0 and 1. The first argument is the X coordinate, and the second is the Y coordinate. For example, g:fp:0.5:0.5
represents the center the image (50% width, 50% height), and g:fp:0:1
would be the bottom left corner. This allows you to be very precise when choosing the desired image part. Think of any client-side application that allows for image editing like an avatar editor or a content management system, where the user can select the area for the crop. With imgproxy, you can implement this functionality without creating new images after each edit and storing them on the server, you can simply translate user input into the focus point coordinates and use them to crop the image on the fly.
imgproxy automatically tries to produce the image with the focus point as close to the center of the crop area as possible while respecting the bounds of the original image.
For the example above, the full processing options part of the URL will look like this:
resize:fill:1500:1500/gravity:fp:0.48:0.52
Figure 2: Default Smart Crop
Now that setting the focus point manually is out of the way, let's finally demonstrate a first smarter way to crop! Turning it on is super easy: pick the same gravity option, but this time, instead of providing the fp
argument and the coordinates for the center of the crop, you can just use gravity:sm
(or g:sm
for short).
This will tell imgproxy to use the default smart crop algorithm of the underlying libvips image processing library. It is based on the approach popularized by smartcrop.js. The algorithm uses a series of heuristics to find the most "interesting" part of an image, based on the detected edges, regions of high saturation, and skin tones. Here, the default smart crop correctly identifies that the most interesting part of the image is the yellow vintage campervan, so that is what goes into our target 1500x1500 square:
All the processing options we need to supply are:
resize:fill:1500:1500/gravity:sm
Figure 3: Advanced Smart Crop (imgproxy Pro)
The default smart crop algorithm is simple and reliable, but it is not without its faults—it can be fooled by images with a lot of saturation in the background. Plus it downsizes the image into a tiny thumbnail to speed up the processing which can mess with the detection of the edges and highlight the details which are no so prominent in the original.
To demonstrate:
The background of the image in question is full of colorful detail, so the algorithm gets distracted with street decorations and fails to detect that we probably want to focus on the person in the foreground.
Luckily, our business-oriented imgproxy Pro distribution has a setting that can help with that.
Toggling IMGRPOXY_SMART_CROP_ADVANCED=true
in the configuration environment variables replaces the default smart crop with our own bespoke algorithm that relies on a more sophisticated mathematical approach to build the feature map for the image. The method is similar to a well-researched Harris Corner Detection and it is able to detect the most attractive part of the image more reliably.
Here's the rare behind the scenes look on how the feature map actually looks like from the perspective of the algorithm. The crop is centered around the cluster of the most interesting points and it ends up being just what a human would expect when looking at the picture.
There is nothing to be changed in your processing options. Just toggling the IMGPROXY_SMART_CROP_ADVANCED
in the environment is enough to swap the cropping algorithm behind smart gravity option, the URL stays the same:
resize:fill:1500:1500/gravity:sm
Figure 4: Advanced Smart Crop with Face Detection (imgproxy Pro)
Advanced Smart Crop determines the best part mathematically: it is aware of the structure of the image but not of its human-perceptible content. While it's great at separating the subject from the background it does not "see" human faces in the shot, which might lead to unexpected results.
Human faces contain much less features from the mathematical standpoint than the striped shirt on the person in the center, so the heads of the companions are cropped out of the frame.
To mitigate that, you can enable the IMGPROXY_SMART_CROP_FACE_DETECTION
configuration option to force the algorithm to prefer faces over other objects. In this case, the algorithm will add an extra step that relies on the Haar Cascade classifier to detect faces and keep them in the frame. If no faces are detected, the algorithm will fall back to the default behavior. Let's see the cascade classifier in action:
Both heuristics to determine the crop are technically valid: the choice will ultimately depend on your use case. If the product you are showing is a colorful shirt, then the IMGPROXY_SMART_CROP_FACE_DETECTION
does not have to be enabled. If the end result is a social network post that tags people, then enabling face detection is the obvious way to go.
Figure 5: AI Object Detection (imgproxy Pro)
Cascade classifiers as a machine learning technique have been around for a while and they do a good job at simple object detection while being faster and using fewer resources than more advanced AI models. As usual, there are trade-offs: the accuracy is not always perfect and cascade classifiers can have a hard time detecting a face if the subject is shot under unusual angles or if the features are obscured by shadows or fashion accessories.
In our final example, the model wears large sunglasses and does not face the camera straight on. This is a tough job for a cascade classifier, but the AI is able to detect the face with ease and crop the image accordingly:
If you are an imgproxy Pro customer, you can simply pull a private Docker image we provide that already includes the Darknet Yolo model pre-trained to detect faces. After this, achieving the result above is as easy as setting your processing options to:
resize:fill:1500:1500/gravity:obj:face
For the most advanced use cases, you can even train your own Yolo v4 model (see the Darknet's official guide) to detect any objects you want and then use it with imgproxy Pro to crop images based on your custom criteria. You can find all the related configuration options in our docs.
Now we've covered everything that imgproxy and imgproxy Pro have to offer to smarten up your image transformations. Hopefully, it demonstrates that you can get SaaS-level features without signing up for an external service and start building self-hosted, performant and feature-rich image processing pipelines with imgproxy. Using our open source version is and forever will be free; imgproxy Pro that packs more features and offers priority support has a flexible and predictable pricing with plans starting at $499/year.
We hope you've found this guide useful and that it will help you make the most of your images. If you have any questions or suggestions, or need help with your imgproxy or imgproxy Pro deployment, drop us a line!