imgproxy v4: Internal Cache and changes to conditional request behavior
This is the first part of a series of blog posts about the new features in imgproxy v4. In this post, we will focus on caching improvements, including the introduction of an internal cache and changes to conditional request behavior.
imgproxy v4 announcements:
- Internal Cache and changes to conditional request behavior
- Parallel image downloading
- Better SVG minification, RAW formats support, and colorspace preservation
Internal Cache Pro
This is definitely the most anticipated and the most requested feature in imgproxy. For a long time, we’ve been encouraging users to rely entirely on an external cache, such as a CDN or a caching reverse proxy. Putting a CDN in front of imgproxy is and will always be a best practice, and it solves the caching problem for most users. However, we learned that some cases require an additional, long-term caching layer.
imgproxy Pro v4 introduces an internal cache that stores the processed images on disk or in cloud storage. This cache can act as both a primary cache and a secondary cache that serves as a fallback when your CDN cache misses.
The imgproxy’s internal cache has the following advantages:
- Long-term storage: While CDNs usually delete rarely accessed images after a certain period, the internal cache is persistent.
- One storage to rule them all: Most CDNs store cached images in multiple edge stores. This means that even if an image is cached on one edge, a request to another edge might result in a cache miss. imgproxy’s internal cache, on the other hand, stores images in a single location, eliminating this issue.
- Knows how imgproxy works: External caches are generic and don’t know a thing about imgproxy’s behavior. For example, they need some tweaks and tricks to work properly with imgproxy’s modern image formats detection or client hints support. The imgproxy’s internal cache, on the other hand, is designed to work seamlessly with all of imgproxy’s features.
- Protected by the same security measures: The internal cache is protected by the same security measures as the rest of imgproxy, such as URL signatures, processing options restrictions, etc.
- URL signature is not a part of the cache key: This means that even if you use multiple URL signature key/salt pairs, or if you rotate your keys, the cache will still work without any issues. And you will still be protected, as the URL signature is checked before the cache is accessed.
- It’s your cache: You can choose where to store it, how to manage it, and how to integrate it with your existing infrastructure.
The internal cache supports all the storage backends that imgproxy can read source images from:
- Local filesystem
- Amazon S3 and compatible services (e.g. Cloudflare R2, DigitalOcean Spaces, etc.)
- Google Cloud Storage
- Microsoft Azure Blob Storage
- OpenStack Swift
If you have an idea of a new storage backend that you would like to see supported, please let us know!
Limitations
The internal cache is a new feature, and it has some limitations that we are aware of and are working on improving in future releases:
- No cache invalidation: Currently, imgproxy doesn’t provide any means to invalidate the cache. Yet imgproxy includes cachebuster into the cache key, so you can use it to force cache invalidation when needed. Also, most storage offerings support object expiration, so you can set a reasonable expiration time for your cached images.
- No cache for info requests: The internal cache is currently only used for image processing requests; requests to the
/infoendpoint are not cached. Info requests are usually pretty lightweight and executed quite rarely, so we don’t see a strong need to cache them. However, we are considering adding caching for info requests in future releases, so if you think this would be useful to you, please let us know!
Etag and Last-Modified behavior changes
imgproxy has supported Etag and Last-Modified HTTP headers for a long time, and we have learned a lot about how they are used in the wild. Based on this experience, we decided to make some changes to their behavior in imgproxy v4 to optimize it.
Etag
In imgproxy v3, the Etag header was generated based on two components:
- The processing options hash.
- The
Etagheader received from the image source, or image file hash if theEtagheader was not present.
After some consideration, we decided to change the Etag generation algorithm in imgproxy v4:
- The processing options are no longer a part of the Etag generation. Since the processing options are a part of the URL, including them in the Etag generation seems redundant.
- imgproxy no longer uses the image file hash as a fallback when the
Etagheader is not present. Instead, it entirely relies on theEtagheader received from the image source. Most HTTP servers generate ETags by default or can be easily configured to do so, and all the storage backends that imgproxy supports provideEtagheaders. So there’s no need to generate ETags from the image file’s hash, which is a costly operation.
Last-Modified
The Last-Modified and If-Modified-Since headers’ behavior wasn’t changed a lot, yet we added a way to tune it a bit.
imgproxy v3 had the IMGPROXY_ETAG_BUSTER config option that allowed to “invalidate” Etag by adding a cachebuster to it. This may be useful if you changed imgproxy’s config and want to prevent it from responding with 304 Not Modified status code to the requests with old Etags. However, there was no way to do the same for the Last-Modified header.
In imgproxy v4, we added the IMGPROXY_LAST_MODIFIED_BUSTER config option. By setting it to a specific datetime, you can make imgproxy treat all the images as if they were modified at least at that datetime. Which means that:
- If the
Last-Modifiedheader received from the image source is older than theIMGPROXY_LAST_MODIFIED_BUSTERvalue, imgproxy will use theIMGPROXY_LAST_MODIFIED_BUSTERvalue as theLast-Modifiedheader in the response. - If the
If-Modified-Sinceheader in the request is older than theIMGPROXY_LAST_MODIFIED_BUSTERvalue, imgproxy will not pass it to the image source and will not return304 Not Modifiedstatus code based on it.
Etag and Last-Modified are enabled by default
In imgproxy v3, the Etag and Last-Modified headers were disabled by default, and you had to enable them explicitly by setting the IMGPROXY_USE_ETAG and IMGPROXY_USE_LAST_MODIFIED config options to true. After changing their behavior in imgproxy v4, which virtually eliminates the downsides of using them, we decided to enable them by default. However, you can still disable them if you want to by setting the corresponding config options to false.
Caching is a crucial aspect of any image processing service, and we are excited to bring these improvements to imgproxy v4. The internal cache provides a powerful new tool for users who need long-term storage of processed images, while the changes to Etag and Last-Modified behavior optimize client and CDN cache revalidation.
More announcements are on the way, so stay tuned! And if you want to try imgproxy Pro v4 and get your hands on these juicy new features, just apply to our Early Access program!