How to Spring Clean Your Heroku Slug Size

April 20, 2020

This week I was deploying a routine update to Heroku, but the app wouldn’t deploy. Here is the message that was displayed in the build log:

-----> Discovering process types
       Procfile declares types     -> release, sidekiq, web
       Default types for buildpack -> console, rake
-----> Compressing...
 !     Compiled slug size: 502.5M is too large (max is 500M).
 !     See: http://devcenter.heroku.com/articles/slug-size
 !     Push failed

Slug size is too large was the culprit. 502 MB! The app is decently sized but that seems way off to me. According to rails stats the app is about 58,243 lines of code. Not small, but certainly not huge. And these are mostly source code plain text files too so file size should not be an issue at this scale.

Investigating the Sizing

I used a basic du -sh * command to check the basic file sizes of the folders in the app’s repository to see what’s going on. Here’s the truncated output:

12M    app
352K    config
1.3M    db
376K    lib
332K    public
260M    node_modules
 78M    tmp
226M    vendor

So my app is relatively small, only about 17 MB. The rest of the file size comes from the various dependencies. By some standards, I don’t really even think this app has very many dependencies. There are a few dozen Ruby gems in the Gemfile and there’s nothing out of the ordinary here. Rails, Postgres, Nokogiri, etc. All things you would expect. (Although side note, the google-api-client gem is 61 MB on its own. Yikes, by far the largest gem in there. That’s something to work on for another day.) All of the gems are in vendor/ so the file size there checks out as reasonable.

The tmp directory is pretty big here too, but that’s ok. It is mostly filled with bootsnap caching files to make the app load quicker too. This would be an easy 78MB to clear out if it was the culprit, since I don’t really care how long the app takes to boot, but let’s reserve clearing it for later.

Even that massive node_modules directory isn’t really problem, since Heroku slugs are compressed before this slug size is calculated. Those dependencies compressed down are only about 80MB.

So there’s something up here. Even with my most conservative calculation the slug size should be max of around 300MB after compression. Which is still too big in my opinion for an app of this size, but still nowhere near the limit of 500 MB.

There is a nice Heroku plugin that allows you to download slugs from your builds. It’s a very handy way to download the exact slug from Heroku’s filestore on S3. I downloaded the slug in question to give it a look.

Using the same du command within the extracted slug files, there was one major diference:

961M    public

Whoa. 961MB in the downloaded and unzipped slug, but only 332K in my project’s repo itself. It turns out that inside the public/assets directory there were hundreds of asset build files dating back years. The app itself is a few years old and it looks like Heroku had stored the compiled assets in the slug for each release since the app was created. Yikes.

This is a known feature on Heroku and it is called the build cache. I was mistakenly under the impression that each app deploy would cause a new slug to be built, including the asset compilation. The build cache feature makes sense: why waste time re-downloading files and other dependencies on each build when they can be cached? But caching the output of asset creation until the end of time? That seems excessive. So let’s clean it up.

The Fix

It turns out there’s an easy fix when build cache is your problem. There’s another Heroku plugin for managing the repo that contains your app’s code. Within this plugin is the purge-cache utility, which clears the build cache.

Here’s how it works:

heroku plugins:install heroku-repo
heroku repo:purge_cache -a appname

I redeployed the app and that shaved off a few hundred MB from the slug size. We’re back in business here. But I still couldn’t get past that massive node_modules/ directory. That was 260MB of modules only used for compiling assets. Once the assets were compiled, those modules are no longer used. (This particular app does not use any node_modules at run time, they are all precompiled with webpack.)

There is a community-developed buildpack for Heroku called post-build-clean that gives us the ability to remove files from the slug after the build command finishes running. This is slightly different than the standard .slugignore file which won’t even allow you to use those files during the build phase. I installed the buildpack, added node_modules to the .slug-post-clean file and tried to deploy again.

This time around, it looks much more reasonable to me:

-----> POST BUILD CLEAN app detected
Removing directory node_modules/ from slug
-----> Discovering process types
       Procfile declares types     -> release, sidekiq, web
       Default types for buildpack -> console, rake
-----> Compressing...
       Done: 87.4M
-----> Launching...
 !     Release command declared: this new release will not be available until the command succeeds.
       Released v1019

For a medium-sized app 87 MB seems like a good size to me. The asset files continue to build up after the compilations step, but it’s going to be a while until I need to clear the cache again.