Users of the Rust programming language interact with the infrastructure of the project in various different ways. They access the project’s website and documentation, query the crates index, and download Rust releases and crates. These resources are hosted by the Rust project and served through a Content Delivery Network (CDN).
This document outlines why we use CDNs, for what, and how we have set them up.
We have three goals for our use of CDNs in our infrastructure:
- Reduce costs of outbound traffic through cheaper pricing and caching
- Reduce load on origin servers to save compute resources
- Provide a way to rewrite legacy URLs for some resources
As an open source project, we have to be very mindful of our infrastructure costs. Outbound traffic is by far one of the most expensive items on our monthly bills, and one that will continue to increase as Rust gets more popular.
Cloud providers typically charge different rates for outbound traffic based on the service. For example, serving data straight from Amazon S3 is more expensive than serving the same data through an Amazon CloudFront distribution. This is why we now use a CDN by default, even for services that can’t make use of other features of a CDN such as caching.
Most of the project’s resources are hosted on AWS. Static content is stored in Amazon S3, while dynamic content is loaded from a server. Both types of content are served through Amazon CloudFront, the Content Delivery Network of AWS.
When a user access a resource, e.g. they are trying to download a crate, they
will access the resource through the CDN. Different distributions map domain
names to a configuration and a backend (called the origin). For example,
downloading a crate from
static.crates.io goes through a distribution that
fetches the crate from an S3 bucket and then caches it for future requests.
┌──► S3 (static content) │ User ───────► CloudFront ────┤ │ └──► Server (dynamic content)
There are many distributions, all of which are configured in the rust-lang/simpleinfra repository. However, their usage is very unevenly distributed. The following distributions are the most important ones for the project, both in terms of traffic and criticality for the ecosystem.
Whenever a user installs or updates Rust, pre-compiled binaries are downloaded
static.rust-lang.org. The same is true when Rust is installed in a CI/CD
pipeline, which is why this distribution has by far the highest traffic volume.
Rust binaries are static and are stored in Amazon S3, from where they are served by the CloudFront distribution.
The distribution for
static.rust-lang.org has a custom router that runs in a
AWS Lambda function. The router provides a way to list files for a release and
rewrites the legacy URL for
The cache for Rust releases is invalidated nightly.
Similar to Rust releases, crates are served from as static content from
static.crates.io. While still being the second-largest distribution in our
infrastructure, it is much smaller than the releases.
Crates are static and stored in Amazon S3, and served through a CloudFront distribution.