- Published on
Scaling web apps using read replicas
- How we scaled Alt.xyz to handle growing read requests
- Postgres read replicas
- Migration to Aurora
- Server: biforking read/write
- Frontend: biforking read/write
How we scaled Alt.xyz to handle growing read requests
At Alt.xyz we built a marketplace for selling investment grade cards. Over time, our application grew from a scale of tens of thousands to millions of daily requests.
We started a single postgres instance that all our clients were reading / writing to / from. Below is a diagram of
our system at the time. One can see there exists contention with the single database instance serving all traffic.
In order to handle the scale, we switched from postgres to aurora
database instances and horizontally scaled using read replicas. 99% of the daily requests were read requests.
We opted to send all our read data to the read replicas and only writes to the primary database instance.
This post describes how we implemented such a change in a GraphQL environment. Below is a diagram of the final system
to give an idea of what we're working towards in this post.
Postgres read replicas
We started with postgres as our database. Postgres offers read replicas and adding a read instance is relatively straight forward in the AWS console (shown below). In a production environment, I recommend to manage these resources through terraform but for the purpose of this post working directly through the aws rds console is sufficient:
Prior to implementing read replica usage developer's should understand how long data takes to be written to the read replica. In our case, it would take several seconds for the read replica. This information is available in the monitoring section of a read replica.
The monitor ReplicaLag
is the one we need to gain insights:
Once insight to replication lag is gained, developers will need to understand whether the product they are building has tolerance for showing stale data and for how long. There are times the replication lag can spike if the rds instance is experiencing high cpu usage. This can occur due to long-running queries or database migrations on large tables.
Migration to Aurora
Coming soon
Server: biforking read/write
Coming soon
Frontend: biforking read/write
Coming soon