Published on

We saved $250,000 by running our own RPC nodes

Authors
Monetize your wallet transactions today.
👋 Welcome to our Engineering blog. merkle specializes in MEV protection and monetization on Ethereum. We guarantee minimum $0.10 per transaction. Ideal for Wallets and RPC Providers looking to protect their customers against Sandwiches while generating revenue.

At merkle, we consume hundreds of millions of RPC requests every month, and we need to make sure that our infrastructure is capable of handling the load. We use a custom load balancer that we built in-house to distribute the load across our RPC nodes and achieve high availability while keeping costs low.

RPC services

When merkle started, we used Alchemy. However, we quickly realized this wouldn't scale, after receiving a $1,600 bill for just 2 days of usage:

Alchemy bill

Extrapolated, this would have cost us $24,000 per month, which is way too much for a startup. We decided to build our own RPC nodes, and we've been using them ever since.

Building our own RPC nodes

Our goals were simple:

  • High availability: we need to be able to handle billions of requests per month, and we can't afford to have downtime.
  • Low cost: we're a startup, and we need to keep our costs low, less than $1,500 per month.
  • Low latency: we need to be able to handle requests in a timely manner.
  • Easy to scale: we need to be able to scale up and down easily.
  • Low maintenance: we don't want to spend a lot of time maintaining our infrastructure.
  • Multi-chain: we need to be able to support multiple chains / add new chains quickly with zero downtime.

Picking a cloud

We decided to use OVH, a French cloud provider, because they offer a lot of flexibility and low prices for beefy machines. We also use AWS for some services, but we prefer OVH for our RPC nodes.

Specifically, we use ADVANCE-2 servers, which have 16 cores, 32GB of RAM, and 2x 1.92TB NVMe SSDs. They cost $200 (less with commitments) per month, which is a great deal.

For Polygon, and BSC, we use the same server, with a higher disk capacity (2x 3.84TB NVMe SSDs) for $250 per month.

But the real value in OVH servers is the unlimited outgoing/incoming bandwidth.

We spawned minimum 2 nodes per chain, adding up to 6 nodes in total, and a monthly cost of ~$1,000 (due to some long term commitments and discounts from OVH).

Picking a load balancer

Nginx is a great choice for a load balancer, but RPC nodes are different, that's why we decided to build our own load balancer using Go. merkle products are mostly rust, but we use Go for some high traffic services, and it's a great fit for this use case.

Building a load balancer for rpc nodes

We needed a high throughput, low latency load balancer, that can handle hundreds of millions of requests per month, and we needed to build it quickly.

The architecture

In order to keep track of all upstream nodes, the load balancer connects to them over multiple websockets (we don't use http).

Load balancer design

The load balancer, at all times, has between 5 and 10 websocket connections to every upstream servers.

Supporting eth_subscribe

Keep track of the head

State consistency is an issue with normal web services, but when it comes to rpc nodes, it's a totally different problem. We want to make sure we never route requests to a server that is lagging behind the head of the network.

For example, suppose that we have server A and B. When A hears of a new block, it'll quickly process it and update its state, but B might not have receive the new block yet. And then you have two nodes with different state.

In order to solve this problem, we keep track of the head of the network of each node, and we only route requests to nodes that are at the latest head. However, we need to wait until the majority of nodes have been synced to the new head become advertising it to clients, otherwise all requests would be routed to one server for a short period, which would lead to a lot of load on one server.

eth_subscribe

New blocks:

eth_subscribe is the fastest way to get notified of new blocks and new pending transaction, but we don't want to just proxy the request and attach a stream to a node, because we want to make sure that we don't miss any events. And in case the nodes goes down, we want to make sure that the client never notices, and keeps receiving new blocks.

Thankfully, we already track every new block event to route requests. Therefore, a eth_subscribe never actually needs to be forwarded to a node, we can just keep track of the subscription on the load balancer, and forward the events to the client.

New pending transactions:

Under the hood, the load balancer connects to our Transaction stream to seamlessly advertise pending transactions as fast as possible.

Caching

We know from experience that as soon as a new block is advertised, the load balancer gets flooded with eth_getTransactionReceipt, eth_getTransactionByHash, eth_getBlockByNumber and eth_getBlockByHash. That's why we cache all the responses before advertising a new block. Leading to 40-80% cache hits.

Cache hits

Quality of life improvements

Our engineers used to always ask What is the RPC url for <x> chain. So we put our load balancer behind rpc.merkle.net (our internal network). Now, they can just use https://rpc.merkle.net/<chain> for any chain that we support.

Conclusion

We've been using this load balancer for over 3 months now. It's been working great, has scaled very well, and we were to save over $250,000 in the process.

RETH RPC
The first RPC service powered by RETH, 20ms faster on average and 10x cheaper.
Sign up
Transaction Stream
Stream all Ethereum, Polygon and BSC transactions to your application in real-time. Faster than any transaction stream on the market. Built with RETH.
Sign up
Simulation API
Simulate bundles of up to 10 transactions on Ethereum, Polygon and BSC for $0.2 / 1,000,000,000 of gas.
Sign up