It's almost a classic: a sluggish webshop or online sales platform that breaks during seasonal peaks or when there is an unexpected surge in traffic.
The easiest way to deal with that problem is just to throw "more metal", which in the IT language means: get a bigger server.
Until a certain moment that is indeed the best strategy - but at some point the conditions change - and forcing "more metal" will very quickly become a liability and in fact a temporary fix which will be draining your wallet. Preparing grounds for a spectacular failure in the future are for free...
Some people asked me: why in 2024 are you still blogging about as obvious as horizontal scaling?
Many smaller businesses rely ONLY on developers doing their code, strategy and infrastructure - and for many developers such concepts are simply outside of their scope and primary interest. Since it's not their first priority - they are treating it more like a "forced hobby" they would rather ditch.
Explaining the horizontal scaling
Imagine a greeter by the door sending people to tables in a restaurant.
Each waiter can service 3 tables efficiently - 5 for a short period of time.
Now imagine you get so many customers that suddenly each waiter has to service 10 tables instead.
How many tables will be happy? Well - all of them will have to wait much longer for the service.
In horizontal scaling you simply add more waiters (and tables) to the mix, so that you always keep their business at around 3-5 tables.
From that example we have the following concepts:
- the greeter becomes our "load balancer"; he or she is sending people to tables services by waiters that still have the capacity
- each table+waiter become a "worker node"
In such situation the load balancer should be able to handle all incoming traffic, but that usually isn't a problem - but that should be clear also in the example: it takes much less time to quickly send a person to the right table rather than take the order, deliver it etc.
Why do companies need horizontal scaling?
There can be many reasons for that - but in my career there are two big reasons that were almost mixed together:
- the company became a victim of its own success: the attention they get either all the time or during seasonal sales (or events) is simply destroying their infrastructure - and credibility
- it's a mixture of "lack of knowledge", "poor coding", "wrong solution", "mismatched infrastructure" and things done "in a hurry"; I am mentioning all of these in one category - because always all were present - just in different proportions
Horizontal scaling requires an effort - and a bit of strategy, but on the contrary to a popular belief - you don't have to build big from the beginning. You can prepare the grounds in your code - or pick a framework/system that is scalable. These don't cost a fortune, and can possibly prevent you from losing one in the future.
What can't we just have a bigger computer?
It's like with trucks - they have a certain size that fits majority of cases. Need to transport more? You just divide it and use multiple trucks.
If your load isn't really easy to disassemble and assemble - then it will be a challenge to have it transported in multiple trucks.
Surprisingly or not - that is how it works in IT. You might have code and structure that can be easily scaled - or you might have to use some extra resources.
The good thing is that in IT if you prepare your code to be horizontally scaled - that will then keep on returning itself every day.
Typical misconceptions about horizontal scaling
I can scale my software to alleviate code issues/poor code
If you think that your poor code will perform well when you employ horizontal scaling - then I will tell you: it won't.
It will simply perform poorly, just on a larger number of machines. Therefore it's important to have control over potential bottlenecks in your software - and remove them before you go horizontal. This is will save you a lot of time in the future.
As a short term strategy this will of course work - you can scale poor code on multiple machines and hence service all your customers, but in the long term it's like balancing on a thin line.
It's expensive
That is relative. If you are not sure about the concept - then I do not recommend building a cargo plane for transporting a box of apples.
Since it's software you can in fact prepare for it with minimum drag on your primary budget - and when you exhaust "the bigger machine" (vertical scaling) strategy - you can quickly go horizontal.
It's expensive mostly in one case: when you have a ready made product that you have to scale - and inside of it there are some architectural solutions that simply require additional tricks or rewrites.
Prevention is always cheaper.
Preparing for horizontal scaling
Below is a list of important steps that I always recommend to have in mind while building new products and projects that have the potential to become "a lot more popular".
- Stateless Design
Ensures your application doesn't rely on previous interactions or store user data between requests, making it easier to add more servers. It's easy to name this feature when working with tokens, but what if you are using sessions? Then make sure to use a centralized session store - and I personally recommend Redis for this task. - Database Optimization
The moment you scale your nodes - your database might become the new bottleneck. Indexing, making sure you fetch only data that is required etc. - these things sound trivial, but not when you suddenly have to transfer 10MB of completely useless data... 50.000 times a day! (true story!) I am mentioning it because scaling databases is a completely different ball game and things get very quickly very expensive there... - Caching
Store frequently accessed data in a cache to reduce database load and speed up your application. The moment you begin to scale horizontally - you should be able to access your cache from multiple locations. Redis again works perfectly - but if you want to start without it - not a problem. Just make a Cache facade in your app - then it does not matter what backend is behind it. - Component Separation
Break your application into pieces - and I do not mean "microservices". There is a middle station between full-vertical and full-horizontal, and that is scaling on the infrastructural level. With minimum effort you can separate your app into the application server, the database server, the queue server, the caching server etc. - and this way you can scale each element independently. In many cases this was enough for my customers: just to split things and scale vertically each service. It is a necessity if you want to go full horizontal. - Background Processing
Move time-consuming tasks to the background, keeping user requests quick and responsive. This way you are not occupying the threads and you can run queues on a separate machine. In some cases scaling just your queue machine will be enough - and believe me or not - this is done much easier than scaling of the frontend application. - File Uploads
Use centralized document storage from the very beginning instead of storing files directly on the server. This is one of the main show-stoppers that plague people that need to scale quickly and have a custom system developed in-house. Yes, you can use clusters and mount them on all worker machines - but that again increases the complexity of your infrastructures and introduces another feature that can break.
When to go horizontal?
Consider horizontal scaling when:
- Your servers consistently experience high load
- You anticipate increased traffic due to campaigns or growth
- You need to ensure high availability and reliability
Notice that the amount of customers served vs. load on machines isn't a linear relation. The moment things "get stuck" and resources saturated things go much quicker in the wrong direction. This is why having a bit of headroom is important.
I have previously mentioned that vertical scaling is usually the first and best approach (part A). Honestly I recommend building things in a way, so that part B from the diagram can be achieved seamlessly (and in majority of cases it is like that). This way you can open new areas for vertical scaling.
The last part - C - is when your app is horizontal scalable - and as you can see the separation performed in part B is basically a hard requirement for C (there are exceptions of course - but they are rare).
Can this be automated?
Yes, it can - and the process is called auto-scaling. In such cases the number of servers is dynamically adjusted based on demand, adding servers during high load periods and reducing them during low traffic times. This feature requires careful planning but offers efficient resource utilization.
Unfortunately the learning curve becomes very steep when dealing with auto-scaling. That is because there are many nuances regarding the codebase itself, the deployment process and many other elements that influence the whole process. After all what you want is to be able to scale your infrastructure up and down smoothly, without any downtime - and that element in itself already is pretty challening.
Conclusion
Horizontal scaling offers tremendous processing capability. It does require some preparations and an investment from your side - but once you do it later on it becomes pretty seamless and easy to scale up and down.
A great side-effect of horizontal scaling is also increased availability - but here I am only talking about potential hardware/VM issues. When you deploy broken code - it will break everything.