Actualité

Scaling your web application: basic steps

Published

4 years ago

09/13/2022

Kossi Adzo

Workplace with modern laptop with program code on screen

It’s not enough to create applications for your business, you need to optimize them. An effective way is to scale. In this article you’ll learn about code optimization, architecture optimization, and how to build scalable web applications in general.

Table of Contents

Optimizing

Gearheart suggests asking yourself the following questions:

are the database queries optimal (EXPLAIN analysis, use of indexes)?
is the data stored correctly (SQL vs NoSQL)?
is caching used?
no unnecessary requests to the FS or the database?
are the data processing algorithms optimal?
are the environment settings optimal: Apache/Nginx, MySQL/PostgreSQL, PHP/Python?

Each of these questions could be covered in a separate article, so a detailed consideration of them within the framework of this article is clearly excessive. It is important to understand that before you start scaling an application, it is highly desirable to optimize its work as much as possible – in fact, it is possible then that no scaling at all will be required.

Scaling

Suppose you have already optimized your application, but it is still not able to handle the load. In this case, the obvious solution is to distribute the application across multiple hosts in order to increase the overall performance of the application by increasing the resources available. This approach is officially called “scaling” the application. More precisely, scalability is the ability of a system to increase its performance by increasing the amount of resources available to it.

There are two types of scalability: vertical and horizontal. Vertical scalability implies increasing application performance by adding resources (CPU, memory, disk) within one node (host). Horizontal scaling is typical for distributed applications and implies increasing application performance by adding another node.

It is clear that the easiest way is a simple hardware upgrade (processor, memory, disk) – i.e., vertical scaling. In addition, this approach does not require any modifications to the application. However, vertical scaling quickly reaches its limit, after which the developer and administrator have no choice but to switch to horizontal scaling of the application.

Application Architecture

Most web applications are a priori distributed, because their architecture can be divided into at least three layers: web-server, business logic (application), data (database, static).

Each of these layers can be scaled. So if your system has an application and a database residing on the same host, the first step should definitely be to separate them on different hosts.

The bottleneck

Proceeding to the scaling of the system, the first thing to do is to determine which of the layers is the “bottleneck”, i.e., slower than the rest of the system. To begin with, you can use trivial utilities like top (htop) to evaluate CPU/memory consumption and df, iostat to evaluate disk consumption. However, it is desirable to provide a separate host with a battle load emulation (using AB or JMeter), on which you can profile the application using utilities such as xdebug, oprofile and so on. You can use utilities like pgFouine to identify narrow database queries (of course, it’s better to do it based on logs from the battle server).

Usually it depends on the architecture of the application, but in general the most likely candidates for a bottleneck are the database and the code. If your application handles a lot of user data, the bottleneck is likely to be static storage.

Database scaling

As mentioned above, the bottleneck in modern applications is often the database. Problems with it are usually divided into two classes: performance and the need to store a large amount of data.

You can reduce the load on the database by dividing it into several hosts. There is an acute difficulty of synchronization between them, which can be solved by implementing the master/slave scheme with synchronous or asynchronous replication. For PostgreSQL, you can use Slony-I for synchronous replication and PgPool-II or WAL (9.0) for asynchronous replication. To solve the problem of splitting read and write requests, as well as balancing the load between slaves, you can configure a special database access layer (PgPool-II).

The concern of storing large amounts of data in case of relational databases can be solved by partitioning (“partitioning” in PostgreSQL), or by deploying the database on a distributed database like Hadoop DFS.

You can read about both solutions in the excellent book on configuring PostgreSQL.

1.However, for storing large amounts of data, the best solution is sharding, which is an inherent advantage of most NoSQL databases (e.g., MongoDB).

2.Moreover, NoSQL databases in general work faster than their SQL brethren due to the lack of overhead for parsing/optimization of the query, checking data structure integrity, etc. The topic of comparing the relational and NoSQL databases is also quite extensive and deserves a separate article.

3.Separately worth noting is the experience of Facebook, which uses MySQL without JOIN selections. This strategy allows them to scale the database much more easily, while transferring the load from the database to the code, which, as will be described below, scales easier than the database.

Code Scaling

The complexities of scaling code depend on how many shared resources your hosts need to run your application. Will it just be sessions, or will you need to share caches and files? Either way, the first thing to do is to run copies of the application on multiple hosts with the same environment.
Next, you need to set up load/request balancing between these hosts. You can do it both on TCP (HAProxy), HTTP (nginx) or DNS.
The next step, Gearheart mentioned, is to make the static files, cache, and the web application sessions available on each host. For sessions, you can use a server working over network (for example, Memcached). As a cache server, it makes sense to use the same Memcached, but on a different host, of course.
Static files can be mounted from some shared file storage via NFS/CIFS or using distributed FS (HDFS, GlusterFS, Ceph).

It is also possible to store files in a database (e.g., Mongo GridFS), thereby solving the problem of availability and scalability (taking into account that for the NoSQL database scalability problem is solved by sharding).

Separately worth noting, the issue of deployment to multiple hosts. How to make sure that the user by clicking “Update” does not see different versions of the application? The simplest solution, in my opinion, would be to exclude from the config load balancer (web-server) the hosts which are not updated, and turn them on sequentially as updates are made. You can also bind users to specific hosts by cookie or IP. If updating requires significant changes to the database, the easiest way is to temporarily close the project.