Scalability is the ability of a system, network, or process to handle a growing amount of work in a capable manner or its ability to be enlarged to accommodate that growth.
“How do you address scalability?” is a question I hear quite a lot lately. Five years ago, it was “How do you address the cloud?” But in this case, “scalability” isn’t just the latest buzzword concerning Internet apps, services, and infrastructure. The quality of scalability is important in terms of high availability, fault-tolerance, and an expected level of performance.
But the question “How do you address scalability?” seems a bit off-putting, asking first and foremost, have you learned the taxonomy of Scalability concept as it pertains to the modern web? Where does one start? At the definitions of ‘horizontal’ (add-another-Dell-to-the-problem) and ‘vertical’ (soup-up-the-machine) scaling? Then, which path of many do you go down after that?
A more grounded question is
“How does your design ensure that there is no degradation in performance of your application with increasing loads, for example extra traffic? What other loads can you foresee? Bigger data sets? Would additional application features be an issue?”
A question of actual practice would be
“How do you test for — and find — performance bottlenecks? What tools do you use?”
(The question of “What tools do you use?” is hardly ever asked. And I don’t know why. I think a person’s use of tools gives the most insight into his or her professional competencies.)
The use of modern “best practice” design patterns and programming techniques addresses almost all scalability issues in client and server applications. I personally start with the concept of keeping everything simple, simple API, simple testing and devops. This, decoupled modules and DRY code help alleviate my baseline performance concerns. Scalable interactions with the backend are also the simple ones, centralizing or avoiding server sessions, ideally constraining the app to REST calls. I design for scalability by obeying the golden rules of performing operations asynchronously when I can, and as concurrently as feasible, and in any order if possible, and by never contending for resources.
Questions I would ask along these lines would be,
“Why does going from 1 computer to 6 computers not necessarily mean 6 times the performance? How would you design your application to achieve the desired 6 times the performance?” (See Amdahl’s Law.)
“What is clustering as it pertains to scalability?”
“How would resource contention affect scalability and what are ways to mitigate such contention?”
“What are ways to persist data, and how does each way enhance or inhibit scalability?”
There is a type of institutional bias with this general question about Scalability, because it implies that it is a new idea, with new ways of doing things. But Scalability is a broad collection of existing concerns and practices bundled together under new taxonomies to make best use of the wave of cloud-commodified IT infrastructures. Neither the taxonomies, nor even the concepts, are by any means common or standardized, but for almost every startup on the rise, the experience of crashed servers under the load of an unexpected increase in visitors to the site, has been virtually a rite of passage – and a nice problem to have – for decades and a very real demonstration of the need to possess the quality of Scalability.
The bias is against the experienced designer who has successfully dealt with the same issues under different guises for years, and for example, makes a habit storing app data in encoded cookies.And biased in favor of one who not can rattle off an outline of Scalability read that morning off the Web, and makes a full page articulation of Representational State stored as Encoded Cookie for Scalability seem newly relevant, knowing, and insightful.