16 System Design Concepts Every Developer Must Know

In the previous article, we discussed why System Design matters and how applications face new challenges as they grow from hundreds of users to millions.

Before diving into advanced topics like caching, load balancers, database replication, and microservices, you need a solid grip on the fundamental concepts that everything else is built on.

These ideas are the vocabulary of System Design. Once you know them, every advanced topic becomes far easier to follow.

Let's go through them, grouped by what they actually describe.

Part 1 — How Fast and How Much? (Performance)

1. Latency

Latency is the time it takes to complete a single request.

In plain words: how long does a user wait for a response?

Example

Tap Instagram icon
       ↓
Request sent
       ↓
Feed loaded in 100ms

Here the latency is 100ms. Lower latency means a snappier experience.

Memory trick: Latency = How fast?

2. Throughput

Throughput is the number of requests a system can handle in a given period, usually measured in Requests Per Second (RPS).

Example

An e-commerce site receives 10,000 requests/second. If the servers successfully process all of them, the throughput is 10,000 RPS.

Latency is about one request; throughput is about volume. A system can have low latency but low throughput, or high throughput with higher latency — they're different dimensions.

Memory trick: Throughput = How many?

Part 2 — Can It Grow? (Scalability)

3. Scalability

Scalability isn't just "can it handle more users" — it's whether the system can handle growth without a proportional explosion in cost, complexity, or downtime.

The ideal is linear scaling:

Double the resources → double the capacity
Triple the resources → triple the capacity

Real systems rarely hit that ideal perfectly, but the closer they get, the more scalable they are.

Example

A food-delivery app starts with 100 users and grows to 1,000,000 users. If it keeps serving users efficiently — without slowing down, crashing, or becoming wildly expensive — it is scalable.

There are two primary ways to scale a system.

Vertical scaling (scale up) — make one machine stronger

Instead of adding more servers, you upgrade the existing one with more CPU cores, more RAM, faster SSDs, and better network bandwidth.

Why teams start here

Vertical scaling is usually the first strategy because it's simple — no architecture changes, no distributed-system complexity, no load balancing, and data stays in one place (so consistency is easy to maintain).

Advantages: easy to implement, minimal code changes, simpler operations, strong consistency.

Disadvantages: hardware has a hard limit, high-end machines get disproportionately expensive, upgrades may require downtime, and it's still a single point of failure — if the one machine crashes, the whole app goes down.

Analogy: a delivery business with one rider. Instead of hiring more riders, you give that rider a bigger, faster vehicle. More powerful — but there's still only one rider.

Horizontal scaling (scale out) — add more machines

Instead of making one machine larger, you add more machines and spread the workload across them. When traffic grows, you add more servers, containers, or instances — capacity grows with the system.

Advantages: nearly unlimited scalability, better fault tolerance, high availability, cheaper commodity hardware, easier incremental growth.

Disadvantages: more architectural complexity, requires load balancing, data consistency becomes harder, and monitoring/deployment get more involved.

Analogy: instead of buying one giant truck, a delivery company hires many riders. As orders increase, more riders are added.

So which one should you use? It depends on your application

This is where most beginners go wrong. They assume horizontal scaling is always better. It isn't — the right choice depends on what your application needs, and especially on whether your application is stateful or stateless.

Stateless — the server remembers nothing between requests. Every request carries everything it needs, so any server can handle any request.

Scales out (horizontal) easily: add servers behind a load balancer, and it doesn't matter which one a user lands on.
Example: an API that reads a request, talks to a shared database, and returns a result.

Stateful — the server stores data about the user in its own memory between requests (like a login session kept on Server 1).

Hard to scale out: if the next request lands on Server 2, the user's state isn't there.
Example: a server that keeps your shopping-cart session in its local memory.

So the rule a beginner should remember:

Stateless apps scale out beautifully — which is why teams work hard to keep their app servers stateless.
Stateful components are the hard part. Databases are inherently stateful, which is why scaling them (through replication and sharding) is a whole separate challenge.

The common fix is to move state out of the server: instead of keeping sessions in a server's memory, store them in a shared place like a cache (Redis) or a database. Now the app servers are stateless again and can scale horizontally freely.

Choosing based on requirements

If you need...	Lean toward
Simplicity, small/moderate load, a quick start	Vertical (scale up)
Massive growth, fault tolerance, high availability	Horizontal (scale out)

In practice, real systems do both: scale the stateless app servers horizontally, and handle the stateful database layer carefully — vertical first, then replication and sharding as it grows.

Memory trick: Scalability = Can it grow? Up or out? — and "out" only works cleanly if the app is stateless.

Part 3 — Is It Correct and Trustworthy? (Correctness Guarantees)

4. Availability

Availability measures whether users can reach the system when they need it.

Example

Website opens successfully ✅   → available
503 Service Unavailable    ❌   → unavailable

Memory trick: Availability = Is it up, and how often?

5. Reliability

Reliability means the system consistently does the correct thing.

Example

You transfer ₹1000:

Sender   −₹1000
Receiver +₹1000

If this happens correctly every single time, the system is reliable. (Availability asks "is it up?" — reliability asks "did it do the right thing while it was up?")

Memory trick: Reliability = Is it correct?

6. Consistency

Consistency means every read returns the most recent write — everyone sees the same, up-to-date data.

Example

You update your profile picture. A consistent system guarantees that the next read on any device returns the new picture, not the old one.

Phone  → new picture
Laptop → new picture
Tablet → new picture

The opposite is eventual consistency, where updates spread out over time and different users might briefly see different versions.

Memory trick: Consistency = Same, up-to-date data everywhere?

7. Durability

Durability guarantees that once data is successfully saved, it will not be lost — even if the system crashes immediately after.

Example

Payment successful ✅
       ↓
Server crashes ❌

After recovery, the payment record must still be there.

Memory trick: Durability = Saved means saved forever.

8. Idempotency

Idempotency means performing the same operation multiple times produces the same result as doing it once.

Example

A payment request times out, so the app retries it. With idempotency, the user is charged once, not twice — the duplicate request is recognized and ignored.

This matters everywhere retries happen, which in distributed systems is constantly.

Memory trick: Idempotency = Do it again, same result.

Part 4 — How Systems Survive Failure (Resilience)

These four concepts are closely related, so it helps to see them as one story instead of four separate ideas. In any large system something is always failing — a disk, a server, a network link. Resilience is how the system keeps running anyway.

9. Redundancy — the technique

Redundancy means keeping backup resources so no single component is irreplaceable.

Instead of:   1 Database
You keep:     Primary Database  +  Backup Database

If the primary fails, the backup takes over.

Memory trick: Redundancy = Always have a spare.

10. Fault Tolerance — the property redundancy gives you

Fault tolerance is the system's ability to keep working when a component fails.

Server 1 ❌ crashes
Server 2 ✅ keeps serving users

Redundancy is what you build; fault tolerance is what you get.

Memory trick: Fault tolerance = Can it survive a failure?

11. Single Point of Failure (SPOF) — what happens without redundancy

A SPOF is any component whose failure takes down the entire system.

Users → Server → Database

If there's only one server and it crashes, the whole app goes down. That server is a SPOF. Redundancy exists precisely to eliminate SPOFs.

Memory trick: SPOF = One failure = system down.

12. High Availability (HA) — the goal

High availability means the system is deliberately designed to stay up, even during failures, by combining redundancy and fault tolerance.

Example

Server A ❌ fails
Server B ✅ takes over

Users keep streaming, shopping, or chatting without noticing anything happened.

Note the difference from plain availability: availability is the measurement (how much uptime you actually got), while high availability is the design choice (engineering the system so that uptime stays high).

Memory trick: HA = Designed to stay online through failures.

Part 5 — Distributed Systems Trade-offs

13. Partition Tolerance

A network partition happens when servers in a distributed system can't talk to each other. Partition tolerance is the ability to keep operating when that happens.

Server A  ❌✕❌  Server B
   (network link broken)

In a system spread across machines and regions, partitions are not an "if" — they're a "when."

Memory trick: Partition tolerance = Can it survive a network split?

14. The CAP Theorem (tying it together)

You've now met three properties: Consistency, Availability, and Partition Tolerance. The CAP theorem is the famous rule that connects them.

During a network partition, a distributed system can guarantee only two of the three — and since partitions will happen, you're really choosing between Consistency and Availability.

Choose Consistency (CP): refuse requests that might return stale data. Safer, but some users get errors during a partition. (Example: banking systems.)
Choose Availability (AP): keep answering requests even if some data is temporarily out of date. (Example: social media feeds.)

There's no universally "correct" choice — it depends on what your application can tolerate.

Memory trick: CAP = Pick two; in a partition, C or A.

Part 6 — Living With the System Over Time

15. Maintainability

Maintainability measures how easy a system is to understand, modify, and operate.

Example

A new developer joins and can quickly understand the codebase, fix bugs, add features, and deploy updates. That system is maintainable.

Memory trick: Maintainability = Easy to change.

16. Performance (the big picture)

Performance isn't a single metric — it's the overall measure of how efficiently the system works, pulling together everything above: latency, throughput, resource usage, and scalability.

Example

Website A: page loads in 5 seconds
Website B: page loads in 500ms

Website B performs better. Performance is the lens you use to judge the whole system, which is why it sits last.

Memory trick: Performance = How well does it all work together?

Quick Revision Table

Concept	Simple Question
Latency	How fast?
Throughput	How many?
Scalability	Can it grow (up or out)?
Availability	Is it up, and how often?
Reliability	Is it correct?
Consistency	Same, up-to-date data everywhere?
Durability	Will saved data stay saved?
Idempotency	Same result if repeated?
Redundancy	Is there a spare?
Fault Tolerance	Can it survive a failure?
Single Point of Failure	What can bring everything down?
High Availability	Designed to stay online through failures?
Partition Tolerance	Can it survive a network split?
CAP Theorem	Consistency or availability during a partition?
Maintainability	Is it easy to change?
Performance	How well does it all work together?

Conclusion

Every large-scale application—Instagram, Netflix, Amazon, WhatsApp, and Google—is built on these core System Design principles.

What's important to understand is that these concepts don't exist in isolation. They work together to create systems that are fast, reliable, scalable, and resilient.

For example:

Redundancy helps achieve Fault Tolerance.
Fault Tolerance improves High Availability.
Latency and Throughput influence overall Performance.
Consistency, Availability, and Partition Tolerance play a key role in distributed systems.

Stay tuned for more learning notes from Noob Diaries! 🚀

System Design Fundamentals: The Core Concepts Every Engineer Should Know

Part 1 — How Fast and How Much? (Performance)

1. Latency

2. Throughput

Part 2 — Can It Grow? (Scalability)

3. Scalability

Vertical scaling (scale up) — make one machine stronger

Horizontal scaling (scale out) — add more machines

So which one should you use? It depends on your application

Part 3 — Is It Correct and Trustworthy? (Correctness Guarantees)

4. Availability

5. Reliability

6. Consistency

7. Durability

8. Idempotency

Part 4 — How Systems Survive Failure (Resilience)

9. Redundancy — the technique

10. Fault Tolerance — the property redundancy gives you

11. Single Point of Failure (SPOF) — what happens without redundancy

12. High Availability (HA) — the goal

Part 5 — Distributed Systems Trade-offs

13. Partition Tolerance

14. The CAP Theorem (tying it together)

Part 6 — Living With the System Over Time

15. Maintainability

16. Performance (the big picture)

Quick Revision Table

Conclusion

Comments

More from this blog

Why System Design Matters: From 100 Users to 1 Million Users

Command Palette

Part 1 — How Fast and How Much? (Performance)

1. Latency

2. Throughput

Part 2 — Can It Grow? (Scalability)

3. Scalability

Vertical scaling (scale up) — make one machine stronger

Horizontal scaling (scale out) — add more machines

So which one should you use? It depends on your application

Part 3 — Is It Correct and Trustworthy? (Correctness Guarantees)

4. Availability

5. Reliability

6. Consistency

7. Durability

8. Idempotency

Part 4 — How Systems Survive Failure (Resilience)

9. Redundancy — the technique

10. Fault Tolerance — the property redundancy gives you

11. Single Point of Failure (SPOF) — what happens without redundancy

12. High Availability (HA) — the goal

Part 5 — Distributed Systems Trade-offs

13. Partition Tolerance

14. The CAP Theorem (tying it together)

Part 6 — Living With the System Over Time

15. Maintainability

16. Performance (the big picture)

Quick Revision Table

Conclusion

Comments

More from this blog