CAP Theorem
CAP stands for Consistency, Availability and Partition Tolerance.
It states that in distributed system during network partition, we must select between consistency and availability.
Why not to select partition tolerance over other? because network partition is inevitable in distributed system that is why partition tolerance becomes mandatory not optional. so if partition tolerance becomes mandatory then we have to choose between consistency and availability.
Few things we must understand,
- Here, availability does not mean that server must up or we get http response or http error. it means Every request to a non-failing node must receive a non-error response.
- Partition tolerance means that our system must tolerate network partition and not crash or fail completely because of network partition.
so if we have to summarize CAP theorem, it means if network partition happens between nodes in distributed system,
- Either, nodes must accept request on both side of partition and provide response based on inconsistent data (why inconsistent data because of network partition over DB nodes, all of these node are not in sync).
- Or, nodes refuse request on both side of partition to ensure consistency.
Now lets understand CAP theorem in mode details.
If we have single instance of DB, Do we need CAP theorem ?
Ans is no, because in case of single instance no network partition will happen, because its single node either its available or not. so it will provide consistency and availability in case single node not fail.
It means CAP theorem only works in case of distributed system where we have multiple node and data replication across multiple nodes via network.
Now let say we have cluster of DB nodes, these nodes are distributed. now lets understand different scenarios that may possible in case of network partition.
-
Let say network partition happens, which break the coordination between nodes and node will not in sync.
-
We consider case where we make our system capable enough to tolerate partition. Now request will come to nodes, both side of partition. its depends on our system design how it will respond.
- if node accept request and provide response based on data (stale data / outdated) then we call system is Available and Partition tolerance, which means it ensure AP of CAP.
- if node reject request, it means that system ensure consistency (not want to provide inconsistent result) and partition tolerance, which means it ensure CP of CAP.
-
We consider case where we make our system capable enough to tolerate partition. Now request will come to nodes, both side of partition. its depends on our system design how it will respond.
- Here we understood 2 scenarios where it ensure CP and AP, why we are not ensuring CA this is because if we try to ensure CA and leave partition tolerance then whole system may go down in case of network partition in distributed system. so, CA applicable where network partition will not happen or system choose to fully shutdown.
But the important thing is partition tolerance is inevitable in distributed system. so, partition tolerance is not choice it is mandatory in case of distributed system. that is why according to CAP theorem, during network partition we must select partition tolerance but have to loose any one from consistency and availability.
Let’s discuss a misunderstanding I had earlier. I used to think that the CAP theorem is only useful for deciding which type of database to choose for an application. However, this is not a correct understanding. CAP theorem main purpose is to decide behavior of a distributed system when a network partition occurs. Database selection involves many factors, and CAP is only one of them.
If we are using a single-node database, CAP does not apply because there is no distributed replication and therefore no partition scenario. In that case, we choose the database based on transaction guarantees such as ACID or BASE, performance, and use case requirements.
CAP theorem applies when we use distributed systems with replicated state, such as database clusters. It helps us understand how the system behaves during network partitions — whether it sacrifices availability to maintain consistency, or allows inconsistency to maintain availability.
Different databases may lean toward CP or AP behavior depending on their design and configuration. Therefore, CAP helps evaluate distributed database behavior, but it does not rigidly classify databases as strictly consistent or strictly available.
Whenever we design any system, CAP is not limited to DB. in our system architecture we may have group of service nodes (microservices), group of DB nodes, cache nodes, messaging system. so where and how CAP theorem applies here ?
CAP is not applied to an entire application as a single unit. It applies to a distributed subsystem that replicates shared state and can experience network partition.
Examples of subsystems:
- Database cluster
- Cache cluster
- Search cluster
- Distributed messaging system
- Replicated microservices sharing state
We often talk about CAP in terms of features (search, payment, inventory). but CAP applies to the infrastructure that implements that feature, not to the feature itself. the point to note is each feature is depend on specific distributed system.
so let say we want to create search functionality, then we will deicide during a network partition, should this functionality continue serving requests (Availability) or reject them to preserve correctness (Consistency)? accordingly we will select subsystem types. Mostly for search functionality we prefer availability and may sacrifice consistency. so we may use elastic search node cluster which provides higher availability and partition tolerance.
If we want to develop inventory functionality or payment functionality here we will prefer consistency over availability, accordingly we will select subsystem types which can provide use higher consistency in case of partition tolerance, like cluster of relational database.
so, CAP is NOT a business-level decision. It is a distributed system behavior decision. But business requirements drive which behavior you need.
CAP applies to each distributed subsystem independently. It does not apply across, DB + Cache + Search together as one CAP unit
Lets understand How CAP theorem works in postgres DB cluster and kafka. Please take a note that below explanation is on high level and may not deep dive in actual concepts, It is just to show the usage of CAP theorem for better understanding.
-
Postgres DB cluster
- Postgres DB is relation database and follows ACID properties.
- postgres DB cluster follows primary and replica architecture. Among multiple postgres DB node, one will behave as primary and other will behave as its replicas. primary node support both read and write whereas replica support only read.
-
Postgres also support two types of replication, async and sync. lets understand these because it will help us to understand how Postgres ensure CP of CAP theorem.
- Whenever write request reach to Primary node, it write changes to WAL (Write Ahead Log), this WAL is shared with replicas, replica replay WAL to be in sync with Primary. this is how write works in Postgres cluster.
- Now coming back to sync and async replica, only difference is when will primary node consider the transaction commit, after writing to WAL or after replica read from it and acknowledge primary.
-
async replica:
- Primary Write to WAL
- Commit changes
- Send WAL to replica later
-
sync replica:
- Primary write to WAL
- Send WAL to replica
- replica read WAL and be in sync with primary
- Replica sends ACK
- Primary commits changes
-
When network partition will happen between primary and replica node,
-
if write or read request reach to primary node then
- if cluster follows async replica configuration, then it will accept request, write to WAL and provide response.
- if cluster follows sync replica configuration, then request may fail, because primary will wait for replica acknowledgement in case no replica acknowledge it will wait for some time and result in timeout error.
- if write or read request reach to replica node then it will allow read operation only even in case of network partition. However, those reads may be stale since replication from the primary is interrupted.
-
if write or read request reach to primary node then
- Now let say primary node also failed, in that case if we have setup failover mechanism to promote replica to Primary then it again follow the same as we discussed earlier. else no request will accept. one more point to note that postgres DB don’t have any default failover mechanism, we must need to explicitly configure it.
- This is how it ensure CP consistency and partition tolerance over availability.
-
Kafka
Kafka architecture consists of multiple brokers forming a cluster.
⇒ A topic is a logical stream of records.
⇒ Each topic is divided into one or more partitions, which are the physical append-only logs stored on brokers.
⇒ Partitions are distributed across brokers for scalability and fault tolerance.
⇒ Replication in Kafka happens at the partition level, not at the topic level.
For example:
- Brokers: B1, B2, B3
- Topic:
orders
- Partitions: 1 (orders-0)
- Replication factor = 3
In this case:
Broker Role for orders-0 B1 Leader B2 Follower B3 Follower Important clarification, B2 and B3 do not contain different partitions. They contain replicas of the same partition (orders-0).
Each partition has:
- Exactly one Leader
- Zero (in case its single broker) or more Followers
ISR (In-Sync Replicas)
Kafka maintains an ISR (In-Sync Replica set) for each partition.
Initially:
ISR = {Leader, all fully caught-up followers}Only replicas that are fully synchronized with the leader remain in ISR. If a follower lags beyond a threshold, it is removed from ISR.
Write Flow
When a producer sends a message:
- Producer sends message to the Leader of the partition.
- Leader appends message to its log.
- Followers replicate the message.
- If producer uses
acks=all, the Leader waits for acknowledgments from all replicas in ISR.
- Once acknowledged, the write is considered committed.
Network Partition Scenario
Now suppose a network partition occurs.
Example (Replication factor = 3):
Side A → Leader only
Side B → 2 Followers
Kafka uses majority (quorum) for leader election.
Majority required = 2 out of 3.
If the Leader is isolated and cannot communicate with majority:
- It steps down.
- The majority side (2 followers) elects a new leader.
- The majority side continues accepting writes.
- Minority side becomes unavailable for writes.
Equal Split Case
If total replicas = 4 and partition splits 2 vs 2:
Majority required = 3.
Neither side has majority.
Result:
- No leader can be elected.
- Writes are rejected.
- Both side partition becomes unavailable.
CAP Interpretation
Kafka always tolerates partitions (P).
During partition, Only the majority side can elect a leader. Minority side becomes unavailable.
Therefore, Kafka prioritizes:
- ✅ Consistency
- ✅ Partition Tolerance
- ❌ Availability (in minority partitions)
Kafka behaves as a CP system per partition.
Thank you for taking the time to read this article. I hope you found it insightful and informative. If you come across any inaccuracies, please let me know in the comments.



Comments
Post a Comment