# Overview
Keeping a copy of the same data on different nodes.
Database replication is used to: #flashcard
- Reduce [[latency]] by bringing data closer to users
- Improve [[Availability]] via [[Redundancy]]
- Improve [[Performance]] by scaling out the number of machines that can be used to serve read queries
- Improve [[Fault Tolerance]] from network interruptions via [[Redundancy]]
<!--ID: 1751507776638-->
# Key Considerations
## Replication Configuration Parameters
Database replication is a massive topic with large amounts of research conducted on the topic. The exact implementation of replication in a database can look very similar on the surface, but actually have slight nuances. Some of the areas where different approaches can be applied are list in the sections below.
### Strategies for Cascading Writes #flashcard
- [[Single-leader Replication]]
- [[Multi-leader Replication]]
- [[Leaderless Replication]]
<!--ID: 1751507776640-->
### Strategies for Completing a Replication #flashcard
- [[Synchronous Data Replication]]
- [[Asynchronous Data Replication]]
<!--ID: 1752260451479-->
### Strategies for Communicating Changes to Replicate #flashcard
For leader-based replication, there needs to be a method for communicating updates to the followers:
- [[Statement-based Replication]]
- [[Write Ahead Log (WAL) Shipping Replication]]
- [[Logical (row-based) log replication]]
- [[Trigger-based Replication]]
<!--ID: 1752260451482-->
## Replication Lag #flashcard
The delay in replicating data, or replication lag, can cause some strange phenomenon for clients reading data. To limit these issues, a database can try to reach these guarantees:
- **Read-after-write consistency** - Users should always see data that they submitted themselves.
- **Monotonic reads** - After users have seen the data at one point in time, they shouldn’t later see the data from some earlier point in time.
- Session stickiness
- **Consistent prefix reads** - Users should see the data in a state that makes causal sense: for example, seeing a question and its reply in the correct order.
<!--ID: 1751507776645-->
# Implementation Details
## Routing Queries to Write vs Read Nodes #flashcard
1. At the application-level: use a write vs. read connection based on the query assessed in the application
1. Adds complexity and is difficult to maintain across services
2. Using an [[Object Relational Mappings (ORMs)]] or Database Library: some libraries support read-replica-aware connection pools (incl, Django, SQLAlchemy, and Prisma)
3. Using a Proxy Layer (PgBouncer, ProxySQL, Citus)
- Provides centralized control and clean application code, but adds potential bottleneck
<!--ID: 1752260451485-->
# Useful Links
# Related Topics
## Reference
#### Working Notes
#### Sources
#### Topics to Cover
- Gossip Protocol
- [[Snapshots]]
#### Related Topics
-