Caching & Redis: Theoretical Squeeze
Deep dive into Caching principles, CPU vs RAM latency, Cache strategies, and Redis internals.
The Essence of Caching
The Problem: Latency Gap
The primary bottleneck in program execution is often memory access, not processing power.
- CPU: Executes operations in nanoseconds (< 1ns).
- RAM: Access takes ~100ns (100x slower).
- Disk/Network: Access takes milliseconds (1,000,000x slower).
The CPU spends significant time waiting for data. Solution: Add a small, ultra-fast memory layer close to the processor (L1/L2/L3 Cache) or application (RAM Cache vs DB).
Definition
Caching reduces access latency by storing frequently used data in a faster storage medium.
- Cache Hit: Data is found in the cache (Fast).
- Cache Miss: Data is not found, must be fetched from the slow source (Slow).
Write Strategies
When data changes, how do we update the cache and the source of truth?
-
Write Through
- Data is written to both the Cache and the Source (DB) simultaneously.
- Pros: High data consistency.
- Cons: Slower writes (wait for both).
-
Write Back (Write Behind)
- Data is written only to the Cache initially. It is synced to the Source later (asynchronously).
- Pros: Extremely fast writes.
- Cons: Risk of data loss if power fails before sync.
Cache States & Warming
Cold Cache
- The cache is empty or contains irrelevant data.
- User requests result in Cache Misses (slow performance).
Hot Cache
- The cache contains relevant, frequently accessed data.
- User requests result in Cache Hits (fast performance).
Cache Warming
- The process of pre-populating the cache with data before users request it.
- Goal: Ensure users always hit a "Hot Cache".
- Methods:
- Internal: The app loads data on startup.
- External: Scripts/Crawlers simulate user traffic to populate the cache.
Redis (Remote Dictionary Server)
Why Redis?
Relational Databases (HDD/SSD) are too slow for high-load scenarios (100k+ RPS). We need a storage system that lives entirely in RAM.
Redis vs Memcached
Memcached
- Pros: Extremely fast, simple multi-threaded architecture.
- Cons: Volatile. If the server restarts, all data is lost. Limited data types (strings only).
- Use Case: Simple session caching, temporary page fragments.
Redis
- Pros:
- Persistence: Can save data to disk (RDB snapshots, AOF logs).
- Data Structures: Supports Lists, Sets, Hashes, etc.
- Replication: Master-Slave support out of the box.
- Use Case: Caching, Message Broker, Leaderboards, Session Store, Real-time analytics.
Redis Internals
Redis is a NoSQL In-Memory Key-Value Store.
- Single-threaded event loop (no locking issues, but heavy commands block everyone).
- Atomic operations.
Data Types
- String: Basic text or binary data (Images, JSON). Max 512MB.
- Ops:
SET,GET,INCR.
- Ops:
- List: Linked lists (efficient head/tail operations).
- Ops:
LPUSH,RPOP. Good for Queues.
- Ops:
- Set: Unordered collection of unique strings.
- Ops:
SADD,SINTER(Intersection).
- Ops:
- Sorted Set (ZSet): Sets with a "score" for sorting.
- Ops:
ZADD,ZRANGE. Good for Leaderboards.
- Ops:
- Hash: Maps string fields to string values (Objects).
- Ops:
HSET,HGET.
- Ops:
- Pub/Sub: Message passing system (not stored).
- Ops:
PUBLISH,SUBSCRIBE.
- Ops:
Cheat Sheet: Common Commands
Strings & TTL
php// Basics $redis->set('currency:USD', 100); $redis->get('currency:USD'); // 100 // Existence & Atomic $redis->setnx('lock:user:1', 'locked'); // Set ONLY if not exists (Distributed Lock) $redis->mset(['key1' => 'val1', 'key2' => 'val2']); // Batch set // Expiration (TTL) $redis->set('otp:123', '5555', 60); // Store for 60 seconds $redis->expire('otp:123', 30); // Update TTL $redis->ttl('otp:123'); // Check remaining time $redis->persist('otp:123'); // Remove expiration (make permanent)
Counters (Atomic)
php$redis->incr('page:views'); // +1 $redis->incrBy('page:views', 10); // +10 $redis->decr('stock:items'); // -1
Lists (Queues)
php$redis->lPush('queue:emails', 'user@example.com'); // Add to Left $redis->rPop('queue:emails'); // Remove from Right (FIFO) $redis->lRange('queue:emails', 0, -1); // Get all items $redis->lLen('queue:emails'); // Get length
Scaling Redis
1. Persistence
- RDB (Snapshot): Saves DB state to disk every N minutes. Compact, faster restore.
- AOF (Append Only File): Logs every write command. Slower, but higher durability (less data loss).
2. Replication (Master-Slave)
- Master: Handles Writes.
- Slaves: Replicate Master, handle Reads.
- Improves Read scalability and Redundancy.
3. Sharding (Partitioning)
Distributing data across multiple Redis instances.
- Method: Hash the key (e.g.,
CRC32(key) % N_SERVERS). - Benefit: Horizontal scaling of RAM and Write throughput.
- Trade-off: Cannot perform multi-key operations (transactions) across different shards.
4. Redis Cluster
Native distributed implementation that handles sharding and replication automatically.