2024-07-22
#architecture
#backend
#microservices
#distributed-systems

Microservices: Theoretical Squeeze

Deep dive into microservices architecture patterns, CQRS, Event Sourcing, and stability patterns.

The primary reason for using microservices architecture is to handle massive amounts of data, scaling, and complex delivery requirements.

Monolith vs Microservices

Monolithic Architecture (everything in one program) faces issues with:

  • Scalability: Hard to scale specific parts.
  • Maintenance: Large codebase, hard to understand.
  • Fault Tolerance: One error can crash the whole app.
  • Deployment: Requires rebuilding the entire application for small changes.

Microservices Architecture uses distributed systems where small, isolated services handle specific domains.

Advantages:

  • Decentralized Data: Database per service.
  • Domain Driven Design (DDD): Clear boundaries matching business domains.
  • Tech Freedom: Different services can use different technologies.
  • Resilience: Fault isolation (Bulkheads).
  • Scalability: Independent scaling of services.

Disadvantages:

  • Complexity: Higher development and infrastructure costs.
  • Distributed Transactions: Hard to maintain data consistency.
  • Network Latency: Inter-service communication overhead.
  • Design for Failure: Must implement stability patterns.

Command and Query Responsibility Segregation (CQRS)

The Problem: Handling complex commands (updates) and queries (reads) in the same model can lead to performance bottlenecks and security risks.

The Solution: CQRS separates the Write Model (Command) from the Read Model (Query).

  • Command: Changes data (Create, Update, Delete). Needs transactional integrity.
  • Query: Reads data. Can be optimized for speed (e.g., denormalized views).

This allows scaling reads and writes independently and optimizing the data schemas for each use case.


Event Sourcing & Eventual Consistency

With distributed databases (Database per Service), maintaining consistency is challenging. Example: Order Service creates an order -> Customer Service reserves credit -> Inventory Service reserves stock.

The Problem: How to update data across services reliably?

Event Sourcing: Instead of storing just the current state, we store the sequence of events that led to it.

  • Events are immutable facts ("OrderCreated", "PaymentProcessed").
  • State is derived by replaying events.
  • Audit Trail is built-in by design.

Eventual Consistency: Data isn't consistent immediately across all services. Instead, it becomes consistent eventually as events propagate through the system.


Event-Driven Architecture (EDA)

EDA is an architectural style where components react to events rather than calling each other directly.

Pub/Sub Pattern:

  • Publisher: Emits an event (doesn't know who listens).
  • Subscriber: Listens for specific events and reacts.
  • Event Broker: Intermediary that manages the delivery and storage of events.

Benefits:

  • Decoupling: Services are independent.
  • Extensibility: Add new listeners without changing the publisher.
  • Asynchrony: Non-blocking operations.

Message Brokers vs Event Brokers

1. Message Broker (e.g., RabbitMQ) Focus: Delivery & Reliability

  • Smart Broker, Dumb Consumer.
  • Model: Queue (FIFO). Message is removed after processing.
  • Use Case: Task queues, background jobs, ensuring a task is done exactly once.

2. Event Broker (e.g., Apache Kafka) Focus: Stream History & Replayability

  • Dumb Broker, Smart Consumer.
  • Model: Log (Stream). Events persist based on retention policy.
  • Use Case: Activity tracking, metrics, event sourcing, stream processing.
  • Topic: Consumers track their own offset (position in the stream).

RabbitMQ Deep Dive

Implements AMQP (Advanced Message Queuing Protocol).

Core Flow:

  1. Publisher sends message to Exchange.
  2. Exchange routes message to Queues based on Routing Key.
  3. Consumer reads from Queue and sends Ack.

Routing Types:

  • Direct: Exact match of routing key.
  • Fanout: Broadcast to all queues.
  • Topic: Pattern matching (e.g., logs.*).

Load Balancing

Distributes incoming network traffic across multiple servers.

Algorithms:

  • Round Robin: Sequential distribution.
  • Least Connections: Sends to server with fewest active connections.
  • IP Hash: Consistent routing based on client IP.

High Availability:

  • Active-Passive: Main LB handles traffic; backup takes over if main fails (VIP - Virtual IP).
  • Health Checks: Removes unhealthy servers from the pool.

Stability Patterns (Design for Failure)

Essential for preventing cascading failures in distributed systems.

1. Retry Pattern

  • Automatically retry failed requests.
  • Risk: Can amplify traffic storms if the service is down.

2. Circuit Breaker

  • Prevents calling a failing service.
  • Closed: Normal operation.
  • Open: Fails fast (after error threshold).
  • Half-Open: Test if service recovered.

3. Timeout

  • Fail request if it takes too long. Prevents resource hanging.

4. Bulkhead

  • Isolate resources (thread pools, connections) for different parts of the system.
  • If one part fails, others continue working (like ship compartments).

5. Handshaking / Backpressure

  • Health Checks: Inform LB about service status.
  • Rate Limiting: Tell clients to slow down (HTTP 429).

Connected Thoughts

Egor Zdioruc | Lead Full Stack Developer | Laravel & AI Solutions