Learn System Design in 10 DaysDay 6: Microservices & API Design
books.chapter 6Learn System Design in 10 Days

Day 6: Microservices & API Design

What You'll Learn Today

  • Monolith vs microservices tradeoffs
  • API design: REST vs gRPC vs GraphQL
  • API Gateway pattern
  • Service discovery mechanisms
  • Rate limiting and throttling strategies
  • Authentication with OAuth 2.0 and JWT
  • Idempotency in API design

Monolith vs Microservices

flowchart LR
    subgraph Monolith["Monolith Architecture"]
        direction TB
        UI1["UI Layer"]
        BL1["Business Logic"]
        DB1[("Single Database")]
        UI1 --> BL1 --> DB1
    end
    subgraph Micro["Microservices Architecture"]
        direction TB
        GW["API Gateway"]
        S1["User Service"]
        S2["Order Service"]
        S3["Payment Service"]
        DB2[("User DB")]
        DB3[("Order DB")]
        DB4[("Payment DB")]
        GW --> S1 & S2 & S3
        S1 --> DB2
        S2 --> DB3
        S3 --> DB4
    end
    style Monolith fill:#f59e0b,color:#fff
    style Micro fill:#3b82f6,color:#fff
Aspect Monolith Microservices
Deployment Single unit, simple Independent, complex
Scaling Scale everything Scale individual services
Development Easy to start Better for large teams
Data consistency ACID transactions Eventual consistency
Latency In-process calls Network calls (higher)
Debugging Simpler stack traces Distributed tracing needed
Technology Single stack Polyglot possible
Failure Entire app fails Partial failures

When to Choose Each

  • Start with a monolith when you have a small team, unclear domain boundaries, or are building an MVP.
  • Move to microservices when you need independent scaling, have distinct team boundaries, or need different tech stacks per service.

API Design: REST vs gRPC vs GraphQL

REST (Representational State Transfer)

GET    /api/v1/users/123          β†’ Get user
POST   /api/v1/users              β†’ Create user
PUT    /api/v1/users/123          β†’ Update user
DELETE /api/v1/users/123          β†’ Delete user
PATCH  /api/v1/users/123          β†’ Partial update

Key principles:

  • Resource-based URLs
  • HTTP methods for actions
  • Stateless
  • JSON payloads (typically)
  • HTTP status codes for responses

gRPC (Google Remote Procedure Call)

service RideService {
  rpc RequestRide(RideRequest) returns (RideResponse);
  rpc StreamLocation(stream LocationUpdate) returns (stream DriverLocation);
}

message RideRequest {
  string user_id = 1;
  Location pickup = 2;
  Location dropoff = 3;
}

Key features:

  • Protocol Buffers (binary serialization)
  • HTTP/2 with multiplexing
  • Bidirectional streaming
  • Code generation for multiple languages

GraphQL

query {
  user(id: "123") {
    name
    email
    rides(last: 5) {
      id
      status
      driver {
        name
        rating
      }
    }
  }
}

Key features:

  • Client specifies exact data needed
  • Single endpoint
  • No over-fetching or under-fetching
  • Strong type system with schema

Comparison

Feature REST gRPC GraphQL
Protocol HTTP/1.1+ HTTP/2 HTTP
Data format JSON Protobuf (binary) JSON
Performance Good Excellent Good
Streaming Limited Bidirectional Subscriptions
Browser support Native Requires proxy Native
Best for Public APIs Service-to-service Mobile/frontend
Learning curve Low Medium Medium
Caching HTTP caching Custom Complex

API Gateway Pattern

flowchart TB
    C1["Mobile App"] & C2["Web App"] & C3["Third Party"]
    GW["API Gateway"]
    C1 & C2 & C3 --> GW
    subgraph Services["Backend Services"]
        S1["User Service"]
        S2["Ride Service"]
        S3["Payment Service"]
        S4["Notification Service"]
    end
    GW --> S1 & S2 & S3 & S4
    subgraph GWFeatures["Gateway Responsibilities"]
        F1["Authentication"]
        F2["Rate Limiting"]
        F3["Load Balancing"]
        F4["Request Routing"]
        F5["Response Aggregation"]
        F6["SSL Termination"]
    end
    style GW fill:#8b5cf6,color:#fff
    style Services fill:#3b82f6,color:#fff
    style GWFeatures fill:#22c55e,color:#fff

The API Gateway acts as a single entry point for all clients. It handles cross-cutting concerns so individual services don't have to.

Popular implementations: Kong, AWS API Gateway, Netflix Zuul, Envoy


Service Discovery

In a microservices architecture, services need to find each other. Services scale up and down, and IP addresses change.

flowchart TB
    subgraph Client["Client-Side Discovery"]
        C1["Service A"] -->|"1. Query"| R1["Service Registry"]
        R1 -->|"2. Return addresses"| C1
        C1 -->|"3. Direct call"| S1["Service B (instance 1)"]
    end
    subgraph Server["Server-Side Discovery"]
        C2["Service A"] -->|"1. Request"| LB["Load Balancer"]
        LB -->|"2. Query"| R2["Service Registry"]
        LB -->|"3. Forward"| S2["Service B (instance 2)"]
    end
    style Client fill:#3b82f6,color:#fff
    style Server fill:#8b5cf6,color:#fff
Approach How It Works Example
Client-side discovery Client queries registry, picks instance Netflix Eureka
Server-side discovery Load balancer queries registry AWS ELB, Kubernetes
DNS-based Services register DNS entries Consul, CoreDNS
Service mesh Sidecar proxy handles routing Istio, Linkerd

Rate Limiting & Throttling

Rate limiting protects services from being overwhelmed. It's critical for public APIs and shared resources.

Common Algorithms

flowchart LR
    subgraph TB["Token Bucket"]
        direction TB
        T1["Tokens added at fixed rate"]
        T2["Request consumes a token"]
        T3["No token β†’ rejected"]
        T1 --> T2 --> T3
    end
    subgraph SW["Sliding Window"]
        direction TB
        W1["Track requests in time window"]
        W2["Count requests"]
        W3["Over limit β†’ rejected"]
        W1 --> W2 --> W3
    end
    style TB fill:#3b82f6,color:#fff
    style SW fill:#22c55e,color:#fff
Algorithm Pros Cons
Token Bucket Allows bursts, smooth Memory for tokens
Leaky Bucket Smooth output rate No burst handling
Fixed Window Simple Burst at window edges
Sliding Window Log Precise High memory usage
Sliding Window Counter Good balance Approximate

Rate Limit Headers

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1625097600
Retry-After: 60

Authentication: OAuth 2.0 & JWT

OAuth 2.0 Flow

sequenceDiagram
    participant U as User
    participant A as App (Client)
    participant AS as Auth Server
    participant RS as Resource Server
    U->>A: 1. Click "Login"
    A->>AS: 2. Redirect to auth page
    U->>AS: 3. Enter credentials
    AS->>A: 4. Authorization code
    A->>AS: 5. Exchange code for tokens
    AS->>A: 6. Access token + Refresh token
    A->>RS: 7. API call with access token
    RS->>A: 8. Protected resource

JWT (JSON Web Token)

A JWT has three parts: Header.Payload.Signature

eyJhbGciOiJIUzI1NiJ9.           ← Header (algorithm)
eyJ1c2VyX2lkIjoiMTIzIn0.       ← Payload (claims)
SflKxwRJSMeKKF2QT4fwpM...      ← Signature (verification)
Aspect Session-based JWT
Storage Server-side Client-side
Scalability Requires shared store Stateless, scales easily
Revocation Easy (delete session) Hard (need blocklist)
Size Small session ID Larger token
Best for Traditional web apps Microservices, APIs

Idempotency

An idempotent operation produces the same result regardless of how many times it's called. This is critical in distributed systems where retries are common.

HTTP Method Idempotent? Example
GET Yes Fetch user profile
PUT Yes Update entire resource
DELETE Yes Remove resource
POST No Create new resource
PATCH It depends Partial update

Idempotency Key Pattern

POST /api/v1/payments
Idempotency-Key: "abc-123-unique-key"

{
  "amount": 50.00,
  "currency": "USD"
}

The server stores the result keyed by the idempotency key. If the same key is sent again, the server returns the stored result instead of processing again. This prevents duplicate payments, duplicate orders, etc.


Practice Problem: Design APIs for a Ride-Sharing Service

Core Entities

  • User (riders and drivers)
  • Ride (a trip from pickup to dropoff)
  • Payment (transaction for a ride)
  • Location (real-time GPS coordinates)

API Design

# User Service
POST   /api/v1/users                    β†’ Register
POST   /api/v1/auth/login               β†’ Login (returns JWT)
GET    /api/v1/users/{id}/profile       β†’ Get profile

# Ride Service
POST   /api/v1/rides                     β†’ Request a ride
GET    /api/v1/rides/{id}                β†’ Get ride details
PUT    /api/v1/rides/{id}/accept         β†’ Driver accepts
PUT    /api/v1/rides/{id}/start          β†’ Start ride
PUT    /api/v1/rides/{id}/complete       β†’ Complete ride
PUT    /api/v1/rides/{id}/cancel         β†’ Cancel ride
GET    /api/v1/rides/{id}/eta            β†’ Get ETA

# Location Service (gRPC for real-time)
rpc UpdateDriverLocation(stream LocationUpdate) returns (Ack)
rpc SubscribeRiderLocation(RideId) returns (stream DriverLocation)

# Payment Service
POST   /api/v1/payments                  β†’ Process payment
GET    /api/v1/payments/{id}             β†’ Get payment status
POST   /api/v1/payments/{id}/refund      β†’ Refund

Summary

Concept Description
Monolith vs Microservices Start simple, split when needed
REST Resource-based, widely adopted
gRPC High-performance service-to-service
GraphQL Client-driven queries, reduces over-fetching
API Gateway Single entry point, cross-cutting concerns
Service Discovery Dynamically locate service instances
Rate Limiting Protect services from overload
OAuth 2.0 / JWT Secure authentication for distributed systems
Idempotency Safe retries in unreliable networks

Key Takeaways

  1. Choose your API style based on your use case: REST for public APIs, gRPC for internal services, GraphQL for flexible frontends
  2. An API Gateway simplifies client interactions and centralizes cross-cutting concerns
  3. Rate limiting is essential for any production API
  4. Design every write API to be idempotent to handle retries safely

Practice Problems

Problem 1: Basic

Design a REST API for a simple blog platform with users, posts, and comments. Define the endpoints, HTTP methods, and response codes.

Problem 2: Intermediate

You're migrating a monolithic e-commerce app to microservices. Identify the service boundaries, define the APIs between services, and explain how you'd handle a transaction that spans multiple services (e.g., placing an order).

Challenge

Design a rate limiting system for a public API that supports: per-user limits, per-endpoint limits, and global limits. The system must work across multiple API server instances. Describe the algorithm, data store, and how you handle edge cases like clock skew.


References


Next up: In Day 7, we'll walk through a complete system design interview β€” designing a URL Shortener from scratch.