From monolith to microservices: how to migrate without breaking everything
A practical guide on how I start with a monolith, monitor traffic by module by storing it in the database, and then extract microservices with Redis when it actually makes sense.
Why starting with a monolith is fine
The industry has an unconditional love for microservices from day zero. But for most projects, starting with a monolith is the most rational decision: less infrastructure, less operational complexity, faster iterations. There is no point solving scale problems that don't exist yet.
My personal approach: I start monolithic, ship fast, measure real traffic, and only think about splitting when there is concrete evidence of a bottleneck. Premature overengineering in architecture is just as harmful as in code.
- Less initial infrastructure: one process, one deploy, one DB.
- Fast iteration: changes anywhere in the system without coordination across services.
- Simple debugging: the whole stack in one place, linear traces.
- Low operational cost: no extra DevOps or service-to-service infrastructure bills.
Typical monolith architecture
Renderizando grafico...
All modules live in the same process and share the same database. One deploy, one instance.
Typical structure of a monolithic API
Even if it runs in a single process, the code is split by domain: each module has its own router, controller, and service. Auth handles everything related to authentication, Products has its routes, business logic, and DB access. The monolith is a single process, but the internal architecture is modular.
That separation isn't just aesthetic. It's what makes future extraction possible. When the Orders module has clear boundaries and isn't coupled to the rest, it can move into its own service without being rewritten from scratch.
Request flow inside the monolith
Renderizando grafico...
Even in a monolith, each module has its own routing, logic, and data-access layers. Boundaries are clear from the start.
Monitoring traffic by module: the rough-and-ready approach
Before making any architecture decision, I need data. My approach: a simple middleware that intercepts every request and logs the minimum necessary into a table in the same monolith database. No Datadog, no Prometheus. Just SQL and what I already have.
The table has basic fields: module, endpoint, HTTP method, response time in milliseconds, and status code. With that I can answer the questions that matter: which module gets the most load, which has the highest response times, which is growing.
It's deliberately rough because at the initial stage I don't want external dependencies or extra infrastructure. The most useful monitoring tool is the one I already have deployed.
- Table: module, endpoint, method, time_ms, status_code, created_at.
- The middleware runs before the controller: it always logs, even on errors.
- Simple analysis queries: GROUP BY module, AVG(time_ms), COUNT(*).
- No external dependency added: it reuses the DB connection the monolith already has.
Monitoring middleware integrated into the flow
Renderizando grafico...
The middleware intercepts every request, logs to the same DB, and the system keeps operating normally. Analysis is done with direct queries.
Identifying the bottleneck
After a few weeks of data, the pattern shows up. One of the modules handles 60% of requests, or has average response times three times higher than the rest. That's the first candidate.
But the extraction decision isn't just about volume. What matters is whether that module's load is degrading the experience of the rest. If the Reports module is slow but doesn't interfere with checkout, maybe internal optimization is enough. If it's blocking the event loop and slowing login, then yes, that's a real problem.
- Look for modules with the highest number of requests per minute.
- Identify endpoints with abnormal response times or high variance.
- Evaluate whether a module's load consumes resources that impact others (CPU, DB connections).
- Don't extract just because it has a lot of traffic: extract when it affects the rest of the system.
Decision tree to identify extraction candidates
Renderizando grafico...
Extraction is not automatic. High load only justifies extraction if it impacts the performance of the rest of the system.
Extracting the first microservice
Once the candidate is identified, extraction begins. Since the module already has clear boundaries inside the monolith, most of the work is moving code, not rewriting it. I create a new service with its own repo, its own database, and its own deployment process.
The data strategy is the most delicate part. The extracted module needs its own DB, but the historical data lives in the monolith DB. Depending on the case, it can be a full migration, a period of double-writing, or simply that the new service starts clean and the monolith remains the source of truth for old data.
While the microservice stabilizes, the monolith can act as a proxy: it receives requests for the extracted module and forwards them to the new service. This allows fast rollback if something goes wrong.
- Create the new service with its own database from day one.
- Move the existing logic. Don't rewrite from scratch if it already works.
- Define the data migration strategy before the cutover.
- The monolith stops handling that domain directly once the service is stable.
Architecture during extraction of the first module
Renderizando grafico...
The extracted module lives in its own process with its own DB. During the transition, the monolith can act as a proxy before the final cutover.
Redis as the communication layer
The extracted microservice needs to communicate with the monolith. My default choice is Redis, specifically pub/sub for asynchronous events and BullMQ for tasks that need persistence and retries. Why Redis? Because I already use it for cache in almost every project, it's fast, and it handles both fire-and-forget events and more structured communication well.
The basic pattern: when something relevant happens in the monolith (for example, an order is created), it publishes an event in Redis. The corresponding microservice subscribes to that channel and processes it independently. If the microservice is down, messages accumulate in the queue and it processes them when it comes back.
I reserve direct HTTP between services for cases where I need an immediate synchronous response. For everything else, the Redis bus gives decoupling and naturally absorbs load spikes.
A concrete case where BullMQ shines is heavy async processing: AI content generation, video, images, deep research. Instead of the main app running those tasks and blocking its own threads, it queues them in Redis and dedicated workers process them independently.
- Redis pub/sub: asynchronous events where I don't need an immediate response.
- BullMQ (queue on top of Redis): tasks with automatic retries, delay, and persistence.
- HTTP between services: only when I need a synchronous response and it can't be async.
- Redis as a bus decouples services: each one runs at its own pace.
Redis as a message bus between services
Renderizando grafico...
Redis handles asynchronous events. Direct HTTP only appears when the result is immediately required to continue the flow.
The progressive migration pattern
Extraction is not a big-bang event where one day you switch from monolith to microservices. It's a cycle: I monitor, detect the highest-impact module, extract it, stabilize it, and then monitor again. Each cycle leaves the monolith a little lighter.
The monolith rarely disappears completely, and that's fine. Modules with low traffic and no bottleneck behavior don't justify the operational complexity of becoming independent services. Over time, the monolith becomes the core that handles the less demanding domains, surrounded by specialized microservices for the areas that generate the most load.
Progressive migration cycle
Renderizando grafico...
Each iteration reduces monolith load and adds an independent service. No big bang: every extraction is small and controlled.
Final architecture: reduced monolith + microservices
After several extraction cycles, the result is a smaller monolith handling the core domains, plus a few specialized microservices communicating through Redis. Each service has its own database, its own deployment pipeline, and can scale independently.
This is not Netflix architecture with 500 services. It's something pragmatic: 3 to 5 services extracted based on real evidence, with the rest of the logic still living in a monolith that is now faster because it has fewer responsibilities.
Final architecture: core monolith + microservices via Redis
Renderizando grafico...
Final result: reduced monolith as the core, specialized microservices communicating via Redis, each with its own database.
Comparison: monolith vs microservices vs progressive migration
None of these three approaches is universally correct. The table shows the real tradeoffs depending on where the project and the team stand.
| Aspect | Pure monolith | Microservices from day 1 | Progressive migration |
|---|---|---|---|
| Initial complexity | Low | High | Low (starts as a monolith) |
| Infrastructure cost | Low | High from the start | Grows with real demand |
| Iteration speed | High at first, low when scaling | Low at first due to overhead | Always high, extract when needed |
| Debugging | Simple (one process) | Complex (distributed system) | Mixed, grows gradually |
| Scalability | Limited (everything scales together) | Granular from the start | Granular where needed |
| Risk of overengineering | Low | High | Low (based on real data) |
When NOT to do this
This approach is not for every project. If the monolith works well and response times are acceptable, don't touch anything. Adding the operational complexity of microservices to a system that has no scale problems is pure overengineering.
It also makes no sense to extract without data. The whole approach is based on measuring first and deciding later. Without monitoring, any architecture decision is a guess. And guesses in architecture are expensive.
- If the monolith shows no production performance problems: don't extract.
- If the team is small: the operational complexity of multiple services may be worse than the original problem.
- If you don't have monitoring implemented: measure first, decide later.
- If the business doesn't justify it: don't optimize infrastructure that isn't causing a real problem.
Checklist before starting extraction
Renderizando grafico...
Before starting any extraction, check these three points. If any of them fails, there is a less costly alternative.
See also
Sources
- Martin Fowler: MonolithFirstThe classic argument for starting with a monolith before distributing.
- Strangler Fig Pattern (Martin Fowler)The gradual migration pattern that inspires this progressive extraction approach.
- Redis Pub/Sub DocumentationOfficial documentation for Redis pub/sub.
- BullMQ: Queue Library for Node.jsRedis-based queue library for Node.js, ideal for service-to-service communication with retries.
- Redis StreamsAlternative to pub/sub for messaging with persistence and consumer groups.
- Sam Newman: Building Microservices (2nd ed.)General reference for microservice patterns and migration strategies.