From monolith to microservices: how to migrate without breaking everything

Why starting with a monolith is fine

The industry has an unconditional love for microservices from day zero. But for most projects, starting with a monolith is the most rational decision: less infrastructure, less operational complexity, faster iterations. There is no point solving scale problems that don't exist yet.

My personal approach: I start monolithic, ship fast, measure real traffic, and only think about splitting when there is concrete evidence of a bottleneck. Premature overengineering in architecture is just as harmful as in code.

Less initial infrastructure: one process, one deploy, one DB.
Fast iteration: changes anywhere in the system without coordination across services.
Simple debugging: the whole stack in one place, linear traces.
Low operational cost: no extra DevOps or service-to-service infrastructure bills.

Typical monolith architecture

Renderizando diagrama…

All modules live in the same process and share the same database. One deploy, one instance.

Typical structure of a monolithic API

Even if it runs in a single process, the code is split by domain: each module has its own router, controller, and service. Auth handles everything related to authentication, Products has its routes, business logic, and DB access. The monolith is a single process, but the internal architecture is modular.

That separation isn't just aesthetic. It's what makes future extraction possible. When the Orders module has clear boundaries and isn't coupled to the rest, it can move into its own service without being rewritten from scratch.

Request flow inside the monolith

Renderizando diagrama…

Even in a monolith, each module has its own routing, logic, and data-access layers. Boundaries are clear from the start.

Monitoring traffic by module: the rough-and-ready approach

Before making any architecture decision, I need data. My approach: a simple middleware that intercepts every request and logs the minimum necessary into a table in the same monolith database. No Datadog, no Prometheus. Just SQL and what I already have.

The table has basic fields: module, endpoint, HTTP method, response time in milliseconds, and status code. With that I can answer the questions that matter: which module gets the most load, which has the highest response times, which is growing.

It's deliberately rough because at the initial stage I don't want external dependencies or extra infrastructure. The most useful monitoring tool is the one I already have deployed.

Table: module, endpoint, method, time_ms, status_code, created_at.
The middleware runs before the controller: it always logs, even on errors.
Simple analysis queries: GROUP BY module, AVG(time_ms), COUNT(*).
No external dependency added: it reuses the DB connection the monolith already has.

Monitoring middleware integrated into the flow

Renderizando diagrama…

The middleware intercepts every request, logs to the same DB, and the system keeps operating normally. Analysis is done with direct queries.

Identifying the bottleneck

After a few weeks of data, the pattern shows up. One of the modules handles 60% of requests, or has average response times three times higher than the rest. That's the first candidate.

But the extraction decision isn't just about volume. What matters is whether that module's load is degrading the experience of the rest. If the Reports module is slow but doesn't interfere with checkout, maybe internal optimization is enough. If it's blocking the event loop and slowing login, then yes, that's a real problem.

Look for modules with the highest number of requests per minute.
Identify endpoints with abnormal response times or high variance.
Evaluate whether a module's load consumes resources that impact others (CPU, DB connections).
Don't extract just because it has a lot of traffic: extract when it affects the rest of the system.

Decision tree to identify extraction candidates

Renderizando diagrama…

Extraction is not automatic. High load only justifies extraction if it impacts the performance of the rest of the system.

Extracting the first microservice

Once the candidate is identified, extraction begins. Since the module already has clear boundaries inside the monolith, most of the work is moving code, not rewriting it. I create a new service with its own repo, its own database, and its own deployment process.

The data strategy is the most delicate part. The extracted module needs its own DB, but the historical data lives in the monolith DB. Depending on the case, it can be a full migration, a period of double-writing, or simply that the new service starts clean and the monolith remains the source of truth for old data.

While the microservice stabilizes, the monolith can act as a proxy: it receives requests for the extracted module and forwards them to the new service. This allows fast rollback if something goes wrong.

Create the new service with its own database from day one.
Move the existing logic — don't rewrite from scratch if it already works.
Define the data migration strategy before the cutover.
The monolith stops handling that domain directly once the service is stable.

Architecture during extraction of the first module

Renderizando diagrama…

The extracted module lives in its own process with its own DB. During the transition, the monolith can act as a proxy before the final cutover.

Redis as the communication layer

The extracted microservice needs to communicate with the monolith. My default choice is Redis, specifically pub/sub for asynchronous events and BullMQ for tasks that need persistence and retries. Why Redis? Because I already use it for cache in almost every project, it's fast, and it handles both fire-and-forget events and more structured communication well.

The basic pattern: when something relevant happens in the monolith (for example, an order is created), it publishes an event in Redis. The corresponding microservice subscribes to that channel and processes it independently. If the microservice is down, messages accumulate in the queue and it processes them when it comes back.

I reserve direct HTTP between services for cases where I need an immediate synchronous response. For everything else, the Redis bus gives decoupling and naturally absorbs load spikes.

A concrete case where BullMQ shines is heavy async processing: AI content generation, video, images, deep research. Instead of the main app running those tasks and blocking its own threads, it queues them in Redis and dedicated workers process them independently.

Redis pub/sub: asynchronous events where I don't need an immediate response.
BullMQ (queue on top of Redis): tasks with automatic retries, delay, and persistence.
HTTP between services: only when I need a synchronous response and it can't be async.
Redis as a bus decouples services: each one runs at its own pace.

Redis as a message bus between services

Renderizando diagrama…

Redis handles asynchronous events. Direct HTTP only appears when the result is immediately required to continue the flow.

The progressive migration pattern

Extraction is not a big-bang event where one day you switch from monolith to microservices. It's a cycle: I monitor, detect the highest-impact module, extract it, stabilize it, and then monitor again. Each cycle leaves the monolith a little lighter.

The monolith rarely disappears completely, and that's fine. Modules with low traffic and no bottleneck behavior don't justify the operational complexity of becoming independent services. Over time, the monolith becomes the core that handles the less demanding domains, surrounded by specialized microservices for the areas that generate the most load.

Progressive migration cycle

Renderizando diagrama…

Each iteration reduces monolith load and adds an independent service. No big bang: every extraction is small and controlled.

Final architecture: reduced monolith + microservices

After several extraction cycles, the result is a smaller monolith handling the core domains, plus a few specialized microservices communicating through Redis. Each service has its own database, its own deployment pipeline, and can scale independently.

This is not Netflix architecture with 500 services. It's something pragmatic: 3 to 5 services extracted based on real evidence, with the rest of the logic still living in a monolith that is now faster because it has fewer responsibilities.

Final architecture: core monolith + microservices via Redis

Renderizando diagrama…

Final result: reduced monolith as the core, specialized microservices communicating via Redis, each with its own database.

Comparison: monolith vs microservices vs progressive migration

None of these three approaches is universally correct. The table shows the real tradeoffs depending on where the project and the team stand.

Aspect	Pure monolith	Microservices from day 1	Progressive migration
Initial complexity	Low	High	Low (starts as a monolith)
Infrastructure cost	Low	High from the start	Grows with real demand
Iteration speed	High at first, low when scaling	Low at first due to overhead	Always high, extract when needed
Debugging	Simple (one process)	Complex (distributed system)	Mixed, grows gradually
Scalability	Limited (everything scales together)	Granular from the start	Granular where needed
Risk of overengineering	Low	High	Low (based on real data)

When NOT to do this

This approach is not for every project. If the monolith works well and response times are acceptable, don't touch anything. Adding the operational complexity of microservices to a system that has no scale problems is pure overengineering.

It also makes no sense to extract without data. The whole approach is based on measuring first and deciding later. Without monitoring, any architecture decision is a guess. And guesses in architecture are expensive.

If the monolith shows no production performance problems: don't extract.
If the team is small: the operational complexity of multiple services may be worse than the original problem.
If you don't have monitoring implemented: measure first, decide later.
If the business doesn't justify it: don't optimize infrastructure that isn't causing a real problem.

Checklist before starting extraction

Renderizando diagrama…

Before starting any extraction, check these three points. If any of them fails, there is a less costly alternative.

Why starting with a monolith is fine

Typical monolith architecture

Typical structure of a monolithic API

Request flow inside the monolith

Monitoring traffic by module: the rough-and-ready approach

Monitoring middleware integrated into the flow

Identifying the bottleneck

Decision tree to identify extraction candidates

Extracting the first microservice

Architecture during extraction of the first module

Redis as the communication layer

Redis as a message bus between services

The progressive migration pattern

Progressive migration cycle

Final architecture: reduced monolith + microservices

Final architecture: core monolith + microservices via Redis

Comparison: monolith vs microservices vs progressive migration

When NOT to do this

Checklist before starting extraction

Sources

See also