Start Boring Scale Deliberately
The microservices versus monolith debate has been going on for years. What strikes me is how often teams are arguing about the wrong thing, or more precisely, about three different things at once.
This is an opinionated take. I am going to tell you what I think the smart default is, when to break from it, and why the current AI moment does not overturn the fundamentals. If anything, it makes those fundamentals matter more, because we are moving faster.
The world has changed. Good design has not. #
AI coding agents are now writing production code in real systems. They are good enough to matter, but they still depend on the same thing human engineers do: visibility. They need clear structure, legible contracts, and a codebase they can navigate without guessing across hidden seams.
That is not a new principle. It is the old one. Good software design makes systems legible to the people working in them, and now to the tools as well.
Good design still works.
The smart default #
If you asked me what to start with, knowing nothing about your particular system or team, I would say: a monolith in a monorepo.
A monorepo means your agents and your humans are looking at the same picture. Shared types stay visible. Contracts are easier to inspect. Refactors can happen in one pull request across a domain boundary. Change management gets simpler because the full scope of a change is visible in one place.
A monolith means a function call where you might otherwise have a network hop. It gives you one deployment surface. When something breaks, the question “what does this change affect?” is still answerable in a single pull request.
Starting here is not naive. It is the choice that keeps your options open. A monolith can be made modular, and that matters more than people sometimes admit.
The forces that push teams too early #
Three pressures regularly push teams toward extraction before they are ready.
Conway’s Law. Your communication structure becomes your system architecture. Not metaphorically. Literally. If you want to know what your microservices graph will look like, look at your org chart. Siloed teams build hard walls. If you are introducing microservices to solve an organizational problem, you may need to reorganize first. The code is downstream of the people.
Legacy headwind. Every historical decision becomes a current pushing against you. When teams feel that resistance, they often want to start something clean beside the old system. Usually they are not escaping the old system. They are duplicating it, and now they have two systems that need to talk to each other.
The greenfield pull. “What if we just extracted this one piece and did it right?” That question has launched a lot of services that made perfect sense on a whiteboard and got complicated the moment they needed data from somewhere else. The new thing feels clean until it has to talk to the old thing.
None of these pressures are imaginary. But they are not physics. They are organizational and emotional pressures, and they are worth naming before you treat them as technical requirements.
Three questions teams conflate #
This is the core mistake. Teams talk about architecture as if it were one decision when it is actually three.
Where does your code live? Monorepo versus polyrepo. This is a collaboration and context decision. It is about how your agents, your engineers, and your tools navigate the codebase.
How is your code deployed? Monolith versus microservices. This is a deployment and scaling decision. It is about operational topology, not design.
Where are your domain boundaries? This is a design decision. It has almost nothing to do with deployment. You can define and enforce boundaries inside a monolith, and you should, long before you change deployment topology.
These are separate questions. If you conflate them, you start fighting about deployment when you really have a design problem, or re-architecting your codebase when what you actually need is a different team structure.
You do not need a network call to draw a boundary #
Separate deployment is the strongest technical enforcement of a service boundary. You literally cannot cross it without a network call. That benefit is real. It imposes discipline without relying on convention.
It is also the most expensive enforcement mechanism you have.
You can approximate much of the same discipline with softer tools:
- package structure that enforces isolation at the module level
- ownership files such as CODEOWNERS that make accountability explicit
- import constraints that define which modules may depend on which
- code review conventions that reinforce domain boundaries
The principle is simple: match the strength of your enforcement to the problem you actually have. Before you pay the deployment tax, ask whether these tools get you most of the value.
The modular monolith is the missing middle ground #
A modular monolith deploys as a single unit but enforces hard domain boundaries internally. Each module owns its own code, data, and public interface. Nothing reaches inside. You get the operational simplicity of a monolith with much of the design discipline of microservices, and when a module genuinely needs to be extracted later, the boundary is already clean.
Most teams skip this option entirely, which is a shame, because some of the most successful engineering organizations at scale are running exactly this architecture.
Shopify runs one of the largest Rails codebases on earth and handles Black Friday at global scale. They looked hard at microservices and chose not to go there. Instead, they built a modular monolith and created Packwerk to enforce module boundaries inside it. When the discipline is not strong enough in the culture alone, you build it into the tooling.
Stripe is processing enormous payment volume on a huge Ruby codebase and is still largely monolithic.
Uber went the other direction. They scaled to thousands of microservices, and their own engineers started calling the result the Death Star: an architecture so difficult to navigate that meaningful work required understanding a web of dependencies no single engineer could hold in their head. They have since moved toward what they call macroservices, which are larger services with fewer and cleaner seams.
Martin Fowler put it plainly: do not start a new project with microservices, even if you are confident the application will eventually be large enough to justify them.
There are strong tools supporting this middle path across ecosystems: Packwerk for Ruby, Spring Modulith for Java, ABP Framework for .NET, and Google’s Service Weaver for Go. Service Weaver is especially interesting because it lets you develop as a modular monolith and deploy as microservices without changing the application code.
When you earn the extraction #
Microservices make sense when part of the system has materially different operating constraints.
Not when something merely feels large. Not when a team wants autonomy. When the component has genuinely different physics.
If your image processing pipeline needs 40 GPU instances while the rest of the application runs on two small general-purpose instances, that is different physics. Extract it.
If one component carries a 99.99 percent availability SLA while the rest of the system can tolerate degraded operation, that seam is real. The operating constraints demand a hard wall.
That is surgery with a clear indication, not architecture for architecture’s sake.
The distributed systems tax #
Every extraction is a CAP theorem decision made at the architecture level. The moment you put a network between two components, partition tolerance becomes a constraint you now have to manage. Whether you intended to or not, you are trading between consistency and availability.
The bill includes:
- network latency where there used to be a function call
- distributed failure modes such as partial outages, retry storms, and cascading failures
- versioned contracts that must survive independent release cycles
- harder observability across process boundaries
- reduced context for AI tooling at every service edge
Know what you are paying before you sign.
Navigate the current. Do not fight it. #
The paddler is a useful metaphor here.
A good paddler does not fight every current. They read the water, use force deliberately, and correct only when the physics demand it. The monolith is the boat. Your domain boundaries are your read of the current. Extraction, when it happens, is a correction maneuver. It should be precise, intentional, and earned.
The take #
Microservices solve scaling problems. Monorepos solve collaboration problems. Service boundaries solve design problems. These are different problems. Know which one you actually have.
Start boring. Add complexity when the math forces your hand.
Questions worth asking #
Should the CTO be making this decision? Or is this usually an engineering leadership call that technical executives too often pull upward? What is the right level of the organization to own architecture decisions, and does escalating them to the CTO actually slow teams down?
Is your architecture driven by scaling physics, or by your org chart? Conway’s Law cuts both ways. If your microservices graph looks like your reporting structure, the real question may not be whether to extract more services. It may be whether your teams are organized well.
What is your signal that you have extracted too many services? Most teams have a signal for when to break something out. Almost none have a defined signal for when they have gone too far.
If you were hiring an AI coding agent as a full-time team member tomorrow, would you make the same architectural choices you made three years ago? Agents need context. Every service boundary is a wall they have to reason across.
Extraction gets celebrated because it looks like progress. Consolidation rarely does, even when it is harder and more valuable work. That should make us suspicious of how we measure architectural health.
Nathan Feger is a fractional CTO and engineering leadership coach at Nate-Land Studios. The slides for this talk and other writing are at nate-land.com/talks.