BACK TO ENGINEERING
Architecture 9 min read

Our Entire SaaS Is One 13,870-Line File. We Tried Splitting It. We Put It Back.

Article Hero

I can already hear it.

"13,870 lines in one file? That's unmaintainable. That's a code smell. That's tech debt."

Maybe.

But here's what it also is: a micro-SaaS that hosts 99 AI-powered apps, processes real-time billing, manages per-user SQLite databases, handles SSE payment streams, and serves thousands of requests per day — all from a single file running on a single server.

It deploys in 8 seconds. It has zero inter-service communication bugs. Zero distributed transaction failures. Zero service mesh configuration files. Zero Kubernetes manifests.

The question isn't "why didn't you split it?" The question is "why would you?"


I – What's Actually in 13,870 Lines

Let me be clear about what this isn't. It's not 13,870 lines of spaghetti. It's organized into clearly delineated sections with consistent structure.

Configuration and constants. Type definitions. Database utilities. Cryptographic secret derivation. Principal resolution middleware. Authentication. The billing engine. The AI generation pipeline. Key-value store API. App lifecycle management. Asset serving. Server-Sent Event streaming. Rate limiting and quotas. Admin and monitoring endpoints. Webhook handling. Health checks and diagnostics. Route registration. Error handling and logging. Server bootstrap.

Each section has a clear responsibility. Each has well-defined boundaries. They're just all in the same file.

"But Antonio, you could put each of those sections in its own file!"

Yes. And we tried that.

Here's why we came back.


II – The Extraction Experiment

Six months ago, I did what every developer's instinct tells them to do. I extracted each section into its own module. Fourteen files across five directories with a clean separation of concerns that would look beautiful in any architecture diagram.

It looked gorgeous in the file tree. It followed every best practice in every architecture book.

Within two weeks, I reverted the entire thing.


III – Circular Dependencies Killed It

The billing engine needs the principal to know which database to open. The principal middleware needs the billing engine to check if the app has credits. The AI pipeline needs the billing engine to hold credits before generation. The billing engine needs the AI pipeline to confirm credits after generation.

In a single file, these are function calls. Direct. Obvious. Traceable.

In separate modules, they're circular imports that either require careful dependency injection, barrel files, or restructuring that makes the code harder to understand, not easier.

I solved the circular dependencies using dependency injection — each module taking its dependencies as constructor parameters. This is architecturally cleaner on paper. In practice, it added roughly 300 lines of wiring code and made it significantly harder to trace what calls what.

When debugging a billing issue at 2 AM, I don't want to grep across 14 files. I want to search within one file and see the entire call chain directly.


IV – Shared State Became a Headache

Several parts of the system share state. The database connection pool. The rate limiter counters. The SSE client registry.

In a single file, these are module-level variables that any function can access directly. Clear, simple, visible.

In separate modules, you need to pass them around or create shared singletons. Each piece of shared state requires a manager class, an export, an import, and wiring code in the entry point.

More files. More indirection. More places where a bug can hide.

And the total line count actually increased — from 13,870 to about 14,400 — because of all the boilerplate required to wire dependencies together. The extraction made the codebase larger, not smaller.


V – IDE Navigation Got Worse

This one surprised me.

With a single file, I can find any function, type, or constant with a single keystroke. I can scroll to see related code. I can use code folding to collapse sections I'm not working on. I can see the entire execution flow without jumping between tabs.

With 14 files, I was constantly clicking through function definitions, losing context as I jumped between tabs, and using global search when local search only covered the current file.

Modern editors handle large files just fine. Syntax highlighting, IntelliSense, type checking — all work identically on a 1,000-line file and a 14,000-line file. The "large file is hard to navigate" argument assumes tools from 2005.


VI – The Steelman Case for Microservices

Let me be fair to the other side. There are legitimate reasons to decompose a system.

Independent scaling — different components need different resource profiles. Independent deployment — different teams ship different components on different schedules. Technology heterogeneity — different components benefit from different languages. Fault isolation — a crash in one component shouldn't bring down others. Organizational alignment — teams own services, not functions.

Let me evaluate each one honestly for our situation.

Independent scaling? Our components don't have different scaling profiles. The AI pipeline is the most CPU-intensive, but it's also I/O-bound waiting for LLM API responses. The single process handles everything at about 15 percent CPU utilization at peak. Premature scaling optimization is just premature optimization with a fancier name.

Independent deployment? There is one developer. Me. I deploy the entire system when I push to main. The deploy takes 8 seconds. There is no deployment contention because there is no team.

Technology heterogeneity? Everything is TypeScript on Bun. We don't need Python for AI — we call an external LLM API. There is no polyglot pressure.

Fault isolation? If the billing engine throws an unhandled exception, the error handler catches it, returns a 500 to that specific request, and every other request continues normally. The process doesn't crash. The only scenario where a monolith fails and microservices wouldn't is a process-level crash. In 8 months of production, we've had zero.

Organizational alignment? One developer. One service. Perfect alignment.

At every decision point, the monolith wins for our current situation.


VII – Where the Seams Are

If I were forced to split — new team member, sudden scale requirement, acquisition — here's exactly where I'd cut.

The AI generation pipeline is the most natural extraction point. About 2,400 lines. Simple input — a prompt. Simple output — generated files. It's purely I/O-bound. The only dependencies are the filesystem write and the SSE broadcast, both of which could be replaced with a message queue.

But extracting it today would require a message broker for event streaming. That's an additional infrastructure dependency and a new failure mode. Currently, the SSE stream goes directly from the LLM API response to the user's browser with no intermediate hop. Making it a service would mean five steps instead of one, five failure points instead of one, and the first-chunk latency would double.

The billing engine is the second natural seam. About 1,100 lines. A well-defined API with four operations — hold, confirm, release, get balance. Essentially a finite state machine.

But billing is on the critical path of every operation that costs credits. In the monolith, it's a function call taking 0.1 milliseconds that never fails due to network issues. As a service, it'd be an HTTP call taking 2 milliseconds at best, plus retry logic, circuit breakers, and a fallback strategy for when billing is unreachable. The reliability cost of extraction exceeds the organizational benefit for a team of one.

App lifecycle management is the third seam. About 920 lines. Clean CRUD operations with clear boundaries. But it touches the filesystem directly. As a separate service, it would need shared filesystem access through NFS or an object store — adding latency and another failure mode.

The seams are there. The extraction points are documented. But choosing not to cut is itself an architectural decision.


If you're grappling with the monolith vs. microservice question — or any architectural trade-off where the "right" answer depends on your specific situation — I walk through these decisions regularly in mentoring sessions at mentoring.oakoliver.com.


VIII – The One-File Debugging Advantage

Here's a practical benefit that architecture books never mention.

When everything is in one file, every debugging session starts the same way. Open the file. Search for the error message or function name. Read the surrounding code. There is no step four.

Compare that to a microservice architecture. Check which service logged the error. Find the service's repository. Open the relevant file — which one? Check the stack trace. Realize the error originated in a different service. Check inter-service communication logs. Find the originating service. Open that file. Realize the data came from a third service.

I'm exaggerating, but only slightly.

Distributed debugging is exponentially harder than monolith debugging. Network calls fail silently. Timeouts mask root causes. Partial failures create states that no single service's logs fully explain.

In our monolith, there are no partial failures. Either the request succeeds end-to-end, or it fails end-to-end, and the error is in one stack trace in one log file.


IX – Performance of a Large File

"But doesn't a 13,870-line file hurt performance?"

No.

Bun parses and loads the entire file in 47 milliseconds. This happens once, at startup. After that, all function calls are within already-loaded code with zero module resolution overhead.

The multi-file version actually loaded slower — about 112 milliseconds total — because of the module resolution overhead, cross-file import resolution, and filesystem reads for 14 separate files.

At runtime, the JavaScript engine compiles functions lazily. A large file doesn't compile all its code at startup. It compiles functions when they're first called. Unused code paths are never compiled. Memory at idle is 52 megabytes regardless of whether the code is in one file or fourteen.

Function calls within a single file are identical in performance to function calls across modules. The engine optimizes call sites the same way regardless of module boundaries. There is zero runtime performance difference.


X – When a Monolith Becomes Wrong

I'm not arguing that monoliths are always better. Here are the concrete signals that would tell me it's time to split.

A second developer joins and we're stepping on each other's changes. Frequent edits to the same file means constant merge conflicts. Splitting along the seams reduces friction.

One component needs a fundamentally different runtime. If we added ML model serving that required Python, that component would need its own process.

One component's failure shouldn't affect others. If the AI pipeline starts crashing due to LLM API changes and we want billing and key-value operations to keep working independently, extraction provides blast radius control.

One component needs to scale independently. If AI generation volume grows 100x while key-value reads stay constant, extracting the pipeline lets us scale it separately.

The file exceeds our ability to understand it. If I can no longer hold the entire system in my head — if I need a wiki to remember how billing interacts with event streaming — it's time to split for cognitive reasons.

None of these conditions are currently true. The system fits in one developer's head, runs on one server, and serves all traffic comfortably.


XI – The Counter-Intuitive Conclusion

Here's what I've learned from running a 13,870-line monolith in production for 8 months.

The monolith is not the absence of architecture. It's the simplest architecture that solves the problem.

The seams are there. The boundaries are clear. The extraction points are documented. We could split in a week if we needed to.

But we don't need to. And choosing not to split buys us simplicity, reliability, debuggability, and deployment speed — at the cost of future flexibility we can purchase later, when we actually need it, with full knowledge of where the traffic patterns really are.

Not where we guessed they'd be on day one.

Premature decomposition is the root of all evil in distributed systems. Don't split until the pain of staying together exceeds the pain of splitting apart.

For us, that day hasn't come.

And honestly? I hope it doesn't come for a while.

How big is the largest single file in your codebase? What would it take — really take — for you to split it? The threshold is different for every team, and I'm genuinely curious where yours falls.

– Antonio

"Simplicity is the ultimate sophistication."