Engineering

How We Reduced AI Deployment from Weeks to a Day

How we collapsed a sprawling cloud deployment into a single VM — and now stand up our entire AI-native stack inside a customer's VPC in a day.

Keshav · June 9, 2026 · 5 min read

How We Reduced AI Deployment from Weeks to a Day

Building an AI product is one thing. Getting it to run cleanly inside someone else's environment is another — and for AI-native products, that second part is where things quietly get hard. You can have the best models and still stall on day one if the system is painful to deploy.

Why deployment is the hard part

An AI-native product has to understand a customer's world before it's useful, and that means ingesting a lot of their data to build a context graph. That data is, almost by definition, the sensitive kind. No one hands that to a system running outside their control.

So the product has to run where the data lives: inside the customer's own VPC. And standing up a complex system inside an environment you don't own, can't see into ahead of time, and have to repeat for every customer is exactly where most AI products get heavy.

How we used to do it

Our original deployment lived in our own cloud and was sprawling: an application gateway, VPC, NAT gateway, load balancer, cloud functions, three VMs, a scheduler, a key manager, object storage, and pub/sub — roughly a dozen cloud services. Code went onto the VMs by hand; we hadn't built CI/CD yet, a conscious trade-off under real time and resource constraints.

It held up until single-tenant deployments. Each new tenant meant replicating that entire architecture and deploying into it manually, ourselves, every time. A dozen managed services is also a dozen things to provision and approve inside a customer's VPC. None of it was easy, and none of it scaled.

What we changed

The architecture was the problem, not our execution of it. So we collapsed it: every managed cloud service was replaced with a self-hosted, containerized equivalent, and the whole stack was packaged to run on a single VM with Docker. Jenkins builds the images, publishes them to an artifact registry, and deployment becomes a simple procedure — stand up the VM, pull the images, run.

The payoff is ease. A dozen managed services means a dozen pieces to provision and approve inside the customer's VPC; one VM means one. The deployment surface collapses to a single resource, the whole stack lives entirely inside their environment, and we stand it up in a day.

From a day to minutes

Because the stack is containerized and built through one pipeline, updates ride the same rails as the first install. A new version is a new image in the registry; rolling it out is a pull and a restart. Initial deployment lands in a day. Routine updates land in minutes.

Why ease of deployment is the advantage

The market is consolidating around one instinct: people want their AI deployed as close to them as possible, inside their own boundaries, especially as it touches their most sensitive data. The winners won't only have the best intelligence — they'll be the ones who can put that intelligence inside the customer's environment without friction.

The model is half the product. How easily it deploys is the other half — the quiet half. And it's often the half that decides whether the smart part ever gets to do its job.

Back to all posts