For the workloads that belong on infrastructure you control. We bring the inference engineering team you don’t have to hire.
Some inference workloads can’t live on a shared API — regulated data, classified environments, IP that can’t leave the tenancy, weight-level customization, latency where the round-trip is the product. Enterprises have ML teams for those workloads. They don’t have inference engineering teams. These ten engagements are the delivery vehicle that stands the platform up around them and keeps it running — alongside whatever else, frontier APIs included, makes sense for the rest of the portfolio.
Ten services · pick → tune → deploy → operate → scale
Three named entry points. Most accounts come in through one of these.
Six-week assessment to make the case. Optimization engagement to prove the SLA. Or the full migration program with a quality and cost guarantee. The other seven services in the catalog descend from these.
Seven more services. They descend from the entry points.
Most accounts pick one or two of these on top of the entry-point engagement. Click any row for scope, deliverables, and what comes after.
Three terms. In writing. Before kickoff.
The migration program is the only engagement we sell with an outcome guarantee. The terms are short and they are signed before we start.
- 01Quality parity
Defined upfront on a signed eval harness. The migrated workload must hit or exceed the production-baseline scores it’s replacing — on the evals you wrote.
- 02TCO reduction
Defined upfront as a percentage of prior token spend at matched traffic volume. The percentage and the measurement window are in the contract.
- 03Miss the guarantee
Service credits against your S9 managed-ops subscription, up to the engagement fee. We eat the overrun, not you.
Three rules. No exceptions.
Three roles cover everything we sell
Inference Engineer (CUDA, vLLM, SGLang, Dynamo, quantization, parallelism), ML Engineer (training, fine-tuning, distillation, eval design), Platform Engineer / SRE (K8s, compute-agent, observability, on-call). Plus a Solutions Architect on the front end and a Technical Account Manager on the back end of S9.
Services exist to deliver the platform, not paper over it
Anything we hand-build twice in services becomes a product feature. Conversely: we will not sell a service whose only purpose is making the product usable — that is a product bug, not a service. Monthly sync between product and services to keep that boundary honest.
Pricing philosophy
Lead with fixed-fee for predictability. Subscription is the north star — it is how we build durable revenue. Outcome-based where we can measure, like S10. Hourly or T&M only as a last resort. If we are selling hours, we lost.
Ready to talk? Pick the path.
For services engagements, schedule a discovery call. For platform access, sign up to the waitlist. Both routes go to the same team.