On 22 January 2026, the people who build vLLM became a company. Inferact launched with a $150 million seed round — one of the largest AI-infrastructure seeds on record — at an $800 million valuation, co-led by Andreessen Horowitz and Lightspeed, with Sequoia, Altimeter, Redpoint, and Databricks Ventures joining. The founding team is the original vLLM group out of UC Berkeley's Sky Computing Lab: Simon Mo as CEO, Woosuk Kwon as CTO, Kaichao You, Roger Wang, and the professors who seeded the lab, Joseph Gonzalez and Ion Stoica.
The money is the headline everywhere else. It's the least interesting part. The question that actually matters if you serve models in production is quieter: who now sets the direction of an inference engine that a large share of the industry runs on — and what happens to the free version when the same people have $800 million riding on a paid one?
The fact the coverage skips#
Start with a detail almost every launch story omitted, because it complicates the "open source goes commercial" arc: vLLM is not Inferact's property. UC Berkeley contributed the project to the Linux Foundation in 2024, and on 7 May 2025 it was accepted as a PyTorch Foundation hosted project, under that foundation's explicitly neutral, vendor-neutral governance. The trademark, the governance, the right to say what "vLLM" officially is — those sit with a foundation, not with a company the maintainers happen to also run.
That single structural fact is why vLLM's situation is not the story you've been trained to fear. The open-source cautionary tales of the last few years — Redis relicensing away from BSD, HashiCorp moving Terraform to the BSL (and getting the OpenTofu fork for it), Elastic switching Elasticsearch to SSPL — all share one precondition: the company owned the copyright. Owning the copyright is what let them relicense unilaterally, overnight, and dare the community to fork. Inferact does not have that lever. It cannot wake up one morning and put vLLM under a business-source license, because it does not hold the rights to do so. Foundation hosting is a real check, and it's the check that was missing in every relicensing scandal people are pattern-matching this to.
The Redis question — "will they take it away?" — is mostly off the table here. Foundation governance already answered it.
So where did the risk go?#
It didn't vanish. It moved — from the license to the roadmap. And that's the shift worth understanding, because it's subtler and harder to police.
Foundation hosting constrains what can happen to vLLM's license. It does far less about who holds day-to-day commit authority and, more importantly, whose problems the roadmap prioritizes. The people who decide what vLLM optimizes next are now, in large part, employees of a company whose product is a hosted place to run vLLM. Every hour of maintainer attention is now implicitly a capital-allocation decision between "make the open engine better for everyone" and "make the commercial serverless product the best way to consume it." Nobody has to act in bad faith for the second to quietly win; incentives do that on their own.
There is a precedent here so exact it's almost uncomfortable, and it belongs to one of Inferact's own founders. Ion Stoica has run this play twice from the same Berkeley lab: Spark became Databricks, and Ray became Anyscale. Ray/Anyscale is the live case study for what this looks like in practice — a genuinely open project with a VC-backed company built around it by its creators, and a years-long, ongoing negotiation over where the line falls between what's in the free project and what's in the paid platform. That negotiation is not a scandal; it's just the permanent condition of open-core-adjacent infrastructure. It's also happening in stereo: vLLM's sibling engine from the same lab, SGLang, is being commercialized in parallel. The Berkeley inference commons is professionalizing all at once.
To their credit, Inferact's founders said the right thing. Simon Mo's public framing is that Inferact exists to "support, maintain, steward and push forward the open source ecosystem," with the commercial engine as a second goal. That is exactly the correct commitment. The open question — the one no press release can settle — is whether foundation governance has teeth when the paychecks come from the commercial side, or whether "steward" slowly comes to mean "steward toward the thing we sell." This is the same tension that runs under who actually controls a protocol once a foundation is involved, and under where the real leverage sits in open-versus-closed infrastructure.
What a builder should actually do#
Nothing, today. vLLM is still free, still open, still the same import; there's no migration and no fire. If you're weighing self-hosting vLLM against a managed inference API, the cost math hasn't changed this quarter — and it's worth remembering that not every LLMops project even survives the squeeze, as TensorZero's shutdown showed. A well-funded steward is, on balance, better news for vLLM's longevity than the alternative.
But treat the dependency differently going forward. Your inference engine's roadmap is now set by a company selling a hosted alternative to running it yourself, which is a specific and knowable conflict of interest — not a disqualifying one, a watchable one. Three signals tell you which way it's breaking: does new performance work land in the open repo or only in the serverless product; does the PyTorch Foundation retain genuinely independent maintainers with commit rights; and do vLLM's plugin and extension points start bending toward the commercial layer's needs. None of those will make headlines. All three are the actual story. The relicensing era taught everyone to watch the license. The foundation-plus-startup era asks you to watch the commits instead.



