vCluster creates lightweight virtual Kubernetes clusters within physical Kubernetes host clusters, dramatically reducing resource consumptionโand increasing agility and control.
Straight out of the webscale playbook, platform engineering was considered a futuristic discipline until just a few years ago. Would platform engineering really trickle down to mainstream enterprise teams? Did companies really want to operate their infrastructure like the major cloud providers? Now itโs a practice that 80% of enterprises will have adopted by 2026,ย according to Gartner.
Platform engineering means different things to different people, and thereโs no golden path that prescribes exactly how to do it right. But the main goals are universally understood. On the one hand, platform engineering strives to boost developer velocity, by removing bottlenecks and adding self-service. On the other hand, it aims to standardize on central controls like security and compliance, so you can keep costs and complexity in check.ย
Increasing developer velocity has been a clear win for platform engineering. Containers, microservices, Kubernetes, CI/CD, and the modern development workflow undeniably have made software development a faster, more productive, more automated experience for distributed teams.ย
But the โcentral controlโ part of platform engineering? Itโs not so easy to declare victory quite yet. Weโre in the midst of aย multi-year backlashย against the high cost and complexity of the cloud operating model. And today that central control side of platform engineering isnโt just a platform engineering team issue, itโs a CFO issue, as cloud bills soar and companies feel severe pressure to find cost savings. The cloud and Kubernetes are here to stay, but fixing a broken central control plane is a multi-million dollar dilemma that many enterprises are struggling with today.
Thatโs why an open-source project called vCluster is having a breakthrough moment.ย vCluster takes aim at the heart of the Kubernetes operating model, the cluster abstraction, to deliver a range of benefits to organizations building on Kubernetes. vCluster not only dramatically reduces resource overhead, which in turn can add up to significant cost savings, but also brings more agility and more central control to platform engineering teams.
The open-source path to opportunity
vCluster co-creators (and Loft Labs co-founders) Lukas Gentele and Fabian Kramm met in college at the University of Mannheim, where, as computer science students, they shared a similar technology path from Java and graph databases like Neo4j to working with web-focused technologies like PHP and JavaScript, then diving into the Go language when Docker and Kubernetes took off, seeing that Go was the future. When Gentele started an IT consultancy while still in college, Kramm was his first hire.
Within that IT services business, Gentele and Kramm created a project called DevSpaceโessentially a Docker Compose alternative focused on streamlining Kubernetes workflowsโand put it on GitHub. That was their first exposure to developing and maintaining an open-source project. Theyโd both made contributions and fixes to open-source projects, but had never owned a project or driven one as maintainers. Seeing the magic of open source, distributing it, people using it, valuing it, and contributing to itโthey were hooked.
After graduating college, the two set out to build a PaaS product (what Gentele says was โlike Heroku for Kubernetesโ), applied to Y Combinator, got denied but invited to apply again, then parlayed that into participation in the SkyDeck accelerator program at U.C. Berkeley. Ultimately, they concluded that PaaS was a very difficult business to run. They werenโt the first to hit this wallโas Cloud Foundry, Heroku, and the Docker foundersโ struggles to monetize dotCloud demonstrated.ย
โWe realized we had a lot of free users for our PaaS but not a lot of willingness to pay,โ Gentele said. โOK, so what did we learn? We learned that running large Kubernetes clusters and sharing those clusters with users was extremely complicated and expensive, and that there was a much better way to do this that was a much bigger opportunity than the PaaS. An idea that could be useful for anyone running Kubernetes clusters.โ
Fleets are the wrong abstraction
In the early days of container orchestration, the market got comfortable with the idea of treating servers like โcattleโ (interchangeable hardware that can be swapped out), versus โpetsโ (each server approached with its own care and feeding) as a core concept of turning servers into clusters.
For years, there was an architectural debate around whether to create a fleet of small clusters or a single massive cluster. โKubernetes itself was designed to run at large scale,โ said Gentele. โItโs not as meant to run as a five-node cluster. If you have these small clusters, you get so much duplication and inefficiency.โ
But as the major cloud providers rolled out their Kubernetes offerings, small single-tenant clusters and fleets of multiple clusters were the units of abstraction sold to the enterprise market, complete with โfleet managementโ solutions for coordinating all the moving parts and keeping services in sync in the โclusters of mini clustersโ approach.
Gentele attributes a large portion of todayโs cloud cost overruns back to this original sin by the cloud providers.
The first consequence of the fleet approach is the penalty of heavyweight infrastructure components being paid multiple times. Platform teams want to standardize on core services like Istio and Open Policy Agentโservices that are designed to run at scale, so if you run a lot of them at small scale itโs super inefficient. In the fleet approach, these services always get installed in each cluster, which creates a massive duplication of core services as the entire platform stack is replicated across multiple small clusters.
The other major consequence is that these clusters run all the time. Nobody turns them off. Thereโs no easy way to turn off an entire cluster on the major cloud offerings with the click of a button. Rather, itโs a manual process that requires a policy to be put in place, and 30 minutes to spin up the entire platform stack of services used to connect, manage, secure, and monitor the cluster. Itโs also hard to tell when a cluster is truly โidleโ when all of these platform servicesโsecurity components, policy agents, compliance, backup, monitoring, and loggingโcontinue running underneath.
vCluster: Addition by subtraction
Gentele and Kramm had the epiphany that the fleet approach to clusters could be vastly improved upon, and that Kubernetes multitenancy could be redefined beyond traditional namespace approaches.
In 2023, they released vCluster and introduced the concept of โvirtual clusters,โ an abstraction to create lightweight virtual Kubernetes clusters. Similarly to how virtual private networks create a virtual network over a physical infrastructure, virtual clusters create isolated Kubernetes environments over a shared physical Kubernetes cluster.
vCluster is a certified Kubernetes distribution, so the virtual clusters behave exactly the same way as any other Kubernetes clusterโwith one important difference. Whereas each virtual cluster manages its own namespaces, pods, and services, it does not replicate the platform stack. Instead, it shares the heavyweight platform components, such as Istio or Open Policy Agent, run by the underlying physical cluster. With this shared platform stack, virtual clusters are no longer dragging around the albatross of replicating platform services.
And yet each vCluster has its own API server and control plane, providing strong isolation to tenants and giving platform teams the ability to create their own custom policies for security and governance. They can create their own namespaces, deploy their own custom resources, protect cluster data by using their own backing stores, and apply their own access control policies to users.
At the same time, vCluster gives platform teams far greater speed and agility than a physical cluster. A virtual cluster can be spun up in a mere fraction of the time it takes to spin up a physical cluster. A restart of a vCluster takes about six seconds, versus 30 or 45 minutes to restart a physical Kubernetes cluster thatโs running heavyweight platform services like Istio underneath.
โKubernetes is great from an API perspective, a tooling perspective, a standardization perspectiveโbut the architecture that the cloud providers advocated in running clusters of small clusters took the industry back to the physical server in terms of cost and heaviness,โ Gentele said. โIn the โ90s, someone had to actually physically walk into a data center, plug in a server, issue credentials, and take some other manual steps,โ he said. โWeโre in a similar boat with Kubernetes today. You have so many enterprises running their entire application stack in each cluster, which creates a lot of duplication.โ
vClusters makes the Kubernetes cluster more lightweight and ephemeral, similar to what virtual machines did for physical servers and to what containers did for workloads in general.
โSpinning up small single-tenant Kubernetes clusters was a really terrible idea in the first place, because itโs very costly, and itโs very, very hard to manage,โ Gentele said. โYouโre going to end up with hundreds of clusters. And then youโve got to maintain things like ingress controller, cert manager, Prometheus, and metrics across all these clusters. Thatโs a lot of work, and itโs really hard to keep in sync.โ
vCluster by the numbers
vCluster has more than 6,000 stars on GitHub and more than 120 contributors. The project has drawn the attention of Kubernetes experts such as Rancherโs former CTO and co-founder Darren Shepherd, who has beenย advocating for the use of virtual clusters. Teams fromย Adobe,ย CoreWeave, andย Codefreshย have been outspoken about their use of vCluster at events like KubeCon.
Gentele and Krammโs startupย Loft Labsย was recently funded to extend enterprise capabilities around vCluster. The $24M Series A was led by Khosla Ventures, which is known for being the first institutional investor in companies like GitLab and OpenAI.
The startupโs commercial offering on top of vCluster has generated particular excitement over its โsleep mode,โ which turns off inactive virtual clusters automatically. Typically, enterprises that spin up clusters tend to see them run all the time. Loft Labsโ product measures virtual cluster activity by monitoring incoming API requests, using the sleep mode to automatically scale down virtual clusters when theyโre not used to save on cloud resources and overall cost.
vCluster may help to run cloud infrastructure more efficiently and drive down cloud costs but it also gives enterprises a clearer path to winning on central control and developer velocity. In addition to stretching physical cluster resources, vCluster provides each virtual cluster with its own separate API server and control plane, giving platform teams both more flexibility and more control over the management, security, resource allocations, and scaling of their Kubernetes clusters.


