Kubernetes vs Single VM: Cost, Latency, and Deploy Speed Benchmarks
When deploying a web service, should you opt for the simplicity of a single Virtual Machine or the robust orchestration of a Kubernetes cluster? In this benchmark, we pit a single cloud VM against a three-node Kubernetes cluster using a lightweight HTTP API. We dive into the real-world trade-offs, measuring deployment times, tail latency under heavy load, and monthly infrastructure costs to help you architect the right environment for your needs.

Single VM vs Three-Node Kubernetes Cluster – Benchmark Results
Benchmark Setup and Methodology
A simple web service was deployed in two environments:
- A single-node cloud VM
- A three-node Kubernetes cluster
The same application (a lightweight HTTP API) was used in both cases to ensure a fair comparison.
Load testing was performed using wrk:
https://github.com/wg/wrk
This was used to generate traffic and measure latency distribution, including 99th percentile (p99) latency.
Deployment times were measured by observing rollout duration:
- Example:
kubectl rollout statustimestamps in Kubernetes - Goal: capture how long it takes to roll out a new version
We also tracked monthly infrastructure cost for each setup based on:
- Typical cloud VM pricing
- Managed Kubernetes fees
Key Configuration Details
- Same VM instance size used across environments
- Identical application code and container image
- Kubernetes used default rolling update settings (zero downtime)
- Deployment setup:
- 1 replica per node → total 3 replicas
- Readiness probes enabled (pods serve traffic only after passing health checks)
Load testing setup:
wrk -t12 -c400 -d30s http://service-url
These conditions reflect real-world constraints, not idealized lab setups:
- No ultra high-end hardware
- No unrealistic optimizations
Deployment Time (Rolling Out New Versions)
Single VM
Deploying a new version on a single VM (e.g., updating a binary or Docker container and restarting) is:
- Simple
- Fast
Observed results:
- Time: ~5–10 seconds
- Downtime: Yes (brief interruption)
This essentially acts as a quick process restart.
Kubernetes Cluster
Deploying on Kubernetes is slower due to rolling updates.
Kubernetes:
- Gradually replaces old pods with new ones
- Ensures zero downtime
However:
- Each node must pull the new image
- Each pod must pass readiness checks before continuing
Example timing:
- Health check interval: ~15 seconds
- Requires 2 successful checks
- Pod readiness time:
- Best case: ~30 seconds
- Worst case: ~60 seconds
Observed results:
- Full rollout time: ~30–60 seconds
- Downtime: None
Key Insight
- VM → Fast but causes downtime
- Kubernetes → Slower but zero downtime
Note
- VM update: nearly instantaneous with a small service blip
- Kubernetes: slower rollout but continuous availability
Tail Latency (99th Percentile)
Tail latency measures how slow the slowest requests are.
We measured p99 latency under sustained load using wrk.
Single VM
-
Performs well at moderate load
-
Under heavy load (~1000 req/s):
- Average latency: ~200 ms
- p99 latency: 500 ms+
Issue:
- Resource saturation
- Request queuing
- High variability
Kubernetes Cluster
Load is distributed across 3 nodes:
- Each pod handles fewer requests
- Reduced contention
Observed results:
- p99 latency: ~150–250 ms
- Much more stable
- Fewer extreme spikes
Real-World Observation
- Initial migration may introduce latency (100–200 ms)
- After tuning (networking, DNS), performance improves significantly
Key Insight
- VM → good average latency, poor tail latency under stress
- Kubernetes → stable and consistent latency
Latency Considerations
Kubernetes introduces small overhead:
- Extra network hop
- Service proxy layer
At low load:
- VM may be faster (e.g., ~20 ms vs 100–200 ms initially)
With proper tuning:
- Overhead becomes minimal
Important Notes
- Resource limits matter:
- Strict CPU limits → throttling → high latency
- Misconfigurations can cause severe issues:
- Example: p99 jumped from 195 ms → 2.5 seconds
Summary
- VM: higher p99 under heavy load
- Kubernetes: tighter latency distribution, better stability
Monthly Infrastructure Cost
Cost comparison based on AWS pricing.
Single Node VM
Example: AWS t3.small
- 1 VM: ~$10/month
- 3 VMs: ~$29/month
Kubernetes Cluster (3 Nodes)
Costs include:
- Node cost: ~$29/month
- Control plane (EKS): ~$80/month
Total: ~$110/month
Cloud Differences
- AWS: control plane cost is significant
- Azure AKS / Google GKE:
- Often no control plane fee
- Total cost closer to $100–$150/month
Key Insight
- Kubernetes is ~3–4× more expensive at small scale
- Cost improves when:
- Running multiple services per cluster
Cost Summary
-
Single VM:
- Cheapest option
- Minimal overhead
-
Kubernetes:
- Higher baseline cost
- Justified by features:
- Auto-scaling
- Self-healing
- High availability
Final Comparison
Deploy Time
- VM: seconds, but downtime
- Kubernetes: tens of seconds, zero downtime
Tail Latency
- VM:
- p99 spikes: 500ms+
- Kubernetes:
- p99 stable: 150–250ms
Cost
- VM: $10–$30/month
- Kubernetes: ~$100+/month
Overall Conclusion
This benchmark highlights a clear trade-off:
Kubernetes
- Better reliability
- Better scaling
- Lower tail latency under load
- Zero-downtime deployments
But:
- Higher cost
- More complexity
Single VM
- Simpler
- Much cheaper
- Fast deployments
But:
- Downtime during deploys
- Poor scaling
- High tail latency under stress
Final Takeaway
Choose based on your needs:
-
Use Single VM if:
- You want simplicity
- Traffic is low
- Cost matters most
-
Use Kubernetes if:
- You need high availability
- Traffic is high
- Tail latency and uptime are critical
Real-World Numbers Recap
-
Deploy time:
- VM: ~5–10s
- K8s: ~30–60s
-
Tail latency:
- VM: 500ms+
- K8s: 150–250ms
-
Cost:
- VM: ~$10–30/month
- K8s: ~$100+/month
Kubernetes improves robustness and scalability, while a single VM remains a strong choice for small, cost-sensitive applications.
Enjoyed this article?
Check out more of my work or get in touch to discuss your next project.