Kubernetes vs Single VM: Cost, Latency, and Deploy Speed Benchmarks

When deploying a web service, should you opt for the simplicity of a single Virtual Machine or the robust orchestration of a Kubernetes cluster? In this benchmark, we pit a single cloud VM against a three-node Kubernetes cluster using a lightweight HTTP API. We dive into the real-world trade-offs, measuring deployment times, tail latency under heavy load, and monthly infrastructure costs to help you architect the right environment for your needs.

Kubernetes vs Single VM: Cost, Latency, and Deploy Speed Benchmarks

Single VM vs Three-Node Kubernetes Cluster – Benchmark Results

Benchmark Setup and Methodology

A simple web service was deployed in two environments:

  1. A single-node cloud VM
  2. A three-node Kubernetes cluster

The same application (a lightweight HTTP API) was used in both cases to ensure a fair comparison.

Load testing was performed using wrk:
https://github.com/wg/wrk

This was used to generate traffic and measure latency distribution, including 99th percentile (p99) latency.

Deployment times were measured by observing rollout duration:

  • Example: kubectl rollout status timestamps in Kubernetes
  • Goal: capture how long it takes to roll out a new version

We also tracked monthly infrastructure cost for each setup based on:

  • Typical cloud VM pricing
  • Managed Kubernetes fees

Key Configuration Details

  • Same VM instance size used across environments
  • Identical application code and container image
  • Kubernetes used default rolling update settings (zero downtime)
  • Deployment setup:
    • 1 replica per node → total 3 replicas
  • Readiness probes enabled (pods serve traffic only after passing health checks)

Load testing setup:

wrk -t12 -c400 -d30s http://service-url

These conditions reflect real-world constraints, not idealized lab setups:

  • No ultra high-end hardware
  • No unrealistic optimizations

Deployment Time (Rolling Out New Versions)

Single VM

Deploying a new version on a single VM (e.g., updating a binary or Docker container and restarting) is:

  • Simple
  • Fast

Observed results:

  • Time: ~5–10 seconds
  • Downtime: Yes (brief interruption)

This essentially acts as a quick process restart.


Kubernetes Cluster

Deploying on Kubernetes is slower due to rolling updates.

Kubernetes:

  • Gradually replaces old pods with new ones
  • Ensures zero downtime

However:

  • Each node must pull the new image
  • Each pod must pass readiness checks before continuing

Example timing:

  • Health check interval: ~15 seconds
  • Requires 2 successful checks
  • Pod readiness time:
    • Best case: ~30 seconds
    • Worst case: ~60 seconds

Observed results:

  • Full rollout time: ~30–60 seconds
  • Downtime: None

Key Insight

  • VM → Fast but causes downtime
  • Kubernetes → Slower but zero downtime

Note

  • VM update: nearly instantaneous with a small service blip
  • Kubernetes: slower rollout but continuous availability

Tail Latency (99th Percentile)

Tail latency measures how slow the slowest requests are.

We measured p99 latency under sustained load using wrk.


Single VM

  • Performs well at moderate load

  • Under heavy load (~1000 req/s):

    • Average latency: ~200 ms
    • p99 latency: 500 ms+

Issue:

  • Resource saturation
  • Request queuing
  • High variability

Kubernetes Cluster

Load is distributed across 3 nodes:

  • Each pod handles fewer requests
  • Reduced contention

Observed results:

  • p99 latency: ~150–250 ms
  • Much more stable
  • Fewer extreme spikes

Real-World Observation

  • Initial migration may introduce latency (100–200 ms)
  • After tuning (networking, DNS), performance improves significantly

Key Insight

  • VM → good average latency, poor tail latency under stress
  • Kubernetes → stable and consistent latency

Latency Considerations

Kubernetes introduces small overhead:

  • Extra network hop
  • Service proxy layer

At low load:

  • VM may be faster (e.g., ~20 ms vs 100–200 ms initially)

With proper tuning:

  • Overhead becomes minimal

Important Notes

  • Resource limits matter:
    • Strict CPU limits → throttling → high latency
  • Misconfigurations can cause severe issues:
    • Example: p99 jumped from 195 ms → 2.5 seconds

Summary

  • VM: higher p99 under heavy load
  • Kubernetes: tighter latency distribution, better stability

Monthly Infrastructure Cost

Cost comparison based on AWS pricing.


Single Node VM

Example: AWS t3.small

  • 1 VM: ~$10/month
  • 3 VMs: ~$29/month

Kubernetes Cluster (3 Nodes)

Costs include:

  • Node cost: ~$29/month
  • Control plane (EKS): ~$80/month

Total: ~$110/month


Cloud Differences

  • AWS: control plane cost is significant
  • Azure AKS / Google GKE:
    • Often no control plane fee
    • Total cost closer to $100–$150/month

Key Insight

  • Kubernetes is ~3–4× more expensive at small scale
  • Cost improves when:
    • Running multiple services per cluster

Cost Summary

  • Single VM:

    • Cheapest option
    • Minimal overhead
  • Kubernetes:

    • Higher baseline cost
    • Justified by features:
      • Auto-scaling
      • Self-healing
      • High availability

Final Comparison

Deploy Time

  • VM: seconds, but downtime
  • Kubernetes: tens of seconds, zero downtime

Tail Latency

  • VM:
    • p99 spikes: 500ms+
  • Kubernetes:
    • p99 stable: 150–250ms

Cost

  • VM: $10–$30/month
  • Kubernetes: ~$100+/month

Overall Conclusion

This benchmark highlights a clear trade-off:

Kubernetes

  • Better reliability
  • Better scaling
  • Lower tail latency under load
  • Zero-downtime deployments

But:

  • Higher cost
  • More complexity

Single VM

  • Simpler
  • Much cheaper
  • Fast deployments

But:

  • Downtime during deploys
  • Poor scaling
  • High tail latency under stress

Final Takeaway

Choose based on your needs:

  • Use Single VM if:

    • You want simplicity
    • Traffic is low
    • Cost matters most
  • Use Kubernetes if:

    • You need high availability
    • Traffic is high
    • Tail latency and uptime are critical

Real-World Numbers Recap

  • Deploy time:

    • VM: ~5–10s
    • K8s: ~30–60s
  • Tail latency:

    • VM: 500ms+
    • K8s: 150–250ms
  • Cost:

    • VM: ~$10–30/month
    • K8s: ~$100+/month

Kubernetes improves robustness and scalability, while a single VM remains a strong choice for small, cost-sensitive applications.

Enjoyed this article?

Check out more of my work or get in touch to discuss your next project.