BlogGPU Billing Models Compared: PAYG vs Reserved vs Spot for Finance Teams

Procurement

PAYG, reserved, or spot? The three GPU billing models, what each costs, and the break-even maths finance teams need before signing a 1-year hyperscaler commit.

GPU Billing Models Compared: PAYG vs Reserved vs Spot for Finance Teams

GPUaaS.com Team
GPUaaS.com Team
Procurement & Finance
June 3, 2026
Blog post cover image

GPU billing has three shapes: pay-as-you-go (PAYG), reserved/committed contracts, and spot/preemptible. Each has a different cash flow profile, a different risk profile, and a different right answer depending on your utilisation. Stop hunting for GPU compute. GPUaaS.com gets you enterprise NVIDIA infrastructure at rates hyperscalers won't offer you, on whichever billing model your finance team can actually live with.

Key takeaways
  • Three billing models dominate GPU procurement in 2026: PAYG (highest unit cost, full flexibility), reserved (20-40% lower rate, multi-month to multi-year commit), and spot (60-80% discount, interruptible)
  • The break-even utilisation between PAYG and reserved sits around 65-70%. Above that, reserved wins. Below it, PAYG is cheaper despite the higher headline rate
  • Spot pricing is only viable for fault-tolerant workloads with checkpointing. For production inference with SLAs, spot is rarely a net win once interruption costs are modelled
  • Hyperscalers require 1 to 3-year non-cancellable commits to access meaningful reserved discounts. GPUaaS.com offers both short-term and long-term contracts without that lock-in
  • For finance teams, the maths comes down to three numbers: forecast utilisation, contract length, and the cost of being wrong. Get those right and the model picks itself

Most GPU procurement conversations start with the GPU model. They should start with the billing model. The hardware decision is well-understood by engineering. The contract structure decision determines your cash flow, your downside risk, and whether your budget holds for the year. This guide is for the finance and procurement leads making that call, with the maths in dollar terms. For the underlying rate structures across GPUs, see the GPU pricing guide.

◆ THREE MODELS
The three GPU billing models: how each one works

Every GPU provider, hyperscaler or otherwise, offers some variation on three models. The names differ. The mechanics don't.

ModelRateCommitCancellableBest for
PAYG / On-demandBaseline (highest)None to monthlyYes, anytimePre-production, bursty workloads, unknown utilisation
Reserved / Committed20-40% below PAYG3 months to 3 yearsUsually noProduction inference, stable training pipelines
Spot / Preemptible60-80% below PAYGNone (interruptible)N/A — provider can reclaimFault-tolerant batch jobs, research checkpoints

The headline rate is the easy part. The real differences are in the contract terms and the implicit cost of each model. PAYG gives you flexibility but you pay for it on every hour. Reserved gives you a better rate but locks up working capital and creates downside risk if your workload evolves. Spot gives you the deepest discount but offloads continuity risk to your engineering team.

GPUaaS.com offers both short-term and long-term contracts at rates hyperscalers won't offer you, without the multi-year lock-in that hyperscaler reserved pricing typically requires. Get a quote and see what each model looks like for your workload.

◆ PAYG
PAYG: when flexibility justifies the premium

Pay-as-you-go is the cleanest model from a finance perspective. You pay only for the GPU-hours you consume. No commitment, no contract term, no early-termination fee. The downside: you pay the highest unit rate, and providers typically reserve the right to reprovision your capacity if demand spikes elsewhere.

PAYG makes sense in three specific situations. First, when you don't yet know your utilisation profile, typically the first 30 to 90 days of a new workload. Second, when usage is genuinely bursty: training runs that fire monthly, batch processing windows, or research workloads that don't justify persistent capacity. Third, when you're evaluating providers and don't want to lock in before you know whether the infrastructure performs.

The hidden cost of PAYG isn't the rate, it's the operational overhead. Teams running pure PAYG often forget to turn clusters off. Idle H100 capacity sitting at 5% utilisation costs the same as H100 capacity running at 80%, and that's where most PAYG budgets quietly evaporate. The GPU bill spike post covers the utilisation maths in detail.

Finance perspective

PAYG is OpEx with no balance-sheet impact. Cash flow tracks usage. If usage doubles next quarter, so does the bill. If it halves, so does the bill. Predictable in pattern, unpredictable in amount.

◆ RESERVED
Reserved: when the commit pays back

Reserved (also called committed-use, capacity reservations, or savings plans depending on the provider) trades flexibility for a lower rate. You commit to a fixed amount of GPU capacity over a defined term. In exchange, you typically save 20 to 40% off PAYG rates. On hyperscalers, deeper discounts require longer commits, with 1-year savings plans giving roughly 30% off and 3-year reserved instances reaching 50% or more.

$553,000

saved on a 256-GPU H100 cluster over 6 months at a $0.50/GPU/hr rate difference

256 GPUs x $0.50 x 24hrs x 180 days. GPUaaS.com offers up to ~30% less than hyperscaler reserved rates.

The compounding maths are what make reserved compelling at scale. A $0.50/GPU/hr rate difference between PAYG and reserved on an H100 cluster over six months:

Cluster sizeRate gap6-month savingIn human terms
8-GPU H100$0.50/GPU/hr~$17,000A month of senior eng time
32-GPU H100$0.50/GPU/hr~$69,000A senior ML hire
80-GPU H100$0.50/GPU/hr~$173,000Two senior engineers
256-GPU H100$0.50/GPU/hr~$553,000Your next model training run

Based on 24/7 operation over 180 days. GPUaaS.com offers up to ~30% less than hyperscaler reserved rates.

The catch on hyperscaler reserved contracts: they're typically non-cancellable. If your workload evolves, your model architecture changes, or you find a better rate mid-term, you continue paying. AWS Savings Plans, Azure Reserved Instances, and GCP Committed Use Discounts all carry this risk. Finance teams need to model the commit as a fixed liability, not a flexible spend line.

GPUaaS.com's contract terms are shorter and more flexible than hyperscaler 1 to 3-year commitments. You can start with a shorter contract and extend as your workload matures, without locking in multi-year spend upfront. For the deep-dive framework on when reserved makes economic sense, see the reserved vs on-demand GPU guide.

◆ SPOT
Spot: the discount that often isn't

Spot pricing offers the deepest discount, typically 60 to 80% off PAYG. The catch: the provider can reclaim your capacity with minutes of notice when on-demand demand spikes. For finance teams modelling spot into a budget, the headline rate is half the picture. The other half is the operational cost of building, running, and recovering from interruptions.

Spot makes sense for a narrow set of workloads. Batch training runs with frequent checkpointing. Embedding generation. Research workloads where 24-hour delays are acceptable. Anything fault-tolerant where the job can resume from a checkpoint when capacity returns.

Spot does not make sense for production inference with SLAs, real-time serving, customer-facing APIs, or anything where interruption translates to revenue loss. The discount looks attractive in the spreadsheet. Once you model the engineering cost of building checkpointing infrastructure, the cost of interruptions during peak demand (exactly when you need capacity most), and the SLA risk, the maths often reverses.

⚠ Watch out

Spot capacity tends to disappear at the worst possible moments, during demand spikes when reserved customers are also burning through their capacity. Your spot capacity gets reclaimed exactly when you need GPUs most. Build that risk into the model before committing the architecture.

For most enterprise teams, the cleanest finance model is reserved for the baseline (predictable, stable workloads) plus PAYG for the burst (training runs, experimentation). Spot fits as a tactical layer for specific batch workloads, not as a strategic billing model.

◆ THE MATHS
The break-even maths: a worked example

The PAYG-vs-reserved break-even point depends on your utilisation. The maths are simple. If your reserved rate is 30% below PAYG, you break even at 70% utilisation. Above 70%, reserved is cheaper. Below 70%, PAYG is cheaper, despite the higher headline rate.

Worked example. You're considering an 8-GPU H200 cluster over 12 months. PAYG rate: $3.50/GPU/hr. Reserved rate: $2.45/GPU/hr (30% discount). The maths:

UtilisationPAYG (12 mo)Reserved (12 mo)Better choice
30%~$73,500~$171,500PAYG
50%~$122,500~$171,500PAYG
70% (break-even)~$171,500~$171,500Either
85%~$208,300~$171,500Reserved
95%~$232,800~$171,500Reserved

PAYG paid only on hours used. Reserved paid on full 12-month commit regardless of utilisation. Indicative GPUaaS.com H200 rates.

Two important caveats. First, average GPU utilisation across enterprise Kubernetes clusters in 2026 sits at 5% (Cast AI, April 2026, 23,000 clusters measured). Most teams aren't running anywhere near 70% utilisation on day one. Second, utilisation isn't fixed. A workload that runs at 40% utilisation in month 1 might hit 80% by month 6 as adoption grows. Reserved makes more sense after you've validated the utilisation, not before.

The break-even between PAYG and a 30%-discounted reserved contract sits at 70% utilisation. Above that, reserved wins. Below it, the discount on the reserved rate is wiped out by paying for capacity you don't use.

◆ DECISION FRAMEWORK
The finance team decision framework

Three questions answer the billing model question. Get all three answered before signing anything.

1

What's your forecast utilisation?

Below 50%: PAYG. 50-70%: PAYG with a plan to move to reserved once usage stabilises. Above 70%: reserved is almost certainly cheaper. Don't guess; look at actual telemetry from the last 30 days.

2

What's the cost of being wrong?

If you commit to 12 months reserved and the workload disappears at month 4, you continue paying for 8 months of unused capacity. Model that downside before signing. On hyperscalers, reserved is typically non-cancellable. On GPUaaS.com, contracts are more flexible.

3

What's the right contract length?

Hyperscaler reserved typically starts at 1 year. Some providers, GPUaaS.com included, offer shorter commits at intermediate discounts. Match the commit length to your utilisation confidence, not to the deepest discount offered.

4

Negotiate the rate before you commit, not after

The headline reserved discount on hyperscalers is fixed. The starting rate isn't. Get competing quotes before signing. GPUaaS.com gives you quotes from multiple vetted providers for H100, H200, B200, and B300 clusters within 24 hours.

Your search for enterprise GPU compute ends here.

NVIDIA infrastructure at rates hyperscalers won't offer you. Short-term and long-term contracts. PAYG and reserved options across H100, H200, B200, B300. Competing quotes within 24 hours.

Get a quote and see the maths for your workload
◆ FAQ
Frequently asked questions

PAYG (pay-as-you-go) charges you per GPU-hour with no commitment. You pay only for what you use, at the highest unit rate. Reserved (or committed-use) requires you to commit to a fixed amount of capacity over a defined term, typically 1 to 3 years on hyperscalers, in exchange for a 20 to 40% lower rate. The right choice depends on your forecast utilisation: above 70%, reserved usually wins; below that, PAYG is cheaper despite the higher headline rate.

Spot pricing makes sense for fault-tolerant batch workloads with frequent checkpointing: training runs, embedding generation, research workloads where 24-hour delays are acceptable. It does not make sense for production inference with SLAs or anything customer-facing. The 60-80% discount looks attractive, but once you model the engineering cost of building interruption-tolerant infrastructure and the SLA risk, the maths often reverse.

For a 30% reserved discount, the break-even utilisation is 70%. Above that, reserved is cheaper than PAYG. Below it, PAYG is cheaper despite the higher headline rate. Industry average GPU utilisation sits at 5% (Cast AI, 2026), so most teams aren't anywhere near the threshold on day one. Validate utilisation with 30 to 90 days of PAYG telemetry before committing to reserved.

Typically no. AWS Savings Plans, Azure Reserved Instances, and GCP Committed Use Discounts are generally non-cancellable. If your workload evolves mid-contract, you continue paying for the full commit regardless. Build that downside risk into your finance model before signing. GPUaaS.com offers more flexible contract terms with shorter commits available, so you can extend as your workload matures rather than locking in multi-year spend upfront.

Model it as three lines: (1) reserved capacity for the stable baseline (fixed monthly OpEx), (2) PAYG burst for variable load (scales with usage), (3) optional spot for batch workloads (deeply discounted but interruptible). Most enterprise teams over-allocate to PAYG and end up paying premium rates for predictable workloads. Build the reserved baseline based on actual telemetry, not forecasts.

GPUaaS.com offers both short-term and long-term contracts across H100, H200, B200, and B300 clusters, at rates hyperscalers won't offer you. Contract terms are more flexible than hyperscaler 1 to 3-year commits. Get a quote and the team will walk through the right model for your workload.

Last reviewed: June 4, 2026. GPU utilisation data from Cast AI 2026 State of Kubernetes Optimisation Report (April 2026, 23,000 clusters). Hyperscaler reserved discount ranges from published AWS, Azure, and GCP pricing pages. GPUaaS.com rates are indicative, contract-based, and quote-dependent on cluster size and contract length.

Share this article:LinkedInX / TwitterCopy link
FIND THE BEST GPU DEAL

Get a wholesale GPU quote in a few hours

NVIDIA B200, H200, H100, A100, RTX Pro 6000 — N. America, EU, MEA, APAC. No buyer fees.

Related articles