What is the difference between PAYG and reserved GPU billing?

PAYG charges per GPU-hour with no commitment at the highest unit rate. Reserved requires a fixed capacity commitment over 1-3 years in exchange for a 20-40% lower rate. Above 70% utilisation, reserved is cheaper. Below it, PAYG wins.

When does spot GPU pricing make sense?

Spot makes sense for fault-tolerant batch workloads with frequent checkpointing. It does not make sense for production inference with SLAs. The 60-80% discount is offset by the engineering cost of building interruption-tolerant infrastructure.

What utilisation rate justifies a reserved GPU contract?

For a 30% reserved discount, the break-even is 70% utilisation. Industry average sits at 5% (Cast AI, 2026), so validate utilisation with 30-90 days of PAYG telemetry before committing to reserved.

Are hyperscaler reserved GPU contracts cancellable?

Typically no. AWS, Azure, and GCP reserved contracts are generally non-cancellable. GPUaaS.com offers more flexible contract terms without multi-year lock-in.

GPU Billing Models: PAYG vs Reserved vs Spot (2026)

GPU billing has three shapes: pay-as-you-go (PAYG), reserved/committed contracts, and spot/preemptible. Each has a different cash flow profile, a different risk profile, and a different right answer depending on your utilisation. Stop hunting for GPU compute. GPUaaS.com gets you enterprise NVIDIA infrastructure at rates hyperscalers won't offer you, on whichever billing model your finance team can actually live with.

Key takeaways

Three billing models dominate GPU procurement in 2026: PAYG (highest unit cost, full flexibility), reserved (20-40% lower rate, multi-month to multi-year commit), and spot (60-80% discount, interruptible)
The break-even utilisation between PAYG and reserved sits around 65-70%. Above that, reserved wins. Below it, PAYG is cheaper despite the higher headline rate
Spot pricing is only viable for fault-tolerant workloads with checkpointing. For production inference with SLAs, spot is rarely a net win once interruption costs are modelled
Hyperscalers require 1 to 3-year non-cancellable commits to access meaningful reserved discounts. GPUaaS.com offers both short-term and long-term contracts without that lock-in
For finance teams, the maths comes down to three numbers: forecast utilisation, contract length, and the cost of being wrong. Get those right and the model picks itself

Most GPU procurement conversations start with the GPU model. They should start with the billing model. The hardware decision is well-understood by engineering. The contract structure decision determines your cash flow, your downside risk, and whether your budget holds for the year. This guide is for the finance and procurement leads making that call, with the maths in dollar terms. For the underlying rate structures across GPUs, see the GPU pricing guide.

In this article

01The three GPU billing models: how each one works 02PAYG: when flexibility justifies the premium 03Reserved: when the commit pays back 04Spot: the discount that often isn't 05The break-even maths: a worked example 06The finance team decision framework 07Frequently asked questions

◆ THREE MODELS

The three GPU billing models: how each one works

Every GPU provider, hyperscaler or otherwise, offers some variation on three models. The names differ. The mechanics don't.

Model	Rate	Commit	Cancellable	Best for
PAYG / On-demand	Baseline (highest)	None to monthly	Yes, anytime	Pre-production, bursty workloads, unknown utilisation
Reserved / Committed	20-40% below PAYG	3 months to 3 years	Usually no	Production inference, stable training pipelines
Spot / Preemptible	60-80% below PAYG	None (interruptible)	N/A — provider can reclaim	Fault-tolerant batch jobs, research checkpoints

The headline rate is the easy part. The real differences are in the contract terms and the implicit cost of each model. PAYG gives you flexibility but you pay for it on every hour. Reserved gives you a better rate but locks up working capital and creates downside risk if your workload evolves. Spot gives you the deepest discount but offloads continuity risk to your engineering team.

GPUaaS.com offers both short-term and long-term contracts at rates hyperscalers won't offer you, without the multi-year lock-in that hyperscaler reserved pricing typically requires. Get a quote and see what each model looks like for your workload.

◆ PAYG

PAYG: when flexibility justifies the premium

Pay-as-you-go is the cleanest model from a finance perspective. You pay only for the GPU-hours you consume. No commitment, no contract term, no early-termination fee. The downside: you pay the highest unit rate, and providers typically reserve the right to reprovision your capacity if demand spikes elsewhere.

PAYG makes sense in three specific situations. First, when you don't yet know your utilisation profile, typically the first 30 to 90 days of a new workload. Second, when usage is genuinely bursty: training runs that fire monthly, batch processing windows, or research workloads that don't justify persistent capacity. Third, when you're evaluating providers and don't want to lock in before you know whether the infrastructure performs.

The hidden cost of PAYG isn't the rate, it's the operational overhead. Teams running pure PAYG often forget to turn clusters off. Idle H100 capacity sitting at 5% utilisation costs the same as H100 capacity running at 80%, and that's where most PAYG budgets quietly evaporate. The GPU bill spike post covers the utilisation maths in detail.

Finance perspective

PAYG is OpEx with no balance-sheet impact. Cash flow tracks usage. If usage doubles next quarter, so does the bill. If it halves, so does the bill. Predictable in pattern, unpredictable in amount.

◆ RESERVED

Reserved: when the commit pays back

Reserved (also called committed-use, capacity reservations, or savings plans depending on the provider) trades flexibility for a lower rate. You commit to a fixed amount of GPU capacity over a defined term. In exchange, you typically save 20 to 40% off PAYG rates. On hyperscalers, deeper discounts require longer commits, with 1-year savings plans giving roughly 30% off and 3-year reserved instances reaching 50% or more.

$553,000

saved on a 256-GPU H100 cluster over 6 months at a $0.50/GPU/hr rate difference

256 GPUs x $0.50 x 24hrs x 180 days. GPUaaS.com offers up to ~30% less than hyperscaler reserved rates.

The compounding maths are what make reserved compelling at scale. A $0.50/GPU/hr rate difference between PAYG and reserved on an H100 cluster over six months:

Cluster size	Rate gap	6-month saving	In human terms
8-GPU H100	$0.50/GPU/hr	~$17,000	A month of senior eng time
32-GPU H100	$0.50/GPU/hr	~$69,000	A senior ML hire
80-GPU H100	$0.50/GPU/hr	~$173,000	Two senior engineers
256-GPU H100	$0.50/GPU/hr	~$553,000	Your next model training run

Based on 24/7 operation over 180 days. GPUaaS.com offers up to ~30% less than hyperscaler reserved rates.

The catch on hyperscaler reserved contracts: they're typically non-cancellable. If your workload evolves, your model architecture changes, or you find a better rate mid-term, you continue paying. AWS Savings Plans, Azure Reserved Instances, and GCP Committed Use Discounts all carry this risk. Finance teams need to model the commit as a fixed liability, not a flexible spend line.

GPUaaS.com's contract terms are shorter and more flexible than hyperscaler 1 to 3-year commitments. You can start with a shorter contract and extend as your workload matures, without locking in multi-year spend upfront. For the deep-dive framework on when reserved makes economic sense, see the reserved vs on-demand GPU guide.

◆ SPOT

Spot: the discount that often isn't

Spot pricing offers the deepest discount, typically 60 to 80% off PAYG. The catch: the provider can reclaim your capacity with minutes of notice when on-demand demand spikes. For finance teams modelling spot into a budget, the headline rate is half the picture. The other half is the operational cost of building, running, and recovering from interruptions.

Spot makes sense for a narrow set of workloads. Batch training runs with frequent checkpointing. Embedding generation. Research workloads where 24-hour delays are acceptable. Anything fault-tolerant where the job can resume from a checkpoint when capacity returns.

Spot does not make sense for production inference with SLAs, real-time serving, customer-facing APIs, or anything where interruption translates to revenue loss. The discount looks attractive in the spreadsheet. Once you model the engineering cost of building checkpointing infrastructure, the cost of interruptions during peak demand (exactly when you need capacity most), and the SLA risk, the maths often reverses.

⚠ Watch out

Spot capacity tends to disappear at the worst possible moments, during demand spikes when reserved customers are also burning through their capacity. Your spot capacity gets reclaimed exactly when you need GPUs most. Build that risk into the model before committing the architecture.

For most enterprise teams, the cleanest finance model is reserved for the baseline (predictable, stable workloads) plus PAYG for the burst (training runs, experimentation). Spot fits as a tactical layer for specific batch workloads, not as a strategic billing model.

◆ THE MATHS

The break-even maths: a worked example

The PAYG-vs-reserved break-even point depends on your utilisation. The maths are simple. If your reserved rate is 30% below PAYG, you break even at 70% utilisation. Above 70%, reserved is cheaper. Below 70%, PAYG is cheaper, despite the higher headline rate.

Worked example. You're considering an 8-GPU H200 cluster over 12 months. PAYG rate: $3.50/GPU/hr. Reserved rate: $2.45/GPU/hr (30% discount). The maths:

Utilisation	PAYG (12 mo)	Reserved (12 mo)	Better choice
30%	~$73,500	~$171,500	PAYG
50%	~$122,500	~$171,500	PAYG
70% (break-even)	~$171,500	~$171,500	Either
85%	~$208,300	~$171,500	Reserved
95%	~$232,800	~$171,500	Reserved

PAYG paid only on hours used. Reserved paid on full 12-month commit regardless of utilisation. Indicative GPUaaS.com H200 rates.

Two important caveats. First, average GPU utilisation across enterprise Kubernetes clusters in 2026 sits at 5% (Cast AI, April 2026, 23,000 clusters measured). Most teams aren't running anywhere near 70% utilisation on day one. Second, utilisation isn't fixed. A workload that runs at 40% utilisation in month 1 might hit 80% by month 6 as adoption grows. Reserved makes more sense after you've validated the utilisation, not before.

The break-even between PAYG and a 30%-discounted reserved contract sits at 70% utilisation. Above that, reserved wins. Below it, the discount on the reserved rate is wiped out by paying for capacity you don't use.

◆ DECISION FRAMEWORK

The finance team decision framework

Three questions answer the billing model question. Get all three answered before signing anything.

What's your forecast utilisation?

Below 50%: PAYG. 50-70%: PAYG with a plan to move to reserved once usage stabilises. Above 70%: reserved is almost certainly cheaper. Don't guess; look at actual telemetry from the last 30 days.

What's the cost of being wrong?

If you commit to 12 months reserved and the workload disappears at month 4, you continue paying for 8 months of unused capacity. Model that downside before signing. On hyperscalers, reserved is typically non-cancellable. On GPUaaS.com, contracts are more flexible.

What's the right contract length?

Hyperscaler reserved typically starts at 1 year. Some providers, GPUaaS.com included, offer shorter commits at intermediate discounts. Match the commit length to your utilisation confidence, not to the deepest discount offered.

Negotiate the rate before you commit, not after

The headline reserved discount on hyperscalers is fixed. The starting rate isn't. Get competing quotes before signing. GPUaaS.com gives you quotes from multiple vetted providers for H100, H200, B200, and B300 clusters within 24 hours.

Your search for enterprise GPU compute ends here.

NVIDIA infrastructure at rates hyperscalers won't offer you. Short-term and long-term contracts. PAYG and reserved options across H100, H200, B200, B300. Competing quotes within 24 hours.

Get a quote and see the maths for your workload

◆ FAQ

Frequently asked questions

Last reviewed: June 4, 2026. GPU utilisation data from Cast AI 2026 State of Kubernetes Optimisation Report (April 2026, 23,000 clusters). Hyperscaler reserved discount ranges from published AWS, Azure, and GCP pricing pages. GPUaaS.com rates are indicative, contract-based, and quote-dependent on cluster size and contract length.

GPU Billing Models Compared: PAYG vs Reserved vs Spot for Finance Teams

Get a wholesale GPU quote in a few hours

Related articles

B200 vs H100 vs H200: What the Price Difference Actually Tells You About Your Workload

The GPU Market Has Two Prices: The One You're Quoted and the One the Market Clears At

FOMO Is Why Enterprises Are Paying for GPUs They Do Not Use