What is the real total cost of ownership for a GPU cluster?

GPU hardware is only 25-35% of 5-year TCO. Power, cooling, networking, staff, and maintenance make up the rest. A 100-GPU H100 cluster with $3M hardware typically reaches $8.6M 5-year TCO (Introl, April 2026).

When does owning a GPU cluster beat renting?

Owned hardware beats renting above ~70% sustained utilisation. Below that, idle capacity outweighs the per-hour discount. The 2026 default is hybrid.

How much does power cost for a 100-GPU H100 cluster?

Roughly $420,000 per year at $0.12/kWh and 1.4 PUE. Higher at current US commercial average rates of $0.13-0.14/kWh.

What is the break-even utilisation for owning vs renting a GPU cluster?

Approximately 70% sustained utilisation for B200 clusters and 60-70% across most GPU generations. Cast AI's 2026 data shows average utilisation is just 5% across 23,000 measured clusters.

How does GPUaaS.com pricing compare to owning a GPU cluster?

GPUaaS.com contract rates from ~$2.50/GPU/hr H100, ~$3.00/GPU/hr H200, ~$4.50/GPU/hr B200 and B300. Owned hardware lands at ~$2.88/GPU/hr effective for 256-GPU B200 at 80% utilisation.

The Real TCO of a GPU Cluster: Hardware, Power, Staff (2026)

A 100-GPU H100 cluster looks like a $3M decision on the spreadsheet. The actual five-year cost is closer to $8.6M once you add power, cooling, networking, staff, and maintenance, according to Introl's April 2026 TCO model. Hardware is only 25-35% of GPU cluster TCO. The rest is what kills budgets that didn't account for it. This is the line-by-line breakdown of what a real GPU cluster costs to run, why hyperscaler "you don't see it" pricing isn't free either, and where GPUaaS.com lands against both.

Key takeaways

GPU hardware is 25-35% of 5-year TCO on a self-built cluster. Power, cooling, networking, staff, and maintenance make up the other 65-75% (Introl, April 2026; KAD, April 2026)
A 100-GPU H100 cluster runs ~$3M in hardware but $8.6M over 5 years all-in. Gartner: 73% of enterprises underestimate AI infrastructure costs by failing to account for opex (Introl, April 2026)
On-prem effective rate for an 8-GPU H100 node is ~$4.75/GPU/hr before networking and facility lease. Spheron's on-demand H100 is $2.90/hr (Spheron, April 2026)
A 256-GPU B200 cluster owned and colocated in Texas costs ~$15.5M over 3 years with loan + opex, or ~$2.88/GPU/hr at 80% utilisation. Buying beats renting above ~70% utilisation (American Compute, March 2026)
Hyperscaler "all-in" pricing isn't free either. Egress, storage, support tiers, and reserved-but-idle capacity stack 20-40% on top of the compute rate
GPUaaS.com offers both short-term and long-term contracts at rates hyperscalers won't offer you. H100 from ~$2.50/GPU/hr, H200 from ~$3.00/GPU/hr, B200 and B300 from ~$4.50/GPU/hr, with no multi-year lock-in

GPU cluster TCO conversations get derailed at the same point every time: someone quotes the hardware sticker price, someone else quotes a per-hour cloud rate, and the comparison breaks because they aren't comparing the same things. The real TCO of a cluster is hardware plus power plus cooling plus networking plus storage plus staff plus depreciation plus the cost of getting it wrong. This guide walks through each line item with 2026 numbers and shows where the contract-based GPU rental model fits. For the broader pricing structure, see the GPU pricing guide.

In this article

01Hardware: the easy 25-35% of the bill 02Power and cooling: the line that compounds 03Networking, storage and the facility 04Staff, maintenance and the obsolescence clock 05The full-stack TCO: worked example for a 100-GPU H100 cluster 06Build vs rent: where the break-even actually sits in 2026 07Frequently asked questions

◆ HARDWARE

Hardware: the easy 25-35% of the bill

Hardware is the line most teams build the whole budget around. It's the wrong line. On a five-year horizon, GPU hardware is roughly 25-35% of total cost (KAD, April 2026; Silicon Analysts cluster TCO calculator). The rest is what the cloud provider absorbs and what the on-prem team forgets to model.

Reference numbers as of mid-2026, from Silicon Analysts' cluster TCO data and io.net's January 2026 buyer's guide:

Cluster	Hardware capex	Power draw	Notes
DGX H100 (8 GPUs)	$200K-$500K	~10 kW	Air-cooled single node
64x H100 training cluster	~$4.0M	~90 kW	+ $0.4M storage, $0.5M network, $0.3M racks
256-GPU H100 SuperPOD	$7-10M	~540 kW	Chilled-water cooling, 20 racks
2,048-GPU H100 datacenter	~$15.8M capex	~2.2 MW	Redundant chillers, InfiniBand spines

Capex ranges from Silicon Analysts cluster TCO calculator and io.net January 2026 buyer's guide. Power draw includes server overhead but excludes PUE.

The numbers compress with scale. Per-GPU hardware cost drops about 8% on bulk orders at 512 GPUs versus 64. The opex doesn't compress the same way: facility, power, cooling, networking, and staff costs are largely fixed regardless of how you got the discount on the silicon.

There's also a depreciation problem nobody likes to model. Cyfuture's January 2026 analysis pegs GPU resale value loss at roughly 50% over two years as new generations land. An H100 purchased in early 2024 is worth meaningfully less in 2026 with B200 and B300 in market and H200 reserved contracts expiring. If you depreciate over 5 years, the back half of that schedule is on hardware that's already obsolescent.

◆ POWER + COOLING

Power and cooling: the line that compounds

An H100 SXM5 draws roughly 1,400W of total system power per GPU once you account for CPU, networking, and cooling overhead, per Silicon Analysts. A 1,024-GPU B200 cluster draws about 1.8 MW. A GB200 NVL72 rack draws 120-130 kW and requires liquid cooling.

Spheron's April 2026 power cost guide breaks down the math for a 1,000-GPU H100 cluster: 1,000 GPUs x 700W x 1.80 server overhead multiplier x 1.4 PUE = approximately 1.76 MW continuous draw. At $0.12/kWh (a conservative floor versus the current US commercial average of $0.13-0.14/kWh per EIA), power alone runs about $152/month per GPU. For a single 8-GPU node, on-prem TCO works out to roughly $1.69/hr depreciation + $0.21/hr power + $2.85/hr everything else = ~$4.75/GPU/hr before networking and facility lease.

$420K/yr

Power bill for a 100-GPU cluster, before cooling overhead (Introl, April 2026)

For context, GPUaaS.com H100 at ~$2.50/GPU/hr buys you the GPU and all of this without separate line items.

Cooling is the line that bites teams late. Air-cooled racks top out around 30 kW. Once you push past that (and B200 already does at the rack level), you need direct-to-chip or immersion liquid cooling. 3exhosting's May 2026 analysis pegs the capex delta for direct-to-chip at $2,500-$4,500 per kW above air-cooled, but operational costs are lower because liquid achieves PUE below 1.2 versus 1.4-1.6 for air. Over 10 years, single-phase immersion has roughly 20% lower TCO than air with rear-door heat exchangers.

According to Spheron's April 2026 power consumption analysis, on-prem TCO for an 8-GPU H100 node works out to approximately $4.75/GPU/hr before networking, maintenance contracts, and facility lease costs, while Spheron's on-demand H100 rate is $2.90/hr.

There's also PUE risk that's rarely modelled honestly. A colocation facility promising PUE 1.3 but delivering 1.5 in summer means your electricity bill is roughly 15% higher than the spreadsheet said. If you're modelling TCO at 1.2 PUE because that's what the brochure says, you're underbudgeting by 15-25% on the power line.

◆ NETWORKING + STORAGE

Networking, storage and the facility

Networking is 10-15% of cluster TCO, per Silicon Analysts. InfiniBand spines for a 2,048-GPU cluster need 10 spine switches and the cabling that goes with them. For a 64-node 512-GPU cluster, networking equipment alone runs around $500K capex.

There's a subtle correctness issue on top of the cost. io.net's January 2026 buyer's guide flags that a single Broadcom PEX8900 96-lane switch hanging eight GPUs behind one NUMA node can add 8 microseconds of DMA contention, reducing throughput by 12%. The fix is a 2-root-complex topology with two CPU sockets, but that's a hardware decision that has to be made at purchase time. Get it wrong and you've paid for 12% throughput you'll never see.

Storage is the other line that compounds. Datasets for foundation model training routinely sit in the petabyte range. High-performance parallel filesystems (Lustre, BeeGFS, WekaFS) cost more than vanilla object storage. A 64-GPU cluster needs roughly $400K of storage capex to feed the GPUs without bottlenecking.

Datacenter space adds another 10-15% of TCO. For a 64-node 512-GPU H100 cluster needing roughly 540 kW and 20 racks, colocation in a tier-3 facility runs anywhere from $300K to $600K per year depending on region and power pricing. Owned datacenter is heavier capex but lower per-kW opex over a 10+ year horizon.

⚠ Watch out

Idle facility costs while waiting for GPUs to arrive are the most-missed line. io.net flags that every month of facility lease before the first GPU lands costs ~$180K. If your supplier slips by 3 months, that's $540K spent on empty racks. Build-from-scratch projects regularly hit this; rented capacity doesn't.

◆ STAFF + MAINTENANCE

Staff, maintenance and the obsolescence clock

Introl's April 2026 TCO model pegs a GPU infrastructure engineer at roughly $275K fully loaded. You need at least two for a 100-GPU cluster, more for anything north of 500 GPUs. io.net's model assumes 5 FTEs for a 24x7 ops team at 512 GPUs.

Maintenance contracts on the hardware itself land in the 8-12% of capex per year range. Cyfuture's January 2026 analysis lumps maintenance and staffing together at $150K+ annually for a single-cluster team. Downtime from failures runs 5-10% utilisation loss in their model. That's not noise: at 80% target utilisation, a 5% downtime hit knocks effective utilisation to 76%, which moves the per-GPU-hour cost up by about 5%.

Obsolescence is the line that nobody likes to write down. Silicon Data's pricing trends analysis notes that many organisations who reserved A100 and H100 hardware under 3-year contracts in 2023-2024 will see those reservations expire in 2026, with additional inventory returning to the market as buyers transition to B200 and GB300. Translation: the resale value of H100 hardware bought in 2024 is meaningfully lower in 2026 than the depreciation curve assumed. Cyfuture estimates 50% value loss over two years.

According to Introl's April 2026 TCO model, a single GPU engineer commands roughly $275K annually fully loaded, and Gartner data cited in the same report shows 73% of enterprises underestimate AI infrastructure costs by failing to account for operational expenses.

◆ WORKED EXAMPLE

The full-stack TCO: worked example for a 100-GPU H100 cluster

Introl's April 2026 TCO model walks through a 100-GPU H100 deployment. Hardware: $3M. Five-year all-in cost: $8.6M. Hardware is 35% of the bill. The rest:

Line item	5-year cost	% of TCO	Source
Hardware (100 x H100 + racks)	$3.0M	35%	Introl April 2026
Power (5 yrs at $420K/yr)	$2.1M	24%	Introl April 2026
Cooling, networking, storage	$1.4M	16%	Silicon Analysts cluster TCO
Staff (2 eng x 5 yrs)	$1.5M	17%	Introl April 2026
Maintenance, downtime, software	$0.6M	8%	Introl April 2026

5-year TCO: $8.6M. Effective rate at 80% utilisation: roughly $2.45/GPU/hr.

Compounding effect at scale matters. A 256-GPU B200 cluster owned and air-cooled colocated in Texas runs about $12M in hardware and $15.5M all-in TCO over 3 years with a loan and operating costs, per American Compute's March 2026 analysis. At 80% utilisation, that's about $2.88/GPU/hr actually used. Most enterprises finance ~70% of hardware via a loan, which means $4M in cash up front plus monthly loan repayments around $300K. The TCO is the obvious number. The cash flow timing is the one that breaks teams.

⚡ The 165% rule

Introl's research on real deployments shows organisations that model only hardware costs discover budget overruns averaging 165% by year three. Examples in their dataset: a biotech that budgeted $2M for 50 H100s and hit $7.8M actual TCO; an autonomous vehicle startup that budgeted $6M for 200 GPUs and hit $28M. Both achieved ROI eventually, but both needed emergency funding mid-deployment.

Hyperscaler "you don't see it" pricing isn't free either. Hivenet's June 2026 analysis estimates that when storage ($0.10-0.50/hr workspace), bandwidth ($0.12-1.00/GB egress), and platform fees are included, the effective hyperscaler GPU hourly cost can reach 2-3x the advertised rate. A $4 H100 hourly rate that becomes $8-12 in practice changes the rent-vs-buy maths entirely. See the GPUaaS.com vs hyperscaler pricing breakdown for the full line-by-line.

◆ BUILD VS RENT

Build vs rent: where the break-even actually sits in 2026

American Compute's March 2026 analysis lands at a clean conclusion for B200 clusters: ownership becomes cheaper than renting somewhere above 70% utilisation. Below that, the idle capacity you've paid for outweighs the per-hour discount. KAD's April 2026 analysis lands at a similar 60-70% break-even threshold.

The break-even shifts with three variables: electricity rate (lower = ownership wins sooner), GPU generation lifespan (longer = ownership wins), and reserved cloud pricing (lower reserved rates push break-even higher). Introl's analysis flags that break-even has shifted to favour cloud below 60-70% utilisation thanks to neo-cloud reserved rates approaching $2/hr for H100 by mid-2026.

Most teams sit below the break-even and don't know it. The dominant 2026 procurement pattern is hybrid: a steady-state reserved base for production load, plus on-demand or short-term contract capacity for everything else. Run a base cluster for steady-state workloads, and add flexible capacity for peaks and experiments. The base can be owned hardware or a long-term reserved contract; the flex layer rides on top.

Option for 256-GPU B200 base	3-yr cost	Effective $/GPU/hr	Notes
Owned hardware + colo (TX)	~$15.5M	~$2.88	80% utilisation, with loan
Hyperscaler 3-yr reserved B200	~$18-25M	~$3.50-4.50	Includes egress + support, non-cancellable
GPUaaS.com contract B200	Quote-dependent	from ~$4.50	Short-term and long-term, no multi-year lock-in

Owned hardware: American Compute March 2026. Hyperscaler reserved estimated from public 1-year tier pricing + typical 3-yr discount. GPUaaS.com rates are indicative.

The case for ownership is utilisation discipline. You commit capital, you make sure it runs hot. The case against ownership is everything that goes wrong before you get there: facility build delays, GPU shipment delays, the ops team you have to hire before you have anything to operate, the obsolescence curve that's faster than the depreciation schedule says. The case for GPUaaS.com is that none of those risks land on you. You get a contract rate, you get dedicated NVIDIA infrastructure, and the contract length matches your utilisation confidence rather than the 36-month minimum your CFO has to write off if the workload changes.

Your search for enterprise GPU compute ends here.

NVIDIA infrastructure at rates hyperscalers won't offer you. H100, H200, B200, B300 clusters. Short-term and long-term contracts. No multi-year lock-in. Quotes within 24 hours.

Get a quote on your cluster

◆ FAQ

Frequently asked questions

Last reviewed: June 5, 2026. Hardware cost ranges from Silicon Analysts cluster TCO calculator and io.net GPU Cluster Buyer's Guide (January 2026). Power and cooling figures from Spheron AI Inference Power Consumption guide (April 2026) and 3exhosting Data Center Power and Cooling Costs (May 2026). TCO percentages and worked example from Introl GPU Infrastructure TCO model (April 2026) and KAD GPU Cluster TCO analysis (April 2026). 256-GPU B200 build-vs-rent figures from American Compute (March 2026). Hyperscaler hidden cost ranges from Hivenet (June 2026) and Cyfuture (January 2026). GPUaaS.com rates are indicative, contract-based, and quote-dependent.

The Real TCO of a GPU Cluster in 2026

Get a wholesale GPU quote in a few hours

Related articles

B200 vs H100 vs H200: What the Price Difference Actually Tells You About Your Workload

The GPU Market Has Two Prices: The One You're Quoted and the One the Market Clears At

FOMO Is Why Enterprises Are Paying for GPUs They Do Not Use