Cloud infrastructure platforms pivoting to AI inference face a predictable, durable gross margin headwind as the inference revenue share grows. The mechanism: GPU-intensive inference workloads carry materially lower gross margins than core cloud services (Droplets, managed databases, storage) because GPU CapEx amortization, power, and cooling are large per-unit COGS line items. As inference revenue grows from a small share to a significant percentage of revenue, blended gross margins compress even if underlying unit economics are stable.
This is distinct from the SaaS-plus-hardware-device GM compression pattern (AXON). In that pattern, a software company adds a physical device attach. Here, a cloud infrastructure company is adding a higher-COGS compute product — both are cloud, but inference is structurally heavier on COGS than general-purpose cloud.
When evaluating cloud platform companies pivoting to AI inference, build a GM bridge model: (1) estimate steady-state inference GM vs. core cloud GM, (2) project inference revenue mix at each quarter, (3) calculate the blended GM trajectory. Do not apply the legacy GM as a stable assumption — the mix shift is predictable and quantifiable. For valuation screens, be cautious applying the 60% gross margin threshold rigidly to cloud platforms mid-pivot; the relevant question is whether inference-specific unit economics are improving and whether scale reduces per-unit GPU costs over time.