How is balance billing different from credits?

With balance billing, the customer thinks in real dollars. Each action deducts a monetary amount from their balance. With credits, the customer thinks in abstract units. Balance billing also does not block usage when the balance is exhausted; overage is charged at period end.

Can balance billing track AI token costs automatically?

Yes. Balance billing integrates with the AI model catalog to calculate per-token costs using upstream provider rates plus a configurable margin. Each API call deducts the exact computed cost from the customer's balance.

What happens when the balance runs out?

The customer keeps using the product. Any spend beyond the plan's balance accumulates as overage and is charged on the closing invoice at the end of the billing period, similar to metered billing.

What is Balance Billing?

Balance billing converts a plan's base price into a real-dollar spending balance. Usage deducts monetary amounts, and any overage is charged at the end of the billing period.

Balance billing is a pricing model where the customer's plan base price becomes a spendable dollar balance. Every action deducts a real monetary amount. If the balance is exhausted before the period ends, overage accumulates and is charged at the end of the cycle, similar to metered billing. The key difference: the customer thinks in dollars, not abstract units or credit counts.

How balance billing works

A balance plan has a base price that doubles as the customer's spending budget. At the start of each billing period, the customer pays the base price and receives that amount as their balance.

Take a $100/month plan for an AI infrastructure product. The customer starts each month with a $100.00 balance. Every API call to a language model deducts a calculated cost from that balance. A call that processes 2,000 input tokens and generates 500 output tokens at the configured rates might cost $0.0043. After 15,000 such calls, the customer has spent roughly $64.50 and has $35.50 remaining.

If the customer spends more than $100.00 in a given month, the excess is charged as overage on the closing invoice. This is the same true-up mechanism used in metered billing: usage is never blocked, and the overage is settled at period end.

Two pricing modes

Balance billing supports two ways to determine what each unit of usage costs.

Fixed price per unit

The simplest mode. You define a price per unit of consumption, and every usage event deducts that amount. An image processing service might charge $0.02 per image processed. A translation API might charge $0.005 per 1,000 characters. The customer sees a clear, predictable cost per action.

AI model pricing

For AI products, balance billing integrates with the AI model catalog. Instead of manually setting per-unit prices, the system uses the token costs from the AI model provider (input tokens, output tokens, cache tokens) and applies a configurable margin on top.

For example, if the upstream cost is $3.00 per million input tokens and $15.00 per million output tokens, and you configure a 40% margin, the customer pays $4.20 and $21.00 respectively. The margin is defined in basis points per feature per plan.

This makes balance billing the natural choice for AI products. Pricing stays aligned with your actual costs, and you control your margin without manually updating prices when the upstream provider changes rates. See the AI token billing documentation for configuration.

How balance differs from credits and metered

Balance, credits, and metered are the three consumption models, and each plan uses exactly one.

With credits-based billing, the customer thinks in abstract units. One credit might cost $0.10, but the customer does not see that. They see "50 credits remaining." Credits also block the customer when exhausted, forcing a credit pack purchase to continue.

With metered billing, the customer has an included usage quota (10,000 API calls) and pays per-unit overage. The customer thinks in units of consumption, not dollars.

Balance billing behaves like metered billing (overage at period end, no blocking), but the mental model is different. The customer sees "$37.20 remaining" instead of "6,200 calls remaining." For infrastructure and AI products where per-action cost varies by model size, input length, or processing complexity, a single dollar balance is more informative than a count of heterogeneous units.

When to use balance billing

Balance billing is the right model when your product has variable per-action costs and you want the customer to have real-time visibility into their spend in dollars.

AI products are the primary use case. A customer using multiple models with different token prices needs a unified view of spending. Credits would require exchange rates between models, which gets complicated. Balance billing handles this naturally because everything resolves to dollars.

Cloud infrastructure products that combine multiple resource types also benefit. A customer running compute, storage, and network transfer sees one balance reflecting the aggregate cost, rather than tracking three separate meters.

Balance billing is less suitable when you want a hard spending limit. Since overage is charged at period end rather than blocking usage, a customer can overshoot during a spike. If you need a hard stop, credits are a better fit.

Real-world examples

AWS credits work on a balance model: you have a dollar balance, and every service deducts from it. AI API providers like OpenAI and Anthropic charge based on token consumption with dollar-denominated costs, which maps directly to balance billing when resold through a SaaS product.

Implementation considerations

Balance billing requires sub-cent precision. A single API call might cost $0.000043. Rounding that to the nearest cent on every transaction would lose or gain significant amounts over millions of events. Commet uses a rate scale where 10,000 equals $1.00, enabling prices as granular as $0.0001 per unit.

For AI products, a single usage event might include input tokens, output tokens, and cached tokens, each priced differently. The system must calculate the total cost, apply the margin, and deduct from the balance atomically.

Metered Billing: included usage quota with per-unit overage at period end
Credits-Based Billing: abstract credit units that block when exhausted
Usage-Based Billing: overview of all three consumption models
AI Token Billing: configuring AI model pricing with margins
Consumption Models: choosing between metered, credits, and balance

Metered Billing: included usage quota with per-unit overage at period end
Credits-Based Billing: abstract credit units that block when exhausted
Usage-Based Billing: overview of all three consumption models
AI Token Billing: configuring AI model pricing with margins
Consumption Models: choosing between metered, credits, and balance

What is Balance Billing?

How balance billing works

Two pricing modes

Fixed price per unit

AI model pricing

How balance differs from credits and metered

When to use balance billing

Real-world examples

Implementation considerations

Frequently Asked Questions

What is Balance Billing?

How balance billing works

Two pricing modes

Fixed price per unit

AI model pricing

How balance differs from credits and metered

When to use balance billing

Real-world examples

Implementation considerations

Frequently Asked Questions