Google's Gemma 4 family includes two open variants: 26B MoE (3.8B active parameters, optimized for latency) and 31B Dense (all parameters active, optimized for quality). Both support function calling, structured JSON output, vision, up to 256K context, and 140+ languages.
If your product uses Gemma 4 for inference, Commet can track token consumption and charge your customers automatically based on the margin you configure.
How it works
- You configure a feature with Margin AI enabled in your plan.
- You set a margin percentage on top of Gemma 4's base cost.
- Your app reports token usage through the SDK.
- Commet calculates the charge using the AI model catalog pricing and your margin, then bills the customer.
You never need to hardcode per-token prices. Commet looks up the model's pricing from the catalog and applies your margin automatically.
Gemma 4 pricing
| Token type | Cost per million tokens |
|---|---|
| Input | $0.15 |
| Output | $0.30 |
| Cache read | $0.04 |
These are the base costs from the AI model catalog. Your customer pays the base cost plus your configured margin.
Track Gemma 4 usage with the SDK
Report token usage after each inference call:
import { Commet } from "@commet/node";
const commet = new Commet({
apiKey: process.env.COMMET_API_KEY!,
environment: "production",
});
await commet.usage.track({
externalId: "org_123",
feature: "ai_generation",
model: "google/gemma-4-31b",
inputTokens: 1200,
outputTokens: 450,
cacheReadTokens: 800,
});Commet resolves the model identifier to the correct per-token rates and calculates the charge including your margin.
Automatic tracking with @commet/ai-sdk
If you use the Vercel AI SDK, @commet/ai-sdk reports tokens automatically after every generateText or streamText call:
import { google } from "@ai-sdk/google";
import { generateText } from "ai";
import { Commet } from "@commet/node";
import { tracked } from "@commet/ai-sdk";
const commet = new Commet({
apiKey: process.env.COMMET_API_KEY!,
environment: "production",
});
const result = await generateText({
model: tracked(google("gemma-4-31b"), {
commet,
feature: "ai_generation",
customerId: "org_123",
}),
prompt: "Explain quantum computing",
});No manual token counting required.
Configuring your margin
In the Commet dashboard, navigate to your plan's feature configuration and enable Margin AI. Set a margin in basis points — for example, 3000 basis points means your customer pays 130% of the base token cost (base + 30% margin).
The margin applies uniformly to input, output, and cache read tokens. Commet calculates the final per-token rate and uses it for all billing.
How your customers pay
Your customer loads a dollar balance into their account. Every Gemma 4 call deducts the computed cost — base token price plus your margin — in real time. No invoices at the end of the month, no usage surprises. They see exactly what they spend as they spend it, and they top up when the balance runs low.