Skip to content
GitHubX/TwitterRSS

Unit Economics for AI Features: Calculating Profitability Per User

Unit Economics for AI Features: Calculating Profitability Per User

Section titled “Unit Economics for AI Features: Calculating Profitability Per User”

Most AI features bleed money because teams measure success by usage metrics instead of unit economics. A customer support chatbot with 10,000 daily conversations looks successful—until you realize each $0.15 API call is subsidized by a $0.50 customer acquisition cost. This guide provides the frameworks to calculate true profitability per user and avoid the “successful failure” trap.

Traditional SaaS unit economics focus on Customer Lifetime Value (LTV) and Customer Acquisition Cost (CAC). AI features add a third dimension: Cost Per Inference (CPI). This creates a dynamic where your most engaged users—those generating the most value—are simultaneously your highest-cost customers.

Consider this scenario: A coding assistant charges $20/month per user. Power users generate 500 inferences daily. At GPT-4o pricing ($5 input/$15 output per 1M tokens), with average 2,000 token inputs and 500 token outputs, that’s:

  • Input: 500 × 2,000 = 1,000,000 tokens = $5.00
  • Output: 500 × 500 = 250,000 tokens = $3.75
  • Daily cost: $8.75 (262.50/month)
  • Monthly loss: $242.50 per power user

Without unit economics tracking, you’d celebrate the engagement while burning $242.50 per customer.

Unit economics transform AI feature management from guesswork into financial engineering. When you can calculate margin per inference, you gain three critical levers:

  1. Pricing Precision: Instead of flat-rate subscriptions, you can implement usage-based pricing that preserves margins. A user generating 100 inferences might pay $5 while a user generating 1,000 pays $45—both profitable.

  2. Feature Optimization: If your coding assistant’s “explain code” feature costs $0.12 per call but your “generate test” feature costs $0.45, you can prioritize the former or re-engineer the latter with smaller models.

  3. Customer Segmentation: Identify which user cohorts subsidize others. Your “power users” might be 10% of customers but consume 80% of inference costs—requiring either price increases or feature limits.

Without unit economics, you’re flying blind. With them, every product decision becomes a profitability calculation.

Track these metrics for every inference:

  • Input tokens: From your model’s response metadata
  • Output tokens: From your model’s response metadata
  • Model used: For cost lookup
  • User ID: For attribution
  • Feature name: For cohort analysis

Create a simple table mapping models to their per-token costs. Update this monthly as providers adjust pricing.

For each user interaction:

  1. Look up the model’s cost per 1M tokens
  2. Calculate: (input_tokens × input_cost + output_tokens × output_cost) / 1,000,000
  3. Subtract from your revenue per interaction

Roll up costs and revenue by:

  • User tier (free, pro, enterprise)
  • Feature usage (chat vs. generation vs. analysis)
  • Time period (daily, weekly, monthly)

Here’s a Python implementation for tracking unit economics:

from dataclasses import dataclass
from typing import Dict, Optional
@dataclass
class ModelPricing:
input_per_million: float
output_per_million: float
provider: str
# Current verified pricing (as of Dec 2024)
MODEL_PRICING: Dict[str, ModelPricing] = {
"gpt-4o": ModelPricing(5.00, 15.00, "OpenAI"),
"gpt-4o-mini": ModelPricing(0.15, 0.60, "OpenAI"),
"claude-3-5-sonnet": ModelPricing(3.00, 15.00, "Anthropic"),
"haiku-3.5": ModelPricing(1.25, 5.00, "Anthropic"),
}
def calculate_inference_cost(
model: str,
input_tokens: int,
output_tokens: int,
pricing: Optional[Dict[str, ModelPricing]] = None
) -> float:
"""
Calculate cost for a single inference.
Args:
model: Model identifier (e.g., "gpt-4o")
input_tokens: Number of input tokens
output_tokens: Number of output tokens
pricing: Optional pricing override
Returns:
Cost in dollars
"""
pricing = pricing or MODEL_PRICING
if model not in pricing:
raise ValueError(f"Unknown model: {model}")
model_pricing = pricing[model]
input_cost = (input_tokens / 1_000_000) * model_pricing.input_per_million
output_cost = (output_tokens / 1_000_000) * model_pricing.output_per_million
return input_cost + output_cost
def calculate_user_margin(
user_id: str,
inferences: list,
revenue_per_user: float,
pricing: Optional[Dict[str, ModelPricing]] = None
) -> Dict[str, float]:
"""
Calculate margin for a user across all their inferences.
Args:
user_id: User identifier
inferences: List of dicts with model, input_tokens, output_tokens
revenue_per_user: Monthly revenue from this user
pricing: Optional pricing override
Returns:
Dictionary with total_cost, total_revenue, and margin
"""
total_cost = sum(
calculate_inference_cost(
inf["model"],
inf["input_tokens"],
inf["output_tokens"],
pricing
)
for inf in inferences
)
margin = revenue_per_user - total_cost
return {
"total_cost": total_cost,
"total_revenue": revenue_per_user,
"margin": margin,
"margin_percentage": (margin / revenue_per_user * 100) if revenue_per_user > 0 else 0
}
# Example usage
if __name__ == "__main__":
# Power user with 500 daily inferences
daily_inferences = [
{
"model": "gpt-4o",
"input_tokens": 2000,
"output_tokens": 500
}
] * 500
# Monthly: 500 inferences/day × 30 days
monthly_inferences = daily_inferences * 30
result = calculate_user_margin(
user_id="user_123",
inferences=monthly_inferences,
revenue_per_user=20.00 # $20/month subscription
)
print(f"User 123 - Monthly Cost: ${result['total_cost']:.2f}")
print(f"User 123 - Monthly Revenue: ${result['total_revenue']:.2f}")
print(f"User 123 - Margin: ${result['margin']:.2f} ({result['margin_percentage']:.1f}%)")

Output:

User 123 - Monthly Cost: $262.50
User 123 - Monthly Revenue: $20.00
User 123 - Margin: $-242.50 (-1212.5%)

This code reveals the brutal truth: your power user is costing you $242.50/month while paying $20.

Many models charge for cached input tokens at reduced rates, but developers forget to implement caching. Fix: Always check if your model offers prompt caching and implement it aggressively.

Calculating “average cost per user” hides the variance. Fix: Segment by deciles. Your top 10% of users likely consume 80%+ of costs.

Model providers change prices quarterly. Fix: Build a pricing update mechanism and recalculate historical margins when rates change.

Your $0.15 API call might actually cost $0.18 after adding load balancers, monitoring, and API gateway fees. Fix: Add a 15-20% overhead multiplier to all inference costs.

Retry logic and errors can double your token consumption. Fix: Log failed calls and include them in cost calculations.

Max CPI = (ARPU - Desired Margin) / Inferences Per Month

Where:

  • ARPU: Average Revenue Per User
  • CPI: Cost Per Inference
  • Desired Margin: Target profit (e.g., 30%)

Example: For a $50/month user with 100 inferences and 30% margin target:

Max CPI = ($50 - $15) / 100 = $0.35 per inference
ModelInput/1MOutput/1MBest For
gpt-4o-mini$0.15$0.60High volume, low complexity
gpt-4o$5.00$15.00Balanced performance
claude-3-5-sonnet$3.00$15.00Complex reasoning
haiku-3.5$1.25$5.00Fast, cost-sensitive tasks
def break_even_analysis(
subscription_price: float,
target_margin: float, # 0.0 to 1.0
model: str,
avg_input_tokens: int,
avg_output_tokens: int
) -> int:
"""Calculate max inferences before losing money."""
cost_per_inf = calculate_inference_cost(
model, avg_input_tokens, avg_output_tokens
)
max_cost = subscription_price * (1 - target_margin)
return int(max_cost / cost_per_inf)

Unit economics dashboard template + ROI calculator

Interactive widget derived from “Unit Economics for AI Features: Calculating Profitability Per User” that lets readers explore unit economics dashboard template + roi calculator.

Key models to cover:

  • Anthropic claude-3-5-sonnet (tier: general) — refreshed 2024-11-15
  • OpenAI gpt-4o-mini (tier: balanced) — refreshed 2024-10-10
  • Anthropic haiku-3.5 (tier: throughput) — refreshed 2024-11-15

Widget metrics to capture: user_selections, calculated_monthly_cost, comparison_delta.

Data sources: model-catalog.json, retrieved-pricing.

interface UnitEconomicsProps {
model: string;
inputTokens: number;
outputTokens: number;
subscriptionPrice: number;
inferencesPerMonth: number;
}
// Example implementation for a React/MDX component
export function UnitEconomicsCalculator({
model = "gpt-4o",
inputTokens = 2000,
outputTokens = 500,
subscriptionPrice = 20,
inferencesPerMonth = 100
}: UnitEconomicsProps) {
const pricing = {
"gpt-4o": { input: 5.00, output: 15.00 },
"gpt-4o-mini": { input: 0.15, output: 0.60 },
"claude-3-5-sonnet": { input: 3.00, output: 15.00 },
"haiku-3.5": { input: 1.25, output: 5.00 }
};
const costPerInference =
(inputTokens / 1_000_000) * pricing[model].input +
(outputTokens / 1_000_000) * pricing[model].output;
const totalCost = costPerInference * inferencesPerMonth;
const margin = subscriptionPrice - totalCost;
const marginPercent = (margin / subscriptionPrice) * 100;
return (
<div className="calculator-result">
<p><strong>Model:</strong> {model}</p>
<p><strong>Cost per inference:</strong> ${costPerInference.toFixed(4)}</p>
<p><strong>Monthly cost:</strong> ${totalCost.toFixed(2)}</p>
<p><strong>Revenue:</strong> ${subscriptionPrice.toFixed(2)}</p>
<p><strong>Margin:</strong> ${margin.toFixed(2)} ({marginPercent.toFixed(1)}%)</p>
{margin < 0 && <p style={{color: 'red'}}>⚠️ Unprofitable</p>}
</div>
);
}

Unit economics transform AI feature development from a guessing game into a financial discipline. By systematically tracking Cost Per Inference (CPI) against ARPU, you can identify which features, user segments, and model choices drive profitability versus losses. The frameworks provided—profitability equation, break-even analysis, and cohort tracking—enable you to:

  1. Detect hidden losses before they scale (like the $242.50/month power user example)
  2. Optimize pricing based on actual consumption patterns
  3. Select models appropriately for each feature’s complexity
  4. Segment users to prevent free tiers from destroying margins

The verified pricing data shows that model selection critically impacts unit economics: using gpt-4o-mini ($0.15/$0.60) instead of gpt-4o ($5/$15) for simple tasks can reduce costs by 30x. Implement the instrumentation, build your cost database, and calculate margins weekly. Your AI features should generate profit, not just engagement.