Use case (monetization) recipes

Choose a pricing pattern that matches your product and audience. All recipes assume you:

Send a stable, pseudonymous user on every billable request.
Use idempotency for non‑idempotent POSTs.
Let the proxy return renderable assistant messages for auth/top‑ups (no special branching).

See background:

Per‑token metering (chat)

When to use

General chat UX where prompts vary widely and you need margin predictability.

How it works

Configure per‑model prices for prompt/completion tokens plus your desired markup.
Charge = token_cost × markup.

Steps

In the dashboard, set model pricing.
Use the Chat Completions endpoint normally; metering finalizes at end of stream.

Minimal request

curl <https://api.paywalls.ai/v1/chat/completions> \\
  -H "Authorization: Bearer $PAYWALLS_API_KEY" \\
  -H "Content-Type: application/json" \\
  -d '{
    "model": "openai/gpt-4o-mini",
    "user": "user_123",
    "messages": [{"role":"user","content":"Help me draft a tweet."}]
  }'

Charge per tool call

When to use

Tools (e.g., web browse, RAG search, PDF parse) where value isn’t well expressed in tokens.

Pattern

Combine automatic chat metering with a flat, manual charge per tool call.

Endpoints

Manual charge: POST /v1/user/charge
Optional balance check (display): GET /v1/user/balance

Flow

Before executing the tool, check user’s balance (optional).
If balance is enough, perform the tool action. On success, call POST /user/charge with an idempotency key (e.g., the tool run id).
If success: false, render the assistant message (auth/top‑up link) and stop the tool.

Example charge

curl -X POST <https://api.paywalls.ai/v1/user/charge> \\
  -H "Authorization: Bearer $PAYWALLS_API_KEY" \\
  -H "Content-Type: application/json" \\
  -H "Idempotency-Key: run_123" \\
  -d '{
    "user": "user_123",
    "amount": "0.25",
    "metadata": {"tool": "web_search", "runId": "run_123"}
  }'

Tips

Charge after your tool succeeds to avoid refunds/adjustments.
If you must pre‑authorize, use a small charge first, then a second charge on completion.

Freemium → prepaid top‑ups → subscription + overage

Goal

Smooth path from zero‑friction trial to predictable recurring revenue, with fair overages.

Today (implemented with existing APIs)

Freemium: grant trial credits by calling POST /v1/user/balance/deposit from your backend when a new user onboards (label as trial in metadata).
Prepaid: in Default mode use Stripe to sell credits; after payment success, call Deposit. In Shared mode, hosted top‑ups handle funding automatically.
Overage: normal metering and/or manual charges apply once credits are used.

Upcoming

Native subscriptions (alongside usage) are on the roadmap; for now, trigger periodic deposits from your billing system.

Example trial deposit

curl -X POST <https://api.paywalls.ai/v1/user/balance/deposit> \\
  -H "Authorization: Bearer $PAYWALLS_API_KEY" \\
  -H "Content-Type: application/json" \\
  -H "Idempotency-Key: signup_user_123" \\
  -d '{
    "user": "user_123",
    "amount": "2.00",
    "metadata": {"reason": "trial", "plan": "starter"}
  }'

Tips

Show remaining credits and estimated cost per action to improve conversion.
For subscriptions, align the deposit amount with the monthly entitlement; Allow to top up once the balance hits zero.

Multi‑agent app with per‑model pricing

When to use

Multi‑tool/agent systems where some actions require premium models while others run fine on small models.

Pattern

Map each agent to a model id (and price). Expose the model tier in UI so users understand cost/performance.
Apply guardrails: if balance is low, fall back to a cheaper model or pause premium agents.

Example selection logic (pseudocode)

const budgetLow = user.balance < 1.0;
const model = budgetLow ? "openai/gpt-4o-mini" : "openai/gpt-4o";
const res = await client.chat.completions.create({ model, user, messages });

Operational notes

Keep agent → model mapping centralized so analytics can attribute cost/revenue per agent.
Consider a small per‑request minimum fee for premium agents to stabilize margins.

Guardrails & UX patterns

Idempotency: use Idempotency-Key for all manual charges/deposits.
Messaging: always render the assistant message when a request is blocked for auth/top‑up—no custom UI needed.
Analytics: label metadata with requestId, tool, and agent for clear reporting.

Getting started

Core concepts

How‑to guides

SDKs & Integrations

More

Per‑token metering (chat)

Charge per tool call

Freemium → prepaid top‑ups → subscription + overage

Multi‑agent app with per‑model pricing

Guardrails & UX patterns

Links

Getting started

Core concepts

How‑to guides

SDKs & Integrations

More

​Per‑token metering (chat)

​Charge per tool call

​Freemium → prepaid top‑ups → subscription + overage

​Multi‑agent app with per‑model pricing

​Guardrails & UX patterns

​Links

Per‑token metering (chat)

Charge per tool call

Freemium → prepaid top‑ups → subscription + overage

Multi‑agent app with per‑model pricing

Guardrails & UX patterns

Links