It enables developers to monetize LLM usage without modifying client-side integration logic. By sitting between your app and the inference provider, the Paywalls.ai proxy:
- Validates the user’s authorization and balance
- Calculates the cost of each request in real time
- Applies your pricing rules
- Charges the user before forwarding the request
- Relays the model response without added latency
/v1/chat/completions
or other standard endpoints.
Developer Setup
When you register and generate an API key, you configure:Provider & API Key
- Choose one LLM provider (e.g. OpenAI, OpenRouter) and connect your credentials
- Or use Paywalls.ai’s built-in provider access — no credentials required
Model Pricing Table
- Use default pricing (e.g. from OpenRouter)
- Override prices per model
- Restrict availability of specific models
- Add pricing for custom models
Each API key is tied to a single provider\ However, every request can
target any model that provider supports, with dynamic pricing per model.
Revenue & Payment Distribution
For each paid request, user spend is split into three parts:Component | Description |
---|---|
Developer revenue | A fixed percentage (e.g. 5%) of user spend, configurable per API key |
Paywall fee | Always 1% of user spend (platform fee) |
LLM cost | Remaining balance (e.g. 94%) to cover inference cost |
Access Modes
BYOK (Bring Your Own Key)
- Requests use your own API key
- You are billed directly by the provider
- You set your own model pricing
- Paywalls.ai charges the user and splits revenue:
Role | Portion of User Spend |
---|---|
Developer revenue | 5% (or configured %) + LLM cost portion (e.g. 94%) |
Paywall fee | 1% |
LLM cost | Paid directly by developer |
✅ Access to all models your provider supports
✅ Higher profit margin if you have good provider rates
⚠ More effort — you must handle provider payments
Built-in Provider Access
- Requests are fulfilled with Paywalls.ai’s credentials
- No setup needed — instant access
- You set model pricing just like with BYOK
- Paywalls.ai charges the user and splits revenue:
Role | Portion of User Spend |
---|---|
Developer revenue | 5% (or configured %) |
Paywall fee | 1% |
LLM provider | Paid by Paywalls.ai from remaining spend (e.g. 94%) |
✅ No need to manage provider payments
⚠ Limited to models offered by Paywalls.ai
Summary Table
Feature | BYOK | Built-in Provider |
---|---|---|
Who pays for LLM | Developer | Paywall |
Developer revenue | 5% (or configured) + LLM cost | 5% (or configured) |
Paywall fee | 1% | 1% |
Developer control | Full | Limited to exposed models |
Model availability | Any supported by provider | Only Paywall’s models |
Core Components
Component | Description |
---|---|
Proxy Endpoint (/chat/completions ) | Handles LLM chat requests. Enforces paywall rules and proxies to OpenAI-compatible model providers. |
Model Registry (/models ) | Lists supported models and their pricing (per-token and per-request). |
User Endpoints (/user/* ) | Tools to manage paywall authorization, top-up, balance lookup, and manual charges. |
Pricing Engine | Calculates the usage cost based on your configured model rates and request parameters (tokens, request flat fees, etc.). |
Metering System | Measures prompt and completion token usage (if token-based billing is enabled). |
Authorization Logic | Ensures only authorized users can be charged. Returns an authorization link if not authorized. |
Top-Up System | Returns a top-up link when balance is low, blocking access until funds are added. |
Request Lifecycle
- Request sent to the proxy → Your app calls
https://api.paywalls.ai/v1/chat/completions
- Authentication →
Authorization: Bearer ...
API key identifies your paywall - User identification → Pass
user
in body orX-Paywall-User
header - Authorization & balance check
- Not authorized → return
authorize
link - Low balance → return
topup
link - Authorized + funded → proceed
- Not authorized → return
- Cost computation → Request fee + token costs
- Charge execution → Deduct balance, record charge
- Forward request → To chosen LLM provider
- Return response → Stream or send full response, log usage
Pricing Engine
The pricing engine ensures every request is profitable by:- Looking up applicable model pricing
- Measuring token usage (prompt + completion)
- Applying per-request fees and/or per-token rates
- Adding developer-defined margins
- Recording charges in the billing ledger
Billing Options
- Per request — fixed price per API call
- Per token — based on prompt + completion token usage
- Manual charges — via
/user/charge
- Subscriptions — handled externally with proxy access checks
Why Use Paywalls.ai?
- Drop-in compatible with OpenAI API
- Any model, any provider — with dynamic pricing
- Multiple monetization models — pay-per-message, microtransactions, token quotas
- Integrates anywhere — code and no-code platforms
- No billing infrastructure needed — Paywalls.ai handles metering, charging, and balances