Cost Estimates

A significant benefit to batch inference is decreased costs as well as transparent pricing as inputs are known in advance. We aim to provide transparent pricing models so you know in advance how much a batch job will cost before running it.

Understanding Pricing

We charge based on the count of input and output tokens that successfully complete inference. Our pricing page contains the average cost per million-token for each model, blending both input and output tokens and weighted according to typical usage patterns. However, output tokens are generally more expensive than input tokens, and we do charge differently for each.

Using the Dry Run Feature

To get the cost estimate for a batch job, set the dry_run parameter to True in the SDK. Instead of running the inference, the API will return an estimated cost for the job. Dry runs are free, so we recommend setting this parameter to True before running the job to ensure you understand costs beforehand.

Getting Started

Core Concepts

Common Patterns

Cost Estimates

Cost Estimates

Understanding Pricing

Using the Dry Run Feature

Getting Started

Core Concepts

Common Patterns

​Cost Estimates

​Understanding Pricing

​Using the Dry Run Feature

Cost Estimates

Understanding Pricing

Using the Dry Run Feature