Cost Estimates
A significant benefit to batch inference is decreased costs as well as transparent pricing as inputs are known in advance. We aim to provide transparent pricing models so you know in advance how much a batch job will cost before running it.Understanding Pricing
We charge based on the count of input and output tokens that successfully complete inference. Our pricing page contains the average cost per million-token for each model, blending both input and output tokens and weighted according to typical usage patterns. However, output tokens are generally more expensive than input tokens, and we do charge differently for each.Using the Dry Run Feature
To get the cost estimate for a batch job, set thedry_run
parameter to True
in the SDK. Instead of running the inference, the API will return an estimated cost for the job. Dry runs are free, so we recommend setting this parameter to True
before running the job to ensure you understand costs beforehand.