What is Sutro?
Sutro helps you build reliable AI decision systems for repeated tasks. Instead of hand-tuning prompts and hoping they hold up in production, Sutro gives you a structured workflow to align AI behavior with your team’s actual preferences and then deploy the result at scale. Sutro has two core components:- Sutro Functions — Build task-specific judges, along with classifiers and extractors, that reflect your team’s decision preferences. With an evolving record book of impactful annotations, it helps you build an optimized, deployable function you can invoke by name.
- Batch Inference — Run large-scale offline inference across thousands to millions of inputs. Easily execute your Sutro Functions at any scale, or run OSS LLMs directly for analytical and generation workloads.
Sutro Functions
With Sutro Functions, you can expect:- Speed: Maximizes prompt quality per unit of your time; you spend minutes labeling, not hours of testing and rewriting.
- Stability Create a consistent foundation of expertise to measure & optimize against
- Maintainability: Swap models, add new data, and re-optimize without regressing on past failures.
- Adaptability: Compress tasks into the right model for the job.
Key use cases
- Evals for agents and single call LLMs
- User intent analysis
- Data filtering and transformation (multimodal and text-based)
- Data tagging or labelling
- Classical ML decisioning (lead generation, fraud detection, compliance, KYC, etc)
Sutro Functions
Learn how Functions work and what you can build with them.
Batch Inference
Sutro’s batch inference platform is the production runtime for Sutro Functions, built to process millions of rows at once. It also handles standalone large-scale offline workloads — synthetic data generation, embeddings, LLM-as-a-judge evaluations, and more. With batch inference, you can expect:- Speed: Large-scale jobs finish in an hour or less, not a day from now.
- Scale: From a handful of inputs to billions of tokens per job.
- Cost: Less than 25% the cost of real-time inference providers.
- Security: Custom data retention policies and optional bring-your-own-storage.
Batch Inference
Run your first batch job.