A system prompt to use for the inference. Use this parameter to provide consistent, task-specific instructions to
the model. See System Prompts for more information
If True, any rows that have a token count exceeding the context window length of the selected model will be
truncated to the max length that will fit within the context window
import requestsresponse = requests.post('https://api.sutro.sh/batch-inference',headers={'Authorization': 'Key YOUR_SUTRO_API_KEY','Content-Type': 'application/json'},json={'inputs': ['What is the capital of France?','Explain quantum computing in simple terms','Write a haiku about programming'],'model': 'llama-3.1-8b','system_prompt': 'You are a helpful assistant.','job_priority': 0})result = response.json()print(f"Job created: {result['job_id']}")