Experience ultra-fast inference with Groq's LPU-powered models. Access to popular open-source models optimized for speed. Perfect for real-time applications requiring rapid response times.
Initial prompt that helps set the behavior and context for the model.
optional:true
Attachments
text
Accepts multiple
Additional context or documents to be included with the prompt.
optional:true
Last Message Only
toggle
Only use the last message in the conversation for generation, ignoring the previous conversation and messages.
default:false
Temperature
number
Controls randomness in the output. Lower values (near 0) make the output more focused and deterministic, while higher values make it more creative and random.
optional:true•minimum:-1•maximum:1
Seed
seed
The seed to use for the model.
default:4508
Top P
number
Nucleus sampling parameter between 0 and 1. Lower values (e.g. 0.1) limit responses to only the most likely tokens. Use either this or temperature, not both.
optional:true
Top K
number
Limits the model to only sample from the top K options for each token. Used to remove low probability responses.
optional:true
Max Tokens
number
Maximum number of tokens to generate.
optional:true
Stop Sequences
text
Accepts multiple
Array of text sequences that will cause the model to stop generating further text when encountered.
optional:true
Presence Penalty
number
Penalizes the model for repeating information already present in the prompt. Higher values reduce repetition.
optional:true•minimum:-2•maximum:2
Frequency Penalty
number
Penalizes the model for repeatedly using the same words or phrases. Higher values encourage more diverse word choice.