Select the language model to use
Augment responses with real-time web data
โฑ๏ธ Estimated GPU Time: 25.1 seconds
๐ Model Size: 1.7B parameters ๐ Web Search: Disabled
Maximum length of generated response
Higher = more creative, Lower = more focused
Number of top tokens to consider
Nucleus sampling threshold
Penalize repeated tokens