Overview

The Model Parameters section in the App Editor allows you to fine-tune how prompts are processed by the selected model. Each parameter affects the behavior and response of the model, giving you control over the output.

This article explains each available parameter and its impact.


Available Model Parameters

1. Max Tokens

  • Definition: Determines the maximum number of tokens (words, punctuation, and special symbols) the model can generate in the response.

  • Range: Varies by model, typically from 1 to 4096+ tokens.

  • Use Case:

    • Increase to allow for longer responses.

    • Decrease to keep responses brief and focused.

Tip: Be mindful of the token limit to avoid truncating responses or exceeding model constraints.


2. Temperature

  • Definition: Controls the randomness or creativity of the model’s responses.

  • Range: 0 to 2

    • Lower values (0-0.3): More deterministic, ideal for factual or structured responses.

    • Higher values (0.7-1.5): Adds creativity and variety but may increase randomness.

  • Use Case:

    • Use low values for tasks requiring precision.

    • Use higher values for brainstorming or creative writing.

Tip: A value around 0.7 strikes a balance between creativity and consistency.


3. Presence Penalty

  • Definition: Encourages the model to introduce new concepts by discouraging repetition.

  • Range: -2 to 2

    • Higher values: Push the model to include less frequent or novel content.

    • Lower values: Allow the model to stay focused on known concepts.

  • Use Case:

    • Ideal for generating diverse content or exploring new ideas.


4. Frequency Penalty

  • Definition: Reduces the likelihood of the model repeating the same phrases.

  • Range: -2 to 2

    • Higher values: Decrease repetition, making the response more varied.

    • Lower values: Allow for more repetition if needed.

  • Use Case:

    • Best used when generating lists or content with minimal duplication.


5. Top P (Nucleus Sampling)

  • Definition: Controls the diversity of model outputs by selecting from the most probable tokens.

  • Range: 0 to 1

    • Lower values (e.g., 0.1): Narrow selection, more focused responses.

    • Higher values (e.g., 0.9): Broader selection, more diverse responses.

  • Use Case:

    • Lower values for factual or predictable tasks.

    • Higher values for creative or exploratory tasks.

Tip: Either adjust Temperature or Top P—not both—when fine-tuning creativity.


6. Top K

  • Definition: Limits the number of tokens the model considers when generating the next word.

  • Range: 0 to 2048

    • Lower values (e.g., 5-50): Highly focused responses by considering only the most likely tokens.

    • Higher values (e.g., 1000+): Allows more randomness and variety.

  • Use Case:

    • Use lower values for highly structured tasks.

    • Use higher values for exploratory or creative responses.

Tip: Top K is often used in combination with Top P for greater control over response diversity.


FAQs

Q: What happens if Max Tokens is set too low?
A: The response may get cut off before completion. Increase the limit if you need longer outputs.

Q: Should I adjust Temperature and Top P together?
A: No, it's best to modify one at a time to observe changes in model behavior.

Q: When should I adjust Top K?
A: Use Top K when you need to limit the model’s choices to the most likely tokens for greater control over response variability.