Scan Settings 

The following parameters define how your model is scanned.

Field Name  Description Values 

Temperature

The Temperature value controls the balance between the predictability and creativity of the generated text in the response. 

Lower temperature value returns more predictable and conservative responses whereas higher temperature value returns less predictable and creative response.

Range: 0 to 1

Default value: 0.7

Top K

This configuration controls the randomness and diversity of the generated text.

Any integer value.

Default value: 10

Maximum Tokens

Maximum number of generated tokens by Target LLM.

Any integer value.

Default value: 512

Top P

This configuration controls the randomness and diversity of the generated text. It employs a technique called nucleus sampling or top-p sampling.

Range: 0 to 1

Default value: 0.95

Repetition Penalty

This configuration reduces the likelihood of repetitive text in text generation.

If set to 1, there is no penalty. Higher values increasingly discourage the repetition of tokens.

Float value between 1 and 2.

Default value: 1.03

Response Timeout

Timeout, in seconds, for a single inference request sent to the Target LLM.

Any integer value

Default value: 120.

Retry Attempts

Number of times a single inference request is attempted.

Default value: 5

Parallelism

The Parallelism value defines how many requests are processed at the same time during a model scan. It helps control the scan rate and performance.

Default value: 5

Supported range: 1–20