KoboldAI Settings
Standard KoboldAI settings files are used here. To add your own settings, simply add the file .settings in TavernAI\public\KoboldAI Settings
Temperature
Value from 0.1 to 2.0. Lower value - the answers are more logical, but less creative. Higher value - the answers are more creative, but less logical.
Repetition penalty
Repetition penalty is responsible for the penalty of repeated words. If the character is fixated on something or repeats the same phrase, then increasing this parameter will fix it. It is not recommended to increase this parameter too much for the chat format, as it may break this format. The standard value for chat is approximately 1.0 - 1.05
Repetition penalty range
The range of influence of Repetition penalty in tokens.
Repetition penalty slope
Top P Sampling
1 is disabled
Top P is a widely used text generation method that involves converting logits into probabilities using the softmax function. The technique keeps as many tokens as possible while adhering to two rules, which are based on the top-p value. A high top-p value is recommended for better creativity, as lower values limit the number of tokens kept. Setting the top-p value to 0 is equivalent to greedy search.
Top K Sampling
0 is disabled
Top K leaves the largest k logits unchanged while setting all the others to negative infinity. However, it has been found to be less effective than other sampling techniques and is often used as a permissive filter before implementing more advanced methods. It is recommended to use top-k sampling as the first sampler in the model to avoid nullifying the effects of more intelligent samplers.
Top A Sampling
0 is disabled
Top-a sampling is a relatively new sampling method designed for use with BlinkDL's RWKV language models. It involves converting logits into probabilities using the softmax function and setting the logits of tokens with a probability less than a certain value (the top-a value) to negative infinity. One of the highest probability tokens must always be kept, even if its probability is less than the top-a value. Top-a sampling reduces randomness when the model is confident about the next token, but has little effect on creativity.
Typical Sampling
1 is disabled
Typical Sampling aims to keep the information content of text consistent throughout generated text. It works by sorting tokens in ascending order of their absolute value of entropy and natural logarithm of probability, and keeping the minimum possible number of tokens that exceed a certain probability threshold. This method can strongly affect the content of the output but still maintains creativity even at extremely low settings.
Tail Free Sampling
1 is disabled
Tail Free Sampling aims to remove low probability tokens without compromising the creativity of the generated text. It does this by identifying a "tail" of undesirable tokens in the probability distribution and removing them based on a user-specified threshold. This method is designed to work well on longer pieces of text and can be used in conjunction with other sampling methods for further control over the generated output.
Amount generation
The maximum amount of tokens that a AI will generate to respond. One word is approximately 3-4 tokens. The larger the parameter value, the longer the generation time takes.
Context size
How much will the AI remember. Context size also affects the speed of generation.
Important: The setting of Context Size in TavernAI GUI always override setting for KoboldAI GUI