Warning: I am an amateur in this kind of subject, I'm not very technical but I know enough to rp decently. Also Images
are in urls.

Alright so, You may be wondering, why I use only smoothing and min_p for testing? Well to start off,
I use smoothing for it's dynamic characteristic. Dynamic in a sense where it would increase and 
balance top token probabilities at the same time lower less likely ones incrementally depending on 
smoothing value. The difference between smoothing and temp is that smoothing covers and considers
'all' tokens while temp focuses either mostly top tokens in low to mid temp values or increasing the 
majority of the tokens that are not top prob. In short, with temp it's either you go either very 
deterministic or very wild, and with smoothing, you can find all possibilities adjusting token probs 
in any degree of determinism and or creativity, you can use smoothing curve to tinker with smoothing 
more but as of now there's no option to visualize smoothing curve. Temp is stiff and Smoothing is flexible.
To visualize this more easily, check out the example images below which I got using Artefact's llm visualizing 
token probs tool. Do note that the visualization of the token probs below are based on open-ended prompts,
as if I were to select token probs visualization on question prompts then it would be a question of factuality
not creativity.

Temp: 3, Min_p: 0.135:

https://cdn-uploads.huggingface.co/production/uploads/6580400298aa9fcdd244c071/O0qoPmaMfM3UXjgObqP2V.jpeg

Smoothing: 0.07, Min_p: 0.075:

https://cdn-uploads.huggingface.co/production/uploads/6580400298aa9fcdd244c071/HWEy-cOaQV9jC1B7W_2rh.jpeg

As you can see, smoothing has a pattern of visualizing the probs in a curve-like manner from the most 
considered top token, to the least considered token, compared to temp's visualization where almost all 
low prob tokens are almost on the same level with little prob differences compared to the vast prob differences
in top prob tokens. So yeah, with the flexibility of Smoothing, I used it to increase the diversity of 
tokens by increasing the prob of tokens in between top prob and low prob tokens, why? Well a balanced diversity 
of tokens would mean increasing creativity while being coherent to an extent. If one were to focus on top tokens,
in which I've noticed is a trend in many of the recommended sampling parameters, that would be somewhat limiting
in grasping the full capabilities of an LLM rp model. I just feel some tokens are underutilized and are segregated 
with the rest of the low prob tokens which would not fare well for creativity. Now to achieve this diversity, I
used a low value of smoothing, because smoothing values are quite sensitive. If smoothing has a value of 0.1 or
more, token probs get more deterministic drastically, the opposite goes for smoothing value below 0.05.

Now, for the min_p part, it's for quality control. With me using low values of smoothing, probs of top tokens are
lessened and probs of low prob tokens are increased, thus I needed to enhance coherence further and cut off nonsensical
tokens. Though I used min_p as minimally as possible so I can retain many tokens for diversification. The high values of 
min_p is so to keep up with smoothing's seemingly high temp side effects. So, min_p is quite proportionate to temp values.
So if I wanted more creativity, I would use smoothing value of 0.06-0.07, for more determinism 0.08-0.09. Overall, I 
combined the dynamic creativeness of smoothing with min_p's coherence enhancer which is good enough to test the full
capabilities of any rp LLM model.

Sampling parameters I considered but did not make the cut to use for testing:

Top_p, Typical_P, Top_K - They specialize in cutting off tokens in a manner where top tokens are the only ones considered.
If tokens are a bonzai plant, min_p are shears while they're axes, unweildy to use for a small plant.

Tail Free Sampling - I've heard anecdoctally that this is somewhat similiar to min_p, well with artefact's llm sampling tool
it might be true, though min_p is preferrable as it's simpler to understand and more objectively measurable than Tail Free
Sampling

Top_A - Min_P is more exact in coverage than this.

Repetition penalty parameters - I prefer to use sampling parameters that are universal in usage, if I were to use this effectively
I'd have to be specific, very very specific for each model for their parameter numbers (7b, 13b, 20b) and repetition penalties
are like bombs, affecting any tokens in their range, both the good and bad. Besides I can just increase creativity to offset 
repetition altering either smoothing or min_p. Also argued by Kalomaze to be unreliable for most models

Mirostat - Argued by kalomaze in a reddit post to be unreliable somewhat. There was an anecdotal claim that this is basically
just Top_K = 1000000. Also I don't read scientific math.

Temp, Dynamic Temp - Good ol reliable, but stiff compared to smoothing.

No repeat Ngram Size - Uh... apparently used for repetitive phrases but same reasons as Repetition penalties.

Beam Search parameters - Dunno what it is, too technical.

Contrastive Search - Requires sampling in general to be disabled so no.

Supplementary Links:

https://artefact2.github.io/llm-sampling/index.xhtml

https://www.reddit.com/r/LocalLLaMA/comments/17vonjo/your_settings_are_probably_hurting_your_model_why/

https://gist.github.com/kalomaze/4473f3f975ff5e5fade06e632498f73e

https://gist.github.com/kalomaze/4d74e81c3d19ce45f73fa92df8c9b979

https://www.reddit.com/r/LocalLLaMA/comments/17vonjo/comment/k9c1u2h/