Skip to content

New extension: <Statistics> #1602

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

github-actions[bot]
Copy link
Contributor

Description

Its main function is the "normal random number generator" (NormalRNG). This function generates a random number that follows a normal (bell-shaped) distribution. Such distribution has a central tendency value (mean) and a margin of deviation (standard deviation). NormalRNG is bounded so that extreme values (abnormally high or small numbers, called outliers) do not ocurr.

This extension also includes a library of common values to check for probability, be it the chance of being above a certain number (one-tailed, "potxx") or the chance of being inside a range (two-tailed, "pttxx") where xx is the chance of success.

How to use the extension

Normal random number generator is an expression that can be used to create a normalized random number, where not all numbers will be as likely. The numbers will follow a normal distribution in which numbers close to a value (called "mean") are more likely to happen. How close or far the numbers appear to the mean is called "standard deviation". For the expression, the input is one standard deviation (also called "1 sigma"). The numbers will appear up to three standard deviations away from the mean. For example, if mean:5 and standard deviation:1, the expression will give numbers between 2 (5-3x1) and 8 (5+3x1). The "equivalent" expression for a flat distribution would be RandomFloatInRange(2,8). In the normal distribution, 5 is likely while the extreme values (2 and 8) are less likely. In the flat distribution, all numbers are as likely to appear. Use this expression to create numbers around a certain value. Real life examples of normal distributions are people height, IQ and walking speed.

Probability check is an expression that tells you how likely a number is to be. Use the OneTail type to find the probability of being above (or below) a goal. A real life example would be a grade list: the top student will be above 100% of their classmates. If you got 8/10... above how many classmates are you? In games, this can be used to check for the rarity of an item, drop or equipment, etc. Uste the TwoTails type to find if something belongs or not to a group, or is inside a range. In real life this is used to determine differences: 100.1 is bigger than 100... but barely so. A game could use this to check if an attack was critical or not, among others.

Note: Normal distributions and Flat distributions have different statistics, so I do not recommend using one for the other (e.g. making a OneTail check of the throw of one dice).

Checklist

  • I've followed all of the best practices.
  • I confirm that this extension can be integrated to this GitHub repository, distributed and MIT licensed.
  • I am aware that the extension may be updated by anyone, and do not need my explicit consent to do so.

What tier of review do you aim for your extension?

Reviewed

Example file

Statistics Example.zip

Extension file

Statistics.zip

@github-actions github-actions bot added the ✨ New extension A new extension label Mar 16, 2025
@github-actions github-actions bot requested a review from a team as a code owner March 16, 2025 16:33
@D8H
Copy link
Contributor

D8H commented Mar 29, 2025

Thank you for submitting an extension.

I only gave it a quick look because the example is missing the resources and it's not really usable without them.

The extension name "Statistics" sounds a bit generic.

It seems that you used the Irwin–Hall distribution. Please, add these 2 links in a comment to help users understand the implementation:

Why did you choose a 5 parts approximation? Is it precise enough?

I have a few suggestions to simplify the events:

  • Extension variables can only be used within the extension, there is no need to add a prefix. You can use the variables directly without a structure.
  • Add a comment to explain any abbreviation you may use (It's better not to use abbreviation, but in your case, I guess it would make the formulas harder to read).
  • In NormalRNG function, you could cut the random value right after the 10 RandomFloat to avoid to map the value 4 additional times, also you can use clamp() instead of 2 conditions.
  • In a 2nd step, you can probably have everything in 1 formula and no longer need the random variable.

@arelaestudio
Copy link

arelaestudio commented Apr 5, 2025

Hello Davy!

  • The "Name displayed in editor" has been changed to Normal Distribution Statistics. However, I kept the name the same, as the extension provides expressions that may be difficult to use if the name is long.
  • The suggested pages, and additional pages that inform the foundations of the extension have been included in "Description".
  • I choose mostly a 5 parts approximation (5%, 10%... etc) because they represent the most common ways to break down probability (by 2, 4, 5, 10 and 20). I have however increased the precision so that the probability can be checked if broken down by 3, 6, 8 and 12. For most purposes this should be enough. If a user requires a very specific way to break probability, I assume they know what they are doing and are able to calculate by themselves or to include the required variable/value in the extension.
  • I have simplified the variables, however kept some structures. The reasons for this are: 1) to differentiate One-Tail and Two-Tails probability; 2) To be able to call Two-Tails directly from the parameter in the condition TwoTails.
  • The three abreviation used (NRG, StdDev and p) are explained in "Description". As for Z, that is the name and it can be double-checked by users in the links. I know that abbreviations are discouraged but, as you said: the names are too long to be friendly. Thanks for understanding.
  • NormalRNG has been completely overhauled because the expressions ZScore and Zvalue were added. Also, thanks, I did not know about clamp()! Very useful indeed. This also solved the use of variables: only the required remain. While the variable ZScore is not absolutely required, not using it would make the expression OneTail to calculate 37 times instead of once... So I guess it does not harm to have it.

!update Statistics.zip

As I cannot upload an example ffile, I link the game in GDevelop:
https://gd.games/arelaestudio/statistics-example

Copy link
Contributor Author

github-actions bot commented Apr 5, 2025

✅ Successfully updated the extension.

Copy link
Contributor Author

github-actions bot commented Apr 5, 2025

❗ No updates found. Please check your file.

@arelaestudio
Copy link

arelaestudio commented Apr 5, 2025

!update Statistics Example.zip

Copy link
Contributor Author

github-actions bot commented Apr 5, 2025

❗ No updates found. Please check your file.

1 similar comment
Copy link
Contributor Author

github-actions bot commented Apr 5, 2025

❗ No updates found. Please check your file.

@D8H
Copy link
Contributor

D8H commented Apr 6, 2025

I have a few questions:

  • in NormalRNG
    • can the affine be applied outside the clamp?
    • about the Bound parameter, when cutting extrem values, I wonder if what is important most of the time is to decide at which value we cut (and users can clamp it themself) rather than at which probablity. What is your view on this?
  • What are the use-cases of OneTail and Range aside from explainning the concept of normal distibution in the demo?
  • is it a bug or a formatting issue?
    image

@arelaestudio
Copy link

Thanks for your review and feedback!
About your questions (sorry if the answers are too long):

  • If with the affine you refer to first calculating the number and later clamping it, it is perfectly possible. Just would require an additional variable.
  • I understand your point. From a user perspective it might be more intuitive to clamp for a value instead of a probability. In that case the user can simply use p999 and clamp for themself to whatever number they want, at the expense for biasing the distribution. I thought about having two separate functions, one to generate the number and other to clamp it against probability... but it felt repetitive. All in all, I consider that having a expression that allows to clamp against probability does not harm: it offers an advanced option, and does not prevent the user to adjust the number later on.
  • OneTail and Range are expressions to check rarity. They are not extraordinary, but just replace comparing numbers for comparing the probability of such numbers. Mechanics similar to the crab catcher example were use-cases that I had in mind. Something like in Pokemon, where you can catch a taller/smaller/heavier/lighter pokemon, someone evaluates them and gives you a prize. Setting the evaluation tresholds and knowing what is considered "extreme" for every pokemon would require 4 variables each. Checking the rarity against the same numbers used to generate the pokemons requires none, only the rarity level. In a more general way, if the user sets what they consider "rare" (eg, above 75% chances), they dont need to calculate any threshold ever again, just input the numbers to check (as the condition will always be OneTail X>p750, for example). Aside of that, in an RPG you could control the chance of hit/miss the enemy and depending on the range you could get a second chance (or whatever). Range may also be used for dinamic move canceling, checking if the button was pressed too slow or too fast against the user own performance. All in all, nothing of the above is something that cannot be done with a compare value, compare variable, etc. However, when using normalized random numbers usually the probability is known, not the value: "I want the angle to be off 5% of the times" vs. "when the angle is 45 it will miss".
  • The 0% range is an approximation of the real value (0,6%) and in this case it is caused because of lack of sensibility in extreme low values. I could fix this with additional variables (like p005 or p001) or using the "smaller" type of check: "bigger" is used by default but then the same probelm would happen for extreme high values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
✨ New extension A new extension
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants