homechevron_rightProfessionalchevron_rightStatistics

Count to Probability

This online calculator takes a list of events along with number of times the particular event occured and calculates the probability (and log probability) of each event by dividing event count to the total number of events.

Let's suppose you analyze some data which is random by nature and you count number of times particular value appeared in your data. Or, in term of probability theory, number of times particular event has happened.

The good example of such task is the analysis of letter frequencies in the text. You have the text and then you just count how much each letter of the alphabet appeared in your text. After that you probably want to compare your results with theoretical letter (or bigrams, or whatever you count) frequencies, which are often given by probabilities. So, you need to convert from counts to probabilities. It is actually easy - you just need to sum all counts and then divide value for each letter to total number of letters in the text. But, to do in by hand can be boring and tedious - say, you need to import your data to spreadsheet program, sum the column, fill another column with results of division, etc.

That's why I've created the calculator below. It takes a list of events along with number of times the particular event occured and calculates the probability of each event by dividing event count to the total number of events. Also, if there are many events, sometimes you need logarithms of probabilities instead of probabilities - and I've included this option as well. However, note that you can't take the log of zero, so, if any event has count of zero, log is computed for some small value, in this case, 0.01 divided by total count.

Paste your data, tweak regular expression used for parsing if needed, then choose separator of result columns and what values you want to see in results.
As for regular expression, the only requirement that it should produce two capture groups - first for event name and second for event count. By default it assumes that you have event name and count separated by semicolon.

I hope it can save some time for somebody. Enjoy.

PLANETCALC, Count to Probability

Count to Probability

Result
 

Comments