Math 1. Mean, Sigma Notation, Standard Deviation and Variance, Percentile.
Mean
Mean is average - to find it - just sum and then divide to the count of summarized:
We want to find average days per months of the leap year:
First we find sum (add numbers):
31 + 29 + 31 + 30 + 31 + 30 + 31 + 31 + 30 + 31 + 30 + 31 = 366
We now that we added 12 numbers, now divide sum to the count of numbers:
366 / 12 = 30.5
So average months in leap year is having 30.5 days (Check: 30.5 * 12 = 366).
Find mean lowest temperature in Celsius in the 2017 year in Azerbaijan Ganja city:
Added lowest monthly temperatures:
- 2 - 1 + 2 + 7 + 12 + 17 + 20 + 19 + 15 + 10 + 4 + 0 = 103
Mean:
103 / 12 = 8.583 Celsius (Check: 8.583 * 12 = 102.996 ~ 103)
As you see we found mean of numbers which are of the same nature (days of the month in the first example and temperature in the month in the second example).
Sigma Notation
Σ this is sigma and it means - sum up what goes after sigma:
Σn - sum up all n's
OK and where are n values, here there are:
This means sum n's and n's values are from n=1 to n=5, so:
Standard Deviation
Standard Deviation (STD) - is a measure of spread between numbers (how far are our numbers from the mean).
To find STD (eg we have 5 flats in our building and have number of humans living in each flat (7,3,1,5,6) and we want to find STD):
- find mean: (7 + 3 + 1 + 5 + 6) / 5 = 4.4
- find differences: for each number - subtract the mean. This shows how far is this number from the mean and also shows if number lower or higher than mean:
- 7 - 4.4 = 2.6
- 3 - 4.4 = -1.4
- 1 - 4.4 = -3.4
- 5 - 4.4 = 0.6
- 6 - 4.4 = 1.6
- find squared differences: square each difference. Without this step the same negative and positive values (if any) will cancel each other and overall measure will be wrong:
- 2.6 * 2.6 = 6.76
- -1.4 * -1.4 = 1.96
- -3.4 * -3.4 = 11.56
- 0.6 * 0.6 = 0.36
- 1.6 * 1.6 = 2.56
- find variance (mean of the squared differences):
- (6.76 + 1.96 + 11.56 + 0.36 + 2.56)/5 = 4.64
- find standard deviation: square root of variance:
- √4.64 =2.154065923 ~ 2.1541
- STD gives us a measure to think which number is normal (is between mean+STD & mean-STD), which is low (lower than mean-STD) or high (higher than mean+STD):
- mean + STD = 4.4 + 2.1541 = 6.5541
- mean - STD = 4.4 - 2.1541 = 2.2459
- 2.2459 < 6.5541 < 7 => 7 is higher than normal for that building
- 2.2459 < 3 < 6.5541 => 3 is normal for that building
- 1 < 2.2459 < 6.5541 => 1 is lower than normal for that building
- 2.2459 < 5 < 6.5541 => 5 is normal for that building
- 2.2459 < 6 < 6.5541 => 6 is normal for that building
Percentile
Percentile - indicating the value below which a given percentage of data falls. Data itself is ordered form lower to the higher. So 95th percentile for men height is 187 cm (statistical measure), this means that 95% of men is lower than 187 cm and 5% of men is higher than 187 cm.
To find percentiles and corresponding values using nearest-rank method:
- order list of values, eg having list of number of humans living in 5 flats (7,3,1,5,6) :
- Ordered list: 1, 3, 5, 6, 7
- Number of values N = 5
- find minimum, it will be 1st percentile: 1st is 1
- find maximum, it will be 100th percentile: 100th is 7
- to find n-th percentile: n-th / 100 * N and then if not integer, round to the first higher number:
- 25th = 25 / 100 * 5 = 1.25 ~ 2
- means 2nd number in list
- so 25th percentile value is 3
- 50th = 50 / 100 * 5 = 2.5 ~ 3
- means 3rd number in list
- so 50th percentile value is 5
- 75th = 75 / 100 * 5 = 3.75 ~ 4
- means 4th number in list
- so 75th percentile value is 6
No comments:
Post a Comment