Winsem2018-19_mgt1051_th_sjtg23_vl2018195003627_reference Material I_12-11_c1_bae.pdf

  • Uploaded by: Satnam Bhatia
  • 0
  • 0
  • June 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Winsem2018-19_mgt1051_th_sjtg23_vl2018195003627_reference Material I_12-11_c1_bae.pdf as PDF for free.

More details

  • Words: 1,529
  • Pages: 27
MGT1051 Business Analytics for Engineers

Normal Distribution

© 2018 C. Gangatharan – VIT

Dec 11, 2018 – Tue

MGT1051 – Business Analytics for Engineers

Data Distribution • Data can be “distributed” (spread out) in different ways

© 2018 C. Gangatharan – VIT

Dec 11, 2018 – Tue

MGT1051 – Business Analytics for Engineers

What is Normal (Gaussian) Distribution? • The normal distribution is a descriptive model that describes real world situations. • It is defined as a continuous frequency distribution of infinite range (can take any values not just integers as in the case of binomial and Poisson distribution). • This is the most important probability distribution in statistics and important tool in analysis of epidemiological data and management science.

© 2018 C. Gangatharan – VIT

Dec 11, 2018 – Tue

MGT1051 – Business Analytics for Engineers

Types of Distribution • Frequency Distribution • Normal (Gaussian) Distribution • Probability Distribution • Poisson Distribution • Binomial Distribution • Sampling Distribution • t distribution • F distribution © 2018 C. Gangatharan – VIT

Dec 11, 2018 – Tue

MGT1051 – Business Analytics for Engineers

A Bell Curve

© 2018 C. Gangatharan – VIT

Dec 11, 2018 – Tue

MGT1051 – Business Analytics for Engineers

What are some examples of things that follow a Normal Distribution? • Heights of people • Size of things produced by machines • Errors in measurements • Blood Pressure • Test Scores

© 2018 C. Gangatharan – VIT

Dec 11, 2018 – Tue

MGT1051 – Business Analytics for Engineers

Standard Normal Distribution • mean=median=mode • Symmetry about the center • 50% of the values less than the mean and 50% greater than the mean

© 2018 C. Gangatharan – VIT

Dec 11, 2018 – Tue

MGT1051 – Business Analytics for Engineers

Characteristics of Normal Distribution • It links frequency distribution to probability distribution • Has a Bell Shape Curve and is Symmetric • It is Symmetric around the mean: Two halves of the curve are the same (mirror images)

© 2018 C. Gangatharan – VIT

Dec 11, 2018 – Tue

MGT1051 – Business Analytics for Engineers

The Standard Deviation 68% of values are within 1 standard deviation of the mean 95% of values are within 2 standard deviations of the mean 99.7% of values are within 3 standard deviations of the mean © 2018 C. Gangatharan – VIT

Dec 11, 2018 – Tue

MGT1051 – Business Analytics for Engineers

Why do we need to know Standard Deviation? • Any value is • likely to be within 1 standard deviation of the mean • very likely to be within 2 standard deviations • almost certainly within 3 standard deviations

© 2018 C. Gangatharan – VIT

Dec 11, 2018 – Tue

MGT1051 – Business Analytics for Engineers

How good is rule for real data? Check some example data: • The mean of the weight of the women = 127.8 lb • The standard deviation (SD) = 15.5 lb

© 2018 C. Gangatharan – VIT

Dec 11, 2018 – Tue

MGT1051 – Business Analytics for Engineers

68% of 120 = .68x120 = ~ 82 runners In fact, 79 runners fall within 1-SD (15.5 lbs) of the mean.

112.3

127.8

143.3

25

20 P e r c e n t

15

10

5

0 80

© 2018 C. Gangatharan – VIT

90

100

110

120 POUNDS

Dec 11, 2018 – Tue

130

140

150

160

MGT1051 – Business Analytics for Engineers

95% of 120 = .95 x 120 = ~ 114 runners

In fact, 115 runners fall within 2-SD’s of the mean.

96.8

127.8

158.8

25

20 P e r c e n t

15

10

5

0 80

© 2018 C. Gangatharan – VIT

90

100

110

120 POUNDS

Dec 11, 2018 – Tue

130

140

150

160

MGT1051 – Business Analytics for Engineers

99.7% of 120 = .997 x 120 = 119.6 runners

In fact, all 120 runners fall within 3-SD’s of the mean.

81.3

127.8

174.3

25

20 P e r c e n t

15

10

5

0 80

© 2018 C. Gangatharan – VIT

90

100

110

120 POUNDS

Dec 11, 2018 – Tue

130

140

150

160

MGT1051 – Business Analytics for Engineers

The Normal Distribution: as mathematical function (pdf)

f ( x) 

1

 2

This is a bell shaped curve with different centers and spreads depending on  and 

Note constants: =3.14159 e=2.71828 © 2018 C. Gangatharan – VIT

1 x 2  ( ) 2  e

Dec 11, 2018 – Tue

MGT1051 – Business Analytics for Engineers

Outliers ? Bill Gates makes $500 million a year. He’s in a room with 9 teachers, 4 of whom make $40k, 3 make $45k, and 2 make $55k a year. What is the mean salary of everyone in the room? What would be the mean salary if Gates wasn’t included? Mean With Gates: $50,040,500 © 2018 C. Gangatharan – VIT

Mean Without Gates: $45,000

Dec 11, 2018 – Tue

MGT1051 – Business Analytics for Engineers

What is an outlier? • Observations inconsistent with rest of the dataset – Global Outlier • Special outliers – Local Outlier • Observations inconsistent with their neighborhoods • A local instability or discontinuity

© 2018 C. Gangatharan – VIT

Dec 11, 2018 – Tue

MGT1051 – Business Analytics for Engineers

Outlier Detection Find the mean and median of the following set of numbers: 3

12

7

40

9

14

18

15

17

Mean is 15 Median is 14 © 2018 C. Gangatharan – VIT

Dec 11, 2018 – Tue

MGT1051 – Business Analytics for Engineers

Outlier In a set of numbers, a number that is much LARGER or much SMALLER than the rest of the numbers is called an Outlier. To find any outliers in a set of data, we need to find the 5 Number Summary of the data. © 2018 C. Gangatharan – VIT

Dec 11, 2018 – Tue

MGT1051 – Business Analytics for Engineers

Outlier Detection To find any outliers in a set of data, we need to find the 5 Number Summary of the data. Find the 5 Number Summary of the following numbers:

Step 1: Sort the numbers from lowest to highest Step 2: Identify the Median Step 3: Identify the Smallest and Largest numbers

Step 4: Identify the Median between the smallest number and the Median for the entire set of data, and between that Median and the largest number in the set. 3

7

© 2018 C. Gangatharan – VIT

9

12

14 Dec 11, 2018 – Tue

15

17

18

40

MGT1051 – Business Analytics for Engineers

Outlier Detection 3 - Smallest number in the set

9 - Median between the smallest number and the median 14 - Median of the entire set 17 - Median between the largest number and the median 40 - Largest number in the set

These are the five numbers in the 5 Number Summary 3

7

© 2018 C. Gangatharan – VIT

9

12

14 Dec 11, 2018 – Tue

15

17

18

40

MGT1051 – Business Analytics for Engineers

Outlier Detection A 5 Number Summary divides your data into four quarters.

3

7

9

12

14

15

17

18

1st

2nd

3rd

4th

Quarter

Quarter

Quarter

Quarter

© 2018 C. Gangatharan – VIT

Dec 11, 2018 – Tue

40

MGT1051 – Business Analytics for Engineers

Outlier Detection 25% of all the numbers in the set are smaller than Q1

3

7

9

12

14

15

17

18

40

The Lower Quartile (Q1) is the second number in the 5 Number Summary The Upper Quartile (Q3) is the fourth number in the 5 Number Summary 25% of all the numbers in the set are larger than Q3 © 2018 C. Gangatharan – VIT

Dec 11, 2018 – Tue

MGT1051 – Business Analytics for Engineers

Outlier Detection What percent of all the numbers are between Q1 and Q3?

3

7

9

12

14

15

17

18

40

50% of all the numbers are between Q1 and Q3 This is called the Inter-Quartile Range (IQR) The size of the IQR is the distance between Q1 and Q3

17 - 9 = 8 © 2018 C. Gangatharan – VIT

Dec 11, 2018 – Tue

MGT1051 – Business Analytics for Engineers

Outlier Detection

3

7

9

12

14

15

17

18

40

IQR = 8

To determine if a number is an outlier, multiply the IQR by 1.5 8 • 1.5 = 12

An outlier is any number that is 12 less than Q1 or 12 more than Q3 © 2018 C. Gangatharan – VIT

Dec 11, 2018 – Tue

MGT1051 – Business Analytics for Engineers

Outlier Detection + 12 - 12

3

7

9

12

14

15

17

18

40

IQR = 8

-3

39 OUTLIER

© 2018 C. Gangatharan – VIT

Dec 11, 2018 – Tue

MGT1051 – Business Analytics for Engineers

Outlier Detection Find the mean and median of the following set of numbers (no outliers):

3

12 7 Mean is 15

40

9

18

15

17

Mean is 11.875

Median is 14

© 2018 C. Gangatharan – VIT

14

Median is 13

Dec 11, 2018 – Tue

MGT1051 – Business Analytics for Engineers

Related Documents

Material
May 2020 52
Material
November 2019 67
Material.
May 2020 51
Material
October 2019 66
Material
October 2019 71
Material
June 2020 11

More Documents from ""