Statistics - Introduction Arranging Data

  • June 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Statistics - Introduction Arranging Data as PDF for free.

More details

  • Words: 2,882
  • Pages: 45
Unit of measure

STATISTICS-INTRODUCTION

• Role of Statistics in Managerial Decisions • Nature of Data, Population data ,Sample data. • Frequency Distribution

* Source:

Footnote Source

1

You use statistics daily without even realizing it!!!

Unit of measure

You use statistics very often without even realising it !!!!!

Examples ??????

* Source:

Footnote Source

2

Statistics is used to help determine

Unit of measure

Which product I should sale (Demand stats) How much you pay for insurance (Mortality Stat) Whether drugs are approved for use (Drug trials) Which cars you buy (Reliability ratings, crash tests) Which products are on you grocery shelf (focus groups), and where they are located (Big Bazzar & Snacks Shop are right next to each other…what a concept!!!) What politicians claim as their “firm beliefs” (opinion polls). Favorites to win in sports. Whether it will rain And , on and on. * Source:

Footnote Source

3

Statistics…..Defn

Unit of measure

Many people think of statistics as large amounts of numerical data, e.g. share prices, GDP statistics, runs scored by Sachin etc etc Definition : Statistics refers to the range of techniques and procedure for collecting data, summarizing data, classifying data, analyzing data, interpreting data, displaying data and making decisions based on data. Definition: By Statistics, we mean aggregate of facts, affected to a marked extent by multiplicity of causes, numerically expresses, enumerated or estimated accordingly to a reasonable standards of accuracy, collected in a systematic manner for a predetermined purpose and placed in relation to each other * Source:

Footnote Source

4

Characteristics of Statistics

Unit of measure

Statistics are the aggregate of facts Statistics are affected to a marked extent by multiplicity of causes Statistics are numerically expressed Statistics are expressed according to reasonable standards of accuracy Statistics should be collected with reasonable standards of accuracy Statistics should be placed in relation to each other

* Source:

Footnote Source

5

Why Study Statistics

Unit of measure

It presents the facts in a definite & clear terms. It gives the concise shape to the mass of figures and develops meaning from the data It helps to compare between two sets of figures It helps in formulating & testing hypothesis It helps in understanding & predicting the future events, from the past & current data It helps in formulation of suitable policies It helps in understanding the complex happenings Statistics are widely used in business. Usage continues to increase as the business world becomes larger, more complex, and more quantitative. * Source:

Footnote Source

6

Limitations of Statistics

Unit of measure

Statistics does not study individual observations. It is only concerned with groups of observations Statistics deals with quantitative characteristics. It does not deal with qualitative characteristics such as beauty, honesty, sharpness, brightness, poverty, intelligence etc Statistical laws are true only on averages Statistics does not reveal the entire story Statistics is only one of the methods of studying the problem Statistics can be misused Statistical data should be uniform & homogeneous. * Source:

Footnote Source

7

Decision Making - Businesses

Unit of measure

Accounting Public accounting firms use statistical sampling procedures when conducting audits for their clients. Economics Economists use statistical information in making forecasts about the future of the economy or some aspect of it.

* Source:

Footnote Source

8

Decision Making - Businesses

Unit of measure

Marketing Electronic point-of-sale scanners at retail checkout counters are used to collect data for a variety of marketing research applications. Production A variety of statistical quality control charts are used to monitor the output of a production process.

* Source:

Footnote Source

9

Decision Making - Businesses

Unit of measure

Finance Financial advisors use price-earnings ratios and dividend yields to guide their investment recommendations.

* Source:

Footnote Source

10

Uses & Abuses of Statistics

Unit of measure

Most of the time, samples are used to infer something (draw conclusions) about the population. However, occasionally the conclusions are inaccurate or inaccurately portrayed for the following reasons: Sample is too small. Even a large sample may not represent the population. Unauthorized personnel are giving wrong information that the public will take as truth. A possibility is a company sponsoring a statistics research to prove that their company is better. Visual aids may be correct, but emphasize different aspects. Specific examples include graphs which don't start at zero thus exaggerating small differences and charts which misuse area to represent proportions. Precise statistics or parameters may incorrectly convey a sense of high accuracy. Misleading or unclear or incomplete information may be shared. * Source:

Footnote Source

11

Misleading Statistical Presentation Unit of measure

These two graphs represent sales…who has seen faster sales growth? 16000

14000

14000

13500

12000

13000

10000

12500

8000

12000

6000

11500

4000

11000

2000

10500

0

10000 1

3

5

7

9

11 13

15 17 19 21

23 25 27 29

31

1

3

5

7

9

11 13 15

17 19

21 23 25

27 29 31

These are actually the same numbers with different scales along the side. * Source:

Footnote Source

12

Pictures can be misleading also Unit of measure

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0

R 1

R 2

How much more is the second?

* Source:

It’s twice as tall, but it’s also twice as wide …this means 4 times the volume… It can be misleading.

Footnote Source

13

Avoid Sensationalism!!!

Unit of measure

e.g. Violence Stat: Yet another incident doubling the last years incidents Accident Stat: First day of the year accident making it 365 a year (Almost 36 times compared to last year, when we just had 10 accidents in a year)

* Source:

Footnote Source

14

Branches of Statistics

Unit of measure

The academic discipline of statistics can be divided into two major branches: – Descriptive

statistics

– Inferential

statistics.

* Source:

Footnote Source

15

Descriptive Statistics

Unit of measure

Deals with summarizing and presenting data in a readable, easily understood form. It is tabular, graphical, and numerical methods used to summarize data Techniques:

• Visualizing and Summarizing Data: Raw Data, Data Array, Distribution • Characterizing Distributions with Numerical and Graphical Tools: Histogram, Ogive, Measures of Central Tendency: mean, median, mode; Measures of Dispersion: Range, standard deviation, variance, etc.

• Exploring the Relationship between Two variables: Scatter Diagrams, Correlation Coefficients, Frequency Tables * Source:

Footnote Source

16

Inferential Statistics

Unit of measure

Drawing conclusions about a population based on information from a sample. Statistical Inference is the process of using information obtained from analyzing a sample to make estimates about characteristics of the entire population. It is a discipline that allows us to estimate unknown quantities by making some elementary measurements. Using these estimates we can then make Predictions and Forecast the Future Statistical Inference with Hypothesis Testing: null and alternative hypotheses, one-tailed vs. two-tailed tests, test statistics, p-value, statistical significance, decision rules • The Concept of Risk and Power: risks involved, type I and II errors, confidence level and power of test • Statistical Inference with Confidence Intervals: how it works, when to use it • Equivalence of the Hypothesis Testing and the Confidence Interval Approaches • Statistical Inference for a Single Sample or Group: Hypothesis Testing vs. * Footnote Confidence Interval Approach Source:

Source

17

Unit of measure START

Gathering of Data

Classification, Summarization, and Processing of data

Presentation and Communication of Summarized information

Yes Is Information from a sample?

Yes

Use sample information to make inferences about the population

Statistical Inference

No Descripti ve Statistics

* Source:

No

Draw conclusions about the population characteristic (parameter) under study

Use cencus data to analyze the population characteristic under study

Footnote Source

STOP

18

Population & Sample

Unit of measure

Population Sample

* Source:

Footnote Source

19

Population & Sample

Unit of measure

Population: The complete set of data elements is termed the population. It is a set of all items in a particular study Sample: A sample is a portion of a population selected for further analysis. It is the subset of population Parameter: A parameter is a characteristic of the whole population Statistic: A statistics is a characteristic of the sample, presumably a measurable Remember: Parameter is to Population as Statistic is to Sample

* Source:

Footnote Source

20

Why Sample

Unit of measure

Why Sample?

Less time consuming than a census Less costly to administer than a census More practical to administer than a census of the targeted population Case of Sampling Survey Opinion Polls

* Source:

Footnote Source

21

Data

Unit of measure

– Data are the facts and figures that are collected, summarized, analyzed, and interpreted. A collection of data is called ‘data set’ and a single observation is called a ‘data element’ – Data can be further classified as being qualitative (Attribute) or quantitative (Variable). – Variables: Weight, height etc……Two types….Continuous & Discrete Continuous Variable is the variable, which can take any value within the given interval . E.g. Weight….50.0, 50.2, 50.5, 51.0 etc Discrete variable is the variable which can take isolated values e.g. No of patients visiting a doctor e.g. 50, 51 etc – Attribute: Honesty, Integrity etc * Source:

Footnote Source

22

Data Types Unit of measure

Data

Numerical

Categorical

(Quantitative)

(Qualitative)

Discrete

* Source:

Footnote Source

Continuous

23

Primary Data Unit of measure

Data can be classified as Primary Data or Secondary Data Primary data are those which are collected for a specific purpose directly from the field and hence are original in nature. This is collected by or on behalf of the person or persons who are going to make the use of the data. Once the data have been collected, processed & published, it becomes the secondary data for the subsequent usage by different people for other application in different connection Methods for Primary Data Collection • Direct Personal Interview • Observations • Indirect Oral Interviews • Information from agents/correspondents •Footnote Mailed Questionnaire Method * Source:

Source

24

Secondary Data Unit of measure

Secondary data are such numerical information, which have been already collected by some agency for specific purpose and are subsequently compiled from that source for the application in different connections. There are many advantages of using secondary data • It is inexpensive • Large quantity of data available from wide range of sources • The data may be available for many number of years, and hence we can understand trend and may forecast the futuristic information

* Source:

Footnote Source

25

Data Sources Unit of measure

Primary

Secondary

Data Collection

Data Compilation

Print or Electronic Observation

Survey

Experimentation * Source:

Footnote Source

26

Unit of measure

Descriptive Statistics

* Source:

Footnote Source

27

Data Processing Techniques

Unit of measure

•Raw Data •Data Array •Discrete Frequency Distribution •Continuous Frequency Distribution

* Source:

Footnote Source

28

Raw Data & Data Array

Unit of measure

Raw Data: •Information before it is arranged & analysed is raw data. It is called raw, as it is unprocessed by any statistical methods

•Example Data Array: •It involves arranging the values in either ascending or descending order

•Example * Source:

Footnote Source

29

Numerical 1 – Data Array

Unit of measure

Raw Data 14

26

2

34

8

13

27

37

9

12

39

42

45

30

32

24

24

30

20

23

14

18

30

33

24

34

30

10

22

14

Prepare data array. 2

8

9

10

12

13

14

14

14

18

20

22

23

24

24

24

26

27

30

30

30

30

32

33

34

34

37

39

42

45

* Source:

Footnote Source

30

Discrete Distribution

Unit of measure

•In the discrete frequency distribution, after arranging the values in ascending order, we count the frequency i.e. number of times each value has appeared in the data set by using tally marks

•Discrete distribution is also known as ungrouped FD. •Numerical

* Source:

Footnote Source

31

Numerical 2 - Discrete FD

Unit of measure

Marks

Frequency

Marks

1

24

8

1

26

1

9

1

27

1

10

1

30

4

12

1

32

1

13

1

33

1

14

3

34

2

18

1

37

1

20

1

39

1

22

1

42

1

23

1

45

1

2

* Source:

Footnote Source

Tally Marks

Tally Marks

Fequency 3

32

Continuous Frequency Distribution

Unit of measure

•Continuous Frequency Distribution •In this, all the values are classified in groups or classes, hence this type of distribution is known as grouped or continuous frequency distribution

•Class Limits •Class Interval •Class Frequency •Class Mid Point or Class Mark

* Source:

Footnote Source

33

Class Limits

Unit of measure

Class Limits The two boundaries of the class are known as Class Limits. The Class Limits are the lowest and the highest value that can be included in the class. e.g. 10-20…In this class, 10 is the lower limit and 20 is the upper limit The lower limit of the class is that value below which no observation can be included in the class. The upper limit of the class is that value above which no observation can be included in the class.

* Source:

Footnote Source

34

Class Interval

Unit of measure

Class Interval The difference between the upper limit and lower limkt of the class is known as class interval or class width of that class. e.g. Class 10-20 has the CI of 10. In case, for the classification, the number of classes are not given, then the number of classes can be determined using the Sturge’s formaula No of Classes (K) = 1 + 3.322 log N Where N is the total no of observations

* Source:

Footnote Source

35

Class Interval

Unit of measure

Formula for the Class Interval: Class Interval (i) = (Next unit value after the largest value in the data – Smallest value in the data)/No of Classes e.g. If the marks of 30 students range between 10 & 40 and if we want to divide in 3 classes, then Class Interval (i) = (41-10)/3 = 10.33 i.e. 11 The classes become 10-21, 21-32, 32-43.

* Source:

Footnote Source

36

Cell Nomenclature

Unit of measure

Cell interval (i)

CELL

Midpoint

UPPER BOUNDARY

* Source:

Footnote Source

CELL NOMENCLATURE

37

Exclusive / Inclusive Method

Unit of measure

There are 2 methods of classifying the data according to class intervals. Exclusive Method: In this, the class intervals are so fixed that the upper limit of the class is the lower limit of the next class. In other words, in exclusive method, upper limits are excluded from that class. E.g. 10-20, 20-30, 30-40 etc. This is more suitable for continuous variable. Inclusive Method: In this type, the upper limits are included in the class. E.g. 10-19, 20-29, 30-39 etc. This is more suitable for discrete variable. Correction Factor = (Lower Limit of 2nd Class – Upper Limit of 1st Class)/2 * Source:

Footnote Source

38

Correction Factor

Unit of measure

In case of inclusive type, for getting the correct CI, we need to add the correction factor to upper limit of the classes and subtract the same from the lower limit of the classes. Correction Factor = (Lower Limit of 2nd Class – Upper Limit of 1st Class)/2 e.g. 10-19 Class Correction factor = (19-10)/2 = 0.5 and hence the class becomes 9.5-19.5 and hence the CI becomes 10

* Source:

Footnote Source

39

Inclusive to exclusive

Unit of measure

* Source:

Footnote Source

Inclusive Type

Exclusive type

10-14

9.5-14.5

15-19

14.5-19.5

20-24

19.5-24.5

25-29

24.5-19.5

40

Constructing FD

Unit of measure

Step 1: Decide on the type (Inclusive / Exclusive) and number of classes for dividing the data by using Sturge’s formula. (If given in the numerical, then go to step 2 directly. Step 2: Sort the data into different classes and count the frequency Step 3: Illustrate the data in the chart

* Source:

Footnote Source

41

Numerical 3 – Continuous FD

Unit of measure

Step 1: Calculate the No of Classes (Sturge’s formula) No of Classes (K) = 1 + 3.322 log N = 1 + 3.322 log 30 = 1 + 3.322 (1.477) = 5.9 =6

Step 2: Sort the data points into classes and count the no of points in each class. We have K = 6 Now Class Interval width = (Next unit value after Largest value –Smallest value)/K = (46-2)/6 = 44/6 = 7.33 i.e. approx 8.

Source:

Hence the classes shall be 2-9, 10-17, 18-25, 26-33, 34-41, * Footnote 42-49. Source

42

Numerical 3 – Continuous FD

Unit of measure

Class

* Source:

Tally Marks

Frequency

2–9

3

10 – 17

6

18 – 25

7

26 – 33

8

34 – 41

4

42 – 49

2

Footnote Source

43

Numerical 4 – Home Assignment

Unit of measure

The following set of the data represents the Km per litre of 40 similar motor cycles. 40.5, 39.7, 40.6, 39.9, 40.9, 38.9, 41.4, 40.5, 41.0, 38.8, 39.6, 40.4, 39.9, 40.2, 40.8, 40.7, 40.6, 41.7, 40.8, 39.1, 40.1, 40.7, 40.1, 40.7, 40.7, 39.8, 39.3, 39.6, 40.5, 41.3, 41.0, 39.9, 40.4, 40.9, 40.1, 41.2, 40.2, 40.0, 39.4, 40.6. Construct the frequency distribution to this data taking classes as 38.5-39.0, 39.0-39.5 etc

* Source:

Footnote Source

44

Numerical 4 – Home Assignment

Unit of measure

Classes

* Source:

Footnote Source

Tally Marks

Frequency

38.5-39.0

2

39.0-39.5

3

39.5-40.0

7

40.0-40.5

8

40.5-41.0

14

41.0-41.5

5

41.5-42.0

1

45

Related Documents