D-u2[1]. Arranging Data

  • Uploaded by: swesi
  • 0
  • 0
  • June 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View D-u2[1]. Arranging Data as PDF for free.

More details

  • Words: 1,646
  • Pages: 42
Why Statistics: Complexity of the situations make a process of decision making difficult Statistics provides the method of collecting , presenting, analyzing, and meaningfully arranging data. Type of situations: When data need to be presented in a form which helps in easy grouping(Graphs, Charts, Table) To test some Hypothesis and draw inference Unknown quantities are to be estmated through observed data A decision is to be made under uncertainty regarding a course of action

Statics Descriptive

Inductive

Stastical Decision Theory

Data collection

Stastistical inferences

- Decision problems

&presentation

Hypothesis testing

- Alternatives

and Inferences

- Uncertainties

(eg. Regression

- Criterion of choices

Correlation)

ARRANGING DATA

Learning Goals MEANING OF DATA  TYPES OF DATA  DATA COLLECTION  DATA PRESENTATION DEVICES 

MEANING OF DATA



Data is a collection of related observations, facts or figures.



Collection of data is called a data-set, and each observation a data point.

TYPES OF DATA  PRIMARY

DATA

 SECONDARY

DATA

DATA COLLECTION      

Following questions can pose to test the validity of the data: Where does the data originate from? Is the source reliable? Does the data support or contradict the previous decisions? Are the conclusions derived from the data? What is the size of the sample? does it represents the entire population under consideration for decision making?

METHODS OF COLLECTING DATA 

COMPLETE ENUMERATION



SAMPLE METHOD

CLASSIFICATION OF DATA 

GEOGRAPHICAL



CHRONOLOGICAL



QUALITATIVE



BY MAGNITUDE

TABULAR PRESENTATION OF DATA OBJECTIVES are:  To condense complex data  To show a trend  To display huge volumes of data in less space  To highlight key characteristics of data  To facilitate comparison of data elements  To help decision making using statistical methods  To serve as reference for future decisions

PARTS OF AN IDEAL TABLE  



Table number: acts as an identity to the table Title: given an idea about the nature of data in the table Captions: these are headings given to vertical columns that explains the mode of classification i.e. time, quantity, region etc.

Contd..



Stubs: these are the headings explaining the basis for classifying the rows



Body: the data posted in rows and columns, where row and column headings explain the data. Footnote: any other information to explain the data in the table. Source: source of information





Table Title Table No

Captions Stub (Headings of the row)

Table 1.1: Product wise Sales Product

Year wise Sales 2001

2002

2003

2004

P1

40

45

40

50

P2

15

20

22

30

P3

20

30

40

50

Source :Economics Time, 22nd Feb.2005

Body of the Table

GRAPHICAL PRESENTATION OF DATA 

LINE CHARTS



BAR CHARTS



PIE CHARTS



PICTOGRAMS



SCATTER DIAGRAMS

LINE CHART 500 450 400 350 300 250 200 150 100 50 0

SALES

1990 1991 1992 1993 1994 1995 1996 1997

1. Line Graph

BAR CHARTS 4500 4000 3500 3000 2500

EXPORTS IMPORTS

2000 1500 1000 500 0

1995

1996

1997

1997

ARRANGING DATA

PIE CHART  HISTOGRAMS  FREQUENCY POLYGONS  SKEWNESS  KURTOSIS 

PIE DIAGRAMS

Indian Promoters Indian institutions/ mutual funds FIIS Public

HISTOGRAMS 

    

The histogram graphically shows the following: center (i.e., the location) of the data; spread (i.e., the scale) of the data; Skewness of the data; presence of outliers; and presence of multiple modes in the data.

HISTOGRAMS are as "sorting bins." You have one variable, and you sort data by this variable by placing them into "bins." Then you count how many pieces of data are in each bin. The height of the rectangle you draw on top of each bin is proportional to the number of pieces in that bin. On the other hand, in bar graphs you have several measurements of different items, and you compare them. The main question a histogram answers is: "How many measurements are there in each of the classes of measurements?" The main question a bar graph answers is: "What is the measurement for each item?" HISTOGRAMS

Situation

Bar Graph or Histogram?

We want to compare total revenues of Bar graph. Key question: What is five different companies. the revenue for each company? We have measured revenues of several companies. We want to compare numbers of companies that make from 0 to 10,000; from 10,000 to 20,000; from 20,000 to 30,000 and so on.

Histogram. Key question: How many companies are there in each class of revenues?

We want to compare heights of ten oak trees in a city park.

Bar graph. Key question: What is the height of each tree?

We have measured several trees in a city park. We want to compare numbers of trees that are from 0 to 5 meters high; from 5 to 10; from 10 to 15 and so on.

Histogram. Key question: How many trees are there in each class of heights?

FREQUENCY POLYGONS "Less than" Ogive of the distribution of 50 employees 60 50 40

Cumulative frequency

30 20 10 0 <25

<30

<35

<40

<45

<50

<55

<60

SKEWNESS Skewness is a measure of symmetry, or more precisely, the lack of symmetry. A distribution, or data set, is symmetric if it looks the same to the left and right of the center point.

SKEWNESS 



A curve is said to be skewed when the values in the frequency distribution are concentrated more towards the left or right side of the curve i.e. the values are not equally distributed from the centre of the curve. A curve is said to be positively skewed when the tail of the curve is more stretched towards the right side. It is said to be negatively skewed when the tail is more stretched towards the left side.

KURTOSIS 

 

KURTOSIS is the degree of peakness of a distribution of points. It measures the peakedness of a distribution Two curves with same central location and dispersion may have different degree of kurtosis

MEASURES OF CENTRAL TENDENCY Objectives of Averaging Requisites of a Good Average Types of Averages Mathematical Averages Positional Averages

CENTRAL TENDENCY The tendency of the data to cluster around the central value is known as CENTRAL TENDENCY. &

Corresponding numerical measure of this tendency is known as measurement of central tendency The average is of great significance because it depicts the characteristics of the whole group. Since an average represents the entire data, its value lies somewhere in between the two extremes, i.e. the largest and the smallest items. For this reason an average is frequently referred to as a measure of Central Tendency.

MAIN OBJECTIVES • To find out one value that represents the whole • • • •

mass of data. To facilitate comparison. To establish relationship. To derive inference about a universe from a sample. To aid decision making.

Requisites of a Good Average • • • • • •

It should be rigidly defined. It should be mathematically expressed. It should be readily comprehensible and easy to calculate. It should be calculated based on all the observations. It should be least affected by extreme fluctuations in sampling data. It should be suitable for further mathematical treatment.

Types of Averages AVERAGES Mathematical Averages Arithmetic Mean (A.M.)

Geometric Mean (G.M.)

Positional Averages Harmonic Mean (H.M.)

Median (Md)

MODE (Mo)

ARITHMETIC MEAN • It is a ratio obtained on dividing the sum of observations by the total _



number of observations is known as ARITHMETIC MEAN. Arithmetic mean is represented by notation X( read X-bar)

CALCULATING THE MEAN FROM UNGROUPED DATA The mean X OF A Collection of observations x1,x2….xn is given by: _ X= (1/n) (x1 +x2 ….xn ) = ∑x/n n

= (1/n) ∑xi

i=1

In statistics the collection of all the elements under study is called a POPULATION whereas a collection of some (but not all) of the elements under study is called a sample. It is necessary to distinguish whether we are considering a population or a sample because certain formulas, like those for computing standard deviation of a population are different from those for computing the standard deviation of a sample. Hence population mean is denoted by

µ= Sum of all the data points in the population Size of population X= sum of all the data points in the sample Size of sample

The following table gives the annual profits of 10 financial services companies for the year2007-2008. Calculate arithmetic mean profit of companies. Companies

Net Profit (Rs. crore)

A B C D E F G H I J

9.19 4.27 1.74 5.71 4.80 4.01 9.22 3.00 15.16 3.93

CALCULATION FOR GROUPED DATA

Discrete Series:

fx ∑ X= ∑f

E.G. In a survey of 50 chemical industries, the following data was calculated:

Xi= Level of Profit (Rs. Lakh) Earned during 2002-2003

fi= No. of companies That earned Xi amount of profit

Xi fi

20 16 24 25 31

12 15 8 7 8

240 240 192 175 248

TOTAL

50

1095

USES OF A.M. • Mean is the simplest average to understand and easy to compute • It is relatively reliable in the sense that it does not vary too much when repeated samples are taken from one and the same population, at least not as much as other kind of statistical descriptions.

• The mean is typical in the sense that it is the centre of gravity balancing the values on the either side of it.

Advantages and Disadvantages of A.M. + Its concept is familiar and clear to all. + It is easy to understand and easy to calculate. + Provides a good basis for comparison.  It may be affected the highly fluctuating values that are not far  

from other values of the group. It is very difficult to find actual mean. Calculation of mean for a data set with open-ended classes, is not possible.

Related Documents


More Documents from "Kenny Paul"