Collecting Data 2

  • Uploaded by: api-3818523
  • 0
  • 0
  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Collecting Data 2 as PDF for free.

More details

  • Words: 1,618
  • Pages: 38
STX 1110 INTRODUCTION TO QUANTITATIVE METHODS LECTURE 2 COLLECTING DATA 2 1

CONTENTS 1. Main Stages in a Statistical Investigation 2. Data - Overview - Definitions 3. Survey (Data Collection Method) - Interviews - Postal Questionnaire 4. Survey Guidelines - Questionnaire Design - Pilot Survey - Errors in Surveying - Validity and Precision

2

MAIN STAGES IN A STATISTICAL INVESTIGATION Pose a question Collect relevant data Summarise and present the data Analyse and interpret the results 3

OVERVIEW OF DATA Data Attributes(categorical)

Types

Nominal*

Variables(Numerical)

Ordinal* Discrete

Continuou s

Data Source

Primary Data

Secondary Data

Example of Collection Method

Survey

Document extraction

Interview

Postal Questionnaire

* Not under the syllabus of this Module.

Published Statistics

Annual Report 4

DEFINITIONS Data (raw materials of statistics) is simply a scientific term for facts, figures, information and measurement, both numerical and non-numerical. Data collected are stored as variables and consist of qualitative and quantitative in nature. Attribute (Categorical or Qualitative) is something an object has either got or not got. E.g. gender (male or female); blood group (A; AB; B; O); T-Shirt size (S; M; L; XL). Qualitative data describes characteristics that cannot be measured. 5

DEFINITIONS (Cont’d) Numerical (Quantitative) is something can be measured or counted. E.g. children (number); height (in cm); weight (in kg). Discrete variables are represented by whole numbers only (mainly counts). E.g. number of children. Continuous variables may take on any value and are typically measured rather than counted. E.g. distance; height. 6

DEFINITIONS (Cont’d) Primary data are data collected especially for the purpose of whatever survey is being conducted. Secondary data are data which have already been collected elsewhere, for some other purpose, but which can be used or adapted for the survey being conducted. E.g. financial figures extracted from published annual report.

7

SURVEY (DATA COLLECTION METHOD) In the absence of suitable secondary data, primary data will be generated through survey or experiment. Survey Conducted through

Observation (Observe people’s behaviours)

Questionnaire (Ask people questions)

Interviews

Postal Questionnaire 8

INTERVIEWS Face to Face (Personal) Interviews

Telephone Interviews

Advantages

High response rate More reliable in general

Rapid response Cheaper cost

Disadvantages

Time consuming High cost Interviewer bias Respondents might not talk freely

Some people do not have telephone Higher refusal rate Respondents might not talk freely 9

Type

POSTAL QUESTIONNAIRE Advantages: • Cheap and easy to organise • No interviewer bias • Respondents might express more freely Disadvantages: • Low response rate • No clarification on respondents’ doubt is possible Reasonable expectation on response rate for a survey is generally 20%. 10

QUESTIONNAIRE DESIGN Questions should be: • as short as possible • simple, easy and clear (unambiguous) • avoiding technical jargon • following a logical sequence • not offensive or leading • not involving calculations or tests of memory • avoiding open questions where possible – should have answer categories • relevant to the survey 11

PILOT SURVEY Pre-testing the questionnaire i.e. to trial it on a few respondents before using it to collect the required data. Revise the questionnaire if any problems discover in the pilot survey. It may save lots of time and cost later. The final version of questionnaire will gather the required data. 12

ERRORS IN SURVEYING • Sampling Error - Arises when the sample selected is not representative of the population. • Response Error - Occurs when respondents are unable to response (may be couldn’t understand the questions) or answer incorrectly. • Non Response Error - Occurs when respondents refuse to take part in the survey.

13

VALIDITY AND PRECISION Data Quality: • Validity - The data obtained in the survey should be relevant, i.e. related to the objectives of the survey. • Precision - The data obtained in the survey should be reliable and accurate. - Precision of recording data can affect calculations and cause rounding errors. 14

Some Survey Questions • Do you often go to pubs and restaurants? • Do you like Klinko coffee? • How old are you? • Are you angry about the government’s current plans to deal with housing? • How much money do you have? • How often do your parents visit the doctor? • How did you travel to work today? 15

STX 1110 INTRODUCTION TO QUANTITATIVE METHODS LECTURE 2 SUMMARISING AND PRESENTING DATA 1 16

CONTENTS • • • •

Ways/Methods of Presenting Data Format of Tables, Charts and Graphs Use Percentages to Compare Counts Interpretation of Tables, Charts and Graphs • Advantages and Disadvantages of Each Method of Presenting Data 17

WAYS/METHODS OF PRESENTING DATA • • • • • • • • • • •

Frequency Table or Frequency Distribution Cross Tabulation / Contingency Table Pie Chart Bar Chart Pareto Chart Pictogram Group Frequency Distribution (to be discussed in Week 3) Histogram (to be discussed in Week 3) Frequency Polygons (to be discussed in Week 3) Line Graph (to be discussed in Week 3) Stem and Leaf Display (to be discussed in Week 4) 18

FREQUENCY TABLE / FREQUENCY DISTRIBUTION A tabular summary of a set of data showing the frequency (or number) of data items in each category. Gender Male Female Total

Gender for a Workforce Frequency Relative Frequency 12 0.8 3 15

0.2 1.0

% 80 20 100

For discussion purpose 19

FREQUENCY TABLE / FREQUENCY DISTRIBUTION (Cont’d) Gender for a Workforce

20

FREQUENCY TABLE / FREQUENCY DISTRIBUTION (Cont’d) Exercise Construct a frequency table for number of children in a family based on the following data obtained from 23 families: 0 1 2 0 3 0 1 1 0 2 3 2 1 1 2 4 3 2 2 2 1 0 3 21

FREQUENCY TABLE / FREQUENCY DISTRIBUTION (Cont’d)

22

CROSS TABULATION / CONTINGENCY TABLE A table showing data of two variables simultaneously, which reflects the relationship of the two tabulated variables. Workforce by Gender and Marital Status Marital Status Gender Total Male Female Single 1 1 2 Married 10 2 12 Widowed 1 0 1 Total 12 3 15 For discussion purpose

23

CROSS TABULATION / CONTINGENCY TABLE (Cont’d) A cross tabulation can be summarised by calculating percentage of the row or column totals. If one variable (the explanatory variable) is believed to influence the other (the response variable), then one normally takes percentages of the totals for the explanatory variable.

24

CROSS TABULATION / CONTINGENCY TABLE (Cont’d) Workforce by Gender and Marital Status Marital Status Gender Total Male Female Single 8% 33% 13% Married 84% 67% 80% Widowed 8% 0% 7% Total 100% 100% 100% Single Married Widowed Total

50% 84% 100% 80%

50% 16% 0% 20%

100% 100% 100% 100%

To identify “Explanatory Variable” and “Response Variable” for each table.

25

CROSS TABULATION / CONTINGENCY TABLE (Cont’d) Example 1 Production shift against type of defect for a furniture manufacturing process

26

CROSS TABULATION / CONTINGENCY TABLE (Cont’d) Comparison of type of defect by shift

27

CROSS TABULATION / CONTINGENCY TABLE (Cont’d) Example 2 Cross tabulation of the quality of a meal by price

28

CROSS TABULATION / CONTINGENCY TABLE (Cont’d) Comparison of the quality of a meal by price

29

PIE CHART A pie chart is used to show pictorially the relative sizes of component elements of a total. Production Costs of Two Factories Factory A

Factory B

Admin 5%

Admin 10%

Materials 35%

Materials 20%

Overheads 20%

Overheads 45%

Labour 15%

For discussion purpose

Labour 50%

30

PIE CHART(Cont’d) Pie charts are very good for comparing the relative sizes of elements of a total. Disadvantages: •Actual numbers or % associated with each category need to presented on the diagram. •They are not a very good presentation method if there are too many different categories. •The impression they can give is easily distorted, by presenting a 3 dimensional pie chart for example. 31

BAR CHART A chart in which quantities are shown in the form of bars. 3 main types: • Simple bar chart • Component bar chart, including Percentage component bar chart • Multiple/Compound bar chart

32

BAR CHART (Cont’d) Simple bar chart is a chart consisting of one or more bars, in which the length of each bar indicates the magnitude of the corresponding data items. Number of Computers Sold by Each Company 14

Frequency

12 10 8 6 4 2 0 Apple

For discussion purpose

Compaq

Gateway

IBM

Packard Bell

33

BAR CHART (Cont’d) Component bar chart is a bar chart that gives a breakdown of each total into its components. Category of Beds in Each Hospital Percentage component bar chart

Component bar chart 250

100%

200

80%

Psychiatric Medical Surgical Maternity

150 100 50

Psychiatric Medical Surgical Maternity

60% 40% 20% 0%

0

Foothills Foothills

General

Southern

Heathview

General

Southern

Heathview

St Johns

St Johns

For discussion purpose 34

BAR CHART (Cont’d) Multiple/Compound bar chart is a bar chart in which two or more separate bars are used to present sub-divisions of data. Analysis of Marital Status by Gender 60

50

40

Count

30

Marital status Unmarried

20

Married Female

For discussion purpose

Male

Gender

35

PARETO CHART Essentially a bar chart in which the categories are arranged according to frequency with the tallest bar is at the left. Number of Computers Sold by Each Company 14 12 Frequency

10 8 6 4 2 0

For discussion purpose

Apple

Compaq

Packard Bell

IBM

Gateway

36

PICTOGRAM A form of visual presentation in which data is represented by picture/s. Number of Chairs Sold by ABC Limited 2001

= 5000 chairs

2000 1999 1998 1997

For discussion purpose 37

PICTOGRAM (Cont’d) •Very elementary form of visual representation. •Can be informative and more effective than other methods of presenting data to the general public. •Not accurate forms of presentation. •Provide lots of scope for confusion or misleading interpretations of the data.

38

Related Documents