18919_statprobce_week 1.pdf

  • Uploaded by: Fransiscaa Hellen
  • 0
  • 0
  • April 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View 18919_statprobce_week 1.pdf as PDF for free.

More details

  • Words: 3,006
  • Pages: 26
1/16/2019

STATISTIKA & PROBABILITAS

Agung Nugroho, Ph.D 1

1

Penilaian: 1. Tugas (10%) 2. Praktikum (15%) 3. Kuis (15%) 4. UTS (30%)

5. UAS (30%)

2

2

1

1/16/2019

Materi Kuliah

Minggu 1 Minggu 2

Materi Pengantar Statistika dan probabilitas Statistika Deskriptif

Minggu 3 Minggu 4 Minggu 5 Minggu 6

Statistika Deskriptif dan Praktikum Excel Teori Peluang Distribusi Peluang Diskrit Distribusi Peluang Kontinyu

Minggu 7 Minggu 8

Distribusi Sampling & Teknik Sampling UTS 3

3

Materi Kuliah

Minggu 9 Minggu 10

Materi Point estimation dan confidence interval Hypothesis testing

Minggu 11 Minggu 12 Minggu 13 Minggu 14

Hypothesis testing-2 Analisis Regresi dan korelasi Analisis Regresi dan korelasi Praktikum Excel

Minggu 15 Minggu 16

Pengantar Statistika Quality control UAS 4

4

2

1/16/2019

Reference 1. Johnson, R. A., & Bhattacharyya, G. K., “Statistics: Principles and Methods”, Wiley Global Education, 6th Edition, 2014. 2. Douglas C., Montgomery, George C. Runger, “Applied Statistics and Probability for Engineers”, John Wiley & sons, 2014. 3. Levine, D. M., Ramsey, P. P., & Smidt, R. K., “Applied Statistics for Engineers and Scientists: Using Microsoft Excel and Minitab”, Prentice Hall, 2001.

5

5

Introduction

1st week 6

6

3

1/16/2019

What Engineers Do?  An engineer is someone who solves problems of interest to society

with the efficient application of scientific principles by: • Refining existing products • Designing new products or processes

The Creative Process

Figure 1-1 The engineering method

7

7

Statistics Supports The Creative Process  The field of statistics deals with the collection,

presentation, analysis, and use of data to: • Make decisions • Solve problems • Design products and processes

 It is the science of data.  For students, statistics is important to collect,

organize, analysis, and interpretation data during research and thesis. 8

8

4

1/16/2019

Definition  Statistics is the science of collecting, organizing, analyzing

and interpreting data in order to make decision in the presence of uncertainty.  Collection of fact, generally in form of numbers arranged in a table or diagram.  Example: Health statistic, Birth statistic, etc

9

9

Classification  Statistical Descriptive → collecting, organizing, analyzing

and interpreting data  Statistical Inference makes use of information from a sample to draw conclusions about the population from which the sample was taken. ⚫

Descriptive Statistics ✓ ✓ ✓ ✓ ✓

Collect Organize Summarize Display Analyze



Inferential Statistics ✓ Predict and forecast values of population parameters ✓ Test hypotheses about values of population parameters ✓ Make decisions 10

10

5

1/16/2019

Variability • Statistical techniques are useful to describe

and understand variability. • By variability, we mean successive observations of

a system or phenomenon do not produce exactly the same result. • Statistics gives us a framework for describing this

variability and for learning about potential sources of variability.

11

11

An Engineering Example of Variability Eight sample are taken from output waste water treatment plant and their Cl concentration are measured (in ppm): 12.6, 12.9, 13.4, 12.3, 13.6, 13.5, 12.6, 13.1. All of the sample does not have the same concentration. We can see the variability in the above measurements as they exhibit variability. The dot diagram is a very useful plot for displaying a small body of data say up to about 20 observations. This plot allows us to see easily two features of the data; the location, or the middle, and the scatter or variability.

Cl concentration 12

12

6

1/16/2019

Hypothesis Tests Hypothesis Test • A statement about a process behavior value. • Compared to a claim about another process value. • Data is gathered to support or refuse the claim.

One-sample hypothesis test: • Example: chlorine concentration (ppm) = 30 vs chlorine concentration (ppm) < 30

Two-sample hypothesis test: • Example: chlorine conc. at A (ppm) – chlorine conc. at B (ppm) = 0 vs chlorine conc. at A (ppm) – chlorine conc. at B (ppm) > 0 13

13

An Experiment in Variation W. Edwards Deming, a famous industrial statistician & contributor to the Japanese quality revolution, conducted a illustrative experiment on process over-control or tampering. Let’s look at his apparatus and experimental procedure. Marbles were dropped through a funnel onto a target and the location where the marble struck the target was recorded. Variation was caused by several factors: Marble placement in funnel & release dynamics, vibration, air currents, measurement errors.

14

14

7

1/16/2019

How Is the Change Detected Graphically? The center line on the control chart is just the average of the concentration measurements for the first 20 samples X = 91.5 g / l

when the process is stable. The upper control limit and the lower control limit are located 3 standard deviations of the concentration values above and below the center line. Figure 1-5 A control chart for the chemical process concentration data. Process steps out at hour 24 & 29. Shut down & adjust process. 15

15

Mechanistic and Empirical Models A mechanistic model is built from our underlying knowledge of the basic physical mechanism that relates several variables. Example: Ohm’s Law Current = V/R I = E/R I = E/R +  where  is a term added to the model to account for the fact that the observed values of current flow do not perfectly conform to the mechanistic model. • The form of the function is known. An empirical model is built from our engineering and scientific knowledge of the phenomenon, but is not directly developed from our theoretical or firstprinciples understanding of the underlying mechanism. The form of the function is not known. 16

16

8

1/16/2019

An Example of an Empirical Model • In a semiconductor manufacturing plant, the finished semiconductor is wirebonded to a frame. In an observational study, the variables recorded were: • Pull strength to break the bond (y) • Wire length (x1) • Die height (x2)

17

17

Visualizing the Data and Resultant Model Using Regression Analysis

3D plot of the pull strength (y), wire length (x1) and die height (x2) data.

3D Plot of the predicted values (a plane) of pull strength from the empirical regression model.

18

18

9

1/16/2019

DESCRIPTIVE STATISTIC

19

19

Statistic Descriptive  Describe the basic features of the data in a study.  It provide simple summaries about the sample and the measures.  Together with simple graphics analysis, they form the basis of virtually every quantitative analysis of data

20

10

1/16/2019

Key term  Variable, is a characteristic that changes or varies over time

and/or for different individuals or objects under consideration. Ex = Hair color, white blood cell count , bottom outlet of disitillation tower, etc  Data is a set of measurements, can be either from a sample or a population.  Population is the set representing all measurements of interest to the investigator. A population is any entire collection of people, animals, plants or things from which we may collect data. In order to make any generalizations about a population, a sample, that is meant to be representative of the population, is often studied.  A sample is a subset of measurements selected from the population of interest. The sample should be representative of the population. 21

Population and Sample: Example  Population: 150-plus million adult American.  Sample: 1500 interviewed.

Population (N)

Sample (n)

22

11

1/16/2019

Types of Data

Ex : red, black, blue, white

Ex : None, mild, moderate, severe

Ex : 1 person, 3 student, 5 pet Ex : 166 cm, 63.9kg,etc

23

Type of Measurement Scales • Nominal Scale - groups , classes, categories

✓Gender, color, professional classification, etc. • Ordinal Scale - order matters

✓Ranks (top ten videos, products, etc.) • Interval Scale - difference or distance matters.

✓Temperatures (0F, 0C) • Ratio Scale - Ratio matters.

✓Salaries, weight, volume, area, length, etc.

24

12

1/16/2019

Percentiles and Quartiles Percentiles partition the data into 100 segments. The Pth percentile in the ordered set is that value below which lie P% (P percent) of the observations in the set. The position of the Pth percentile is given by (n + 1)P/100, where n is the number of observations in the set.

⚫ ⚫



25

Example The magazine Forbes publishes annually a list of the world’s wealthiest individuals. For, 2007, the net worth of the 20 richest individuals, in $ billions, is as follows:

Billions 33 26 24 21 19 20 18 18 52 56 27 22 18 49 22 20 23 32 20 18

Sorted Billions 18 18 18 18 19 20 20 20 21 22 22 23 24 26 27 32 33 49 52 56

Find the 50th, 80th and the 90th percentiles of this data set.



26

13

1/16/2019

Example (Continued) Percentiles To find the 50th percentile, determine the data point in position (n + 1)P/100 = (20 + 1)(50/100) = 10.5. Thus, the percentile is located at the 10.5th position. The 10th observation in the ordered set is 22, and the 11th observation is also 22. The 50th percentile will lie halfway between the 10th and 11th values (which are both 22 in this case) and is thus 22.



⚫ ⚫



27

Example









To find the 80th percentile, determine the data point in position (n + 1)P/100 = (20 + 1)(80/100) = 16.8. Thus, the percentile is located at the 16.8th position. The 16th observation is 32, and the 17th observation is also 33. The 80th percentile is a point lying 0.8 of the way from 32 to 33 and is thus 32.8.

28

14

1/16/2019

Example



⚫ ⚫



To find the 90th percentile, determine the data point in position (n + 1)P/100 = (20 + 1)(90/100) = 18.9. Thus, the percentile is located at the 18.9th position. The 18th observation is 49, and the 19th observation is also 52. The 90th percentile is a point lying 0.9 of the way from 49 to 52 and is thus 49 + 0.9(52 – 49) = 49 + 0.93 = 49 + 2.7 = 51.7.

29

Quartiles – Special Percentiles ⚫









Quartiles are the percentage points that break down the ordered data set into quarters. The first quartile (lower quartile, Q1) is the 25th percentile. It is the point below which lie 1/4 of the data. The second quartile (middle quartile, Q2) is the 50th percentile. It is the point below which lie 1/2 of the data. This is also called the median. The third quartile (upper quartile, Q3) is the 75th percentile. It is the point below which lie 3/4 of the data. The interquartile range (IQR) is the difference between the first and the third quartiles. IQR = Q3 – Q1

30

15

1/16/2019

Example Finding Quartiles Billions 33 26 24 21 19 20 18 18 52 56 27 22 18 49 22 20 23 32 20 18

Sorted Billions 18 18 18 18 19 20 20 20 21 22 22 23 24 26 27 32 33 49 52 56

(n+1)P/100 Position

First Quartile (20+1)25/100=5.25

Median

(20+1)50/100=10.5

Third Quartile (20+1)75/100=15.75

Quartiles

19 + (.25)(1) = 19.25

22 + (.5)(0) = 22

27+ (.75)(5) = 30.75

31

Your Turn! Fifty statistics students were asked how much sleep they get per school night (rounded to the nearest hour). The results were (student data):

• • • •

28th percentile = 3rd quartile = 80th percentile = 90th percentile =

AMOUNT OF FREQUENCY SLEEP PER SCHOOL NIGHT (HOURS) 4 2 5 5 6 7 7 12 8 14 9 7 10 3

32

32

16

1/16/2019

Summary Measures: ⚫

Measures of Central Tendency



Measures of Variability ✓ Range

✓ Median

✓ Interquartile range

✓ Mode

✓ Variance ✓ Standard Deviation

✓ Mean



Measures of Shape: ✓ Skewness ✓ Kurtosis

33

MEASURES OF CENTER

(Ukuran Pemusatan) Mean, Median, Mode 34

17

1/16/2019

Arithmetic Mean or Average The mean of a set of measurements is the sum of the measurements divided by the total number of measurements. Symbol: x bar x Grouped data

Ungrouped data

x=

 xi n

x=

 f i .xi n

where n = number of measurements ෍ 𝑥𝑖 = sum of measurements

35

Example: Mean Consider 8 observations (xi) of pull-off force from engine connectors as shown in the table. i 8

x = average =

 xi i =1

=

8 104 = = 13.0 pounds 8

12.6 + 12.9 + ... + 13.1 8

xi 12.6 12.9 13.4 12.3 13.6 13.5 12.6 13.1 13.00 = AVERAGE($B2:$B9) 1 2 3 4 5 6 7 8

Figure 6-1 The sample mean is the balance point.

If we were able to enumerate the whole population, the population mean would be called μ (the Greek letter “mu”). 36

36

18

1/16/2019

Median • The median of a set of measurements is the middle measurement when the measurements are ranked from smallest to largest. • The position of the median is once the measurements have been ordered.

0.5(n +1) • Also called second quartile or 50th percentile

37

Example  The set: 2, 4, 9, 8, 6, 5, 3  Sort:

n=7

2, 3, 4, 5, 6, 8, 9  Position: 0.5(n + 1) = 0.5(7 + 1) = 4th

Median = 5

• The set: 2, 4, 9, 8, 6, 5 n=6 Median = (5 + 6)/2 = 5.5 • Sort: 2, 4, 5, 6, 8, 9 →average of the 3rd and 4th data • Position: 0.5(n + 1) = 0.5(6 + 1) = 3.5th

38

19

1/16/2019

Mode  The mode is the data which occurs most

frequently.  Example: 1. The set: 2, 4, 9, 8, 8, 5, 3  The mode is 8, which occurs twice 2. The set: 2, 2, 9, 8, 8, 5, 3  There are two modes—8 and 2 (bimodal) 3. The set: 2, 4, 9, 8, 5, 3  There is no mode (each value is unique). 39

Example The number of quarts of milk purchased by 25 households: 0 0 1 1 1 1 1 2 2 2 2 2 2 2 2 2 3 3 3 3 3 4 4 4 5

 Mean?

x=

 xi 55 = = 2.2 n 25

 Median?

m=2  Mode? (Highest peak)

mode = 2 40

20

1/16/2019

MEASURES OF VARIABILITY

(Ukuran Penyebaran) Range, Interquartile range, Variance, standard deviation 41

Variability  Tell us how far scores spread out

 Tells us how the degree to which scores deviate

from the central tendency

Mean = 10

Mean = 10

42

42

21

1/16/2019

Measures of Variability or Dispersion ⚫

Range ✓ Difference between maximum and minimum values



Interquartile Range ✓ Difference between third and first quartile



(Q3 - Q1)

Variance ✓ Average of the squared deviations from the mean



Standard Deviation ✓ Square root of the variance

43

Sample Range If the n observations in a sample are denoted by x1, x2, …, xn, the sample range is: r = max(xi) – min(xi)

(6-6)

It is the largest observation in the sample minus the smallest observation. From Example : r = 13.6 – 12.3 = 1.30

Note that: population range ≥ sample range 44

44

22

1/16/2019

Example 1-3: Finding range Billions 33 26 24 21 19 20 18 18 52 56 27 22 18 49 22 20 23 32 20 18

Sorted Billions 18 18 18 18 19 20 20 20 21 22 22 23 24 26 27 32 33 49 52 56

Ranks 1 2 3 4 5 First Quartile 6 7 8 9 10 Median 11 12 13 14 15 Third Quartile 16 17 18 19 20

Range = Maximum – Minimum = 56 – 18 = 38

(20+1)25/100=5.25

19 + (.25)(1) = 19.25

(20+1)50/100=10.5

22 + (.5)(0) = 22

(20+1)75/100=15.75

27+ (.75)(5) = 30.75

Interquartile Range = Q3 – Q1 = 30.75 – 19.25 = 11.5

45

Variance

( xi − x ) 2 2 s = n −1

( xi −  ) 2 2  = N

2

s2 =

( xi ) 2 n n −1

 xi −

2

2 =

( xi ) 2 n N

 xi −

46

46

23

1/16/2019

Standard Deviation  The standard deviation is the square root of the

variance.  σ is the population standard deviation symbol.  s is the sample standard deviation symbol. Sample standard deviation: 𝑠 = 𝑠 2 Population standard deviation : 𝜎 = 𝜎 2

47

47

Example : Sample Variance Table below displays the quantities needed to calculate the sample variance and sample standard deviation.

Dimension of: xi is pounds Mean is pounds. Variance is pounds2. Standard deviation is pounds. Desired accuracy is generally accepted to be one more place than the data.

i 1 2 3 4 5 6 7 8 sums =

xi x i - xbar 12.6 -0.4 12.9 -0.1 13.4 0.4 12.3 -0.7 13.6 0.6 13.5 0.5 12.6 -0.4 13.1 0.1 104.00 0.0 divide by 8 xbar = 13.00 variance = standard deviation =

2

(x i - xbar) 0.16 0.01 0.16 0.49 0.36 0.25 0.16 0.01 1.60 divide by 7 0.2286 0.48

48

48

24

1/16/2019

Example : Variance by Shortcut  n  x −   xi   i =1  i =1  2 s = n −1 n

2 i

2

n

1,353.60 − (104.0 ) 8 = 7 2

=

1.60 = 0.2286 pounds 2 7

i xi 1 12.6 2 12.9 3 13.4 4 12.3 5 13.6 6 13.5 7 12.6 8 13.1 sums = 104.0

2

xi 158.76 166.41 179.56 151.29 184.96 182.25 158.76 171.61 1,353.60

s = 0.2286 = 0.48 pounds 49

49

Exercise The experiment show that concentration of Cl- in the of solution is measured by one operator using the same instrument 8 times. She obtains the following data (ppm):

7.15, 7.20, 7.18, 7.19, 7.21, 7.20, 7.16, and 7.18  Calculate the sample mean, mode, median  Find 28th percentile, 80th percentile and 1st quartile

 Calculate variance and standard deviation 50

25

1/16/2019

51

52

52

26

Related Documents

Chile 1pdf
December 2019 139
Theevravadham 1pdf
April 2020 103
Majalla Karman 1pdf
April 2020 93
Rincon De Agus 1pdf
May 2020 84
Exemple Tema 1pdf
June 2020 78

More Documents from "Gerardo Garay Robles"