Types of Variables
Quantitative variables Eg. Number of childern in family Score in maths Sales in $ for a product Height/weight of student per capita income
Continuous Infinite number of values E.g sales in $, per capita income Weight/height of student etc.
Descrete Limited # of values e.g # of childeren in family, number of cars in city
Qualitative Variables/ classification variable/ categorical variable Defined or limited number of levels e.g Gender, Size of Tshirt, Winning position, colors
Nominal Can not order E.g Gender, color
Ordinal Order e.g size of t shirt - S<M<XL<XXL Winning position: 1>2nd runner up > 3rd runner up
Score data for a class id
Gender
race
ses
schtyp
prog
read
write
math
science
socst
70
Male
4
1
1
1
57
52
41
47
57
121 86
Female Male
4 4
2 3
1 1
3 1
68 44
59 33
53 54
63 58
61 31
141 172 113 50 11 84 48 75 60 95 104 38 115 76 195 114 85 167 143 41 20 12 53 154 178 196 29 126 103 192 150 199 144 200 80 16 153 176 177
Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male
4 4 4 3 1 4 3 4 4 4 4 3 4 4 4 4 4 4 4 3 1 1 3 4 4 4 2 4 4 4 4 4 4 4 4 1 4 4 4
3 2 2 2 2 2 2 2 2 3 3 1 1 3 2 3 2 2 2 2 3 2 2 3 2 3 1 2 3 3 2 3 3 2 3 1 2 2 2
1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 2 2 1 1 1 2 1 2 1 2 1 1 1 2 2
3 2 2 1 2 1 2 3 2 2 2 2 1 2 1 2 1 1 3 2 2 3 3 2 3 2 1 1 2 2 3 2 1 2 2 3 3 2 2
63 47 44 50 34 63 57 60 57 73 54 45 42 47 57 68 55 63 63 50 60 37 34 65 47 44 52 42 76 65 42 52 60 68 65 47 39 47 55
44 52 52 59 46 57 55 46 65 60 63 57 49 52 57 65 39 49 63 40 52 44 37 65 57 38 44 31 52 67 41 59 65 54 62 31 31 47 59
47 57 51 42 45 54 52 51 51 71 57 50 43 51 60 62 57 35 75 45 57 45 46 66 57 49 49 57 64 63 57 50 58 75 68 44 40 41 62
53 53 63 53 39 58 50 53 63 61 55 31 50 50 58 55 53 66 72 55 61 39 39 61 58 39 55 47 64 66 72 61 61 66 66 36 39 42 58
56 61 61 61 36 51 51 61 61 71 46 56 56 56 56 61 46 41 66 56 61 46 31 66 46 46 41 51 61 71 31 61 66 66 66 36 51 51 51
168 40 62 169 49 136 189 7 27 128 21 183 132 15 67 22 185 9 181 170 134 108 197 140 171 107 81 18 155 97 68 157 56 5 159 123 164 14 127 165 174 3 58 146 102
Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male Male
4 3 4 4 3 4 4 1 2 4 1 4 4 1 4 1 4 1 4 4 4 4 4 4 4 4 4 1 4 4 4 4 4 1 4 4 4 1 4 4 4 1 4 4 4
2 1 3 1 3 2 2 2 2 3 2 2 2 3 1 2 2 2 2 3 1 2 3 2 2 1 1 2 2 3 2 2 2 1 3 3 2 3 3 1 2 1 2 3 3
1 1 1 1 1 1 2 1 1 1 1 2 1 1 1 1 2 1 2 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1
2 1 1 1 3 2 2 2 2 2 1 2 2 3 3 3 2 3 2 2 1 1 2 3 2 3 2 3 1 2 2 1 3 2 2 1 3 2 2 3 2 2 3 2 2
52 42 65 55 50 65 47 57 53 39 44 63 73 39 37 42 63 48 50 47 44 34 50 44 60 47 63 50 44 60 73 68 55 47 55 68 31 47 63 36 68 63 55 55 52
54 41 65 59 40 59 59 54 61 33 44 59 62 39 37 39 57 49 46 62 44 33 42 41 54 39 43 33 44 54 67 59 45 40 61 59 36 41 59 49 59 65 41 62 41
57 43 48 63 39 70 63 59 61 38 61 49 73 44 42 39 55 52 45 61 39 41 50 40 60 47 59 49 46 58 71 58 46 43 54 56 46 54 57 54 71 48 40 64 51
55 50 63 69 49 63 53 47 57 47 50 55 69 26 33 56 58 44 58 69 34 36 36 50 55 42 65 44 39 58 63 74 58 45 49 63 39 42 55 61 66 63 44 63 53
51 41 66 46 47 51 46 51 56 41 46 71 66 42 32 46 41 51 61 66 46 36 61 26 66 26 44 36 51 61 66 66 51 31 61 66 46 56 56 36 56 56 41 66 56
117 133 94 24 149 82 8 129 173 57 100 1 194 88 99 47 120 166 65 101 89 54 180 162 4 131 125 34 106 130 93 163 37 35 87 73 151 44 152 105 28 91 45 116 33
Male Male Male Male Male Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female
4 4 4 2 4 4 1 4 4 4 4 1 4 4 4 3 4 4 4 4 4 3 4 4 1 4 4 1 4 4 4 4 3 1 4 4 4 3 4 4 2 4 3 4 2
3 2 3 2 1 3 1 1 1 2 3 1 3 3 3 1 3 2 2 3 1 1 3 2 1 3 1 3 2 3 3 1 1 1 2 2 2 1 3 2 2 3 1 2 1
1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 2 2 1 1 1 1 2 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1
3 3 2 2 1 2 2 1 1 2 2 3 2 2 1 2 2 2 2 2 3 1 2 3 2 2 2 2 3 1 2 2 3 1 1 2 3 3 2 2 1 3 3 2 2
34 50 55 52 63 68 39 44 50 71 63 34 63 68 47 47 63 52 55 60 35 47 71 57 44 65 68 73 36 43 73 52 41 60 50 50 47 47 55 50 39 50 34 57 57
49 31 49 62 49 62 44 44 62 65 65 44 63 60 59 46 52 59 54 62 35 54 65 52 50 59 65 61 44 54 67 57 47 54 52 52 46 62 57 41 53 49 35 59 65
39 40 61 66 49 65 52 46 61 72 71 40 69 64 56 49 54 53 66 67 40 46 69 40 41 57 58 57 37 55 62 64 40 50 46 53 52 45 56 45 54 56 41 54 72
42 34 61 47 66 69 44 47 63 66 69 39 61 69 66 33 50 61 42 50 51 50 58 61 39 46 59 55 42 55 58 58 39 50 50 39 48 34 58 44 50 47 29 50 54
56 31 56 46 46 61 48 51 51 56 71 41 61 66 61 41 51 51 56 56 33 56 71 56 51 66 56 66 41 46 66 56 51 51 56 56 46 46 61 56 41 46 26 56 56
66 72 77 61 190 42 2 55 19 90 142 17 122 191 83 182 6 46 43 96 138 10 71 139 110 148 109 39 147 74 198 161 112 69 156 111 186 98 119 13 51 26 36 135 59
Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female
4 4 4 4 4 3 1 3 1 4 4 1 4 4 4 4 1 3 3 4 4 1 4 4 4 4 4 3 4 4 4 4 4 4 4 4 4 4 4 1 3 2 3 4 4
2 2 1 3 2 2 2 2 1 3 2 2 2 3 2 2 1 1 1 3 2 2 2 2 2 2 2 3 1 2 3 1 2 1 2 1 2 1 1 2 3 3 1 1 2
1 1 1 1 2 1 1 2 1 1 1 1 1 2 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 2 1 1 1 1 1 1 1 1
3 3 2 2 2 3 3 2 1 2 3 2 2 2 3 2 2 2 2 2 3 1 1 2 3 3 1 2 2 2 2 2 2 3 2 1 2 3 1 3 1 2 1 2 2
68 42 61 76 47 46 39 52 28 42 47 47 52 47 50 44 47 45 47 65 43 47 57 68 52 42 42 66 47 57 47 57 52 44 50 39 57 57 42 47 42 60 44 63 65
62 54 59 63 59 52 41 49 46 54 42 57 59 52 62 52 41 55 37 54 57 54 62 59 55 57 39 67 62 50 61 62 59 44 59 54 62 60 57 46 36 59 49 60 67
56 47 49 60 54 55 33 49 43 50 52 48 58 43 41 43 46 44 43 61 40 49 56 61 50 51 42 67 53 50 51 72 48 40 53 39 63 51 45 39 42 62 44 65 63
50 47 44 67 58 44 42 44 44 50 39 44 53 48 55 44 40 34 42 58 50 53 58 55 54 47 42 61 53 51 63 61 55 40 61 47 55 53 50 47 31 61 35 54 55
51 46 66 66 46 56 41 61 51 52 51 41 66 61 31 51 41 41 46 56 51 61 66 71 61 61 41 66 61 58 31 61 61 31 61 36 41 37 43 61 39 51 51 66 71
78 64 63 79 193 92 160 32 23 158 25 188 52 124 175 184 30 179 31 145 187 118
Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female Female
4 4 4 4 4 4 4 2 2 4 2 4 3 4 4 4 2 4 2 4 4 4
2 3 1 2 2 3 2 3 1 2 2 3 1 1 3 2 3 2 2 2 2 2
1 1 1 1 2 1 1 1 1 1 1 2 1 1 2 2 1 2 2 1 2 1
2 3 1 2 2 1 2 3 2 1 1 2 2 3 1 3 2 2 1 3 1 1
39 50 52 60 44 52 55 50 65 52 47 63 50 42 36 50 41 47 55 42 57 55
54 52 65 62 49 67 65 67 65 54 44 62 46 54 57 52 59 65 59 46 41 62
54 45 60 49 48 57 55 66 64 55 42 56 53 41 42 53 42 60 52 38 57 58
53 58 56 50 39 63 50 66 58 53 42 55 53 42 50 55 34 50 42 36 55 58
41 36 51 51 51 61 61 56 71 51 36 61 66 41 41 56 51 56 56 46 52 61
137
Female
4
3
1
2
63
65
65
53
61
Variable types Qualitative variable Continuous variable
Read Scores
Mean 52.23
Obs
Average
Variable Gender
Middle Value Value with maximum replication
Score of read, write, math, science
Read Variable : Measures of Central Tendancy Median Mode 50 47
Average of deviation of each observation from mean of data.
Dispersion STD 10.25
For eg the first observation of read has value of 57 and mean of read score is 52.23. So the difference 57-52.23 = ~5. In STD all deviations are taken from each data point w.r.t mean and the average of such deviations is called STD
Read Asc Order 1
28
2
31
3
34
4
34
5
34
6
34
7
34
8
34
9
35
10
36
Q1(25%)
44 =200*0.25=50th
11
36
Q2/Median (50%)
50 =200*0.5=100th
12
36
Q3 (75%)
60 =200*0.75=150th
13
37
14
37
15
39
16
39
17
39
18
39
19
39
20
39
21
39
22
39
23
41
24
41
25
42
26
42
27
42
28
42
29
42
30
42
Percentiles/ Quartiles
Value
Value of Obs number
31
42
32
42
33
42
34
42
35
42
36
42
37
42
38
43
39
43
40
44
41
44
42
44
43
44
44
44
45
44
46
44
47
44
48
44
49
44
50
44
51
44
52
44
53
45
54
45
55
46
56
47
57
47
58
47
59
47
60
47
61
47
62
47
63
47
64
47
65
47
66
47
67
47
68
47
69
47
70
47
71
47
72
47
73
47
74
47
75
47
Q1 (25% quartile)
76
47
77
47
78
47
79
47
80
47
81
47
82
47
83
48
84
50
85
50
86
50
87
50
88
50
89
50
90
50
91
50
92
50
93
50
94
50
95
50
96
50
97
50
98
50
99
50
100
50
101
50
102
52
103
52
104
52
105
52
106
52
107
52
108
52
109
52
110
52
111
52
112
52
113
52
114
52
115
52
116
53
117
54
118
55
119
55
120
55
Median or Q2 (50% quartile)
121
55
122
55
123
55
124
55
125
55
126
55
127
55
128
55
129
55
130
55
131
57
132
57
133
57
134
57
135
57
136
57
137
57
138
57
139
57
140
57
141
57
142
57
143
57
144
57
145
60
146
60
147
60
148
60
149
60
150
60
151
60
152
60
153
60
154
61
155
63
156
63
157
63
158
63
159
63
160
63
161
63
162
63
163
63
164
63
165
63
Q3 (75% quartile)
166
63
167
63
168
63
169
63
170
63
171
65
172
65
173
65
174
65
175
65
176
65
177
65
178
65
179
65
180
66
181
68
182
68
183
68
184
68
185
68
186
68
187
68
188
68
189
68
190
68
191
68
192
71
193
71
194
73
195
73
196
73
197
73
198
73
199
76
200
76
Score data for a class id
Gender
race
ses
schtyp
prog
read
write
70
Male
4
1
1
1
57
52
Variable types
Variable
Qualitative variable Continuous variable
Gender Score of read, write, math, science
Average of deviation of each observation from mean of data.
Average
For eg the first observation of read has value of 57 and mean of read score is 52.23. So the difference 5752.23 = ~5. In STD all deviations are taken from each data point w.r.t mean and the average of such deviations is called STD
Middle Value
Read Scores
Mean 52.23
Obs
Read Variable : Measures of Central Tendancy Median Mode 50 47
Read Asc Order
Dispersion STD 10.25
Value with maximum replication
1
28
2
31
3
34
4
34
5
34
6
34
7
34
8
34
9
35
10
36
Q1(25%)
44 =200*0.25=50th
11
36
Q2/Median (50%)
50 =200*0.5=100th
12
36
Q3 (75%)
60 =200*0.75=150th
13
37
48
44
49
44
50
44
51
44
99
50
100
50
101
50
102
52
148
60
149
60
150
60
151
60
Percentiles/ Quartiles
Q1 (25% quartile)
Median or Q2 (50% quartile)
Q3 (75% quartile)
Value
Value of Obs number
n of each mean of data.
ervation of read d mean of read he difference 57D all deviations h data point w.r.t age of such d STD
math
science
socst
41
47
57
C13:
Average
D13:
Middle Value
E13:
Value with maximum replication
F13:
Average of deviation of each observation from mean of data. For eg the first observation of read has value of 57 and mean of read score is 52.23. So the difference 57-52.23 = ~5. In STD all deviations are taken from each data point w.r.t mean and the average of such deviations is called STD
Normal distribution or a bell curve Confidence Interval, which represents the range of data distribution. Used in hypothesis testing, if a value lies in this range the value belongs to this distribution
Mean = Median = Mode
- H0 and H1 hypothesis - Critical region - Significance level (5%) - Type I & II err - one tail and 2 tail test
Z statistic Confidence Interval p- value
-1.96 30.4 2.5% Mean Std Dev
+1.96 69.6 2.5%
Confidence Interval 50 10
Standard Error = Std Dev/ √n
Problem Statement: Is Sample Mean representative of Population Sample Mean score Hypothesized of Read Subject population Mean 52.23 51
Z Statistic =(52.23-51)/ (10/SQRT(200))
Z Stat cut 1.739483 off
ean = dian = Mode
Key Highlights of normal distribution - Data is plotted on X axis and Y axis represents frequency - Data distribution: 68% of data lies in +1 Stdev and -1 Stdev w.r.t mean and so on…. - Mean = Median = Mode - While testing Hypothesis data beyond -2 stdev and +2 stdev is considered to be not of the distribution and H 0 is rejected.
Read Score 57 68 44 63 47 44 50 34 63 57 60 57 73 54 45 42 47 57 68 55 63 63 50 60 37 34 65 47 44 52 42 76 65 42 52 60 68 65 47 39 47
Steps in creating bell curve with an example 1. Take out mean and Stdev from a distribution 2. Calculate Mean - 1 Stdev, Mean -2 Stdev, Mean -3 Stdev 3. Similarlyy Calculate Mean + 1 Stdev, Mean +2 Stdev, Mean +3 Stdev 4. Plot frequency against point 2 and 3 from data Frequency
Normal distribution Standards Value
`
Mean - 3 Stdev
21.47118952
Mean - 2 Stdev
31.72412635
Mean - 1 Stdev
41.97706317
Mean
52.23
Mean + 1 Stdev
62.48293683
Mean + 2 Stdev
72.73587365
Mean + 3 Stdev
82.98881048
0 2 22 91 39 39 7
Go to data tab -> Data analysis -> Histogram and select data range and bin range to populate frequenc`y
Rejection area. H0 is rejected for values at this level i.e between -2 to -3 Stdev. Here Frequency = 0
~ Normally distributed data 100 90 80 70 60 50 40 30 20 10 0
Frequency
55 52 42 65 55 50 65 47 57 53 39 44 63 73 39 37 42 63 48 50 47 44 34 50 44 60 47 63 50 44 60 73 68 55 47 55 68 31 47 63 36 68 63 55 55 52 34 50
55 52 63 68 39 44 50 71 63 34 63 68 47 47 63 52 55 60 35 47 71 57 44 65 68 73 36 43 73 52 41 60 50 50 47 47 55 50 39 50 34 57 57 68 42 61 76 47
46 39 52 28 42 47 47 52 47 50 44 47 45 47 65 43 47 57 68 52 42 42 66 47 57 47 57 52 44 50 39 57 57 42 47 42 60 44 63 65 39 50 52 60 44 52 55 50
65 52 47 63 50 42 36 50 41 47 55 42 57 55 63
and H 0 is rejected.
to data tab -> Data lysis -> Histogram and ct data range and bin ge to populate uenc`y
Rejection area. H0 is rejected for values at this level i.e between -2 to -3 Stdev. Here Frequency = 0
Frequency
N15:
Rejection area. H0 is rejected for values at this level i.e between -2 to -3 Stdev. Here Frequency = 0
P14:
Go to data tab -> Data analysis -> Histogram and select data range and bin range to populate frequenc`y
id
Gender
race
ses
schtyp
prog
read
write
math
science
121 Female
4
2
1
3
68
59
53
63
82 Female
4
3
1
2
68
62
65
69
8 Female
1
1
1
2
39
44
52
44
129 Female
4
1
1
1
44
44
46
47
173 Female
4
1
1
1
50
62
61
63
57 Female
4
2
1
2
71
65
72
66
100 Female
4
3
1
2
63
65
71
69
1 Female
1
1
1
3
34
44
40
39
194 Female
4
3
2
2
63
63
69
61
88 Female
4
3
1
2
68
60
64
69
99 Female
4
3
1
1
47
59
56
66
Cov = (x-x{mean})* (Y-Y{mean})
socst
Correctes SS Covariance
Covariance and corrected Sum of squares calculation between Maths (x) and Science(y)
61
3.95825
65.163
61
211.88825
1845.183
48
5.06325
-10.707
51
32.22825
-43.407
51
93.15825
-121.017
56
273.87325
998.593
71
314.78825
5839.823
41
162.48825
-1848.567
61
149.64825
1264.023
66
194.73825
2676.843
61
47.47325
441.003
11642.35 58.50
L1:
Covariance and corrected Sum of squares calculation between Maths (x) and Science(y)