Exercise 1 Data Management
Ecology Laboratory
Submitted by Gerardo, Mary Antonette Maguslog, Justine Salumbre, Renz Surquia, Joseph
Introduction Data management is a must in the study of ecology. Data management is done using various statistical methods and is valuable in decision-making especially in scientific studies that uses controlled experiments. To gain information, data is first collected. A datum is a value that a variable assumes. Then the data collected is subjected to a particular test that fits the experimentation taken with the hypothesis involved. Most statistical tests are designed so that error is minimized if not completely prevented. The value of these tests rests on the fact that a study of whole populations is quite impossible, and that these tests account for the whole population. Meaning to say, a statistical test provides a conclusive “guesswork” or generalization of a population. It provides a specific view and reveals the salient feature of a population. There are now many ways of getting around statistical tests. With the advent of advanced spreadsheet softwares, data management is now routine. Programs such as Microsoft Excel, Apple Numbers, Minitab and SPSS, and some advance Linux-based open-source spreadsheet, there is now this certain ease at attending to a certain statistical protocol. But it does not mean that these softwares have no degree of complexity. An understanding of commands and special program functions is required, and since two programs do not necessarily have the same build, there is a wide range of commands. Learning them, though, will certainly prove advantageous. The objectives in the Data Management experiment are: 1) to be able to learn some of the principles and techniques of data management; and 2) to be able to familiarize one self with the use of the computer and spreadsheet. Materials and Methodology The materials that were used in the experiment are bond papers, graphing papers,
pencils, erasers, scientific calculator, personal computer or laptop with spreadsheet. Before statistical tests were performed, a general method in hypothesis testing was followed. First, the null and alternative hypothesis were stated; second, the level of significance was selected; third, the critical value and the rejection region were determined; fourth, the decision rule was stated; fifth, the test statistic was computed; and finally, a decision whether to reject or not to reject the null hypothesis was made. In practice exercise I, a Student’s t-test was performed to see whether there is a significant difference in the growth of oat coleoptiles treated with indole acetic acid (IAA) in comparison with untreated controls. The group then used Microsoft Excel, and in the program, the first two columns were labeled as Control and IAA.
Formula 1. t-test formula
In practice exercise II, Microsoft Excel was used to plot and print an XY graph of larval growth in Noctua pronuba using the given data. As for practice exercise III, Microsoft Excel was also used to plot and print a graph of “Growth in Pices halluciginea” based on the equation (Formula 2) w =aLb where w is weight in grams and is the dependent variable, a is the coefficient of proportionality (0.001), L is length in mm and is the independent variable, and b is 3. For practice exercise IV, the length-weight equation in the previous problem was plotted using a log-log plot. For practice exercise V, a species effort curve was made. It involves taking samples, identifying, and counting species in a sample. The cumulative number of species was plotted against the number of samples. For practice exercise VI, a histogram was made with the data on Age Distribution of Male Perch in Lake Windermere, England in the year 1966. For practice exercise VII, a graph of population growth of Selenastrum capricornutum was plotted using the exponential growth equation (Formula 3) Nt = N0en. In this graph, the population size was taken as the dependent variable and time in days as
the independent variable. In practice exercises VIII and IX, ANOVA and Kruskall-Wallis Test were performed.
Results and Discussion 1.
Control 10.1 9.8 10.3 10.2 9.9 10.5 10.7 10.0 10.7 9.8
IAA 11.8 12.7 11.2 13.0 12.9 13.2 13.5 12.6 13.9 13.9
Table 1.1 Data for problem 1
N 1 2 3 4 5 6 7 8 9 10 Total Mean
Control 10.1 9.8 10.3 10.2 9.9 10.5 10.7 10.0 10.7 9.8 102 10.24
Control2 102.01 96.04 106.09 104.04 98.01 110.25 114.49 100 114.49 96.04 1041.46
IAA 11.8 12.7 11.2 13.0 12.9 13.2 13.5 12.6 13.9 13.9 128.7 12.87
IAA2 139.24 161.29 125.44 169 166.41 174.24 182.25 158.76 193.21 193.21 1663.05
Table 1.2 Square of Control and IAA with their respective Means
Standard Deviation Control IAA 0.343187671 0.86158768 Table 1.3 Standard Deviation of Control and IAA
Control 10.8 10.6 10.4 10.2 10 9.8 Coleoptile length 9.6 9.4 9.2 1
2
3
4
5
6
7
8
9
10
Sample
Using manual
Graph 1.1 Sample vs. Coleoptile length in Control
computation of the t-test, the group came up with the result of -8.97 while using Microsoft Excel, the generated answer was 0.37589951 and the conclusion was to reject the null hypothesis.
Treated with IAA 16 14 12 10 8 6 Coleoptile length 4 2 0 1
2
3
4
5
6
7
8
9
10
Sample
Graph 1.2 Sample vs. Coleoptile Length in IAA
The t-test assesses whether the means of two groups are statistically different from each other. This analysis is appropriate whenever you want to compare the means of two groups. The t-test gives the probability that the difference between the two means is caused by chance. It is customary to say that if this probability is less than 0.05, that the
difference is 'significant', the difference is not caused by chance. The result in the t-test shows that there is a significant difference in the growth of coleoptiles treated with IAA and untreated controls. 2. Instar
Mean Body length (mm) 2.52 4.3 6.62 10.35 15.14 23.36 35.90
1 2 3 4 5 6 7 Table Data
2.1 for
40 35 30 25 20 15 10 5 0 1
2
3
4
problem 2
5
6
7
Graph 2.1 Instar vs. Mean Body Length
This graph was used to test the null hypothesis which is the growth rate is linear and does not change as the caterpillar grows. This graph shows the relationship of instar and body length is directly proportional. The graph exhibit that the growth rate is not linear and it changes as the caterpillar grows. The graph also tells us that the relationship between growth rate and body length is exponential which means that even when it seems slow on the short run, it becomes impressively fast on the long run. Length in mm
3.
Weight in grams 50 65 80 95 110 125 140 155 170 185 200
125 274.625 512 857.375 1331 1953.125 2744 3723.875 4913 6331.625 8000
Table 3.1 Data for problem 3 Length-Weight Relationship in the growth of Pisces halllucigenia 9000 8000 7000 6000
Log 5000
Length-log Weight relationshp 4000 in the growth of Pisces hallucigenia 3000 weight 2000 in grams
4.5
4 3.5 3 2.5
4.
1000 0 50
65
80
95 110 125 140 155 170 185 200 length in mm
Graph 3.1 Length-Weight Relationship
2 1.5 1 Log weight in grams 0.5 0 1.7 1.81 1.9 1.98 2.04 2.1 2.15 2.19 2.23 2.27 2.3 Log length in mm
Graph 4.1 Log Length and Log Weight Relationship Log length in mm 1.698970004 1.812913357 1.903089987 1.977723605 2.041392685 2.096910013 2.146128036 2.190331698 2.230448921 2.267171728 2.301029996
Log weight in grams 2.096910013 2.43874007 2.709269961 2.933170816 3.124178055 3.290730039 3.438384107 3.570995095 3.691346764 3.801515185 3.903089987
Table 4.1 Data for problem 4 in log
5. Number of samples 1 35 2 3 30 5 25
Cumulative Number of # of speciesSpecies speciesEffort 6 10 8 15 14 20 19 30
Cumulative
Curve # of species
22 24 25 27
Number of samples 40 50 80 100
Cumulative # of species 28 28 29 29 Table 5.1 Data for problem 5
20 15 10 5
cumulative # of species 0
1
2
3
5
10
15
20
30
# of samples
40
50
80
100
Graph 5.1 Species effort curve
In the species effort curve, the following have been concluded by the group: 1) the most common species will be found first; 2) the most dominant species will control the whole population; 3) an intensive sampling is necessary in order to satisfy the real number of species; 4) the curve depends primarily on two factors, the first one is the community or area of sampling and the second is the method of trapping. 6. Age (years) 2 3 4 5
% of male perch pop. 0 2 0 2
Age (years) 6 7 8 9
% of male perch pop. 9 60 6 12
Table 6.1 Data for problem 6
Age (years) 10 11 12
% of male perch pop. 3 6 0
Age Distribution of Male Perch in Lake Windermere, England 70 60 50 40 30 20 10 Percentage of Male Perch Population 0 1
2
3
4
5
6
7
8
9
10
11
12
Age in Years
Graph 6.1 Histogram of Age Distribution of Male Perch
Male perch are aged using scale, otolith, spine, and opercle. The histogram shows us that 7 years old male perch was the most abundant in Lake Windermere , England in the year 1966. This tells us that, 7 years ago many male perch survived and this also tells us that, 2 years, 4 years and 12 years ago there must be a rampant fish kill that many male perch did not survived. The study uses male perch and not female perch because male perch are more stable than female perch. Also, the age of the male perch are more easy to identify because male perch exhibit standard length, weight and markings at a specific age. Histogram was used in this study instead of a pie chart because 0 value was presented in the histogram unlike in a pie chart 0 value was not presented. 0 value were significant in this study because this value can tell us something like presented above.
7. N0 = (5382)(2.7)(1.5)(0) = 5382 cell/ml
Time (days) 0 1 2 3
N1 = (5382)(2.7)(1.5)(1) = 23878 cells/ml N2 = (5382)(2.7)(1.5)(2) = 105934 cell/ml
Population size (cells/mL) 5382 23878 105934 469981
Table 7.1 Data for problem 7
N3 = (5382)(2.7)(1.5)(3) = 469981 cells/ml
Population Growth of Selenastrum capricornutum 500000 400000 300000
Series1
200000 100000 Population Size 0 1 2
3
4
Time in Days
Graph 7.1 Population Growth of Selenastrum capricornutum
8. ANALYZE THE GIVEN DATA USING ANOVA
A 78 88 87 88 83 82 81 80 80 89
B 78 78 83 81 78 81 81 82 76 76
C 79 73 79 75 77 78 80 78 83 84 Table 8.1. Data for problem set 8
One-way ANOVA: C1, C2, C3, C4 Source DF SS MS F P Factor 3 341.9 114.0 9.01 0.000 Error 36 455.6 12.7 Total 39 797.5 S = 3.557 R-Sq = 42.87% R-Sq(adj) = 38.11% Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev -+---------+---------+---------+-------C1 10 83.600 4.033 (------*-----) C2 10 79.400 2.503 (------*-----) C3 10 78.600 3.307 (------*-----) C4 10 75.400 4.142 (-----*------) -+---------+---------+---------+-------73.5 77.0 80.5 84.0 Pooled StDev = 3.557 Fig. 8.1. Minitab-generated analysis of variance H0 HA Level of Significance Critical Value Conclusion
µ1 = µ2 = µ3 = µ4 µ1 ≠ µ2 ≠ µ3 ≠ µ4 α0.5
D 77 69 75 70 74 83 80 75 76 75
Table 8.2. Summary of results
The Analysis of Variance (ANOVA) is a statistical technique that makes use of the F- test, and tests for a hypothesis concerning the means of more than two populations. In an experiment, certain situations, although concerning the same elements, may exhibit a degree of variability. In such cases, ANOVA is used as an estimating tool. In ANOVA, the total variations are accounted for and subsequently subdivided to various factors of interest to the observer or experimenter. There are assumptions made in using ANOVA. These assumptions are similar to the t-test and the F statistic. The basic assumption that must be first satisfied is that the data must be normally distributed with a common variance; otherwise another test is performed such as the Kruskall-Wallis nonparametric test.
9. ANALYZE THE GIVEN DATA USING KRUSKAL-WALLIS TEST A 78 88 87 88 83 82 81 80 80 89
B 78 78 83 81 78 81 81 82 76 76
C 79 73 79 75 77 78 80 78 83 84
D 77 69 75 70 74 83 80 75 76 75
Table 9.1. Data for problem set 9
A
B
78
16.5
78
88
38.5
78
87
37
83
88
38.5
81
83
33.5
78
C 1 6.5 1 6.5 3 3.5 2 7.5 1 6.5
79 73 79 75 77
D 2 0.5 3
77 69
1 2.5 1
2 0.5 6.5
75
6.5
70
2
1 2.5
74
4
82
30.5
81
81
27.5
81
80
23.5
82
80
23.5
76
89
40
76
2 7.5 2 7.5 3 0.5 10
78
10
84
80 78 83
1 6.5 2 3.5 1 6.5 3 3.5 36
83
75
3 3.5 2 3.5 6.5
76
10
75
6.5
80
Table 9.2 Data and ranking
Ti = 18909.40 H = 15.36
H0 HA Level of Significance Critical Value Conclusion
µ1 = µ2 = µ3 = µ4 µ1 ≠ µ2 ≠ µ3 ≠ µ4 α0.5 7.81 Reject null hypothesis
Table 9.2. Summary of results.
The Kruskall-Wallis Test is a nonparametric test serving as an alternative to ANOVA. This test is used to detect differences in locations among more than two population distributions based on independent random sampling. In this respect, the Kruskall-Wallis test is similar in aspect to ANOVA, except that the only situations where a Kruskall-Wallis test is appropriate is when the data cannot be assumed to have a normal distribution and/or a problem with heteroscedasticity arises. This test is commonly performed when there is one attribute variable and one measurement variable, and the measurement variable does not meet the normality assumption of ANOVA. If the original data set actually consists of one attribute variable and one ranked variable, an ANOVA cannot be performed. Conclusion Data management is necessary for the interpretation of data. There are many methods of analyzing data such as the Student’s t-test, Analysis of Variance and the Kruskall-Wallis. There are many criteria to which we can fit our data so that a specific
test can be made. There are also various ways of transforming the data without breaking the integrity of the collected data. There are also many ways by which we can represent our results such as a histogram, a scatter plot or a simple line graph. There are many software specialized for statistical data analysis. One of the most common is Microsoft Excel. Other programs such as SPSS and Minitab are much more sophisticated programs in that they are committed to statistics only.
References A. Books Alferez, M.S. & M.C.A. Duro. 2006. Statistics and probability. MSA : Quezon City Magurran, A.E. 2004. Measuring biological diversity. Blackwell Publishing : Australia Mendenhall, W., R.J. Beaver & B.M. Beaver. Introduction to probability and Statistics. Thomson Brooks/Cole : Singapore Odum, E.P. & G.W. Barrett. Fundamentals of ecology. Thomson Brooks/Cole:Canada B. Websites http://answers.yahoo.com/question/index?qid=20070826050030AACAguZ http://en.wikipedia.org/wiki/Kruskal-Wallis_one-way_analysis_of_variance http://weblogs.elearning.ubc.ca/biol300/archives/2006/04/anova_vs_kruska.php http://www.socialresearchmethods.net/kb/stat_t.php http://udel.edu/~mcdonald/statkruskalwallis.html http://yhspatriot.yorktown.arlington.k12.va.us/~dwaldron/stat_examp.html#krusk