Analysis of Panel Data : Multilevel Panel, Growth, and Latent Curve Models Sessions 6-7: 17-18 June 2014 Steven Finkel, PhD Daniel Wallace Professor of Political Science University of Pittsburgh USA
• A final way to deal with temporal dependence is to conceive of Y as a function of Time itself. That is, Y at time t is function of TIME, not of Yt-1 or of εt (or U) being dependent on εt-1. The relationship between Yt and Yt-1 is produced by the progression or “growth” (either positive or negative) of the unit through time. • Increasingly popular in many areas of social sciences, e.g. psychology, education, biostatistics – – – –
Children’s acquisition of vocabulary or motor skills as function of time Delinquency behaviors in adolescents and young adulthood Sexual behaviors post-puberty Recovery from depression, post-therapeutic treatment
• In all of these examples, we can talk about growth over time, and we could model the outcome for each individual as a positive or negative function of time itself. As months past puberty increase, amount of sexual activity increases, etc. This is a very intuitive and very direct way to talk about change. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• Growth models will be our window into general area of multilevel panel models, the third major analytic framework for panel analysis • NOTE: The dependent variable in growth (multilevel) models need not be continuous, though this is how we will introduce the topic . • In the examples above, in fact, the DV might be conceived (no pun intended) as the occurrence of sexual behaviors of some kind, or in the second example, as a frequency or count of the number of delinquent acts at a given point in time. In the last example, we can model the “growth” in some ordinal measure of non-depression (so to speak), i.e., recovery, as a function of time since a treatment for an experimental and control group. • We can model all of these kinds of growth processes in non-continuous dependent variables within the framework of “Multilevel Generalized Linear Mixed Models,” (or its SEM equivalent of “Generalized SEM”) which we will cover in the last session. For now, we will stick to continuous outcomes to talk about the basic ideas of the approach and basic modeling and estimation issues (which are complicated enough in the linear case!) Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Growth Models in Political Science • Relatively few, but promising method since many DVs in our discipline also involve growth – Plutzer, “Becoming a Habitual Voter,” 2001 American Political Science Review: Models growth in the likelihood of voting among people entering the electorate. The model includes each election after which individual is eligible, so can talk of developmental process of vote. Models influences on the initial vote decision and then the likelihood of subsequent voting as a function of time and other variables. – Finkel, Pérez-Liñán, Seligson, “The Effects of US Foreign Assistance on Democracy Building, 1990-2003,” 2007 World Politics: Models the growth in democracy among countries world-wide from 1990-2003. Using Polity IV and Freedom House measures, it plots trajectories of democratic growth as a function of time, and then models the factors determine that determine the initial 1990 starting point and then the growth rates over time. – Paxton, Hughes and Painter, 2010 European Journal of Political Research model growth in women’s representation 1975-2000 in 110 countries as a function of the existence of gender quotas, overall growth in democracy, and other factors that vary over countries and/or over time Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• Possible applications to other areas of political science: – Growth in system or regime support among individuals, either in new or advanced democracies – Growth in party identification in new democracies – Growth in party support during a political campaign – Growth in budgets, revenues, political violence, trade, democracy, human rights violations, etc. etc.
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
The Basic Growth Model • We can start to model this idea of growth very simply with:
(1)
Yit = β0 + β1Timei + ε it
• which models Y for a particular unit at a particular point in time as a function of TIME. The regression coefficient would be interpreted in accordance with the unit of time (years, months, days, etc.) that was chosen for the given model. • Note on subscripts: Time formally has an i subscript because different individuals are observed at different waves. In a balanced design you would not need the i because everyone would have the same values for Time at all times. You also could put a t in Time i subscript but this seems superfluous, since the value of Time at any t is already t. I will omit the i subscript from here on, but it is implicitly there. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• But this model only contains one of the two main ideas from this approach. In fact it is a model that applies equally to every unit, as every case in the sample (and population) is assumed to be governed by the same parameters (β0 and β1) in determining Y. This is not realistic in the panel situation, due to… • HETEROGENEITY of the units! This is a big reason why we do panels in the first place, and it has been the focus of much of our attention this week. – We need at least to take into account the fact that observations generally will be higher or lower on Y due to unmeasured stable factors (previously labeled Ui ); alternatively, – We can also say that we include unit-level heterogeneity to model the fact that the *same* unit is being observed over time, and so the observations are not independent. We model the clustered nature of the data so as to make correct inferences (i.e. correct standard errors of estimated parameters) and to model possible differences in the outcomes and the effects of variables on the outcomes over time. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
How to Incorporate Heterogeneity into the Growth Model? • One way: different units are likely to have different “starting” points or intercepts as well, exactly as in the econometric models we’ve looked at with fixed or random effects in the non-growth model context.
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• Some cases might start higher or lower than others, and stay that way regardless of the overall effect of TIME. In the FE tradition, we have a dummy variable that stands for this unobserved unit-specific effect, while in the RE tradition, we assume a normal distribution for the unit effects, and a random draw from this distribution makes a given case higher or lower at all times from the common intercept value.
(2)
Yit = β0i + β1Time + ε it and β0i = β00 + U i
• Where β0i is the intercept for a given case, separate from the influence of TIME, and is composed of the common intercept plus the individual unit effect • This is old news to us!!!!
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• But growth models --and the more general framework that we will deal with in this unit --also allow another kind of heterogeneity, unitlevel differences in the slopes or rates of change over time as well. That is, in growth models we allow for possible differences in β1 across units, so that some units may grow at a faster rate than others. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• In terms of the above examples: Some children pick up vocabulary at faster rates than others, some countries increase in democracy at faster rates than others, some adolescents develop delinquent behaviors more rapidly over time than others, some people change more strongly in their support for a candidate during a particular campaign than others. All of these processes necessitate heterogeneity in the TIME effect and not only in the INTERCEPT effect.
(3)
Yit = β0i + β1iTime + ε it
• where each case now has its own intercept as well as its own slope for growth over time. Each unit has its own “growth trajectory” • This is the basic structure of the growth model: individual units start at different points and change at different rates on some dependent variable of interest. This structure accounts for unit-level clustering with the different intercepts – which push cases up or down in general compared to other cases – and allows different rates of change from case to case as well. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Examples: Finkel/ Pérez-Liñán/Seligson (2007) on growth in democracy
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Modeling Unit-Specific Growth Trajectories • Next step: use the pooled data for all cases to model the differential growth trajectories for different units, that is, to determine what unitlevel factors predict the units’ rates of change and their “starting points” or growth trajectory intercepts. • Examples: – Children with educated parents learn vocabulary more quickly than children with non-educated parents – Countries with higher levels of ethnic fractionalization democratize less quickly than others – Patients who were treated with one type of therapy recovered from depression more quickly than patients treated with another type – Children who come from divorced households are more likely to show increases in delinquent behavior activities over the course of adolescence and early adulthood than children from stable households
• So we model change within units over time, and then account for the rates of change across units with a series of independent variables Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Key Features of the Growth Model • Aside from focusing on modeling unit-specific trajectories, two aspects are of most interest: – It is a multilevel or a hierarchical model. The data comprise a hierarchical or “nested” structure, with analysis and modeling taking place at different levels of the hierarchy; and – It is a “mixed” model -- with some elaboration of equation (3) above, it will be seen as containing both “fixed” and “random” effects (though the former term is used in a slightly different sense than what we have talked about so far)
• So the growth model is one type of multilevel, or hierarchical mixed model as applied to longitudinal data – There are also multilevel or mixed models for non-longitudinal data – There are also multilevel or mixed longitudinal models that are not explicitly growth models – In fact, all of the models we’ve looked at can fit into the multilevel framework!
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• The HIERARCHICAL structure of panel data: we conceive of the “wave of observation” as comprising the “lowest level” of the hierarchy, the observations of a given individual i at occasions t1, t2, t3, etc. • These observations over time are then nested within units – that is, the level 1 observations for unit 1 are nested within “unit 1,” the level 1 observations for unit 2 are nested within “unit 2,” and so on. • The entire sample consists of N units at Level 2, with T observations nested within each unit at Level 1. Each Level 2 unit comprises a cluster of non-independent observations at Level 1, with the non-independent observations in the longitudinal context being the panel waves where each unit is observed. • We build explanatory models at both levels of the hierarchy
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
The Growth Model as a “Multilevel” Model • Two different levels of analysis • Level 1: Change at Individual Level –That is, Intra-Individual Growth Over Time as represented by equation (3):
(3)
Yit = β0i + β1iTime + ε it
• Level 1 is the occasion or “wave” of measurement for all units • Each unit has its own Level 1 intercept and own slope • By the way: slope is linear in this model but need not be limited to the linear case. Could have quadratic, polynomial patterns of change over time (e.g. political science faculty productivity as function of time: up before tenure, dip post-tenure, up once the kids are out of the house…) Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• Level 2: Unit Level Factors that Account for Inter-Individual Differences in the Level 1 Parameters. – At Level 2 we have unit-level variables that determine the β0 and the β1 parameters at Level 1. What characteristics of individual units cause intercepts to be higher or lower? What characteristics of individuals units cause slopes to be higher or lower?
• We can write out one possible Level 2 model as: (a)
β0i = β00 + β01 X1i + β02 X 2i
(b)
β1i = β10 + β11 X1i + β12 X 2i
(4) – X1 and X2 as Level 2 variables predicting: in (a): why the intercept β0 at Level 1 in equation (3) is higher or lower for some Level 2 units than others, and in (b) why the slope β1 at Level 1 in equation (3) is higher or lower for some Level 2 units than others.
• Important point: We conceive of the causal processes as occurring at levels that are nested in some hierarchical fashion, and we model processes at each level of the hierarchy Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• In the youth delinquency longitudinal example: X1 could be a dummy for parents divorce, and X2 could be how often the child moved before age 13. These would be the “Level 2 explanatory variables”. • Hypotheses: – Children from divorced households are more likely to start adolescence at higher levels of delinquency (prediction: a positive effect of β01 in 4a) – Children who moved more often before age 13 are more likely to start adolescence at higher levels of delinquency (prediction: a positive effect of β02 in 4a) – Children from divorced households increase in delinquent behaviors over time at a faster rate (prediction: a positive effect of β11 in 4b); and – Children who move more often before age 13 increase in delinquent behaviors over time at a faster rate (prediction: a positive effect of β12 in 4b)
• NOTE: The variables that are used to predict the intercept value across units need not be the same variables that are used to predict the slope. Theory should be the guide as always (though the default is usually including them in both Level 2 equations). Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Multilevel Models are Ubiquitous in Social Sciences, Panels or No Panels!! • In education research, children (Level 1) are nested within schools (Level 2), nested within districts (Level 3). • Survey research: respondents (Level 1) are (hypothetically) nested within blocks (Level 2) which are nested within Primary Sampling Units (Level 3) which are nested within, for example, U.S. states (Level 4) • Trend analysis: respondents (Level 1) are nested within Time (Level 2), since in this kind of design different individuals from the same aggregate unit (country, state) are interviewed at multiple points in time. This is the reverse of panel data, where time of measurement (Level 1) is nested within individuals (Level 2). Example: Pooled NES Election data, 1948-2012 • Cross-national survey research: respondents (Level 1) are nested within countries (Level 2), and perhaps Time (Level 3) as well. Example: World Values Surveys. Barometers. • In panel analysis, can also extend levels. Ames/Renno/Baker (2005) Brazilian election data is three level: time (Level 1) nested within individuals (Level 2) nested within neighborhoods (Level 3) • Whenever data structures are nested or hierarchical, you can use the framework we are discussing now! Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
The Growth Model as a “Mixed” Model • Other key aspect of growth models: the prediction of Level 1 coefficients may have random components, such that the Level 2 variables predict them imperfectly, with some amount of unexplained random variation. • In the Level 2 models so far in equation (4) there were no error terms; in keeping with the conceptual discussion above, we can also include a random disturbance term in the level 2 equations. This would mean that we can account for some of the heterogeneity in the intercept and slopes with the independent variables specified at Level 2, but that there is still some random or unexplained variation that remains. This would be represented by a “random effect” in the level 2 equations capturing the unexplained, random variation. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• We can thus modify the Level 2 equations:
(a)
β0i = β00 + β 01 X 1i + β 02 X 2i + ζ 0i
(b)
β1i = β10 + β11 X 1i + β12 X 2i + ζ 1i
(5) where ζ0i is the random portion of the Level 1 intercept that is unexplained by Level 2 variables; and ζ1i is random portion of the Level 1 slope that is unexplained by Level 2 variables • So the growth model is a kind of random effects model, with random effects for both intercepts and slopes (here, the slope for the variable “TIME”). So it is what we have dealt with before in terms of a random intercept, along with a new random effect term in the slope as well • NOTE: We could also simply put in a dummy variable for each case in the level 2 equations and turn it into a “fixed effect growth model,” but this is almost never done. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• Why a “mixed” model? Can see this by plugging in equation (5) into the Level 1 growth equation (3): (3)
Yit = β0i + β1iTimei + ε it
(6)
Yit = ( β 00 + β 01 X 1i + β 02 X 2i + ζ 0i ) + ( β10 + β11 X 1i + β12 X 2i + ζ 1i ) * Timei + ε it
β 0i
β1i
• And rearranging to yield: (7)
Yit = β 00 + β01 X1i + β02 X 2i + β10 *Timei + β11 X1i *Timei + β12 X 2i *Timei + (ζ 0i + ζ 1i *Timei + ε it )
{
FIXED PART OF THE MODEL
} +{RANDOM PART}
• We call all the coefficients before the parentheses in (7) “FIXED EFFECTS,” and we call all the coefficients within the parentheses “RANDOM EFFECTS.” The model is thus “MIXED” in the sense of containing both fixed and random effects Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• FIXED effects are those coefficients that do not vary across individuals --- everyone in the sample gets the same β. Fixed effects are the “nonstochastic” part of the model. This is a slightly different sense of fixed effects from what we have discussed so far in Unit 2 of the course – but only slightly. If we had N-1 dummy variables to represent the unit effect in equation (7), we would simply have added N-1 “fixed effects,” because everyone in the sample/population still gets the same β (though only one dummy variable is non-zero per case). So the fixed effects models for the Unit Effect (Ui) that we considered earlier are one type of “FIXED EFFECT” in the more general sense we are considering here. • RANDOM effects are those that do vary across individuals, as can be seen from the i subscript associated with the ζ and ε terms in equation (7). The random effects represent the stochastic, unexplained part of the model, the part that differs by a random amount for each individual case or unit. In this model there are three random effects: a random part of the unit’s intercept; a random part of the slope for the TIME variable for each unit; and a random idiosyncratic error at each point in time. • We can thus describe the growth model as a Multilevel Mixed Model with Random Intercepts and Random Time Trends. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• It is very useful to get comfortable with both ways of looking at the model: one way with separate equations for Level 1 and Level 2 processes, as in (3) and (5), and the other way as the full MIXED equation of (7). They are identical and you can move back and forth between the two kinds of specifications. In some software programs (e.g. HLM), you program the two level equations separately, while in other software (e.g. STATA, SPSS, SAS), you program the MIXED model directly. (3)
Yit = β0i + β1iTime + ε it
(a)
β0i = β00 + β 01 X 1i + β 02 X 2i + ζ 0i
(b)
β1i = β10 + β11 X 1i + β12 X 2i + ζ 1i
(5)
(7)
Separate Level 1 and Level 2 Equations
Yit = β 00 + β01 X1i + β02 X 2i + β10 *Timei + β11 X1i *Timei + β12 X 2i *Timei + (ζ 0i + ζ 1i *Timei + ε it )
Full “Mixed” Model Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• Hierarchical/Multi-Level panel models don’t *need* a time trend. You can still have randomly varying slopes for *any* independent variable that varies over time. • Example: Democracy as a function of USAID Democracy Assistance at a given point in time – Level 1: (8) Yit = β0i + β1i X1it + ε it
– Level 2: (9)
β 0i = β 00 + β 01Z1 + ζ 0i β1i = β10 + β11Z1 + ζ 1i
– At Level 2, we think that the country’s level (intercept) of democracy, and the effect of AID on democracy in that country, will depend on the country’s Human Development (Z1) as measured, for example, by the UNDP index. So we might hypothesize Human Development to positively affect the intercept and (perhaps) negatively affect the slope, such that AID “works” better in tougher economic/social contexts. – This is called a “Random Coefficient Model” (RCM) where the effect of any independent variable (X) may be determined by both fixed and random components
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• We can also combine this RCM model with a growth model by including TIME as an additional independent variable: (10) Level 1:
Yit = β0i + β1i *Timei + β2i X1it + ε it β 0i = β 00 + β 01Z1 + ζ 0i
(11) Level 2:
β1i = β10 + β11Z1 + ζ 1i β 2i = β 20 + β 21Z1 + ζ 2i
• So we have fixed and random components for the intercept, fixed and random components for the time trend slope, and fixed and random components for the slope of the AID variable at Level 1. In this model we would call AID (X) a “time-varying covariate.” • Note: The effects of time-varying covariates need not necessarily have random component associated with them, but they may. This is your modeling choice. The same thing goes for the effect of Time, but the standard growth model does have a random component for the time trends at Level 2. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• YOU DON’T NEED ANY INDEPENDENT VARIABLES AT LEVEL 2 AT ALL. AS LONG AS YOU INCLUDE RANDOM EFFECTS, YOU WOULD STILL HAVE A “MIXED MODEL” (12) Level 1: Yit = β0i + β1i X1it + ε it β 0i = β 00 + ζ 0i
(13) Level 2: β = β + ζ 1i 10 1i • There are no variables predicting the Level 1 parameters, but there are random effects for both the intercept and the slope associated with X1. – Every unit has the fixed overall or average “intercept” (β00) plus some random error ζ0i. – Every unit has the fixed overall or average “slope” for X (β10) plus some random error ζ1i.
• MIXED VERSION OF THIS MODEL: (14) Yit = β00 + β10 X1it + (ζ 0i + ζ 1i X1it + ε it ) with the fixed portion representing the average intercept for the sample and the average slope, and the random portion representing random deviations from the average value for a given unit. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• And if we did not have a random effect in this model for the slope at Level 2 (i.e. no ζ1i in (13b)), we would arrive at the following mixed model: (15)
Yit = β00 + β10 X1it + ζ 0i + ε it
• DOES THIS LOOK FAMILIAR? IT IS THE BASIC RANDOM INTERCEPT PANEL MODEL!!! • How about this? Level 1: Yit = β0i + β1i X 1it + ε it (16)
Level 2: β0i = β00 + β01 X 1i + ζ 0i
β1i = β10 Mixed: Yit = β00 + β10 X 1it + β01 X 1i + ζ 0i + ε it
• It is the random effects hybrid model with cluster-level (Level 2) means of the time-varying independent variables included as additional predictors of the Level 2 intercept (See Bell&Jones!) Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• We can even eliminate the independent variable X altogether, and we arrive at a mixed model that is:
(17)
Yit = β00 + ζ 0i + ε it
• This model says that Y at a given point in time is equal to an overall intercept, or a “GRAND MEAN” (β00), plus a random unit effect ζ0i , plus an idiosyncratic unit-time error term εit. This is the first RE model that Rabe-Hesketh and Skrondal discuss in the book, Multilevel and Longitudinal Modeling Using Stata. • SO: ALL PANEL MODELS CAN BE EXPRESSED IN THE MULTILEVEL (“MIXED”) FRAMEWORK!!! Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Estimation of the Growth and Other Multilevel Longitudinal Models • We won’t have time to discuss the statistical estimation issues procedures in depth, but briefly, the estimation of (7) is very complicated! We need to estimate: – 6 fixed parameters (β00, β01, β02, β10, β11, β12); these are (usually) the primary theoretical parameters of interest. – But we must do so in the context of a composite error term that has three separate random components, and that is by construction heteroskedastic (because ζ1i *Time varies over time) and autocorrelated (because of the presence of ζ0i and ζ1i in the error term for each unit at every point in time). This makes estimation difficult and sometimes very slow, depending on the number of random effects included in the models. – Moral: Choose the random effects wisely (and judiciously)!
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Fixed and Random Effects: Youth Delinquency Example Yit = β 00 + β01 X 1i + β02 X 2i + β10 *Timei + β11 X 1i *Timei + β12 X 2i *Timei + (ζ 0i + ζ 1i *Timei + ε it )
• Level 2 variables: (X1) Divorced Parents (yes/no), and (X2) Number of Moves preAge 13. • FIXED EFFECTS IN THE MODEL – β00: Overall or Average Delinquency Intercept for all Units when X1 and X2 both equal 0 – β01: Difference in the Unit-Level Delinquency Intercept for Children of Divorced Parents (i.e. when X1=1) – β02: Effect on the Unit-Level Delinquency Intercept for each Additional Move Made by Child’s Family Before Age 13 – β10: Overall or Average Slope Effect for Time for All Units when X1 and X2 both equal 0 – β11: Difference in the Unit-Level Slope for Time for Children of Divorced Parents (i.e. when X1=1). If this coefficient is positive, it means that children of divorced parents increase in delinquent behavior over time more so than children whose parents are not divorced. – β12: Effect on the Unit-Level Slope for Time for each Additional Move Made by Child’s Family Before Age 13
• RANDOM EFFECTS IN THE MODEL – ζ0i: the Random Portion of the Unit-Level Intercept (with variance σ02) – ζ1i: the Random Portion of the Unit-Level Slope for Time (with variance σ12) – εit: the idiosyncratic error term for a given unit at a given time (with variance σε2) Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Predictions in the Growth Model • “Fixed Predicted” growth trajectory for a given individual is based on the fixed portion of the model. All individuals with identical values on the Xs will have the same “fixed predicted” intercept and the same “fixed predicted” slope, and hence the same “fixed predicted” growth trajectory. This can be viewed as the “average” growth trajectory for all individuals with identical values on the Xs. • “Full Predicted” growth trajectory of Y will deviate from the fixed predicted value, depending on the size of the random components ζ0i and ζ1i. If ζ0i is large, the unit’s intercept will be bigger than that predicted by the Xs, and if ζ1i is large, the unit’s predicted slope will also be bigger than predicted by the Xs. So every unit has some additional random effect that affects the magnitude of both its intercept and slope in the growth curve. • “Actual” value of Y at any given time will equal the predicted Y based on the full predicted growth trajectory plus the idiosyncratic unit-time error term εit. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Notes • Statistical Output from STATA, SPSS, HLM, etc. will produce the Fixed Effect estimates with associated standard errors, as well as all of the “Variance Components” of the random terms that are specified in the model. These will represent how much random variance there is estimated to be in the intercept, slope, and in the Level 1 residuals, conditional on the independent variables in the model, and how much covariance there may be between the various random effects as well. These are the crucial variables in the entire procedure, which is why sometimes mixed models are called “variance components” models. • One can also view the growth model as a Random Effects Model with a Random Time Trend and Time*Independent Variable interaction terms. If we think variable X influences the rate of change, then it enters multiplicatively with Time as can be seen in the MIXED formulation of (7). If think that X influences the intercept, then it enters additively. So multilevel growth models are “just” random intercept models with interactions of variables with Time, and with random components to the intercepts and to the interactions. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
SUMMARY • Modeling in the multilevel tradition consists of specifying random intercepts to account for the clustered observations and unit-level heterogeneity in exactly the same way we did in unit 2. In addition, we can also include: – The effects of time to capture intra-unit growth – Random effects for the effects of time across units (Hierarchical growth models) – Random effects for the slopes of other time-varying independent variables (RCM models)
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
How to Estimate the Mixed Model? • Several methods, all of which of which are based in sum or in part on maximum likelihood methods. – (Note: we are not overly concerned with the technical details and algorithms , but some aspects of estimation are important so that you can interpret coefficient estimates, assess model fit correctly, and make appropriate comparisons between different models)
• How does ML estimation work? – ML methods estimate the values of the population parameters that maximize the likelihood of observing the sample data that we, in fact, did observe. – In SEMs we maximized the likelihood of observing the sample variances and covariances that we did observe, given the implied V-C matrix of our SEM model; here we will maximize the likelihood of observing the value of the individual Yit in our sample Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
The “Mixed” or Hierarchical Growth Model • Level 1: Intra-Individual Change Over Time
(18)
Yit = β0i + β1iTimei + ε it
• Level 2: Inter-Individual Differences in Level 1 Parameters
(19a)
β0i = β00 + β01 X 1i + β02 X 2i + ζ 0i
(19b)
β1i = β10 + β11 X 1i + β12 X 2i + ζ 1i
• Mixed Formulation: (20)
Yit = β 00 + β01 X 1i + β02 X 2i + β10Timei + β11 X 1iTimei + β12 X 2iTimei + (ζ 0i + ζ 1i *Timei + ε it )
• with three random effects:
– σ02: the variance of ζ0i, the random (unexplained) portion of the unit-level intercept – σ12: the variance of ζ1i, the random (unexplained) portion of the unit-level slope for time – σε2: the variance of the idiosyncratic error term εit for a given unit at a given time Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• This proceeds by writing out the probability of observing the outcome Yit for a particular case that we did observe, given that the data were generated from combining the parameters in the model (the β and the σ2) with the X independent variables. – The probability of observing the outcome Y for a given case is a function of its distance from the mean (like a “z-score”) in a normal distribution; hence the importance of the normality assumption for the variables and random effects
• We then aggregate the individual probabilities (through multiplication) to arrive at a joint likelihood function of observing the given sample, given the estimated parameters. • Iterative procedures find the estimated parameters which yield the highest joint likelihood of having observed the set of outcomes that comprise the sample data. These parameters yield predictions of Yit that are as close as possible to the actual observed Yit. – In practice, the process maximizes the log of the joint likelihood to make the mathematics more tractable, which is why summary statistics for ML estimation are given as log-likelihoods. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• Complications in mixed models stem from the multiple random effects, such that the probability of observing Y for a given case is a function of Xs, the β, and its place on three separate normal distributions, those of the ζ0i, ζ1i, and εit. (In “regular” regression we worried only about X, β and εit). • So we need to consider in the estimation process the variances and covariances of the random effects or, as we called them earlier, the “variance components” of the model. In this case we have: – the εit, which we assume to be normal, homoskedastic with variance σε2, and having no correlation with previous values over time, and: – the variances for the two Level 2 random effects and their covariance (σ01). We assume that the variances are also normally distributed and independent of the σε2 errors, though the covariance between the two Level 2 random effects may be nonzero. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• Formally, we assume the following for the Level 2 random effects: ⎛⎡ ⎡ σ 2 σ ⎤⎞ ⎡ ζ ⎤ ⎤ 0i 01 ⎥⎟ ⎥ ∼ N ⎜ ⎢ 0 ⎥,⎢ 0 (21) ⎢ ⎜ ⎣ 0 ⎦ ⎢ σ 10 σ 12 ⎥⎟ ⎢ ζ 1i ⎥ ⎝ ⎣ ⎦ ⎣ ⎦⎠ • In combination with the assumptions for εit, this produces in the mixed growth model, or what Singer and Willett call the “standard multilevel model for change”, a complex variance-covariance matrix of the compound error term (ζ 0i + ζ 1i * Timei + ε it ) in (3) above. For a three-wave example: σ 02 + σ 12 (1)2 + 2σ 0 (1) + σ ε2 (22)
Σ=
σ 02 + σ 01 (1+ 2) + σ 12 (1* 2)2 2 (σ c12σ c2
σ 02 + σ 01 (1+ 3) + σ 12 (1*3)2 2 (σ c12σ c3
σ 02 + σ 12 (2)2 + 2σ 0 (2) + σ ε2 σ 02 + σ 01 (2 + 3) + σ 12 (2 *3)2 2 2 (σ c2 σ c3
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
σ 02 + σ 12 (3)2 + 2σ 0 (3) + σ ε2
• where the terms in brackets signify the time period in question (wave 1, 2, 3), and the terms under the square root signs signify the composite error variance for a given point in time. So in element (2,1), the terms under the square root sign signify the wave 1 composite variance, i.e, the diagonal element (1,1), and the wave 2 composite variance, i.e. the diagonal element (2,2). • NOTE: You can see the intrinsic heteroskedasticity and autocorrelation built into the model (right?) • So what we want is for ML to estimate those values of the β fixed effects, and those values of the four variance components (σε2, σ02, σ12, σ10) that produce the highest overall likelihood of observing the data that we did observe, given the following distribution for Y that is assumed in our model: (23)
Y ∼ N (β 00 + β01 X 1i + β02 X 2i + β10Timei + β11 X 1iTimei + β12 X 2iTimei ,Σ)
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• ML will: – estimate the β fixed effects that, in combination with the σ random effects, generate Yit that come as close as possible to the actual unit-time values of Yit. This is what it means to “maximize the likelihood of observing the data.” By maximizing the likelihood in this fashion, the estimation procedures also minimizes the variance of the Level 1 residuals εit. – estimate “fixed predicted growth trajectories” from the β fixed effects that come as close as possible to the average growth trajectories of all cases with the same values of X (so that the overall deviations of the actual trajectories from the “fixed predicted” regression lines will be as small as possible). – estimate Level 2 variance components that describe the amount of variation of the units around the fixed predicted growth trajectories in terms of the intercepts and the slopes, and the amount of idiosyncratic variation around the “actual predicted trajectories,” given the normality assumptions and the form of the Σ variancecovariance matrix that is specified in the model. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Conditional Effects and Standard Errors of Variables in Mixed Models, and the Impact of Centering • All mixed growth model coefficients are conditional effects, as are all effects estimated in all multiplicative interaction models. In this case the interaction variable is with TIME. • This means that the estimated effect that is produced in statistical output of any of the component variables in a multiplicative model formulation is conditional on the value of the other components being 0. This also means that the standard errors and statistical tests involved in the components of the interaction terms are conditional on the value of the other term being 0 as well. So the estimated intercept effect of X1i is conditional on TIME being equal to 0, and the estimated effect of TIME in the mixed model (β10) is conditional on the Xs being 0 also. Moreover, there is a separate standard error (called the “conditional standard error”) associated with the effect of the one variable at each level of the other variable. So we can calculate not only the effect of YEARS OF PRIOR DEMOCRACY at each level of TIME, but also its standard error and statistical significance. Same for the effect of TIME at each level of YEARS OF PRIOR DEMOCRACY. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• It might be important in some applications to calculate conditional standard errors at different levels of X1i; if so, see the formula in Brambor, Thomas, William Roberts Clark, and Matt Golder. 2006. "Understanding Interaction Models: Improving Empirical Analyses." Political Analysis 14 (1):63-82. • This also means that the effects in mixed models will depend on where the “Zero Point” of a variable is. For example, assume that the overall intercept in the model (β00) is .75. This is actually the intercept when all other variables at Level 2 are 0. At TIME=5, we would add the effect associated with β10 (for example, 2) and multiply it by 5 to produce the overall predicted value of Y at TIME=5 of 10.75, with and all other variables being 0. This means that if we had centered time around 1995, such that 1995=0, 1996=1, 1997=2, 1994=-1, etc., we would have estimated β00 to be 10.75. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• So the estimated effects depend on where “0” is on TIME and on all other relevant variables. This may lead you to make inaccurate conclusions if you are not cognizant of the implications of the zero point on your variables, as well as the fully interactive nature of the mixed model. • Other variables could be centered as well to facilitate the interpretation of the effects. For example, for the UNDP Human Capital measure, the impact of TIME (by itself) would be interpreted as when UNDP=0. This may be unrealistic since there are no countries where UNDP=0. If we centered the UNDP variable at its “grand mean,” then the impact of TIME (by itself) would be interpreted as when “centered UNDP” was 0, in which case you could say when UNDP is at its “average” value. This may make interpretations easier in the mixed model and be useful for you and for the reader. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Interpreting the Estimated Random Effects and the “Explained Variance(s)” in the Mixed Model • There are two general kinds of random effects: the variance terms σ2, and the individual random effects ζ that are estimated for each unit, though STATA does not provide the individual effects in the output unless you ask for them. • The interpretation of the variance terms provides a starting point for discussing “explained” and “unexplained” variation, and “Rsquared” in the mixed model framework. All of the variance components make these interpretations somewhat more complicated than in “normal” regression.
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Interpreting the Variance Components • σε2 : tells us how much Level 1 error variance there is in the model, that is, how much (squared) difference there is on average between the predicted and actual Yit. If you would divide this value by the overall variance in Yit, you would arrive at a predicted l-R2 for the Level 1 equation. • σ02 : tells us how much Level 2 error variation there is in equation 2a predicting the growth trajectory intercepts, that is, how much (squared) difference there is on average between the “fixed predicted” β0i and the “actual predicted” β0i that includes the unit effects ζ0i. If this value is statistically significant, it means that there is random Level 2 variation in the estimated intercepts, controlling for the Xs; if the value is not significant, you cannot reject the hypothesis that all of the units with identical values of the Xs share a common intercept. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• σ12 : tells us how much Level 2 error variation there is in equation 2b predicting the growth trajectory slopes, that is, how much (squared) difference there is on average between the “fixed predicted” β1i and the “actual predicted” β1i that includes the unit effects ζ1i. If this value is statistically significant, it means that there is random Level 2 variation in the estimated slopes, controlling for the Xs; if the value is not significant, you cannot reject the hypothesis that all of the units with identical values of the Xs share a common slope. • σ10 : tells us how much covariation there is in the two Level 2 random effects. If the value is positive, it means that as the unit effect for the intercept (ζ0i) gets bigger, the unit effect for the slope (ζ1i) also gets bigger; when the value is negative, it means that as the unit effect for the intercept gets bigger, the unit effect for the slope gets smaller. It is like a correlation between the two ζ. Often in growth models you see a negative correlation between initial status and rate of change – this is how “regression to the mean” effects manifest themselves within this analytic framework. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• The first three variance components thus give you the unexplained variation in each of the three equations of the multilevel model: (18) for the Level 1 error variation, and (19a) and (19b) for the Level 2 error variation. So you naturally want to talk about the absolute level of “explained variation” in each of these equations as well. • This makes some sense for the Level 1 residuals because there is an observed variance in Yit to compare the value of σε2 with. So an “R2” at Level 1 is relatively straightforward conceptually. But since we do not observe a true distribution for the βi variables, it is not clear what the actual variances are that could be used compare σ02 and σ12 with to arrive at Level 2 “R-Squared” values. • The upshot of this is that we do not really talk about “R-squared” in mixed models; rather, we talk about “Pseudo-R-squared” statistics that reflect intuitively some of the same ideas as normal R-squared but do not share their exact statistical (and nice algebraic) properties. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• What is done in the mixed modeling tradition is to compare the estimated Level 2 variance components to their counterparts in models that have no independent variables at Level 2 at all; that is, to equations (19a) and (19b) with no Xs and only random ζ0i and ζ1iterms in their respective model. • Intuitively, we then base our notion of R-squared at Level 2 on the idea of “proportional reduction of error” (PRE), which we know is a perfectly valid way of looking at R-squared. We say “how much does the addition of explanatory variables X1, X2, etc. reduce the error variation in the Level 2 intercept (slope) equations?” If we reduce the error variance by a lot through adding the X variables, we say that these variables “explain” the variation in the Level 2 equations. If not, we still have much unexplained variation in those parameters.
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• We can start with a model that has no IVs at Level 2 and no effect of TIME. This model reduces to: (20a) Level 1: Yit = β0i + ε it (20b) Level 2: β0i = β00 + ζ 0i (20c) Mixed:
Yit = β00 + ζ 0i + ε it
• This is the “Unconditional Means Model” (UM). It says that Yit is a function of an overall population mean, a random unit effect that represents the intercept difference for unit i from the population mean, and a random level 1 residual. This model will produce two variance components: – σε2: Level 1 error variance in the individual Yit (from equation 8a) – σ02: Level 2 error variance in the β0i intercepts (from equation 8b)
• This is the first “baseline model” for Pseudo R-squared Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• Forgetting about growth models for the moment, consider a new model that tries to explain the magnitude of the Level 1 intercept by adding two new variables at Level 2.
(21a) Level 1: Yit = β0i + ε it (21b) Level 2: β0i = β00 + β01 X 1i + β02 X 2i + ζ 0i (21c) Mixed: Yit = β00 + β01 X 1i + β02 X 2i + ζ 0i + ε it • This model (call it Model “A”) will also produce estimates of – σε2: Level 1 error variance in the individual Yit (from equation 9a) – σ02: Level 2 error variance in the β0i intercepts (from equation 9b)
• The estimate of σε2 should not change (too much) from equation (20a or c) to equation (21a or c) because nothing in the model has been added at Level 1. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
•
But we *can* compare the estimate of σ02 from equation (20b) to equation (21b), and this (proportional) difference will tell you how much the addition of the two Xs reduced the error variance of the Level 2 intercepts, or, in other words, how much the two Xs “explain” the Level 2 intercept variation.
• Formula:
σ 02 − σ 02 " Pseudo Intercept R − squared " = σ 02 UM
A
UM
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• We can extend these ideas to the mixed growth model. First, we can ask: How much does the inclusion of TIME, and its associated random effect ζ1i reduce the overall error variation in the Level 1 residuals? (22a) Level 1: Yit = β0i + β1iTimei + ε it (22b) Level 2: β0i = β00 + ζ 0i
β1i = β10 + ζ 1i (22c) Mixed: Yit = β00 + β10Timei + (ζ 0i + ζ 1iTimei + ε it )
• This is called the “Unconditional Growth Model” (UG). It allows for a common intercept and common fixed effect for time, and then four random effects: – – – –
σε2: Level 1 error variance in the individual Yit (from equation 22a σ02: Level 2 error variance in the β0i intercepts (from equation 22b) σ12: Level 2 error variance in the β1i slopes (from equation 22b) σ10: covariance between the ζ0i and ζ1i random effects
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• We can calculate the proportional difference between Level 1 residuals from this equation (22a or c) with its value from the UM model in (20a or c), and this will tell you how much the addition of TIME reduced the Level 1 error variance --- in other words, how much variation TIME (and its associated random effect ζ1i) “explain” at Level 1 compared to a model without these terms. • Formula:
σ ε2 − σ ε2 " Pseudo Level 1 R − squared " = σ ε2 UM
UG
UM
• If we add additional Level 1 covariates to the Level 1 model, we can go through the same procedures to determine their relative explanatory power.
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• Finally, we can compare any subsequent model that includes Level 2 explanatory variables in the growth framework to the UG model, and see how much reduction in variance in both the intercept and slope they may produce. So, if we call the model we have been working with – with two independent variables predicting the Level 2 intercepts and slopes – Model “B”, we can use the variance components from that estimation to arrive at two “Pseudo Rsquared” values, one for the intercept and one for the slope, as:
σ 02 − σ 02 " Pseudo Intercept R − squared " = σ 02 UG
B
UG
σ12 − σ12 " Pseudo Slope R − squared " = σ12 UG
B
UG
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Assessing the Overall Fit of Mixed Models • We can use the ML estimation procedures to derive summary statistics of the overall fit of mixed models. These are identical to the statistics we used in Unit 1. We use these statistics in the model building process to arrive at the “best” model among the several that we might estimate in a given situation. • The most common measure for mixed models is called the model “Deviance”, and it is the same thing as the Model χ2 from Unit 1. We begin with the final maximized (log)-likelihood function from the ML estimation procedure. This is the logarithm of the maximized likelihood of observing the data in the entire sample that we did observe, given the estimated fixed and random effects. Call this value the LLC, for “current” model . It is the largest value (i.e., least negative value) possible, given the assumptions of the model and the sample data. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• We can also conceive of the “best possible model,” which would produce a predicted likelihood of 1 of observing this sample of Yit, for a LL of 0 (as the log of 1 is 0). This is like the “saturated” model of Unit 1 that fully accounted for the observed variances and covariances between the variables in the model. In this case, it would be a model that perfectly predicts each unit’s Y at each point in time. Call this value the LLS, for “saturated.” • The “model χ2” or “Deviance” is: -2(LLC-LLS) = -2*LLC • This value follows a χ2 distribution, as we discussed earlier. More importantly, the difference in Model Deviances between any two nested models also follows a χ2 distribution, so you can test whether relaxing constraints from one model to another results in a significant improvement in overall fit. The difference between the two Model Deviances will have degrees of freedom equal to the number of relaxed constraints. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• For example, we can compare a “constrained” model that has a common slope for TIME (i.e., where σ12 is assumed to be 0) with an “unconstrained” one that has non-zero random unit-level variation in the slope for TIME. We know that we can arrive at the constrained model simply by imposing the constraint that σ12 = 0, so that model is nested in the model with randomly varying slopes. This would be calculated as: -2(LLConstrained -LLUnconstrained), with 1 df • Since Unconstrained LL will always be larger (less negative) or equal to the Constrained LL, this expression will be greater than or equal to zero. • NOTE: The difference in Model Deviances between nested models with different fixed effects must be based on the LLs that you obtain through Full Maximum Likelihood Methods (FML). If you use “Reduced ML” methods (which are advantageous in point estimation), you can only test nested models that differ in their random effects, for reasons noted above. So it is often the case that you will estimate models using both FML and REML methods, the latter to provide the best point estimates for the fixed and random effects, the former to use in testing alternative models with different fixed effects. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Comparing Non-Nested Models • If you have non-nested models, you cannot use the Model Deviance differences because these differences follow no known distribution. But you can use what we discussed in Unit 1, Information-Based Goodness of Fit Indices, to compare non-nested models that are estimated via ML methods. • Note that these must be fit on the same data, so you cannot use them in completely different situations. But the models you compare with the same data need not be nested within one another. • Two common measures (both with smaller values being “better”): – Akaike Information Criterion (AIC): Model Deviance plus a penalty equal to the number of estimated parameters – Bayesian Information Criterion (BIC): Model Deviance plus penalty equal to: 2 times the number of estimated parameters *.5(ln(N)) Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Further Issues in Mixed Models • Non-linear growth models. In these models we do not assume linear growth across the population, but rather some kind of polynomial, quadratic, logarithmic, etc. Can add a squared terms for TIME or some other transformation of TIME to model the substantive process you think is operating. (See Chapter 6 of Singer and Willett). • Time-Varying Covariates at Level 1. There is no reason to exclude other Level 1 predictors aside from TIME from the specification. This would mean that, controlling for the fixed predicted growth trajectory and the random effects for individual i, there is still an additional source of Level 1 variation based on the values of the timevarying Level 1 covariate. If you have these kinds of variables in your model, you can, in addition to examining statistical significance, see how they affect the reduction of Level 1 residuals in assessing their importance. (Example: USAID expenditures on democracy as a Level 1 time-varying covariate in Finkel et al. (2007)). Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• You can extend the interaction models to include possible interactions between Level 1 time-varying covariates and TIME, and/or between Level 1 time-varying covariates and Level 2 predictors as well. In our study, e.g., we examine how the impact of AID differs under different Level 1 conditions (good economic performance versus bad, large US Military expenditures versus small), as well as under different Level 2 conditions (lots of Human Capital in the country versus little, lots of ethnic fractionalization versus little). • You can also modify the assumptions regarding the error covariances to accommodate more complex patterns in the model’s disturbances.
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• Recall that the variances-covariances of the composite error term for the “standard” mixed growth model was given as equation (5) above for the three-wave case as:
Σ=
σ 02 + σ 12 (1) 2 + 2σ 0 (1) + σ ε2 σ 02 + σ 01 (1 + 2) + σ 12 (1* 2) 2 (σ c21σ c22
σ 02 + σ 01 (1 + 3) + σ 12 (1*3) 2 (σ c21σ c23
σ 02 + σ 12 (2) 2 + 2σ 0 (2) + σ ε2 σ 02 + σ 01 (2 + 3) + σ 12 (2*3) 2 (σ c22σ c23
σ 02 + σ 12 (3) 2 + 2σ 0 (3) + σ ε2
• It may be, however, that the model does a poor job of reproducing the observed composite errors because the assumptions of the “standard” model do not hold in a given situation. For example, there may be autocorrelation in the idiosyncratic errors εit, or there may be time-wise heteroskedasticity in the idiosyncratic errors as well. There are many other possibilities as well (see Chapter 7 of Singer and Willett, or pp. 293-325 in Rabe-Hesketh/Skrondal). Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Hierarchical Growth Models in SEM • Over the past fifteen years, models have been developed for incorporating the longitudinal growth perspective into SEMs • Actually, this was one of the first ways that SEM panel analysis was integrated with other traditions, in this case the multilevel modeling framework • Advantages of SEM approach: – flexibility in modeling and testing (via χ2 differences, e.g.) of alternative specifications, error structures, etc.; – Ability to incorporate multiple indicators of the X and Y variables into the analysis to control for measurement error; – Ability to incorporate reciprocal linkages or associations between growth processes in different domains – i.e. it will be possible to have growth curves of two different variables and see how growth in one influences growth in another over time
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• So what is known as Latent Curve, or Latent Growth Models provide a very flexible way of approaching multilevel longitudinal (growth) processes. • The way it proceeds is surprisingly simple: we specify the two random effects in the growth model (the random intercept and random slope for time) as latent variables within the SEM framework. • These latent variables are then linked to the observed outcome over time via the measurement error portion of the SEM model(!!!) with fixed constraints that model their appropriate theoretical linkages • Then, as in all SEM analysis, we estimate the parameters that minimize the difference between the observed variances and covariances of the X and Y variables and the variance-covariance matrix that is implied by the model, and finally assess the model (if it is overidentifed) in terms of how well it can “reproduce” the variances and covariances of the observed variables. • Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Example: Growth in Democracy, 4 waves, 1990-2002 • Let’s use wide-form data at first, then we will show long-form equivalent via “GSEM” later • We model each Polity outcome as being predicted by two latent variables called “Intercept” and “Slope” • So, at Level 1 of the hierarchy, we have a T-Indicator Y-measurement model with two latent variables, one corresponding to the intercept of the growth trajectory and one corresponding to the slope.
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Latent Growth Curve, No Level 2 Predictors Slope
Intercept
1
dg01i0 ε1
0
var
4
01
dg01i4 ε2
1
0
var
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
12
8 1
dg01i8 ε3
0
var
dg01i12 ε4
0
var
• Can you see that the parameters of the multilevel growth model are actually latent variables from the perspective of the SEM approach? That is, INTERCEPT for each case is a latent η variable with 4 indicators, Y1, Y2, Y3, and Y4, and SLOPE is also a latent η variable with 4 indicators Y1, Y2, Y3, and Y4. • So the outcome variables in the multilevel growth framework are treated as if they are imperfect measures of the INTERCEPT and SLOPE. • Weird, but true! • Means of the latent variables give you the “fixed effect” associated with the intercept and slope; variance gives you the variance component
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
• Look again at Level 1 Multilevel Growth Equation on slide 6:
(1)
Yit = β0i + β1iTimei + ε it
• INTERCEPT is actually β0i and it is “causing” Yit with a regression coefficient of “1” in each wave • SLOPE is actuallyβ1i and it is “causing” Yit with a regression coefficient of “0” at time 1, a regression coefficient of “4” at time 2, a regression coefficient of “8” at time 3, and so forth? ( These would be the values of “Time” for those observations) • Writing the equation for all waves gives:
Y1i = β0i *1+ β1i *0 + ε1i Y2i = β0i *1+ β1i * 4 + ε 2i Y3i = β0i *1+ β1i *8 + ε 3i Y4i = β0i *1+ β1i *12 + ε 4i Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Stata SEM Results: Latent Growth Model -.6
.057
.25
Slope
Intercept
.075
41
1
dg01i0 ε1
0
6.4
4
01
dg01i4 ε2
1
0
6.4
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
12
8 1
dg01i8 ε3
0
6.4
dg01i12 ε4
0
6.4
Additional Tests • Examine GOF statistics to assess model fit • Evaluate Modification Indices to see where the model could be improved – Relaxation of equal error variances – Relaxation of no autocorrelation assumption
• Add Level 2 explanatory variables • Add Level 1 time-varying covariates
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Growth Model with Level 2 Explanatory Variables • Recall the Level 2 Growth Model:
(a)
β0i = β00 + β01 X 1i + β02 X 2i + ζ 0i
(b)
β1i = β10 + β11 X 1i + β12 X 2i + ζ 1i
• Since, in the SEM framework, the β0i and β1i are latent variables, we need only to model them now as Latent Endogenous Variables and have the specific Level 2 X variables as exogenous predictors.
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Full Multilevel SEM Growth Model _cons l225
l203
ε5
1
dg01i0 ε1
0
var
ε6
Slope
Intercept
01
4
dg01i4 ε2
1
0
var
12
8 1
dg01i8 ε3
0
var
dg01i12 ε4
0
var
Important Note: Estimating the Means of INTERCEPT and SLOPE provides the population average fixed effects for these parameters. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Extensions • Multiple indicators for latent exogeneous varaiables • Effects of one growth curve on another • Effects of growth curve parameters(INTERECEPT and/ or SLOPE) on other variables, as in mediation models • Inclusion of time-varying covariates • Non-linear growth with an additional latent variables representing, for example, the effects of a slope for “TIME-SQUARED” • Heteroskedastic error variances • Autocorrelated disturbances Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Example: Growth Parameters as Independent Variables _cons l225
l203
ε7
prf01m
ε5
1
dg01i0 ε1
0
var
ε6
Slope
Intercept
01
4
dg01i4 ε2
0
var
1
12
8 1
dg01i8 ε3
0
var
dg01i12 ε4
Hypothesis: Countries with Stronger Democratic Trajectories Have Higher Levels of GDP Growth During the Time Period
0
var
Note: Model is also an example of an SEM mediation model: Prior Democracy and Ethnic Fractionalization “cause” the Level 2 growth parameters, which then “cause” Level 2 GDP growth. See Selig, James, and K. Preacher. 2009. “Mediation Models for Longitudinal Data in Developmental Research, Research in Human Development 6(2-3), 144-69. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Alternate Specification: Long Form Data with Multilevel Random Effects and Random Coefficients • More general procedure allows estimation of multilevel models with random effects/random coefficients with long-form data • Advantages: Can use for all kinds of random coefficient models, not only “time” which was made possible because we knew the exact parameter constraints to impose in the wide form data. • Also straightforward to extend this framework to handle noncontinuous DVs, as will see next time • Disadvantages: slightly more difficult to deal with heteroskedasticity and autocorrelation and relaxation of other equation-by-equation constraints (as in all long-form data analyses); missing values more difficult to handle in Stata gsem Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
GSEM Random Effects Model Linear Model with Random Effect
Ethnic Frac cc_21
Prior Democ
aid100
Polity
ε1
Ethnic Frac cc_21
ε2
Level 2
Prior Democ
Multilevel Model Version
aid100
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Polity
ε1
Level 1
Multilevel Random Effects Hybrid Model Ethnic Frac cc_21
ε2
Level 2
Prior Democ
aidmean
aid100
Polity
ε1
Level 1 Ethnic Frac
ε2
cc_21
Level 2
Prior Democ
aidmean
With time dummies
yrr2
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
ε1
Polity
aid100
yrr3
Level 1 yrr4
Long-Form Multilevel Growth Model: Level 2 Random Intercept and Random Time Slope
cc_22
cc_21
yearnum2
dg01i
Level 2
ε1
Level 1
cc_2 is the Level 2 marker variable; double circles signifies a Level 2 random effect Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Long-Form Multilevel Growth Model: Level 2 Explanatory Predictors l203
l225 ε2
ε3
cc_22
yearnum2
Level 2
cc_21
dg01i
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
ε1
Level 1
Multilevel Random Coefficient Model with Level 2 Explanatory Predictors l203
l225 ε2
ε3
cc_22
aid100
Level 2
cc_21
dg01i
ε1
Level 1
Exact same structure for this model as the growth model, but IV here is another time-varying factor, Aid100 Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
When a group’s standing along the latent slope variable is negative and strong, this means that there is a strong negative relationship between T and status within the group and, thus, there is a T–status mismatch. Alternatively, when a group’s standing along the latent slope variable is positive and strong, this means that there
Extensions: Measurement Error and “Slopes as Predictors” Sizej .023
.10 Gender_Cj <.01 .00006*
-.21
CEj
28.02*
Between -group
RSj 1.04*
Within-group
Testosterone ij
Statusij
Organizational Behavior and Human Decision Processes 110 (2009) 70–79
1.0
.95 .76
.86
.68
Contents lists available at ScienceDirect y1i j
y2i j
y3 ij
y4 ij
y5i j
Organizational Behavior and Human Decision Processes
onship, as a correlation for each
Fig. 2. Results from a multilevel structural equation model, showing unstandardized effects; y1ij ! y5ij = the five status items for each individual i in a group j; journal homepage: www.elsevier.com/locate/obhdp RS = random slope of the within-group T–status effect; gender_C = gender composition; CE = collective efficacy and size = group size.
Testosterone–status mismatch lowers collective efficacy in groups: Evidence from a slope-as-predictor multilevel structural equation model Michael J. Zyphur a,1, Jayanth Narayanan b,1,*, Gerald Koh c, David Koh c a
Department of Management and Marketing, Level 10, 198 Berkeley Street, The University of Melbourne, Victoria 3010, Australia Department of Management & Organization, NUS Business School, National University of Singapore, 1 Business Link, Singapore 117592, Singapore Department of Epidemiology and Public Health, Yong Loo Lin School of Medicine, National University of Singapore, National University Health System, Block MD3, 16, Medical Drive, Singapore 117597, Singapore b c
aOrganizational r t i c l e i n f Behavior o Article history: Received 21 July 2008 Accepted 4 May 2009 Available online 21 June 2009
and Human Processes 110 (2009) 70–79 a b s t r Decision a c t The study of the biological underpinnings of behavior is in its nascent stages in the field of management. We study how the hormone testosterone (T) is related to status and collective efficacy in a group. We assessed salivary testosterone of 579 individuals in 92 teams. We find that T does not predict status within the group. We also tested the effects of a mismatch between T and status in the group on the col-
Analysis of Panel Data, University of Gothenburg, 15-18 June 2015
Contents lists available at ScienceDirect
Extension: Multilevel Mediation Models with Indirect Effects Ethnic Frac cc_21
ε2
Prior Democ
aidmean
aid100
ε3
Polity
ε1
Further Reading: Preacher, Zyphur and Zhang, “A General Multilevel SEM Framework for Assessing Multilevel Mediation”, Psychological Methods 2010, 15 (3): 209-33. Analysis of Panel Data, University of Gothenburg, 15-18 June 2015