Rosella_laura_c_200911_phd_thesis.pdf

  • Uploaded by: Aman Aman
  • 0
  • 0
  • December 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Rosella_laura_c_200911_phd_thesis.pdf as PDF for free.

More details

  • Words: 44,303
  • Pages: 203
Dissertation

A population based approach to diabetes mellitus risk prediction: Methodological advances and practical applications

by Laura C. A. Rosella H.B.Sc., University of Toronto, 2003 M.H.Sc., University of Toronto, 2005

A Thesis submitted in conformity with the requirements of the degree of: Doctor of Philosophy (PhD) in Epidemiology

DEPARTMENT OF PUBLIC HEALTH SCIENCES DALLA LANA SCHOOL OF PUBLIC HEALTH UNIVERSITY OF TORONTO

2009

© by Laura C. Rosella 2009

A population based approach to diabetes mellitus risk prediction: Methodological advances and practical applications Laura C.A. Rosella Doctor of Philosophy (PhD) 2009 Dalla Lana School of Public Health, University of Toronto.

Abstract Since the publication of the Framingham algorithm for heart disease, tools that predict disease risk have been increasingly integrated into standards of practice. The utility of algorithms at the population level can serve several purposes in health care decision-making and planning. A population-based risk prediction tool for Diabetes Mellitus (DM) can be particularly valuable for public health given the significant burden of diabetes and its projected increase in the coming years. This thesis addresses various aspects related to diabetes risk in addition to incorporating methodologies that advance the practice of epidemiology. The goal of this thesis is to demonstrate and inform the methods of population-based diabetes risk prediction. This is studied in three components: (I) development and validation of a diabetes population risk tool, (II) measurement and (III) obesity risk. Analytic methods used include prediction survival modeling, simulation, and multilevel growth modeling. Several types of data were analyzed including population healthy survey, health administrative, simulation and longitudinal data. The results from this thesis reveal several important findings relevant to diabetes, obesity, population-based risk prediction, and measurement in the population setting. In this thesis a model (Diabetes Population Risk Tool or DPoRT) to predict 10-year risk for diabetes, which can be applied using commonly-collected national survey data was developed and validated. Conclusions drawn from the measurement analysis can inform research on the influence of measurement properties (error and type) on modeling and statistical prediction. Furthermore, the ii

use of new modeling strategies to model change of body mass index (BMI) over time both enhance our understanding of obesity and diabetes risk and demonstrate an important methodology for future epidemiological studies. Epidemiologists are in need of innovative and accessible tools to assess population risk making these types of risk algorithms an important scientific advance. Population-based prediction models can be used to improve health planning, explore the impact of prevention strategies, and enhance our understanding of the distribution of diabetes in the population. This work can be extended to future studies which develop tools for disease planning at the population level in Canada and to enrich the epidemiologic literature on modeling strategies.

iii

Supervisors: Dr. Douglas Manuel/ Dr. Cam Mustard, Dalla Lana School of Public Health, University of Toronto Committee Members: Dr. Paul Corey, Dept. of Public Health Sciences, University of Toronto Dr. Jan Hux, Health Policy Management and Evaluatio, University of Toronto Dr. Les Roos, Dept. of Community Health Sciences, University of Manitoba Dr. Thérèse A. Stukel, Dept. of Health Policy Management and Evaluation, University of Toronto

Acknowledgements I would like to thank my supervisors and committee for their brilliant mentorship and guidance. Each has provided a unique contribution to my training and this work. I would like to acknowledge the valuable insight of my classmates: Jennifer Bethel, Sarah Jane Taleski and Marcelo Urquia and my colleagues at ICES: Nick Daneman, Jeff Kwong, and Refik Saskin. I would like to thank my husband Luigi for his unending patience, support and love. Also, I would like to thank my parents, and my older brothers and sisters who have encouraged and inspired me. I am blessed to have all their support and without them this work would not be possible.

iv

Table of Contents 1 THESIS INTRODUCTION ................................................................................................................1 1.1 Introduction/Background ....................................................................................................1 1.2 References .........................................................................................................................10 2. DEVELOPMENT AND VALIDATION OF DPORT .........................................................................16 2.1 Abstract .............................................................................................................................16 2.2 Introduction .......................................................................................................................18 2.3 Methods .............................................................................................................................21 2.4 Results ...............................................................................................................................29 2.5 Discussion .........................................................................................................................40 2.6 References .........................................................................................................................46 3. MEASUREMENT IN RISK PREDICTION MODELS .......................................................................51 3.1 The influence of measurement error on accuracy (calibration), discrimination, and overall estimation of a risk prediction model ........................................................................51 3.1.2 Abstract .....................................................................................................................51 3.1.2 Introduction ...............................................................................................................53 3.1.3 Methods .....................................................................................................................57 3.1.4 Results ........................................................................................................................66 3.1.5 Discussion ..................................................................................................................81 3.1.6 References ..................................................................................................................87 3.2 The role of ethnicity in the population-based prediction of diabetes ............................90 3.2.2 Abstract .....................................................................................................................90 3.2.2 Introduction ...............................................................................................................92 3.2.3 Methods .....................................................................................................................95 3.2.4 Results ......................................................................................................................105 3.2.5 Discussion ................................................................................................................115 3.2.6 References ................................................................................................................121 4. OBESITY RISK ..........................................................................................................................126 4.1 Abstract ...........................................................................................................................126 4.2 Introduction .....................................................................................................................128 v

4.3 Methods ...........................................................................................................................130 4.4 Results .............................................................................................................................139 4.5 Discussion .......................................................................................................................157 4.6 References .......................................................................................................................163 5. CONCLUSION ............................................................................................................................167 6. APPENDIX .................................................................................................................................172 6.1 Glossary of frequently used terms and acronyms ...........................................................173 6.2 Ethics ..............................................................................................................................176 6.3 Study flow diagram and SAS code for simulation ..........................................................177 6.4 An example of ICC in the context of BMI distribution in the population ......................184 6.5 Detailed results from simulations ....................................................................................187

vi

1. Thesis Introduction 1.1 Introduction/Background In many scientific disciplines, studies that predict or forecast what will happen in the future have contributed to our understanding of the world. The value of scientific studies that provide models to inform strategies that can modify and possibly mitigate future events is of importance to society. Examples include estimating the impact of climate or environmental changes on the earth‘s ecosystems or the impact of policy changes on the economy (1-3). These prediction models have been accepted as valuable tools by scientists and have provided critical information for the development of strategies to modify predicted trends (4, 5). In the field of epidemiology, prediction models are underrepresented and the concept of risk prediction is overshadowed by the estimation of relative risk measures to clarify etiological perspectives of disease. Etiological models use the same estimation procedures as most predictive modeling (i.e., regression) in order to quantify the relative risk associated with a particular exposure on an outcome. Though regression is often used for both purposes, the way in which the model is constructed will differ due to the goals of the model. The goal of a multivariate etiological model is to optimize the accuracy of the relative risk estimate by controlling for confounding. The goal of a prediction model differs in several important ways. First, the outcome which needs to be optimized is an absolute measure of risk, often expressed as percentage or probability (versus a risk or hazard ratio). Second, the goal of a prediction model is to maximize the ability to discriminate between at risk groups and to correctly classify true risk, known as discrimination and calibration. Typically these indices are not evaluated in etiological models. Thirdly, prediction models must be generalizable in other populations to which the model can be applied. Typically etiological models fit the data used to generate the relative risk

1

estimate as tightly as possible and as a result may not be reproducible using data in other settings or may not be applicable to describe risk in another population. These goals change the criteria for model assessment and concomitantly the methodological framework being used. In medicine, prediction models, in the form of risk algorithms, have been used as tools for patient decision-making. A risk algorithm is a tool used to estimate the absolute risk of an outcome for an individual as a function of their baseline characteristics. Typically, risk is expressed as the probability of dying or developing a disease in a given time period (6). In medicine, risk algorithms have contributed important advances in individual patient treatment and disease prevention. One of the most utilized risk tools is the Framingham Heart Score (7). This tool is used to calculate the probability that a patient will develop coronary heart disease in 5 or 10 years, and has been widely integrated into cardiovascular disease prevention and management throughout the world (8-12). Risk algorithms are widely recommended by medical societies for appropriate identification of patients that will benefit from specific interventions. This is exemplified in clinical guidelines for pharmacologic interventions such as cholesterollowering medications (13). Typically, risk prediction tools are used clinically and applied at the individual level. Several potential benefits may be realized by extending the application of these tools to the population level. Similar to the individual level, at the population setting predictive risk tools have the potential of providing insight into the future burden of a disease in an entire region or nation and the influence of specific risk factors. These tools can support health care decisionmaking, including the effective and efficient allocation and distribution of health care resources and planning for effective disease prevention interventions. To date, prediction tools specifically designed for use at the population level are neither created nor used for planning.

2

Epidemiology of Diabetes Diabetes is a chronic endocrine disorder affecting the body‘s metabolism and resulting in structural changes affecting the organs of the vascular system. Serious complications resulting from diabetes include coronary heart disease, stroke, retinopathy, renal failure, peripheral artery disease, and neuropathy (14). The two main forms of diabetes are type I diabetes and type 2 diabetes. Type 1 diabetes is a result of pancreatic islet beta-cell destruction usually due to an autoimmune response which results in insulin deficiency requiring exogenous insulin to prevent serious complications (15). Type 2 diabetes is characterized by insulin resistance and/or abnormal insulin secretion(15). In people with type 2 diabetes, blood sugar must be controlled either through diet, with oral hypoglycemic drugs or in severe cases with exogenous insulin (16, 17). Type 2 diabetes accounts for over 90% of all diabetes cases worldwide (18) and is the focus of this thesis dissertation. Diabetes and its complications is a leading cause of death and disability, reducing overall life expectancy and healthy life expectancy. Global estimates place the number of people with diabetes at approximately 200 million and increasing rapidly (19). Over 2 million Canadians have diabetes and prevalence is increasing annually (20). In Canada‘s largest province, Ontario, the 2005 prevalence estimate of diabetes had already exceeded the rate that was predicted by the World Health Organization (WHO) for 2030 (21). There is a growing concern that if left unchecked, these trends may slow or even reverse life expectancy gains in the US and other developed countries (18). In Ontario alone, diabetes has been shown to reduce healthy life expectancy by 2.7 and 2.9 years respectively (22). The economic and medical consequences of

3

complications arising from diabetes, including cardiovascular disease and kidney failure, are significant. Diabetes and its associated complications pose a significant economic burden to the health care system in Canada, which was estimated at 1.6 billion in 1998 (23). Excess body weight or obesity is associated with insulin resistance (24) and is overwhelmingly associated with incidence of type 2 diabetes (25-29). Furthermore, it has now been demonstrated through randomized trials that diabetes can be prevented or delayed through reduction in body weight (30-33) and consequently intervention strategies that target weight reduction or prevention of weight gain are largely recognized as integral to primary prevention of diabetes (34, 35). Currently, more than 1.1 billion adults are overweight worldwide including 312 million who are obese (36). In Canada, obesity doubled between 1985 and 1998 (37) and currently approximately 6.8 million adults ages 20 to 64 are overweight (including 4.5 million who are obese) (38). Therefore, obesity in Canada is a significant problem (39) that must be considered for population planning and health intervention for diabetes. In addition to obesity several lifestyle, environmental and demographic factors have been associated with diabetes, the main ones being: ethnicity, physical activity, alcohol and tobacco (40). There is growing evidence that certain ethnic groups are at increased risk for developing type 2 diabetes. Globally, non-European populations have a higher proportional burden of type 2 diabetes compared to the other regions of the world(41). The highest diabetes rates in the world are seen in aboriginal population, including those in Australia(42, 43), United States(44, 45), and Canada(46, 47). In the United States, studies have shown that those of African and Hispanic decent are at increased risk for developing diabetes compared with non-Hispanic white Americans (48-50). Throughout the world, those of South Asian decent have been shown to carry an increase burden of type 2 diabetes compared with both non-white and white ethnicities

4

(51-53). The variability in risk between ethnicities may be due to a contribution of genetic factors, which can influence insulin metabolism and predisposition to weight gain(54, 55). On the other hand, the clustering of risk factors for type 2 diabetes within both families and cultural groups may partly explain the variability in risk (56-59). The debate on the relative contributions of genetic susceptibility versus environmental and lifestyle factors is ongoing and will continue to be discussed as more scientific evidence emerges (60). Both intervention and observational studies have supported the reduced incidence of type 2 diabetes associated with increased physical activity (61-71). Physical activity can reduce diabetes risk by eliciting physiological changes at the metabolic level related to insulinstimulating pathways (72) in addition to aiding in the maintenance of a healthy body weight. Several observational studies have now confirmed that shown that moderate alcohol consumption is associated with a reduced risk of developing diabetes (73-77) whereas smoking appears to increase risk of diabetes (78-80).

Population health planning for diabetes Estimating future burden of diabetes is a valuable aspect to population health planning, especially given the current high prevalence of diabetes, the startling rise in the occurrence of obesity, and the substantial associated costs and consequences of the disease for the health system and individuals. Population estimates of future diabetes burden can be found in the literature using a variety of different techniques. Previous studies that estimate future diabetes burden have either extrapolated overall trends in diabetes prevalence or indirectly incorporated information on the influence of risk factors with various assumptions (19, 41, 48, 81, 82). Previous studies of diabetes lifetime risk and life expectancy are not predictive, rather they

5

describe diabetes from a life-course perspective using a period or stationary population approach (22, 48). Clinical risk algorithms have been developed for use in a clinical primary care setting. These types of algorithms are limited in their ability to be used at the population level because they are developed for the individual (versus the population) and require clinical data which is infrequently available at the population level such as fasting blood sugar (83-85), or require detailed information such as diabetes family history (86, 87). Population planning tools for diabetes in the form of a population-based risk algorithm has not yet been developed. Estimates of future incidence of diabetes using a risk algorithm approach and based on current baseline risk factors in the population will alert policy makers, planners, and physicians to the extent of the diabetes epidemic in communities. In addition, the effectiveness of widespread prevention strategies can be improved by knowing which groups to target and how extensive a strategy is needed to stabilize or reduce the number of new cases. In order to maximize the benefits of a population-based risk prediction tool for diabetes, it necessarily must be available to the widest audience and tightly integrated into existing risk factor surveillance systems to maximize efficiency. Therefore, an effective tool is required that is valid statistically but also meaningful and accessible in Canadian communities.

Thesis Objectives The goal of this thesis is to demonstrate and inform the methods of population-based diabetes risk prediction. This goal will be investigated in three broad areas: (i) development and validation of a population-based risk tool; (ii) measurement; and, (iii) modeling of obesity trends. These components will be investigated using different types of data and distinct analytic approaches.

6

The first objective of this thesis is to develop and validate a risk tool to predict the probability of developing diabetes in the subsequent 10 years (10-year risk) for Canadians using widely available data. Thos model developed in this study is known as the Diabetes Population Risk Tool (or DPoRT). This is the first time that a risk tool has been developed specifically for the purpose of population health assessment and using regularly collected national data. The conceptual framework of this model can be extended to future studies to develop tools for disease planning at the population level in Canada and to enrich the epidemiologic field on predictive modeling strategies. Epidemiologists are in need of innovative and accessible tools to assess population risk thereby making these risk algorithms an important scientific advance. The development and application of this novel population-based risk tool for diabetes raises important questions concerning the measurement of risk factors in the population and understanding obesity trends in the population. The second objective is to understand the influence of measurement in risk prediction. Indeed, there are few issues more important to epidemiology than measurement. Though significant research is devoted to this area for relative risk estimation, there exists little information on how measurement error and types of variables influence a prediction algorithm. Measurement in the context of a prediction algorithm can differ from measurement in the context of an etiological model in several important ways. Firstly, the influence of measurement properties on risk prediction may differ from the influence on relative risk estimation in etiological models because they rely on different endpoints for their assessment. Secondly, the data needed for population prediction must be generalizable to the population in addition to being regularly collected and reported on and this has implications for the type of data that can be used to make the predictions. Specifically the scale of this type of data may require reliance

7

on methods (such as self-reporting), which may be more prone to error. In addition, the level of detail available at the population level may be more limited than clinical data used in epidemiological studies. The first objective in the measurement section is to use simulation to understand the impact of measurement error on risk algorithm performance. Simulation is a useful tool for epidemiologists which allow one to carry out investigations which otherwise may not be possible due to lack of data. Specifically this section will quantify and describe the effect measurement error (differential and non-differential) in self-reported height and weight on the performance (discrimination and accuracy) and outcome (predicted risk) of a diabetes risk algorithm. The second objective in this section is to investigate the influence of scope of variables (or lack thereof) available at the population level. Detailed ethnic information is collected but not publicly reported in Canada. Therefore in keeping with the goal of DPoRT to use publicly available population health surveillance data, DPoRT uses the form of ethnicity which is publicly reported as ―white/non-white‖. The purpose of this section is to assess the impact of this data restriction by comparing DPoRT to an algorithm for diabetes that uses detailed ethnicity and to report on its relative importance in terms of improvement to the predictive accuracy of the model and policy implications. This work also provides insight into the role of ethnicity on diabetes risk in the context of other risk factors. The third segment of this thesis is focused on studying obesity trends over time using longitudinal data and multilevel growth modeling. The objective of this section is to study how lifestyle and socio-demographic factors are associated with weight gain, which is key for preventing diabetes since obesity is its most influential risk factor. The results from such an analysis are useful for the Diabetes Population Risk Tool for two reasons. First it can provide further insight into the likely trajectories of obesity over time and this can be used to improve

8

DPoRT predictions or to allow for predictions farther into the future. Secondly, the results of the model can be used to determine characteristics of the population that can be used to identify those who are most likely at risk for weight gain (and thus are increased risk for developing diabetes). In addition to elucidating the factors that modify weight gain, the multilevel growth model can identify factors that influence the rate at which it occurs over time. This can be used better inform interventions that are assessed using DPoRT. This thesis is an investigation into diabetes, obesity, population-based risk prediction, analytic methods, and measurement in the population setting. This thesis describes the development and validation of DPoRT, which can be applied using commonly-collected national survey data. Conclusions drawn from the measurement analysis will inform research on the influence of measurement properties (error and type) on modeling and statistical prediction. Furthermore, the use of novel modeling strategies to model change of BMI provides important information on factors that influence the relationship between obesity and diabetes and demonstrates an important methodology for future epidemiological studies.

9

1.2 References

1. Stern N et al. Stern Review: The Economics of Climate Change. London: HM Treasury, 2006. 2. Hulme M et al. Relative impacts of human-induced climate change and natural climate variability. Nature 1999;397:688-91. 3. Steigerwald DG, Stuart C. Econometric estimation of foresight: Tax policy and investment in the United States. Review of Economics and Statistics 1997;79:32-40. 4. Ayres I. Supercrunchers: Why thinking-by-numbers is the new way to be smart. Random House, 2007. 5. Orrell D. Apollo's Arrow: The Science of Prediction and the Future of Everything. Toronto: HarperCollins Publishers, 2006. 6. Diamond GA. What price perfection? Calibration and discrimination of clinical prediction models. Journal of Clinical Epidemiology 1992;45:85-9. 7. Anderson KM et al. An Updated Coronary Risk Profile - A Statement for HealthProfessionals. Circulation 1991;83:356-62. 8. Hense HW et al. Framingham risk function overestimates risk of coronary heart disease in men and women from Germany-results from the MONICA Augsburg and the PROCAM cohorts. European Heart Journal 2003;24:937-45. 9. Kannel WB. Some Lessons in Cardiovascular Epidemiology from Framingham. American Journal of Cardiology 1976;37:269-82. 10. Leaverton PE et al. Representativeness of the Framingham risk model for coronary heart disease mortality: A comparison with a national cohort study. Journal of Chronic Diseases 1987;40:775-84. 11. Thomsen TF et al. A cross-validation of risk scores for coronary heart disease mortality based on data from the Glostrup Population Studies and Framingham Heart Study. International Journal of Epidemiology 2002;31:817-22. 12. Wilson PWF et al. Prediction of coronary heart disease using risk factor categories. Circulation 1998;97:1837-47. 13. Manuel DG et al. Effectiveness and efficiency of different guidelines on statin treatment for preventing deaths from coronary heart disease: modeling study. BMJ 2005;332:1419-22. 14. Nathan DM. Long-Term Complications of Diabetes-Mellitus. New England Journal of Medicine 1993;328:1676-85. 10

15. Alberti KGMM, Zimmet PZ. Definition, diagnosis and classification of diabetes mellitus and its complications. Part 1: Definition, diagnosis and classification of diabetes mellitus provisional report of a WHO consultation. Diabetic Medicine 1998;15:559-3. 16. Gannon MC, Nuttall FQ. Control of blood glucose in type 2 diabetes without weight loss by modification of diet composition. Nutrition and Metabolsim 2006;3. 17. Inzicchi SE. Oral antiglycemic therapy for type 2 diabetes - scientific review. JAMA: The Journal of the American Medical Association 2002;287:360-72. 18. Zimmet P, Alberti KGMM, Shaw J. Global and societal implications of the diabetes epidemic. Nature 2001;414:782-7. 19. Wild S et al. Global prevalence of diabetes - Estimates for the year 2000 and projections for 2030. Diabetes Care 2004;27:1047-53. 20. Health Canada. Diabetes in Canada - Second Edition. H49-121. 2002. Ottawa, Health Canada. 1999. 21. Lipscombe LL, Hux JE. Trends in diabetes prevalence, incidence, and mortality in Ontario, Canada 1995-2005: a population-based study. Lancet 2007;369:750-6. 22. Manuel D, Schultz S. Health-related quality of life and health-adjusted life expectancy of people with diabetes in Ontario, Canada, 1996-1997. Diabetes Care 2004;27:407-14. 23. Health Canada. Economic Burden of Illness in Canada. 1998. Ottawa, Public Health Association of Canada. 24. Kahn SE, Hull RL, Utzschneider KM. Mechanisms linking obesity to insulin resistance and type 2 diabetes. Nature 2006;444:840-6. 25. Colditz G et al. Weight as a risk factor for clinical diabetes in women. American Journal of Epidemiology 1990;132:501-13. 26. Colditz G et al. Weight gain as a risk factor for clinical diabetes mellitus in women. Annals of Internal Medicine 1995;122:481-6. 27. Perry IJ et al. Prospective study of risk factors for development of non-insulin dependent diabetes in middle aged British men. British Medical Journal 1995;310:555-9. 28. Vanderpump MPJ et al. The incidence of diabetes mellitus in an English community: a 20-year follow-up of the Wickham Survey. Diabetic Medicine 1996;13:741-7. 29. Wilson P et al. Prediction of incident diabetes mellitus in middle-aged adults. Archives of Internal Medicine 2007;167:1068-74. 30. Knowler WC et al. Reduction in the incidence of type 2 diabetes with lifestyle intervention or metformin. New England Journal of Medicine 2002;346:393-403. 11

31. Li G et al. The long-term effect of lifestyle interventions to prevent diabetes in China Da Qing Diabetes Prevention Study: a 20-year follow-up study. Lancet 2008;371:1783-9. 32. Lindstrom J et al. Sustained reduction in the incidence of type 2 diabetes by lifestyle intervention: follow-up of the Finnish Diabetes Prevention Study. Lancet 2006;368:1673-9. 33. Tuomilehto J et al. Prevention of type 2 diabetes mellitus by changes in lifestyle among subjects with impaired glucose tolerance. New England Journal of Medicine 2001;344:1343-50. 34. Avenell A et al. Systematic review of the long-term effects and economic consequences of treatments for obesity and implications for health improvement. Health Technology Assessment 2004;8:1-+. 35. Mayor S. International Diabetes Federation consensus on prevention of type 2 diabetes. International Journal of Clinical Practice 2007;61:1773-5. 36.

Haslam DW, James WP. Obesity. Lancet 2005;366:1197-209.

37.

Katzmarzyk PT. The Canadian obesity epidemic. Obesity Research 2002;10.

38.

Statistics Canada. Canadian Community Health Survey, 2005. 2005.

39.

Katzmarzyk PT. The Canadian obesity epidemic. CMAJ 2002;166:1039-40.

40. van Dam RM. The epidemiology of lifestyle and risk for type 2 diabetes. European Journal of Epidemiology 2003;18:1115-25. 41. World Health Organization. Report of a WHO consultation on obesity, Obesity: preventing and managing the global epidemic. 1998. Geneva, World Health Organization. 42. Odea K. Westernization, Insulin Resistance and Diabetes in Australian Aborigines. Medical Journal of Australia 1991;155:258-64. 43. Odea K et al. Obesity, Diabetes, and Hyperlipidemia in A Central Australian Aboriginal Community with A Long History of Acculturation. Diabetes Care 1993;16:1004-10. 44. Mokdad AH et al. The continuing epidemics of obesity and Diabetes in the United States. JAMA: The Journal of the American Medical Association 2001;286:1195-200. 45. Pavkov ME et al. Changing patterns of type 2 diabetes incidence among Pima Indians. Diabetes Care 2007;30:1758-63. 46. Harris SB et al. The prevalence of NIDDM and associated risk factors in native Canadians. Diabetes Care 1997;20:185-7.

12

47. Young TK et al. Type 2 diabetes mellitus in Canada's First Nations: status of an epidemic in progress. Canadian Medical Association Journal 2000;163:561-6. 48. Narayan KMV et al. Lifetime risk for diabetes mellitus in the United States. JAMA 2003;290:1884-90. 49. Winkleby MA et al. Ethnic and socioeconomic differences in cardiovascular disease risk factors - Findings for women from the third national health and nutrition examination survey, 1988-1994. Jama-Journal of the American Medical Association 1998;280:356-62. 50. Brancati FL et al. Diabetes mellitus, race, and socioeconomic status - A population based study. Annals of Epidemiology 1996;6:67-73. 51. Abate N, Chandalia M. Ethnicity and type 2 diabetes - Focus on Asian Indians. Journal of Diabetes and Its Complications 2001;15:320-7. 52. Ramachandran A et al. Risk of noninsulin dependent diabetes mellitus conferred by obesity and central adiposity in different ethnic groups: A comparative analysis between Asian Indians, Mexican Americans and Whites. Diabetes Research and Clinical Practice 1997;36:121-5. 53. Ramachandran A et al. Rising prevalence of NIDDM in an urban population in India. Diabetologia 1997;40:232-7. 54. Qi L, Hu FB, Hu G. Genes, environment, and interactions in prevention of type 2 diabetes: A focus on physical activity and lifestyle changes. Current Molecular Medicine 2008;8:519-32. 55. Qi L. Genetic effects, gene-lifestyle interactions, and type 2 diabetes. Central European Journal of Medicine 2008;3:1-7. 56. Carmelli D, Cardon LR, Fabsitz R. Clustering of Hypertension, Diabetes, and Obesity in Adult Male Twins - Same Genes Or Same Environments. American Journal of Human Genetics 1994;55:566-73. 57. DeFronzo RA, Ferrannini E. Insulin resistance. A multifaceted syndrome responsible for NIDDM, obesity, hypertension, dyslipidemia, and atherosclerotic cardiovascular disease. Diabetes Care 1991;41:173-94. 58. 4.

Martin BC et al. Familial clustering of insulin sensitivity. Diabetes 1992;41:850-

59. Meigs JB et al. Risk variable clustering in the insulin resistance syndrome - The Framingham Offspring Study. Diabetes 1997;46:1594-600. 60. Zimmet P, Alberti KGMM, Shaw J. Global and societal implications of the diabetes epidemic. Nature 2001;414:782-7.

13

61. Bassuk SS, Manson JE. Epidemiological evidence for the role of physical activity in reducing risk of type 2 diabetes and cardiovascular disease. Journal of Applied Physiology 2005;99:1193-204. 62. Gill JMR, Cooper AR. Physical Activity and Prevention of Type 2 Diabetes Mellitus. Sports Medicine 2008;38:807-24. 63. Haapanen N et al. Association between leisure time physical activity and 10-year body mass change among working-aged men and women. International Journal of Obesity 1997;21:288-96. 64. Helmrich SP et al. Physical activity and reduced occurrence of non-insulindependent diabetes mellitus. New England Journal of Medicine 1991;325:147-52. 65. Hu FB et al. Walking Compared With Vigorous Physical Activity and Risk of Type 2 Diabetes in Women . JAMA 1999;282:1433-9. 66. Hu G et al. Physical activity, body risk of type 2 diabetes in patients with normal or impaired glucose regulation. Archives of Internal Medicine 2004;164:892-6. 67. Meisinger C et al. Leisure time physical activity and the risk of type 2 diabetes in men and women from the general population. Diabetologia 2005;48:27-34. 68. Norman A et al. Total physical activity in relation to age, body mass, health and other factors in a cohort of Swedish men. International Journal of Obesity 2002;26:670-5. 69. Petersen L, Schnohr P, Sorensen TIA. Longitudinal study of the long-term relation between physical activity and obesity in adults. International Journal of Obesity 2004;28:105-12. 70. Villegas R et al. Physical activity and the incidence of type 2 diabetes in the Shanghai women's health study. International Journal of Epidemiology 2006;35:1553-62. 71. Manson JE et al. Physical-Activity and Incidence of Non-Insulin-Dependent Diabetes-Mellitus in Women. Lancet 1991;338:774-8. 72. Perseghin G et al. Increased glucose transport-phosphorylation and muscle glycogen synthesis after exercise training in insulin-resistant subjects. New England Journal of Medicine 1996;335:1357-62. 73. Davies MJ et al. Effects of moderate alcohol intake on fasting insulin and glucose concentrations and insulin sensitivity in postmenopausal women - A randomized controlled trial. Jama-Journal of the American Medical Association 2002;287:2559-62. 74. Friedman GD, Klatsky AL. Is alcohol good for your health? New England Journal of Medicine 1993;329:1882-3. 75. Howard AA, Arnsten JH, Gourevitch MN. Effect of alcohol consumption on diabetes mellitus - A systematic review. Annals of Internal Medicine 2004;140:211-9.

14

76. Marques-Vidal P et al. Relationships between alcoholic beverages and cardiovascular risk factor levels in middle-aged men, the PRIME study. Atherosclerosis 2001;157:431-40. 77. Wei M et al. Alcohol intake and incidence of type 2 diabetes in men. Diabetes Care 2000;23:18-22. 78. Manson J.E. et al. A prospective study of cigarette smoking and the incidence of diabetes mellitus among us male physicians . American Journal of Medicine 2000;109:538-42. 79. Nakanishi N et al. Cigarette smoking and risk for impaired fasting glucose and type 2 diabetes in middle-aged Japanese men. Annals of Internal Medicine 2000;133:18391. 80. Wannamethee SG, Shaper AG, Perry IJ. Smoking as a modifiable risk factor for type 2 diabetes in middle-aged men. Diabetes Care 2001;24:1590-5. 81. Boyle JP et al. Projection of diabetes burden through 2050 - Impact of changing demography and disease prevalence in the US. Diabetes Care 2001;24:1936-40. 82. King H, Aubert RE, Herman WH. Global burden of diabetes, 1995-2025 Prevalence, numerical estimates, and projections. Diabetes Care 1998;21:1414-31. 83. Eddy DM, Schlessinger L. Validation of the archimedes diabetes model. Diabetes Care 2003;26:3102-10. 84. Hanley AJG et al. Prediction of type 2 diabetes using simple measures of insulin resistance - Combined results from the San Antonio Heart Study, the Mexico City Diabetes Study, and the Insulin Resistance Atherosclerosis Study. Diabetes 2003;52:4639. 85. Ito C et al. Prediction of diabetes mellitus (NIDDM). Diabetes Research and Clinical Practice 1996;34:S7-S11. 86. Herman WH et al. A new and simple questionnaire to identify people at increased risk for undiagnosed diabetes. Diabetes Care 1995;18:382-7. 87. Lindstrom J, Tuomilehto J. The diabetes risk score: a practical tool to predict type 2 diabetes risk. Diabetes Care 2007;26:725-31.

15

2. A Population Based Prediction Algorithm for the Development of Physician Diagnosed Diabetes Mellitus: Development and Validation of the Diabetes Population Risk Tool (DPoRT) 2.1 Abstract Background: Diabetes prevalence is increasing in most countries and its burden poses a serious threat to public health. National estimates of the magnitude of the upcoming diabetes epidemic are needed to understand the distribution of diabetes risk in the population and inform health policy and resource planning. Objective: The objective of this study is to create and validate a population based risk prediction tool for incident diabetes mellitus (DM) using widely available public data. Methods: Using a cohort design that links baseline risk factors to a validated population-based diabetes registry, a model (Diabetes Population Risk Tool or DPoRT) to predict 9-year risk for diabetes using commonly-collected national survey data was developed and validated. The development cohort was the Ontario‘s 1996/7 National Population Health Survey (NPHS) linked to the validated Ontario Diabetes Database (ODD), a provincial component of the National Diabetes Surveillance System (NDSS) in Canada. Variables were restricted to factors routinely measured in the population. The probability of developing diabetes was modeled using sexspecific Weibull survival functions for those > 20 years, without diabetes and not pregnant at baseline (N = 19,861). The model was validated in two external validation cohorts: the 2000/1 Canadian Community Health Survey (CCHS) in Ontario (N=26,465) and 1996/7 NPHS in Manitoba (N= 9,899), both linked to administrative data for NDSS-defined physician diagnosed diabetes. Predictive accuracy was assessed by comparing observed physician diagnosed diabetes rates with predicted risk estimates from DPoRT. Discrimination of the model was assessed

16

using a C statistic and calibration was assessed with the Hosmer-Lemeshow chi-square statistic (χ2H-L). Results. In the development cohort, 9-year age-standardized diabetes rates were 6.7% for males and 5.6% for females. Predictive factors included in the final model were: BMI, age, ethnicity, hypertension, immigrant status, smoking, education status and heart disease. The DPoRTpredicted incidence closely agreed with the observed diabetes incidence rates in the validation cohorts. Calibration measured using the Hosmer-Lemeshow was satisfied (χ2H-L <20) and overall observed and predicted differed by less than 0.4%. Good discrimination was found in both validation cohorts. In CCHS Ontario C= 0.78 95% CI (0.76, 0.80) for males and C= 0.77 95% CI (0.75, 0.79) for females. In NPHS-Manitoba C = 0.81 95% CI (0.78, 0.83) for males and C= 0.81 95% CI (0.77, 0.83) for females. Conclusions. The algorithm accurately predicted diabetes incidence in two validation cohorts and demonstrated good discrimination. This algorithm can be used by health planners to estimate diabetes incidence and quantify the impact of interventions using routinely collected survey data.

17

2.2 Introduction In medicine, prediction tools are used to calculate risk, defined as the probability of developing a disease or state in a given time period. Within the clinical setting, predictive studies such as the Framingham Heart Score (1) - used to calculate the probability that a patient will develop coronary heart disease - have contributed important advances in individual patient treatment and disease prevention (2). Similarly, applying predictive risk tools to populations can provide insight into the influence of risk factors, the future burden of disease in an entire region or nation, and the value of interventions at the population level. Global estimates place the number of people with diabetes at approximately 200 million, and increasing rapidly (3). There is a growing concern that if left unchecked these trends may slow or even reverse life expectancy gains in the US and other developed countries (4). Planning for health care and public health resources can be informed by robust prediction tools. Estimates of future diabetes incidence will alert policy makers, planners, and physicians to the extent and urgency of the diabetes epidemic. In addition, a population prediction tool for diabetes can identify the optimal target groups for new intervention strategies, and determine how extensive a strategy must be to achieve the desired reduction in new cases. This insight can improve the effectiveness and efficiency of large-scale prevention strategies. Clinical risk algorithms have been used at the population level but with considerable challenges (5). In particular, clinical risk tools usually require clinical data which is infrequently available at the population level. For diabetes in particular, several clinical risk prediction tools exist, but they require clinical data that are collected infrequently or not at all at the population level, such as fasting blood sugar (6-8), or require detailed information such as diabetes family history (9, 10). In addition, some prediction tools apply only to specific sub-groups of the

18

population, such as to individual within specific age ranges or only to individuals that have other co-morbid conditions, and thus are not suitable for national population estimates (11-13). There are several ways in which an algorithm intended for population health application differs from an algorithm intended for clinical use. For a population algorithm the input variables must be representative of the entire population (ideally population-based), meaningful for health policy decisions makers, available to a wide audience, and regularly collected so that estimates can be updated frequently. Often algorithms used in clinical settings maximized discrimination at the expense of accuracy, meaning that the algorithms do well at rank ordering subjects but not as well at accurately predicting actual risk (14) . In contrast, algorithms used in populations may favor accuracy over discrimination, because population health decision makers rely more heavily on estimates of absolute risk and numbers of disease cases versus rankordering of individuals. The objective of this study was to design and validate a diabetes risk prediction tool for use in populations using publicly available data. The creation and application of a populationbased risk algorithm for diabetes was feasible because the risk factors for diabetes are well known and readily measured through self-reported questionnaires in many country‘s population health surveys. Studies have overwhelmingly shown that the most influential risk factor for the development of diabetes is excess weight, or obesity (15, 16), which is collected nationally in Canada. DPoRT generates diabetes risk using data that are routinely collected in Canada through health surveys. The objective of this study is to create a risk algorithm for diabetes incidence that can be applied at the level of populations using widely available public data. The Diabetes Population

19

Risk Tool or DPoRT was created and validated by individually linking three different provincial population health surveys to population-based registries of physician-diagnosed diabetes.

20

2.3 Methods DPoRT derivation cohort The cohort used to develop the risk algorithm was derived from 23,403 Ontario residents that responded to the 1996/7 National Population Health Survey (NPHS-ON) conducted by Statistics Canada that had an overall 83% response rate (17) and were linkable to health administrative databases in Ontario. In the NPHS, households were selected though stratified, multilevel cluster sampling of private residences using provinces and/or local planning regions as the primary sampling unit. The survey was conducted through telephone and all responses were selfreported. Persons under the age of 20 (n = 2, 407) and those who had self-reported diabetes were excluded (n = 894). Those who were pregnant at the time of the survey were also excluded (n = 241), due to the fact that baseline Body Mass Index (BMI) could not be accurately ascertained, leaving a total of 19,861 individuals (Figure 1). Sixty-six males were further excluded when applying the algorithm due to missing baseline BMI resulting in 9,177 males and 10,618 females in the final cohort. Respondents from the survey were individually linked to a chart-validated registry of physician-diagnosed diabetes, the Ontario Diabetes Database, allowing each member of the cohort to be followed to determine their diabetes status for the next 9 years (to 2005/6).

DPoRT validation cohorts The Manitoba respondents of the 1996/7 NPHS (NPHS-MB) were used as one of two validation cohorts. The same sampling strategy and methodology used to carry out the Ontario portion of the NPHS was used in Manitoba. There were N = 10,118 persons linked to health care data in Manitoba. The same exclusion criteria were applied to NPHS-MB cohort resulting in 9,899 individuals for the validation. The second validation data set was from the Ontario portion of the

21

2000/1 Canadian Community Health Survey (CCHS, Cycle 1.1, N = 37,473), a national telephone survey administered by Statistics Canada. The target population of the CCHS consisted of persons aged 12 and over resident in private dwellings in all provinces and territories, excepting those living on Aboriginal reserves, on Canadian Forces Bases, or in some remote places. The CCHS included the same self-reported health questions as the NPHS. Like the NPHS, this survey uses a multistage stratified cluster design and provides cross-sectional data representative of 98% of the Canadian population over the age of 12 years, and attained an 80% overall response rate (18, 19). After the exclusion criteria were applied there were 26,465 individuals in the validation cohort. The NPHS-MB cohort had a 9-year follow-up (1996-2005) and the CCHS-ON had a 5year follow-up (2000-2005) therefore the predicted risks from the DPoRT algorithm were generated for both 9-year and 5-year follow-up periods.

Figure 1. Cohort used to develop DPoRT Females

Males N=23,403

1138 age < 20 years

1269 age < 20 years N = 20, 996

241 pregnant at interview N = 20, 755 452 Prior DM

442 Prior DM N = 19, 861

No DM (N = 18, 451)

DM (N = 1410)

859 females died 773 males died 692 females 718 males N = 16, 819

22

Identifying respondents who develop diabetes Survey data from development and validation cohorts were linked to provincial administrative health care databases that include all persons covered under the government funded universal health insurance plan. The diabetes status of all respondents in Ontario was established by linking persons to the Ontario Diabetes Database (ODD). The Ontario Diabetes Database (ODD) contains all physician diagnosed diabetes patients in Ontario identified since 1991. The database was created using hospital discharge abstracts and physician service claims. A patient is said to have physician diagnosed diabetes if he or she meets at least one of the following two criteria: (a) a hospital admission with a diabetes diagnosis (International Classification of Diseases Clinical Modification code 250 (ICD9-CM) before 2002 or ICD-10 code E10 – E14 after 2002, or (b) a physician services claim with a diabetes diagnosis (code 250) followed within two years by a either physician services claim or a hospital admission with a diabetes diagnosis. Individuals entered the ODD as incident cases when they were defined as having diabetes according to the criteria described above. A hospital record with a diagnosis of pregnancy care or delivery close to a diabetic record (i.e. a gestational admission date between 90 days before and 120 days after the diabetic record date), were considered to relate to a diagnosis of gestational diabetes and therefore were excluded. The ODD has been validated against primary care health records, and demonstrated excellent accuracy for determining incidence and prevalence of diabetes in Ontario (sensitivity 86%, specificity of 97%)(20, 21). Information regarding the vital statistics and eligibility for health care coverage for linked respondents was captured from the Registered Persons Data Base (RPDB). The algorithm used to create the ODD was applied to Manitoba‘s administrative health care data to ascertain physician-diagnosed

23

diabetes status in that province. The ODD algorithm is applied nationally using provincial administrative registries (known as the National Diabetes Surveillance System (NDSS)) and has been successfully validated in several Canadian provinces (22).

Variables To ensure that DPoRT would be widely applicable across different populations, variables considered for the algorithm had to fulfil the following criteria: (i) be based on well established evidence, (ii) be captured in a consistent manner across surveys (i.e. using the same questions), (iii) be unlikely to be subject to serious self-reporting error (such as alcohol and dietary habits), and (iv) be easily captured using survey data. All potential variables were obtained from selfreport responses to NPHS and CCHS telephone surveys, including: age, height and weight, presence of chronic conditions diagnosed by a health professional (including hypertension and heart disease), ethnicity, immigration status, smoking status, highest level of achieved education, total household income, alcohol consumption and physical activity (based on metabolic equivalents). Body Mass Index (BMI) in kg/ m2 was used as an indicator of obesity. Derived BMI, calculated by dividing the weight in kilograms by height squared (in meters squared) directly from the NPHS, is only calculated for respondents aged 30 to 64; therefore, BMI was calculated using weight and height values for the individual for those who fell outside the age range of 30 – 64 (23). In order to facilitate future use in a variety of settings, all variable in the risk algorithm were kept in the form which is released in the public use data.

24

Statistical Analysis The goal of the model was to create a risk algorithm that would accurately predict diabetes risk with high discrimination and calibration. Discrimination is the ability to differentiate between those who are at high risk and those who are at low risk – or in this case those who will and will not develop diabetes given a fixed set of variables. Calibration is the existence of close agreement between observed diabetes risk – calculated by comparing observed diabetes rates and predicted probabilities from DPoRT. Detailed definitions of discrimination and calibration are in appendix 6.1. For each cohort member, the probability of physician-diagnosed diabetes was assessed from the interview date until censoring for death or end of follow-up. The final model was fit using a Weibull accelerated failure time model which provides simple and robust survival probabilities. Importantly this model includes time in its equation, allowing the user to predict probability for a range of follow-up periods. Alternative parametric models (exponential and loglogistic) and a semi-parametric model (Cox Proportional Hazard model) were also considered but either did not fit the data appropriately or did not perform as well, particularly when applying to external cohorts. Furthermore, the ability to asses risk for a variety of follow-up times is an important feature needed when applying this model in different settings. Diabetes risk functions were derived separately for men and women above age 20 without prior diabetes diagnoses. The following covariates were considered as candidates for the algorithm: age, BMI and their interactions, presence of hypertension, presence of heart disease, ethnicity, immigration status, smoking status, education, income, and physical activity (based on metabolic equivalents or METS). First unadjusted relationships with diabetes were inspected and variables with a clinically significant unadjusted hazard ratio were considered for the final model. Risk factors were added to the model in a nested fashion based on clinical importance,

25

and the marginal statistical and predictive significance of each group was evaluated, controlling for variables already in the model. For both males and females, the BMI-age interactions were fit first, then hypertension and heart disease, then ethnicity and immigration status, then remaining risk factors. Covariates with missing information were also assessed for their relationship with the outcome as missingness is often associated with higher risk, in this case, missing BMI for females. Polynomial functions and splines were fit to try and accurate capture the relationship between BMI and diabetes and its interaction with age. Each variable was centred on the population mean before inclusion in the model to allow for easier calibration with other cohorts. This means that when the algorithm is applied, all variables are centered to the mean variables for the cohort to which it is being applied allowing levels of risk to be reflective of the average baseline risk in the cohort of interest. Overall risk (predicted probability) of diabetes for each person was calculated by multiplying the individual‘s risk factor values by the corresponding regression coefficients, and summing the products (24). The form of the model was assessed using likelihood ratio tests to compare nested parametric models (25). A plot of Log(-logS(t))) vs log(t) was inspected for linearity to assess consistency of the survival times with the Weibull distribution. Cox-Snell residual plots were also constructed to assess the adequacy of the Weibull distribution assumption. The Cox-proportional hazards model was fit to compare the risks with the parametric model. Discrimination was measured using a C statistic modified for survival data developed by Pencina et al. (26), analogous to the area under the ROC curve (27). Calibration was assessed using a modified version of the Hosmer-Lemeshow χ2 statistic developed by Nam (28, 29). This statistic is computed by dividing the validation cohort into deciles of predicted risk and comparing the observed versus predicted risk in each decile using a chi-square statistic (see

26

appendix 6.1). To mark sufficient calibration χ2 = 20 was used as a cutoff (p<0.01), consistent with D‘Agostino‘s method in validating the Framingham algorithms (28). Discrimination and calibration statistics were also computed using the algorithm coefficients generated from the validation cohort itself, and labeled ―own cohort‖. This was done to assess if the algorithms generated from the validation cohort produced significantly different measures of predictive accuracy than DPoRT. In addition to the Hosmer-Lemeshow test for calibration, graphical representations of predicted and observed rates were produced to assess accuracy in prediction across quintiles of risk. In order to generate these plots, predicted risks are ordered from smallest to largest. The predicted probabilities are separated into quantiles of risk (in this study deciles or quintiles). The number of predicted diabetes cases in each decile or quintile is determined by summing the predicted risk within each quantile and the observed diabetes cases are also summed within quantile. To present these numbers as proportions, the number of predicted and observed cases are respectively divided by the number of individuals within each decile or quintile. The observed and predicted cases are then plotted against each other at each quantile of risk. Re-calibration was achieved by substituting the mean values from the validation cohort to define all variables (which were mean-centered) with the Framingham risk function. Due to systematic case ascertainment differences between jurisdictions, a further adjustment was applied to predicted rates outside of Ontario (to reflect case ascertainment differences between provinces). This adjustment was done by estimating the average ratio between observed and predicted risk across decile (after taking into account differences in risk level through meancentering). The aggregate risk estimate was divided by the average amount of over-prediction averaged across all risk groups. The assumption for this adjustment was that the differences are a

27

function of case-ascertainment (due to billing practices) between provinces. This approach has been used in previous research involving the calibration of risk algorithms, including by Brindle et al. to adjust Framingham risk functions for the UK (30). All estimates (including betas and variance estimates) incorporated bootstrap replicate survey weights to accurately reflect the demographics of the Canadian population and account for the survey sampling design based on selection probabilities and post stratification adjustments. Variance estimates and 95% confidence intervals were calculated using bootstrap survey weights (31, 32). All statistics were computed using SAS statistical software (version 9.1 SAS Institute Inc, Cary, NC).

28

2.4 Results In the development cohort, 718 males and 692 females developed physician diagnosed diabetes within the 9-year follow-up period (crude 9-year incidence rate of 7.78% and 6.13% respectively). The age standardized (standardized to the 1991 Canadian population) 9-year incidence rates in the development cohorts were 6.67 % for males and 5.59 % for females. The 5-year age standardized incidence rates in the development cohort were 3.59 % for males and 2.81% for females. In the Manitoba validation cohort 272 males and 258 females developed physician diagnosed diabetes within the 9-year follow-up period (crude 9-year incidence rate of 7.22% and 4.75% respectively). The age standardized 9-year incidence rates in the development cohorts were 6.55 % for males and 4.27 % for females. In the Ontario-CCHS validation cohort 559 males and 558 females developed physician diagnosed diabetes within the 5-year follow-up period (crude 5-year incidence rate of 4.60% and 3.69% respectively). The age standardized 5-year incidence rates in the Ontario-CCHS validation cohort was 3.95 % for males and 3.35 % for females. All baseline population characteristics in the derivation cohort and two validation cohorts are shown in table 1. Both the Manitoba and Ontario validation cohorts differed from the derivation cohort. There were similar in age distribution, however both NPHS-MB and CCHS-ON had a higher proportion of obese individuals. The CCHS-ON cohort had significantly higher number non-white ethnicities compared to the derivation cohort, whereas the derivation cohort and Manitoba validation cohort were similar. Compared to the derivation cohort the CCHS-ON had a higher baseline prevalence of hypertension and heart disease but a lower prevalence of smoking. The Manitoba validation cohort had higher levels of hypertension and heart disease compared to the derivation cohort in women only.

29

Table 1. Baseline characteristics of development and validation cohorts. Categorical variables are represented as a proportion (%) and continuous variables are represented as a mean/median.

Risk Factor

Body Mass Index (Kg/m2)

Development Cohort Ontario NPHS (N = 9,177)

MALES Validation Cohorts Manitoba NPHS (N=4,670)

Ontario CCHS (12, 020)

Development Cohort Women NPHS (N = 10,618)

FEMALES Validation Cohorts Manitoba NPHS (N=5,229)

Ontario CCHS (N=14,445)

Age (mean/median)

26.10/ 25.70 44/42

26.86/ 26.31 44/42

26.12/ 25.62 44/42

24.47/ 23.50 46/42

25.43/ 24.59 47/42

24.98/ 24.03 46/42

Age <45, % 45≤Age<65, % Age≥65, %

54.80 30.78 14.42

55.67 29.71 14.63

55.85 31.00 13.15

51.68 29.92 18.39

52.71 27.79 19.51

51.59 31.51 16.90

BMI<23 23≤BMI<25 25≤BMI<30 30≤BMI<35 BMI≥35 BMI = missing

19.48 22.11 43.97 11.31 2.40 0.73

17.79 20.34 44.34 14.40 2.63 0.51

22.23 21.51 40.03 12.74 3.05 0.44

40.39 19.01 24.36 8.50 2.77 4.98

35.89 16.65 28.51 10.55 3.15 5.26

39.29 17.79 27.19 9.47 4.11 2.14

Non-white, % Hypertension, % Current Smoker, % Physical Activity Mets (kcal/day) Heart Disease, % Graduated Post Secondary School, % Number Incident Diabetes (unweighted) Crude 9-year incidence rate Age standardized* 9-year incidence rate Crude 5-year incidence rate Age standardized* 5-year incidence rate

11.51 10.23 29.67

10.42 10.16 30.76

16.68 12.50 24.78

10.41 12.32 24.48

10.51 13.22 24.40

16.76 14.94 18.97

1.86/1.20 4.97

1.79/1.10 4.43

1.97/1.30 5.19

1.62/1.10 4.16

1.44/1.00 4.62

1.63/1.10 5.24

81.12

73.28

82.11

81.86

73.81

81.09

718

272

559

692

258

558

7.78

7.22

6.13

4.75

6.67

6.55

5.59

4.27

4.26

4.60

3.23

3.69

3.59

3.95

2.81

3.35

*standardized to the 1991 Canadian population

30

For both males and females, diabetes risk was highly positively related to BMI and age (tables 2a&b). BMI was considered in both its continuous and categorical form. It was found the relationship between BMI and diabetes was non-linear and complicated power transformations were needed to correctly model the association. Continuous BMI not only had a complicated exponential form but its relationship with diabetes varied significantly by age and sex. BMI categories had the advantage of not being constrained to a specific mathematical function and having regression coefficients that were easier to interpret than those from a series of polynomial BMI terms. Ultimately, the best goodness-of-fit and calibration was achieved by categorizing BMI and including its interactions with age since the effect of BMI was highly dependent on age. This categorization was least likely to over-fit when applied to other data while at the same remaining discriminating. Non-white ethnicity, hypertension, and less than post secondary education were also important factors associated with an increased risk of diabetes in the multivariate model. For males, smoking and heart disease were also important independent risk factors that were found to improve model characteristics; for women, immigrant status was an additional important risk factor found to improve the model. Some variables were excluded from both models because they did not improve the model or worsened predictive accuracy, including: income, physical activity, and alcohol consumption. The majority of discrimination (C=0.70) was achieved by including only age, BMI and their interaction. The sex-specific estimates for DPoRT algorithm are shown in table 3. Figure 2 demonstrates how the risk coefficients in DPoRT were used to calculate risk using a high-risk male as an example.

31

Table 2a. DPoRT multivariate-adjusted hazard ratios and 95% confidence intervals for 9-year physician diagnosed diabetes for males

Risk Factor

Males Hazard ratio 95% CI

Hypertension No Yes Non-white ethnicity No Yes Heart Disease No Yes Current Smoker No Yes Education < Post-secondary Secondary Age-BMI category BMI<23*Age<45 23≤BMI<25*Age<45 25≤BMI<30*Age<45 30≤BMI<35*Age<45 BMI≥35*Age<45 BMI<23*Age≥45 23≤BMI<25*Age≥45 25≤BMI<30*Age≥45 30≤BMI<35*Age≥45 BMI≥35*Age≥45

1.00 1.38

(1.08, 1.78)*

1.00 2.19

(1.54, 3.10) ‡

1.00 1.95

(1.43, 2.65) ‡

1.00 1.25

(1.00, 1.57)

1.00 0.75

(0.61, 0.92)†

1.00 4.65 6.85 23.58 74.68 11.70 20.79 34.44 61.69 86.04

(1.53, 14.12) ‡ (2.41, 19.46) ‡ (7.89, 70.43) ‡ (24.23, 230.17) ‡ (3.97, 34.50) ‡ (7.27, 59.50) ‡ (12.17, 97.48) ‡ (20.75, 183.38) ‡ (26.61, 278.20) ‡

Adjusted for all variables in the model simultaneously; all standard errors are computed using survey bootstrap weights *0.01
32

Table 2b. DPoRT multivariate-adjusted hazard ratios and 95% confidence intervals for 9-year physician diagnosed diabetes for females

Risk Factor

Females Hazard ratio 95% CI

Hypertension No Yes Non-white ethnicity No Yes Immigrant Status No Yes Education < Post-secondary Secondary Age-BMI category BMI<23*Age<45 23≤BMI<25*Age<45 25≤BMI<30*Age<45 30≤BMI<35*Age<45 BMI≥35*Age<45 BMI = missing*Age<45 BMI<23*45≤Age<65 23≤BMI<25*45≤Age<65 25≤BMI<30*45≤Age<65 30≤BMI<35*45≤Age<65 BMI≥35*45≤Age<65 BMI = missing*45≤Age<65 BMI<23*Age≥65 23≤BMI<25*Age≥65 25≤BMI<30*Age≥65 30≤BMI<35*Age≥65 BMI≥35*Age≥65 BMI = missing*Age≥65

1.00 1.44

(1.11, 1.88)†

1.00 1.74

(1.16, 2.60)†

1.00 1.45

(1.17, 1.81)‡

1.00 0.77

(0.61, 0.97)*

1.00 2.00 2.94 6.08 13.75 4.26 0.91 2.45 6.13 17.03 18.25 9.11 4.00 4.31 7.75 11.75 16.61 10.38

(0.89. 4.50) (1.47, 5.92)† (3.01, 12.28)‡ (6.27, 30.17)‡ (1.95, 9.33)‡ (0.41, 2.04) (1.18, 5.10)* (3.23, 11.63)‡ (8.63, 33.60)‡ (8.69, 18.33)‡ (4.29, 19.31)‡ (2.05, 7.77)‡ (2.12, 8.75)‡ (4.16, 14.45)‡ (5.43, 25.42)‡ (6.12, 45.06)‡ (2.75, 39.21)‡

Adjusted for all variables in the model simultaneously; all standard errors are computed using survey bootstrap weights *0.01
33

Table 3. DPoRT Functions for predicting 9-year physician diagnosed diabetes for a) males and b) females (a) Risk Factor Intercept Hypertension No Yes Non-white Ethnicity No Yes Heart Disease No Yes Current Smoker No Yes Education < Post-secondary Secondary Age-BMI category BMI<23*Age<45 23≤BMI<25*Age<45 25≤BMI<30*Age<45 30≤BMI<35*Age<45 BMI≥35*Age<45 BMI<23*Age≥45 23≤BMI<25*Age≥45 25≤BMI<30*Age≥45 30≤BMI<35*Age≥45 BMI≥35*Age≥45 Scale

Men 10.5971 0.00 -0.2624 0.00 -0.6316 0.00 -0.5355 0.00 -0.1765 0.00 0.2344 0.00 -1.2378 -1.5490 -2.5437 -3.4717 -1.9794 -2.4426 -2.8488 -3.3179 -3.5857 0.8049

(b) Risk Factor Intercept Hypertension Yes No Non-white Ethnicity Yes No Immigrant Status Yes No Education < Post-secondary Secondary Age-BMI category BMI<23*Age<45 23≤BMI<25*Age<45 25≤BMI<30*Age<45 30≤BMI<35*Age<45 BMI≥35*Age<45 BMI = missing*Age<45 BMI<23*45≤Age<65 23≤BMI<25*45≤Age<65 25≤BMI<30*45≤Age<65 30≤BMI<35*45≤Age<65 BMI≥35*45≤Age<65 BMI = missing*45≤Age<65 BMI<23*Age≥65 23≤BMI<25*Age≥65 25≤BMI<30*Age≥65 30≤BMI<35*Age≥65 BMI≥35*Age≥65 BMI = missing*Age≥65 Scale

34

Women 10.5474 0.00 -0.2865 0.00 -0.4309 0.00 -0.2930 0.00 0.2042 0.00 -0.5432 -0.8453 -1.4104 -2.0483 -1.1328 0.0711 -0.7011 -1.4167 -2.2150 -2.2695 -1.7260 -1.0823 -1.1419 -1.5999 -1.9254 -2.1959 -1.8284 0.7814

Figure 2. Example use of DPoRT to predict the 9 year risk of a diabetes mellitus for a specific high-risk male 2

Profile: Male; 55 years old; BMI = 29 kg/m , hypertension, white, non- smoker; heart disease & hypertension, graduated secondary school **Note all variables are centered μ = 10.5971 + -0.2624*hypertension -0.6316*non-white ethnicity -0.5355*heart disease -0.1765*smoker +0.2344*secondary school education + 0.00*BMI<23*Age<45 -1.2378* 23≤BMI<25*Age<45 1.5490*25≤BMI<30*Age<45 -2.5437*30≤BMI<35*Age<45 -3.4717*BMI≥35*Age<45 -1.9794*BMI<23*Age≥45 2.4426*23≤BMI<25*Age≥45 -2.8488*25≤BMI<30*Age≥45 -3.3179*30≤BMI<35*Age≥45 -3.5857* BMI≥35*Age≥45 μ = 10.5971 + -0.2624*(1 - 0.1083811) -0.6316*(0-0.108281) -0.5355*(1-0.0519083) -0.1765*(0- 0.2939450) + 0.2344*(1-6288460) - 1.2378* (0-0.1290286) - 1.5490*(0-0.2207441) -2.5437*(0-0.0558456) -3.4717*(0-0.0120304) 1.9794*(0-0.068159) -2.4426*(0- 0.0860440) - 2.8488*(0.2189925) -3.3179*(0-0.0572891) -3.5857* (1-0.0119579) μ = 8.8816 m = log(365.25*9) – μ / σ = 8.09781– 8.8816/ 0.8049 m = -0.97374 9-year predicted risk for developing diabetes is: -0.97374 P = 1-exp(-e ) P = 0.314542896 or 31.45%

The algorithm was re-calibrated to the mean of the validation cohorts. For example, the coefficient for the ‗non-white ethnicity‘ variable in males is -0.6316 (table 3a). If the variable were not mean centered this coefficient would be multiplied by 1 or 0 given the status of ethnicity for that individual; therefore for someone with non-white ethnicity it would be -0.6316 x 1. By mean centering the value of the individual is related to the baseline rate of non-white ethnicity in the cohort to which the algorithm is being applied. If the cohort where the algorithm is being applied has a 10.828% baseline rate of non-white ethnicity for males, then the resulting arithmetic becomes: -0.6316 x (1-0.10828). Observed and predicted diabetes rates differed ≤ 0.4% in the two validation cohorts. In the CCHS cohort, the mean-calibrated DPoRT 5-year predicted (and observed) DM incidence

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella rates were 4.2% (4.6%) for males and 3.4% (3.7%) for females. Observed and predicted risks for the CCHS cohorts, by decile of risk, are shown in figure 2. In the NPHS-MB cohort, the mean-calibrated DPoRT 9-year DM predicted (observed) incidence rates were 8.2% (7.1%) for males and 6.6% (4.7%) for females. After adjusting for case ascertainment differences between provinces, the predicted (observed) incidence rates were 7.0% (7.1%) for males and 5.1% (4.7%) for females (figure 2).

Figure 2. Overall calibrated predicted versus observed physician diagnosed diabetes rates for males and females in the two validation datasets.

Observed

Predicted

8.0% 7.0%

Diabetes incidence rate (%)

6.0% 5.0% 4.0% 3.0% 2.0% 1.0% 0.0% Males

Females

Males

CCHS-ON validation cohort

Females

NPHS-MB validation cohort

36

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Due to the smaller sample size in the NPHS-MB cohort, the risks are reported by quintiles. Rsquared between observed diabetes incidence rates and predicted probabilities from the DPoRT algorithm across deciles of risk in both validation data sets exceeded 98% for both males and females after re-calibration. C-statistics when applying DPoRT to the validation cohorts were high (0.77 – 0.80) and were not appreciably lower than those generated from the ―own cohort‖ models (Table 4). Hosmer-Lemeshow χ2 statistics to assess calibration are shown in table 4. For men and women, sufficient calibration was demonstrated in the CCHS validation cohort after mean adjustment (χ2 <20). For NPHS-MB, to achieve goodness of fit additional calibration was needed to adjust for case ascertainment differences between provinces.

37

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella

Figure 3. Calibrated predictions for incidence of physician diagnosed diabetes for males and females in two validation datasets across deciles or quintiles of risk.

Males

Females

A. DPoRT applied to Ontario CCHS † N = 12,018 C = 0.77 95%CI (0.76, 0.79)

N = 14,442 C = 0.76 95%CI (0.74, 0.77)

B. DPoRT applied to Manitoba NPHS † ¥ N = 4,670 C = 0.79 95%CI (0.77, 0.82)

N = 5,229 C = 0.80 95%CI (0.77, 0.82)

† Goodness of fit was achieved (H-L χ2 < 20) calibrated model

38

Table 4. C statistics with 95% confidence intervals and calibration χ2 statistics for DPoRT and cohorts‘ own functions. C statistic 95% CI DPoRT Own Function†

Men NPHS ON

CCHS ON

NPHS MB

Women NPHS ON

CCHS ON

NPHS MB

0.77 (0.76, 0.79) 0.80 (0.78, 0.83)

0.77 (0.76, 0.79) 0.78 (0.76, 0.79)

0.79 (0.77, 0.82) 0.80 (0.77, 0.82)

0.78 (0.76, 0.79) 0.80 (0.78, 0.83)

0.76 (0.74, 0.77) 0.77 (0.75, 0.79)

0.80 (0.77, 0.82) 0.80 (0.77, 0.82)

Calibration χ2 Uncalibrated DPoRT 4.33 13.23 136.13 5.22 24.84 35.07 Mean Calibrated … 13.04 32.616 … 18.27 25.83 DPoRT‡ Mean and CA Calibrated … … 18.35 … … 17.61 DPoRT¥ Own function … 8.89 8.32 … 10.44 4.88 † Own function is the factors of the algorithm applied using coefficients derived from the validation cohort‘s own data ‡Calibrated DPoRt is function adjusted using the validation cohort‘s own means for factors; does not include case ascertainment adjustment ¥ Based on binary H-L statistic (approximation to survival based H-L)

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 2.5 Discussion This study demonstrated that diabetes risk can be accurately predicted at the population level using self-reported age, sex, BMI and other measures available in population health surveys. In addition to displaying good discrimination, DPoRT-predicted rates closely agreed with observed rates for both males and females in both external validation cohorts, and this agreement was generally maintained across deciles and quintiles of risk. To my knowledge, DPoRT is the first validated risk tool that is integrated into commonly-collected population health survey data. DPoRT offers advantages over existing methods used to estimate future diabetes risk in populations. Previous studies that estimate future diabetes burden have either extrapolated overall trends in diabetes prevalence or indirectly incorporated information on the influence of risk factors with various assumptions (3, 33-36). Previous studies of diabetes lifetime risk and life expectancy are not predictive, rather they describe diabetes from a life-course perspective using a period or stationary population approach (36, 37). Other studies focus on overall diabetes burden, a useful approach, but one which does not enable users to directly assess the impact of risk factors, such as BMI, on future diabetes. Furthermore, these studies did not assess how future diabetes can be prevented by targeting risk factors since they do not directly quantify the influence of risk factors on baseline risk or diabetes incidence. Complex modeling and simulation studies differ from the approach used in this study in that they use additional information on how populations and risk factors change over time (38, 39). Other simulation studies add more detailed clinical information such as fasting blood sugar level or information on diabetes family history, data not available at the population level. A strength of these simulation models is that they can combine together different data sources and study findings (40). However, these models are complex and often represent clinical or 40

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella theoretical populations, making their estimates difficult to validate in external populations that are meaningful for population health planning. DPoRT could be incorporated into simulation models that consider future changes in population composition and risk factors. The nature of diabetes risk allowed us to discriminate and explain risk using a limited number of variables – most importantly BMI. Discrimination of DPoRT (C statistic 0.77 – 0.80 and wide range of predictive risk across deciles) is as high as or higher than many clinical risk prediction tools used in clinical practice. The most stringent test of model accuracy is the application of the model to a different population (41, 42). This study demonstrates that DPoRT is discriminating and accurate in two external populations that varied across geography and time. The algorithm was further calibrated using population means, which may attenuate differences between populations since risk estimates are relative to baseline risk in the population. Given current data in most countries, DPoRT is a more balanced approach to estimating diabetes risk than methods used in previous research. A number of important clinical values are excluded from DPoRT, such as hip to waist ratio, waist circumference, fasting blood glucose, and family history (10;41;42). Although these variables may be clinically important for assessing diabetes risk, adding these, or other detailed anthropometric measures, is not feasible because they are not routinely collected in most populations. These omitted variables are unlikely to have a major impact on the performance characteristics of the model due to the clustering of risk factors, particularly when dealing with abnormalities of the metabolic system (43-47). Variables not included in DPoRT, such as family history of diabetes or poor diet, are also associated with the clustering of metabolic risk factors that are included in the algorithm such as hypertension and BMI. There is increasing awareness that simple clinical risk tools, including those with self-report risk factor data, perform as well or better than complex models. 41

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella (13, 43, 45-52). Adding more predictors or clinical measures results in negligible improvements to discrimination, and this is likely due to the high correlation of these risk factors with obesity and other variables already included in the model. Previous study have shown that additional predictive variables need to have a high independent risk (odds ratio greater than 6.9 or greater) to result in significant improvements in discrimination, once a discrimination of 0.8 is already achieved (53). This phenomenon was corroborated in the creation of DPoRT, as maximal levels of discrimination was achieved using few predictor variables. Obesity is the most important factor in predicting diabetes risk. BMI is the most commonly used marker of obesity; however measures of central obesity may capture the entire risk domain in a more comprehensive manner and be more meaningful across all age groups (54). This may be considered a limitation of DPoRT since measures of central obesity are not routinely collected on the population level and thus are not included. If included, these estimates may increase both discrimination and accuracy of the algorithm; however debate about the clinical utility of these measures still continues. A recent meta-analysis has shown that there is no evidence of difference in estimates associated with incident diabetes between BMI, waist circumference and waist/hip ratios (55). Furthermore, algorithms to identify individuals for weight loss in populations did not differ if using BMI or waist circumference (56). To ensure DPoRT can be applied in different populations, we gave preference to variables that were: based on established evidence, remained stable over time, were unlikely to be subject to serious measurement error (such as alcohol and dietary habits), and were easily captured using survey data in different populations. For example, physical activity has been shown to have a protective effect on diabetes incidence (57) but was removed from the final algorithm due to the inability to capture this in a reliable and reproducible manner across studies, 42

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella and because of it marginal improvement in the discrimination of diabetes risk in our creation cohort. Despite placing considerable constraints on variable selection as a means of ensuring maximum feasibility, DPoRT maintained good discrimination. Using self-report measures is a limitation which could affect predictive risk accuracy because these measures may be more subject to reporting error and/or bias than measured anthropometric measures. In general there is a high correlation between self-report and measured height and weight; however, validation studies do show that weight tends to be slightly underestimated and height trends to be slightly overestimated and as a result reported BMI is generally lower than measured BMI (58-60). The effect of self-reported BMI may depend on the population where the algorithm is being applied since these patterns have been shown to vary across gender and socioeconomic status (61-63). The ability of DPoRT‘s predictive estimates to agree with observed diabetes risk in different populations will be reduced if systematic errors associated with responses vary across populations or time. In chapter 3.1, simulation modeling is used to specifically explore the impact of potential reporting biases on risk prediction. Another limitation of this study is the use of physician-diagnosed diabetes, as opposed to true diabetes status (diagnosed plus undiagnosed). The estimates from DPoRT may exclude diabetics in the population who are not yet identified by a physician. This could reflect patients with less severe disease and/or poorer access to medical care. Physician-diagnosed diabetes is currently the most commonly measured definition of diabetes at the level of populations. Although estimates put the true prevalence of diabetes higher, advocates of the physiciandiagnosed outcome argue that it is meaningful to people with recognized diabetes and to the treatment of these diagnosed patients within the health care system. If diabetes testing/screening increases over time, predicted estimates could be lower than the observed estimates (under the 43

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella assumption that this would lead to increased case detection). DPoRT has been found to be accurate in different populations for different time periods; however, DPoRT could be recalibrated to predict total diabetes cases using revised information on screening/testing or using estimates of the number of undiagnosed cases relative to diagnosed cases in the population. Finally, the potential for inaccuracy increases the longer into the future the predictions are made or when unforeseen changes occur; therefore, it is recommended that predictions from DPoRT are updated frequently by using the most recent data, limiting predictive calculations to 9 years or less, and validating the risk tool in the population where it is being applied. This tool was developed in Canada and is likely most appropriate in the Canadian setting; however like other risk tools, DPoRT may be transportable once validated and calibrated as needed. Careful interpretation and adjustment for contextual variables, such as immigrant status, will need to be made in populations outside of Canada. Curbing the diabetes epidemic has been identified by several governments and health policy makers as one of their top priorities for improving, and even maintaining, the health of their nations. Population-based prediction models such as DPoRT can be used to improve health planning, explore the impact of prevention strategies, and enhance understanding of the distribution of disease populations. This study demonstrates that DPoRT accurately predicts diabetes incidence and is effective at discriminating risk at the population level. This algorithm can be used by health planners to estimate diabetes incidence, to stratify the population by risk and quantify the impact of interventions using routinely collected survey data. Validation of DPoRT will continue as more effective methods to quantify variation in case ascertainment in different populations are developed. As the surveillance of risk factors and diabetes advances,

44

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella DPoRT can be adapted to become even more accurate, while maintaining its accessibility for decision makers.

45

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 2.6 References 1. Anderson KM et al. An Updated Coronary Risk Profile - A Statement for HealthProfessionals. Circulation 1991;83:356-62. 2. Hippisley-Cox J et al. Derivation and validation of QRISK, a new cardiovascular disease risk score for the United Kingdom: prospective open cohort study. British Medical Journal 2007;335:136-41. 3. Wild S et al. Global prevalence of diabetes - Estimates for the year 2000 and projections for 2030. Diabetes Care 2004;27:1047-53. 4. Zimmet P, Alberti KGMM, Shaw J. Global and societal implications of the diabetes epidemic. Nature 2001;414:782-7. 5. Manuel DG et al. Effectiveness and efficiency of different guidelines on statin treatment for preventing deaths from coronary heart disease: modelling study. BMJ 2005;332:1419-22. 6. Eddy DM, Schlessinger L. Validation of the archimedes diabetes model. Diabetes Care 2003;26:3102-10. 7. Hanley AJG et al. Prediction of type 2 diabetes using simple measures of insulin resistance - Combined results from the San Antonio Heart Study, the Mexico City Diabetes Study, and the Insulin Resistance Atherosclerosis Study. Diabetes 2003;52:4639. 8. Ito C et al. Prediction of diabetes mellitus (NIDDM). Diabetes Research and Clinical Practice 1996;34:S7-S11. 9. Herman WH et al. A new and simple questionnaire to identify people at increased risk for undiagnosed diabetes. Diabetes Care 1995;18:382-7. 10. Lindstrom J, Tuomilehto J. The diabetes risk score: a practical tool to predict type 2 diabetes risk. Diabetes Care 2007;26:725-31. 11. Larsson H et al. Prediction of diabetes using ADA or WHO criteria in postmenopousal women: a 10-year follow-up study. Diabetologia 2000;43:279-88. 12. Stern MP, Williams K, Haffner SM. Identification of persons at high risk of type 2 diabetes mellitus: Do we need the oral glucose tolerance test. Annals of Internal Medicine 2009;136:575-81. 13. Wilson P et al. Prediction of incident diabetes mellitus in middle-aged adults. Archives of Internal Medicine 2007;167:1068-74. 14. McGeechan K et al. Assesing new biomarkers and predictive models for use in clinical practice. Archives of Internal Medicine 2008;168:2304-10. 46

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 15. Ford ES, Giles WH, Dietz WH. Prevalence of the metabolic syndrome among US adults - Findings from the Third National Health and Nutrition Examination Survey. Jama-Journal of the American Medical Association 2002;287:356-9. 16. Hu FB et al. Diet, Lifestyle, and the Risk of Type 2 Diabetes Mellitus in Women. New England Journal of Medicine 2001;345:790-7. 17. Statistics Canada. 1996-97 NPHS Public Use Microdata Documentation. 1999. Ottawa. 18. Statistics Canada. Canadian Community Health Survey Methodological Overview. Health Reports 2002;13:9-14. 19.

Canadian Community Health Survey 2000–2001. Ottawa: 2003.

20.

Hux JE, Ivis F. Diabetes in Ontario. Diabetes Care 2005;25:512-6.

21. Lipscombe LL, Hux JE. Trends in diabetes prevalence, incidence, and mortality in Ontario, Canada 1995-2005: a population-based study. Lancet 2007;369:750-6. 22. Health Canada. Responding to the Challenge of Diabetes in Canada. 2003. Ottawa, ON. 23. Statistics Canada. 1996-7 National Population Health Survey: Derived Variable Specifications. 1999. Ottawa. 24. Odell PM, Anderson KM, Kannel WB. New Models for Predicting Cardiovascular Events. Journal of Clinical Epidemiology 1994;47:583-92. 25. Farewell VT, Prentice RL. A study of distributional shape in life testing. Technometrics 1977;19:69-75. 26. Pencina M, D'Agostino RB. Overall C as a measure of discrimination in survival analysis: model specific population value and confidence interval estimation. Statistics in Medicine 2004;23:2109-23. 27. Campbell G. General Methodology I: Advances in statistic methodology for the evaluation of diagnostic and laboratory tests. Statistics in Medicine 2004;13:499-508. 28. D'Agostino RB et al. Validation of the Framingham Coronary Disease Prediction Scores. JAMA 2001;286:180-7. 29. Nam B-H. Discrimination and Calibration in Survival Analysis: Extension of the ROC Curve for Descrimination and Chi-square test for Calibration. 2000. Boston University. 30. Brindle P et al. Predictive accuracy of the Framingham coronary risk score in British men: prospective cohort study. BMJ 2003;327:1267.

47

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 31. Yeo D, Mantel H, Lui TP. Bootstrap variance estimation for the National Population Health Survey. 778-783. 1999. Baltimore, American Statistical Association. 32. Kovacevic MS, Mach L, Roberts G. Bootstrap variance estimation for predicted individual and population-average risks. Proceedings of the Survey Research Methods Section. 2008. American Statistical Association. 33. Boyle JP et al. Projection of diabetes burden through 2050 - Impact of changing demography and disease prevalence in the US. Diabetes Care 2001;24:1936-40. 34. King H, Aubert RE, Herman WH. Global burden of diabetes, 1995-2025 Prevalence, numerical estimates, and projections. Diabetes Care 1998;21:1414-31. 35. World Health Organization. Report of a WHO consultation on obesity, Obesity: preventing and managing the global epidemic. 1998. Geneva, World Health Organization. 36. Narayan KMV et al. Lifetime risk for diabetes mellitus in the United States. JAMA 2003;290:1884-90. 37. Manuel D, Schultz S. Health-related quality of life and health-adjusted life expectancy of people with diabetes in Ontario, Canada, 1996-1997. Diabetes Care 2004;27:407-14. 38. Forouhi NG et al. Diabetes prevalence in England, 2001 - estimates from an epidemiological model. Diabetic Medicine 2006;23:189-97. 39. Mainous AG et al. Impact of the population at risk of diabetes on projections of diabetes burden in the United States: an epidemic on the way. Diabetologia 2007;50:93440. 40. Ford ES et al. Explaining the decrease in US deaths from coronary disease, 19802000. New England Journal of Medicine 2007;356:2388-98. 41. Altman DG, Royston P. What do we mean by validating a prognostic model? Statistics in Medicine 2000;19:453-73. 42. Harrell FE, Lee KL, Mark DB. Multivariable prognostic models: Issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Statistics in Medicine 1996;15:361-87. 43. Carmelli D, Cardon LR, Fabsitz R. Clustering of Hypertension, Diabetes, and Obesity in Adult Male Twins - Same Genes Or Same Environments. American Journal of Human Genetics 1994;55:566-73. 44. DeFronzo RA, Ferrannini E. Insulin resistance. A multifaceted syndrome responsible for NIDDM, obesity, hypertension, dyslipidemia, and atherosclerotic cardiovascular disease. Diabetes Care 1991;41:173-94.

48

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 45. Lorenzo C et al. The metabolic syndrome as predictor of type 2 diabetes - The San Antonio Heart Study. Diabetes Care 2003;26:3153-9. 46. Meigs JB et al. Risk variable clustering in the insulin resistance syndrome - The Framingham Offspring Study. Diabetes 1997;46:1594-600. 47. Schmidt MI et al. Clustering of dyslipidemia, hyperuricemia, diabetes, and hypertension and its association with fasting insulin and central and overall obesity in a general population. Metabolism-Clinical and Experimental 1996;45:699-706. 48. Mainous AG et al. A Coronary heart disease risk score based on pateint-reported information. The American Journal of Cardiology 2007;1236-41. 49. Ambler G, Brady AR, Royston P. Simplifying a prognostic model: a simulation model based on clinical data. Statistics in Medicine 2002;21:3803-22. 50. Chambless LE et al. Prediction of ischemic Stroke Risk in the Atherosclerosis Risk in Communities Study. American Journal of Epidemiology 2004;160:259-69. 51. Wilson PWF et al. Prediction of coronary heart disease using risk factor categories. Circulation 1998;97:1837-47. 52. DeFronzo RA. Insulin Resistance, Hyperinsulinemia, and Coronary-Artery Disease - A Complex Metabolic Web. Journal of Cardiovascular Pharmacology 1992;20:S1-S16. 53. Pepe MS et al. Limitations of the odds ratio in gauging the performance of a diagnostic, prognostic, or screening marker. American Journal of Epidemiology 2004;159:882-90. 54. Flint E, Rimm E. Obesity and cardiovascular disease risk among the young and old - is BMI the wrong benchmark? International Journal of Epidemiology 2006;35:1879. 55. Vazquez G et al. Comparison of body mass index, waist circumference, and Epidemiology 2007;29:115-28. 56. Mason C, Katzmarzyk PT, Blair SN. Eligibility for obesity treatment and risk of mortality in men. Obesity Research 2005;13:1803-9. 57. Bassuk SS, Manson JE. Epidemiological evidence for the role of physical activity in reducing risk of type 2 diabetes and cardiovascular disease. Journal of Applied Physiology 2005;99:1193-204. 58. Nawaz H et al. Self-reported weight and height: implications for obesity research. Journal of Preventive Medicine 2001;20:294-8. 59. Rowland M. Self-reported height and weight. American Journal of Clinical Nutrition 2007;52:1125-33. 49

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 60. Shields M, Gorber SC, Tremblay MS. Estimates of obesity based on self-report versus direct measures. Health Reports 2008;19:1-16. 61. Bostrom G, Diderichsen F. Socioeconomic differentials in misclassification of height, weight and body mass index based on questionnaire data. International Journal of Epidemiology 1997;26:860-6. 62. Niedhammer I et al. Validity of self-reported weight and height in the French GAZEL cohort. International Journal of Obesity 2000;24:1111-8. 63. Wardle K, Johnson F. Sex differences in the association of socioeconomic status with obesity. Internation Journal of Obesity and Related Metabolic Disorders 2002;26:1144-9.

50

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 3. Measurement in Risk Prediction Models 3.1 The influence of measurement error on accuracy (calibration), discrimination, and overall estimation of a risk prediction model

3.1.1 Abstract Background: Self-reported height and weight can be subject to several types of measurement error. The impact of this measurement error on the estimated risk, discrimination, and calibration of a model that uses height and weight expressed as body mass index (BMI) to predict 10-year risk of developing diabetes incidence has never been systematically studied. Objective: To use simulation to quantify and describe the effect of random and systematic measurement error in self-reported height and weight on the performance of a model for predicting diabetes. The three performances measures used are predicted risk, accuracy, and discrimination. Methods: Predicted risk, Hosmer-Lemeshow goodness-of-fit χ2, correlation between observed and predicted probabilities of developing diabetes and C statistic were measured under various levels of random and systematic error. Simulations were done 500 times with sample sizes typical of population-bases surveys (~10,000) used to collect self-reported risk factor information. Results were run separately for males and females. Results: Simulation data successfully reproduced estimates, discrimination and calibration values similar to those generated by the algorithm that was derived from actual population data. Increasing levels of random error in height and weight reduced the calibration and discrimination of the risk algorithm. Furthermore, random error biased the predicted risk upwards. Systematic

51

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella error biased predicted risk in the direction of the under- or over-estimation and reduced calibration; however, it did not affect discrimination. Conclusions: This study demonstrates that random and systematic errors have the potential to influence the performance of risk algorithms. Overall predicted risk should be carefully considered in the context of potential measurement errors. Further research that quantifies the amount and direction of error can improve the performance of prediction tools by allowing for adjustments in exposure measurements.

52

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 3.1.2 Introduction Epidemiologic studies rely on the measurement of exposure and outcome variables. Measurement error is said to occur when the measured value of a variable does not equal the ‗true‘ value. Measurement error is a major concern in epidemiologic research and has been discussed extensively in the literature(1-5). In general there are two classes of measurement error: (i) random error is error that on average is equal to zero (ii) systematic error is error that on average is not equal to zero (6). Measurement error has been mainly examined with respect to its effect on risk estimates, such as risk ratios or hazard ratios(1, 7, 8). This research has led to improvements in the critical appraisal and interpretation of epidemiologic findings and has provided guidance for measurement in future studies. While useful for understanding the effects of error on etiological estimates of disease, the findings from these studies do not directly apply to risk algorithms. Risk prediction is a key aspect of clinical work and has been recently has been applied to population health through the Diabetes Population Risk Tool (DPoRT) (Chapter 2). Accurate and valid prediction of the probability of developing a disease given a set of baseline risk factors is a valuable tool for clinical management, health policy and population health decision making. In risk algorithms disease prediction is based on a set of baseline variables that may contain measurement error that could affect the prediction, discrimination and accuracy of the risk tool. DPoRT is made more accessible by using characteristics routinely measured in population health surveys to make predictions. National health surveys are conducted through telephone interviews and are based on self-reported responses. These self-reported responses can result in random error due to imperfect recall or misunderstanding of the question. They can also result in systematic error as a result of psychosocial factors, such as social desirability. The influence of 53

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella error from self-report surveys on disease prediction has not been systemically studied. In particular, the influence that measurement error has on predictive accuracy has never been examined. By understanding the consequence of measurement error on risk algorithms, efforts could be made to correct for these errors and thus improve the accuracy and validity of a risk algorithm. Furthermore, other developers of population risk tools can use this information to better weigh the pros and cons of using different types of data. The objective of this study is to use simulation to understand the effect of measurement error in self-reported risk factors on the performance of a simple risk algorithm to predict diabetes. This study will focus on the measurement of Body mass index (BMI), a function of height and weight, because it has the greatest influence on diabetes risk (9-13). In general, there is a high correlation between self-report and measured height and weight (14); however, a recent systematic review to summarize the empirical evidence regarding the concordance of objective and subjective measures of assessing height and weight found that the general trend was an underestimation of weight and an overestimation of height (15). Currently, there are no linked data available that allow assessment of self-reported versus measured or ‗true‘ BMI simultaneously with the diabetes outcome. Simulation allows one to examine the impact of varying levels of measurement error on discrimination and accuracy. These results will inform DPoRT and assist in understanding how its predictions and validation are affected by the use of population health surveys. Research from studies that examine the impact of measurement error on relative risk shows that, in general, random error results in an estimate that is closer to the null value; whereas systematic error results in over- or underestimates of risk in a particular direction depending on

54

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella the nature of the bias (1, 4, 7). These findings were the basis for the following a priori hypotheses:

(i) Random (random) measurement error will affect both discrimination and calibration of a model. The presence of measurement error will increase the observed variance in BMI and thus widen the distribution of BMI in both diabetics and non-diabetics. This will affect discrimination by increasing the overlap between the BMI distribution of those that are likely to develop diabetes and the distribution of those that are not. This increased overlap will make it more difficult to resolve them into populations of diabetics and non-diabetics. Calibration will be affected because the predicted risk estimate will be higher or lower than the true estimate, due to the over- or under-estimation of weight and height, and will result in misclassification of diabetes status, resulting in a lower concordance with the observed estimate.

(ii) Systematic error will have minimal affects on discrimination but significant effects on calibration of a model. Systematic error can result in an average over- or under-estimation of predicted risk; however, this should not influence the ability to rank order subjects, under the assumption that the error is not differential between persons that develop or not develop the disease. The same rationale outlined in (i) applies to calibration under this scenario in that the over- or under-estimation of weight and height will reduce the concordance with the observed estimate. However the effect will be more apparent here since the error is occurring all in one direction.

55

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella (iii) Random error will not affect the overall predicted risk of the model and systematic error will influence the predicted risk in the direction of the systematic error. Random error increases the variability, which increases the range of predicted risk equally in both directions and thus, will not skew the distribution in any particular direction. Therefore, the overall predicted risk estimate should be similar to the predicted risk estimate from the model without measurement error.

56

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 3.1.3 Methods Simulating Data Data to simulate the relationship of a continuous variable (BMI) related to probability of developing diabetes were created to closely match the actual data used to create DPoRT. For this simulation, the algorithm predicting the binary outcome of physician-diagnosed diabetes was modeled using a logistic regression form equation and included BMI and its quadratic term as predictor variables. The model can be described as follows: Logit(p) = β0 + β1*BMI + β2*BMI2 where p represents the probability of developing BMI in the next 10 years and is calculated as: P(Yi = 1|Xi) = exp(β‘Xi) / ( 1 + exp(β‘Xi) ) and P(Yi = 0|Xi) = 1 / ( 1 + exp-(β‘Xi) ) where =i, ………n β‘ represents the vector of parameter estimates estimated by maximum likelihood X represents the vector of fixed covariates for each subject To randomly generate a person with diabetes, first height (m) and weight (kg) are generated as Gaussian random variables with means and variances obtained from DPoRT development cohort. Using the specified values for the regression coefficients β0, β1 and β2 the logit was calculated from which the probability P of the person having diabetes is calculated. Then a 0 / 1 Bernoulli random variable with probability P is generated with 1 or 0 meaning that the person does or does not have diabetes. BMI is calculated as the ratio of height in meters (m) to weight in kilograms (kg) squared. Both height and weight were varied individually from the true value assuming different levels of error in the observed estimate. The size of the random measurement error is defined by the intraclass correlation coefficient (ICC), the proportion of the overall observed variance 57

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella attributable to the ―true‖ variance between subjects. The systematic levels of measurement error in height and weight were taken from a recent systematic review that summarized the empirical evidence regarding the concordance of objective and subjective measures of height and weight (16). These error values are consistent with a recent study examining the difference between measured and self-reported height and weight in the CCHS (17). The observed mean and standard deviation of BMI, taken from the National Population Health Survey (NPHS) (18), along with the parameter estimates from the linkage with the Ontario Diabetes Database (ODD) (19, 20) (See chapter 2) were used in our simulation. This simpler algorithm that included only BMI and not the entire set of algorithm variables used in DPoRT was chosen because it has been shown that it explains the majority of variation in diabetes incidence across individuals in a population and produces C statistics as high as 0.73. For this study, the simulation data were created such that the discrimination value was equal to that observed in the actual DPoRT development cohort ~ 0.7. The simulated data were created using observed data assuming no measurement error and then re-fit using various values of systematic and random error in height and weight.

.

58

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella

Random Error The observed variance of height and weight can be described as follows:

(I) σ2observed = σ2true + σ2error where: σ2true is the true variance of the measurement σ2error is the variance, which is attributed to measurement error and σ2observed ≥ σ2true

By these definitions, in the case where no error exists σ2observed = σ2true the intraclass correlation coefficient (ICC), denoted as ρ, is an estimate of the fraction of the total measurement variance associated with the true variation among individuals (21, 22).

(II) ρ =

σ 2true

σ 2true + σ 2error

From (I) we can express σ2true as σ2true = σ2observed - σ2error and substituting this expression back into (II) gives us:

ρ = (σ2observed - σ2error) / ((σ2observed - σ2error)+ σ2error and therefore ρ can be expressed as:

(III) ρ =

σ 2observed

− σ 2error

σ 2obs erved

Re-arranging (III) allows us to express σ2error as a function of ρ:

59

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella

(IV) σ2error = σ2observed(1- ρ) If no error exists in the measurement ρ = 1.0 and σ2error = 0 and from (I) the σ2observed = σ2true. In other words, all the observed variance is due to the true variance among individuals. An example of the influence of ICC on the distribution of BMI can be seen in the appendix section 6.4.

Systematic Error Deviation of average self-reported height and weight can be represented as:

(V) 𝑌𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 = 𝑌𝑡𝑟𝑢𝑒 + ωi where: 𝑌𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 = the average self-report response in the survey 𝑌𝑡𝑟𝑢𝑒 = the average true value of the response in the survey ωi represents the average amount of over- or under- estimation from the true value

Traditionally the E(ωi) = 0 and thus the given measurement is, on average, equal to the reported measurement; however, in this simulation ωi will represent various levels of over- and underestimation.

60

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Correlation of Height and Weight In the simulation, the errors associated with self-reported height and weight will be varied independently; however there is a strong correlation between height and weight that must be considered. All starting correlation values were taken from the original data. The following definition will be used to express the correlation between height and weight. Let be:

rhw(observed)

= observed correlation coefficient between height and weight

A1 = σheight B1 =

B2 =

σweight * rhw(observed) (σ2weight − B12 )

B1 and B2 will be used to define the variance in weight, such that B1 defines the portion of the variance in weight that is related to its correlation of height. The correlation of height and weight will vary according to ICC values such that

rhw for the true height and weight under measurement error was defined as: rhwTRUE =

rhw(observed) 𝐼𝐶𝐶ℎ𝑒𝑖𝑔ℎ𝑡 × 𝐼𝐶𝐶𝑤𝑒𝑖𝑔ℎ𝑡

61

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Therefore for the observed height and weight: Let be:

M1 = A1 x Z1 M2 = B1 x Z1 + B2 x Z2 M2 = B1 x Z1 +

(σ2weight

or observed

− B12 ) x Z2

The correlation is a function of the B2, which is common to both height and weight Z1 and Z2 are independent standard normal random variables

The true height and weight values were generated using the same approach except rhw(true) will be used instead of rhw(observed).

62

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Simulation using various levels of ICC and systematic error Simulated values of height and weight were generated by applying a random component to the standard deviation of the variable. By applying the definitions of M1 and M2 above and using the following definitions:

𝐻𝑒𝑖𝑔ℎ𝑡 =

𝐻𝑒𝑖𝑔 ℎ𝑡 𝑖 𝑛𝑖

and

𝑊𝑒𝑖𝑔ℎ𝑡 =

𝑊𝑒𝑖𝑔 ℎ𝑡 𝑖 𝑛𝑖

where i = 1 ….. n for each individual in the cohort

Height = 𝐻𝑒𝑖𝑔ℎ𝑡 + M1 Weight = 𝑊𝑒𝑖𝑔ℎ𝑡 + M2 BMI =

𝑊𝑒𝑖𝑔 ℎ𝑡 𝐻𝑒𝑖𝑔 ℎ𝑡 2

BMIOBSERVED2 = BMIOBSERVED X BMIOBSERVED

*note that when using ICC = 1.0 it follows that σ2observed = σ2true, which is the way the data are currently used when applying DPoRT. This analysis quantified the influence of random error by simulating scenarios where a proportion of the observed variance was attributed to random error. These steps were replicated for the given sample (9,177 for males and 10,618 for females) 500 times. Ten-year cumulative incidence was maintained between 7—10 %, in accordance with Ontario male and female diabetes rates taken from the Ontario Diabetes Database. The model was first fit assuming that there was no error in the observed height and weight values to ensure that the expected percent rejection of the null hypothesis, which would be equal to the type I error rate (or α) under the null distribution is equal to 5% and that the 63

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella average C statistics were consistent with those seen in the DPoRT development cohort. All simulations were done using SAS statistical software (version 9.1 SAS Institute Inc, Cary, NC) and random numbers were generated using the RAN family of functions (RANUNI and RANNORM).

Calibration and Discrimination Calibration was measured using the Hosmer-Lemeshow goodness of fit statistic (χ2H-L) where observed and expected values are compared across deciles of risk (23-25). This statistic is computed by dividing the validation cohort into deciles of predicted risk. Consistent with D‘Agostino‘s approach for evaluating observed and predicted values using risk algorithms the value 20, the 99th percentile of a chi square with 8 degrees of freedom , was used as a cutoff to mark sufficient calibration (26). Power to detect calibration was therefore defined, as the proportion of simulations out of the 500 simulations that achieved the χ2H-L = 20 cutoff. This proportion was calculated under the various error situations. Discrimination was measured using a C-statistic, analogous to the area under the ROC curve (27). The C-statistic was calculated using rank statistics and verified by the output from the LOGISTIC procedure. A study flow diagram and an example of the SAS code for females can be seen in the Appendix section 6.4.

64

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Assumptions For this study the following assumptions were made: (i) The random or systematic error does not differ between persons that develop and do not develop the disease or other subpopulations. (ii) Other sources of error including error in ascertainment of diabetes status, selection bias in the survey, and/or sampling are assumed to be absent.

65

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 3.1.4 Results The fitted risk algorithms for males and females are shown in Table 1. These algorithms predict the probability of developing diabetes in 10 years given the respondents‘ BMI, which is a function of their reported height and weight.

Table 1a. Starting values taken from 1996-7 National Population Health Survey used in simulation. Parameter mean (standard deviation) Height (m) Weight (kg) BMI (m/kg2) Correlation for height and weight (rhw) 10-year DM incidence

Males (N = 9, 177) 1.768 (0.075) 81.624 (13.805) 26.076 (3.995) rhw = 0.475 9.17%

Females (N = 10, 618) 1.627(0.069) 64.761 (12.320) 24.495 (4.586) rhw = 0.311 7.35%

Table 1b. Properties of actual risk equation relating BMI to probabilities of developing diabetes from NPHS cohort and values from the simulation model.

Variable BMI BMI2 Intercept Model Properties Calibration (χ2HL) Discrimination (C-statistic) Variable BMI BMI2 Intercept Model Properties Calibration (χ2HL) Discrimination (C-statistic)

Males – NPHS data (N = 9, 177) Coefficient Standard Error Pvalue 0.4202 0.0383 <0.0001 -0.00437 0.000618 <0.0001 -10.4034

Males – Simulation (N = 9, 177) Coefficient Standard Error Pvalue 0.4263 0.0111 <0.0001 -0.00448 0.001049 <0.0001 -10.4897

χ2HL = 5.67, p-value = 0.6841 C = 0.677 Females – NPHS data (N = 10, 618) Coefficient Standard Error Pvalue 0.4565 0.0554 <0.0001 -0.00509 0.00091 <0.0001 -10.8967

χ2HL =9.951, p-value = 0.3689 C = 0.686 Females – Simulation (N = 10, 618) Coefficient Standard Error Pvalue 0.4593 0.0779 <0.0001 -0.00514 0.00141 <0.0001 -10.5899

χ2HL = 9.33, p-value = 0.3153 C = 0.726

χ2HL = 10.466, p-value = 0.3356 C = 0.718

66

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella The results from the simulations are presented by error type (random and systematic error) and by the influence on three areas: predicted risk; calibration and validation. Due to the amount of data, most results are displayed in graphical form; however tables of the results can be found in appendix 6.9.

Random Error Random error in height and weight were examined for ICC = 1.0, 0.9, 0.8, 0.7, 0.6 and 0.5.

Impact on Predicted Risk Under the presence of random error the overall probability of developing diabetes predicted from the risk algorithm was higher than the estimate from the algorithm applied to the data without random error. In other words, the presence of random error biased the overall predicted risk estimate upwards. The differences between the predicted risk without error and with error were relatively small, with the biggest differences being 0.99% higher than the truth (N = 90 more cases) for males and 0.89% higher for females (N = 95 more cases) under extreme levels of random error. Random error in weight had a bigger influence on the predicted risk than random error in height. When the ICC for weight was held at 1.0 (i.e. no error in weight) but ICCs in height were allowed to vary (from 1.0 to 0.5), the largest overestimate in diabetes risk was 0.30% (28 more cases) in males and 0.16% (70 more cases) in females. However, when the ICC for height was held at 1.0 but the ICCs for weight were allowed to vary, the largest overestimate was 0.70% (64 more cases) for females and 0.66% (39 more cases) for males (Figure 1). In the presence of random error, the observed distribution of predicted risk is skewed right compared to the true distribution. Figure 2 shows the observed distribution of predicted risk under the assumption of no random error (thus reported values = true values) compared with the true 67

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella distribution of predicted risk if the observed data had random error (ICC values of 0.8 and 0.6). The fact that the true distributions have a narrower range relative to the observed distributions indicates that if the data to which the algorithm was applied contained some level of random error then the true distribution of predicted probability of developing diabetes would be narrower than what is estimated from the algorithm. Further, the effect on the distribution largely influences the right side of the distribution since the left side is bounded by zero.

Impact on Calibration Under the assumption of no random error in the observed data, the χ2H-L value was < 20 (calibration cutoff achieved) 97% of the time for males and females. At increasing levels of random error the proportion of simulations where H-L χ2H-L was less than the cutoff 20 decreased steadily. Overall, ICCs of approximately 0.8 or higher resulted in the algorithm achieving the calibration criteria, that is a H-L chi square less than 20, at least 80% of the time. In both males and females, errors in weight lead to larger decreases in calibration than errors in height, such that even a perfect height measurement (ICC height = 1.0) would result no calibration if the ICC for weight drops below 0.8. On the other hand, if ICC for weight was 1.0 even if the ICC for height was 0.6, the algorithm could still achieve calibration almost 80% of the time (Figure 3).

Impact on Discrimination Discrimination was decreased in the presence of random error. Under the most severe measurement error the C-statistic was reduced from a C-statistic of 0.69 (with no error) to 0.55 in males and from 0.72 (with no error) to 0.63 in females. If the ICCs for height and weight were higher than 0.8, then the differences in the C-statistic compared to the estimate that had no 68

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella random error were less than 0.02. As with calibration, error in weight had a bigger impact on the C-statistic than errors in height (Figure 4).

69

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Figure 1. Difference in overall risk (observed – true) under random error for males and females (averaged over 500 simulations).

70

Figure 2.Observed (top graphs) and true (middle and bottom graphs) distribution of the probability of developing diabetes under increasing levels of random error (ICC = 0.8 and 0.6) compared with observed distribution. Observed Distribution

ICC = 1.0

Mean: 7.20%

22.5

20.0

17.5

Min, Max: (0.04%, 33.49%)

12.5

10.0

7.5

5.0

2.5

0 0.005

0.025

0.045

0.065

0.085

0.105

0.125

0.145

0.165

0.185

0.205

0.225

0.245

0.265

0.285

0.305

0.325

0.345

90th percentile: 7.38%

prob

True distribution when ICC 0.8 ICC ==0.8

Mean: 7.00%

22.5

20.0

17.5

Min, Max: (0.01%, 32.65%)

Percent

15.0

12.5

10.0

7.5

90th percentile: 7.06 %

5.0

2.5

0 0.005

0.025

0.045

0.065

0.085

0.105

0.125

0.145

0.165

0.185

0.205

0.225

0.245

0.265

0.285

0.305

0.325

0.345

Mean: 6.80%

prob

True distribution when ICC 0.6 ICC ==0.6 25

20

Min, Max: (0.01%, 31.98%)

15 Percent

Percent

15.0

10

90th percentile: 6.85 %

5

0 0.005

0.025

0.045

0.065

0.085

0.105

0.125

0.145

0.165

0.185

prob

0.205

0.225

0.245

0.265

0.285

0.305

0.325

0.345

Figure 3a. Ability to detect calibration in 500 replications under various levels of random error in height and weight among males (top) and females (bottom).

Figure 4. Average C-statistic for 500 replications under various levels of random error in height and weight.

Systematic error The values for systematic measurement error in height and weight used in this study were taken from a recent systematic review that summarized the empirical evidence regarding the concordance of objective and subjective measures of height and weight (16). The underreporting of weight was varied from 0 kg to 3.0 kg below true value (varied in increments of 0.5 kg). The over-reporting of height was varied from 0 cm to 3.0 cm above the true value in increments of 0.5 cm. The average level of systematic error found in the systematic review was an under-reporting of weight of 1.7 kg and an over-reporting of height of 2.5 cm.

Impact on Predicted Risk As expected, under-reporting of weight and over-reporting of height resulted in an underestimate of predicted probability of developing diabetes mellitus. Expected levels of under-reporting of height and weight taken from the systematic review (+2.5 cm and -1.7 kg) would result in 0.86% reduction (91 fewer cases) in the overall predicted probability of developing diabetes for females and a 0.91% reduction (84 fewer cases) for males (Figure 5). The presence of random error in conjunction with systematic error slightly reduced the amount of underestimation (Figure 6).

Impact on Calibration Overall, under-reporting of weight of at least 1.7 kg or over-reporting of height of at least 1.5cm resulted in the algorithm achieving the calibration at least 80% of the time. None of the 500 simulations achieved calibration under the maximum biases in reported height and weight (Figure 7). It must be noted that there is no evidence that these extreme biases are likely in self-

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella reported height and weight; rather, they were investigated to illustrate of the range of results of under- and over-reporting. The presence of random error in conjunction with systematic error did not significantly worsen or improve power to detect calibration.

Impact on Discrimination Under- or over-reporting of weight and height did not have a significant effect on the discrimination of the model (Figure 8). C-statistics were reduced very slightly in the most extreme case of under-reporting of weight or over-reporting of height; however, the difference from the true estimate was never more than 0.01 for both males and females. When both random error and systematic error were imposed, the C-statistic was reduced; however this was due to the influence of random error and not systematic error.

75

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Figure 5. Difference in overall risk (observed – true) under systematic error for males and females (averaged over 500 simulations).

76

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Figure 6. Difference in overall risk (observed – true) under systematic error and random error in height and weight for males and females (averaged over 500 simulations).

77

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella

Figure 7. Ability to detect calibration in 500 replications under various levels of systematic error in males and females. Percent that achieved calibration in H-L test among males in the presence of bias - 500 Replications

Males

Females

100% 90% 80%

% H-L <20

70% 60% 50% 40% 30% 20% 10% 0% None "-0.5 kg" "-1.0 kg" "-1.5 kg" "-2.0 kg" "-2.5 kg" "-3.0 kg" "+0.5 cm"

Bias

78

"+1.0 cm"

"+1.5 cm"

"+2.0 cm"

"+2.5 cm"

"+3.0 cm"

Figure 8. Average C-statistic for 500 replications under various levels of systematic error.

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Results in conjunction with a priori hypotheses Below is a summary of the a priori hypotheses and findings of the study in response to those hypotheses:

(i) Random measurement error will affect both discrimination and calibration of the risk algorithm. This was confirmed in this study. Increasing levels of random error (generally ICCs below 0.8) reduced the discrimination and calibration compared to a model without random error.

(ii) Systematic error will have minimal affects on discrimination but large effects on calibration of the risk algorithm. This was confirmed in this study. Systematic error resulted in reductions in calibration; however, even in extreme situations of under-reporting of weight or over-reporting of height the discrimination (C-statistic) remained unaffected.

(iii) Random error will not affect the overall predicted risk of the model and systematic error will influence the predicted risk in the direction of the under- or over-estimation of height and weight. This hypothesis was only partly confirmed in this study. It was confirmed that systematic error influenced the predicted risk in the direction of the systematic error. However, contrary to this hypothesis, random error also biased the overall estimate of predicted risk and the direction of this bias for this scenario was upward.

80

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 3.1.5 Discussion This study systematically examined the impact of measurement error in the context of a prediction algorithm. Simulation studies are flexible and permit a range of assumptions about the magnitude and direction of errors. This simulation study reveals several interesting aspects of the influence of measurement error (systematic and random) on the performance of a risk algorithm. Two out of the three a priori hypotheses were realized in this study. As hypothesized, random error reduced calibration and discrimination of the algorithm. From equation (I) σ2observed = σ2true + σ2error , it can be seen that that the observed variance is greater than the true variance in the presence of measurement error. In the context of Canada‘s population health surveys, the observed variance of self-reported BMI would be greater than the true variance because self-reported BMI may contain some random error. This study confirmed that measurement error increases the observed variance of BMI, thus, making the observed BMI distribution from the public use data wider than the true distribution. This affects both diabetics and non-diabetics resulting in greater overlap between the BMI distributions making assigning risk according to different levels of BMI more difficult to achieve. Even though random error in height and weight should on average correctly estimate the true BMI in the population (since it does not skew the mean in a particular direction), it can still influence the performance of a prediction model due to decreased precision, which leads to greater dispersion in the BMI distribution. This study confirmed that systematic error in height and weight will result in bias in the predicted risk estimates. This affects calibration, which is not surprising since the concordance of observed and predicted events would be influenced by the under- or over-reporting of the level BMI. In other words, persons that are over- or under-reporting their weight will then be 81

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella over- or under-estimated by the risk model and thus result in disagreement with observed estimates. As hypothesized, systematic error did not influence the ability to rank order subjects. In other words, the ability to discriminate between who will and will not develop diabetes was not affected by systematic error, when variance due to random error is held constant. This was reflected by the stability of the C-statistic under varying degrees of systematic error. The way that the systematic error was examined in this study was such that the distribution of BMI was shifted to the left (as a result of underestimating weight or overestimating height or both) compared to the true distribution. This is an overall effect and the decreased precision or increased variability as seen with random error is not observed. Even though the distribution is shifted to the left – those with higher BMI still have a higher probability of developing diabetes compared to those with lower BMI, even though the absolute levels of risk will be underestimated in both groups. This is a classic example of how discrimination and calibration are often discordant. Due to the nature of probability, it is possible for a prediction algorithm to exhibit perfect discrimination – i.e. it can perfectly resolve a population into those who will and will not experience the event - and at the same time have deficient accuracy (meaning that the predicted probability of that person experiencing the event does not agree with what will be actually observed) (28). This study did not impose systematic error with respect to disease status, but it could be hypothesized that if the systematic error were differential between diabetics and non-diabetics that this could indeed affect discrimination. This is a topic of future research. The finding that random error resulted in the overall predicted risk estimated to be biased upwards was contradictory to the hypothesis that only systematic error will bias the risk estimate. As mentioned, random error increases the variability of a measurement and increases the range 82

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella of predicted risk. From other simulations not reported here it was shown that that if the true prediction was greater than 0.5 then the estimated probability based on variables measured with error would be smaller than the observed probability. That is, in both cases the prediction using variables with measurement error pushes the estimated probabilities closer to 0.5 Not surprisingly, the error in predicted risk resulting from under-reporting weight or over-reporting height is in the anticipated direction i.e. if weight is under-reported the observed risk will be underestimated and not surprisingly, based on the above discussion, the addition of random error to this type of systematic error slightly reduced the amount of underestimation because the random and systematic errors work in opposite directions. In another situation, random error could potentially augment the error in predicted risk. Such would be the case if systematic error tended to result in an overestimate of risk. There are several results from this study that have implications for DPoRT. DPoRT relies on self-report survey data, which is likely to suffer from some form of random error. This study confirms that random error, which makes up to 20% of the total observed variance (ICC of 0.8 or higher), is unlikely to affect the performance or validation of the model. Research shows that the random error in height and weight reporting is unlikely to exceed that amount (14). However, the level of random error in the self-reporting of height and weight in the national health surveys need to be confirmed to ensure that it does not make up more than 20% of the total observed variance. Interestingly, the effects on the estimate of predicted diabetes risk in the population were relatively minor, even in situations of high under- or over- reporting of weight and height. This is likely because BMI has such a strong relationship with the outcome of diabetes such that increased risk is apparent even with significant underestimation. In other words, the true distributions of BMI in diabetics and non-diabetics are so distinct that even in the 83

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella presence of underreporting these populations has dissimilar risk for developing diabetes. Had this misclassification affected a variable which did not have such a strong relationship with the outcome, the effect on predicted risk may have been more severe. Furthermore, in this study systematic error in self-reported height and weight was taken as an overall effect in the population. If self-reporting error were significantly more likely to occur in those who were more likely to develop diabetes then the impact of this bias could be augmented. This is another topic for future research. This study examined a range of error values found from validation studies looking at selfreport and measured height and weight compared with measured height and weight. A recent study by Shields et al. (17) examined agreement between self-report and measured BMI in a subsample of the CCHS population. Overall systematic error in females was +0.5 cm for height and -2.5 kg for weight which corresponded to an average underestimate of BMI of 1.2 kg/m2. In males the bias was +1.0 cm for height and -1.8 kg for weight which corresponds to an average underestimate of BMI of 0.9 kg/m2. According to these values, DPoRT predicted risks may be underestimated by ~1% for both sexes; however, this underestimation is in the context of no systematic error, which as discussed above may minimize underestimation. The relatively small amount of underestimation in predicted risk that occurs due to systematic error in self-reported height and weight is almost exactly proportional to the magnitude of error i.e. BMI is underestimated by ~1 (0.9 & 1.2 for males and females) and overall predicted risk is underestimated by ~1 % in males and females. If DPoRT were applied in a setting where reporting error is thought to be higher than in the NPHS (where the tool was developed), this increased error must be taken into account when predicting risk and validating the tool. If the amount of error can be quantified, this study 84

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella outlines a way to correct predicted risk estimates using the ICC and equation (IV) σ2error = - ρ). Since this study shows that systematic error has the ability to influence the discrimination and accuracy of a risk algorithm performance, research into understanding and quantifying these errors could potentially improve the performance of DPoRT. This study focused on the overall trend of self-reporting error seen in several validation studies, that is an underestimation of weight and an overestimation of height (15); however these patterns may also vary across subpopulations such as gender and socioeconomic status. Generally women tend to underestimate weight more so than men and men tend to overestimate height more so than women(14, 29). Socioeconomic status has been shown to modify these associations such that those of lower socioeconomic status may actually overestimate their weight and/or underestimate their height (30, 31). These subgroups may also have differential diabetes risk and the effect to which this error influences population risk prediction is a topic of future research. There are several limitations to consider in the context of this study. Conclusions drawn from this simulation study will relate only to the scenarios simulated and may not apply to all risk algorithm situations. Simulation programs that reflect the specific study conditions to which a study is applied to must be created to make conclusions applicable. Another caution in interpreting the findings of this study is that models examined in this exercise are simpler than complicated multivariate risk algorithms encountered in practice. This simpler model allows us to focus on the height and weight error, which is the greatest potential source of error in DPoRT. It should be noted that one of the assumptions of this study is that the only sources of error are that in self-reported height and weight. Other sources of error including, error in diabetes status and selection bias in the survey or in sampling, are assumed to be absent. We cannot confirm 85

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella that the results of this study would be the same in the presence of the above mentioned sources of error. This study has provided novel information about the influence of measurement error in a prediction model. By understanding the consequences of measurement error on prediction and algorithm performance, efforts can be made to correct for these errors and thus improve the accuracy and validity of a risk algorithm. Future research will include investigation into systematic error with respect to disease status or other characteristics of the population. Further, efforts must be made to understand the nature of error in self-reporting measurements. Ongoing work to improve the quality of measurements used in risk algorithms will improve model performance. Researchers developing and validating risk tools must be aware of the presence of measurement error and its impact on the performance of their risk tools.

86

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 3.16 References 1. Flegal KM, Keyl PM, Nieto FJ. The effects of exposure misclassification on estimates of relative risk. Epidemiology 1986;123:736-51. 2. Greenland S. The effect of misclassification in the presence of covariates. American Journal of Epidemiology 1980;112:564-9. 3. Wacholder S. When measurment errors correlate with truth: Suprising effects of nondifferential misclassification. Epidemiology 2007;6:157-61. 4. Willet WC. An overview of issues related to the correction of non-differential errors in exposure measurement. Statistics in Medicine 1989;8:1040. 5. Wong JVA, Le ND, Burnett R. Causality, Measurement error and multicollinearity in epidemiology. Environmetrics 1996;7:441-51. 6.

Last JM. A Dictionary of Epidemiology. Oxford: Oxford University Press, 2001.

7. Fuller WA. Estimation in the presence of measurement error. International Statistical Review 1995;63:121-41. 8. Weinstock MA, Colditz GA, Willet WC. Recall (report) bias and reliability in the retrospective assessment of melanoma risk. American Journal of Epidemiology 1991;133:240-5. 9. Colditz G et al. Weight as a risk factor for clinical diabetes in women. American Journal of Epidemiology 1990;132:501-13. 10. Colditz G et al. Weight gain as a risk factor for clinical diabetes mellitus in women. Annals of Internal Medicine 1995;122:481-6. 11. Perry IJ et al. Prospective study of risk factors for development of non-insulin dependent diabetes in middle aged British men. British Medical Journal 1995;310:555-9. 12. Vanderpump MPJ et al. The incidence of diabetes mellitus in an English community: a 20-year follow-up of the Wickham Survey. Diabetic Medicine 1996;13:741-7. 13. Wilson P et al. Prediction of incident diabetes mellitus in middle-aged adults. Archives of Internal Medicine 2007;167:1068-74. 14. Nawaz H et al. Self-reported weight and height: implications for obesity research. Journal of Preventive Medicine 2001;20:294-8. 15. Gorber SC et al. A comparison of direct vs. self-report measures for assesing height, weight, and body mass index: a systematic review. Obesity Reviews 2007;8:30726. 87

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 16. Gorber SC et al. The feasibility of establishing correction factors to adjust selfreported estimates of obesity. Health Reports 2009;19. 17. Shields M, Gorber SC, Tremblay MS. Estimates of obesity based on self-report versus direct measures. Health Reports 2008;19:1-16. 18. Statistics Canada. 1996-97 NPHS Public Use Microdata Documentation. 1999. Ottawa. 19.

Hux JE, Ivis F. Diabetes in Ontario. Diabetes Care 2005;25:512-6.

20. Lipscombe LL, Hux JE. Trends in diabetes prevalence, incidence, and mortality in Ontario, Canada 1995-2005: a population-based study. Lancet 2007;369:750-6. 21. Deyo RA, Diehr P, Patrick DL. Reproducibility and responsiveness of health status measures: Statistics and strategies for evaluation. Controlled Clinical Trials 2008;12:142S-58S. 22. Fleiss J. Statistical Methods for Rates and Proportions. New Jersey: John Wiley & Sons, 2003. 23. Hosmer DW, Lemenshow S. Applied Logistic Regression. New York: Wiley, 1989. 24. Hosmer DW et al. A comparison of goodness-of-ft tests for the logistic regression model. Statistics in Medicine 1997;16:965-80. 25. Hosmer DW, Lid Hjort N. Goodness-of-fit processes for logistic regression: simulation results. Statistics in Medicine 2002;21:2723-38. 26. D'Agostino RB et al. Validation of the Framingham Coronary Disease Prediction Scores. JAMA 2001;286:180-7. 27. Campbell G. General Methodology I: Advances in statistic methodology for the evaluation of diagnostic and laboratory tests. Statistics in Medicine 2004;13:499-508. 28. Diamond GA. What price perfection? Calibration and discrimination of clinical prediction models. Journal of Clinical Epidemiology 1992;45:85-9. 29. Niedhammer I et al. Validity of self-reported weight and height in the French GAZEL cohort. International Journal of Obesity 2000;24:1111-8. 30. Bostrom G, Diderichsen F. Socioeconomic differentials in misclassification of height, weight and body mass index based on questionnaire data. International Journal of Epidemiology 1997;26:860-6.

88

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 31. Wardle K, Johnson F. Sex differences in the association of socioeconomic status with obesity. Internation Journal of Obesity and Related Metabolic Disorders 2002;26:1144-9.

89

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 3.2 The role of ethnicity in the population-based prediction of diabetes 3.2.1 Abstract Background: Certain ethnic groups have been shown to be at increased risk for type 2 diabetes. The current form of the Diabetes Population Risk Tool (DPoRT) includes a non-specific category of ethnicity (white/non-white in concordance with public data available in Canada). Having to include detailed ethnic information would limit the applicability of the risk tool since this information is not routinely collected at the population level. Given the importance of ethnicity in influencing diabetes risk and its significance in Canada‘s multi-ethnic population, it is prudent to determine whether detailed ethnic information would significantly improve the prediction of diabetes risk using a population-based risk tool. Objective: To apply and compare the Diabetes Population Risk Tool (DPoRT) with a modified version that includes detailed ethnic information in Canada‘s largest and most ethnically diverse province, Ontario. Methods: Two diabetes prediction models were built using the same principles and data as DPoRT. The 2 models created in this study were: (i) a model that contained predictors specific to the following ethnic groups: White, Black, Asian, South Asian, and First Nation and (ii) a reference model which did not include a term for ethnicity. In addition to discrimination and calibration measures of model performance, 10-year diabetes incidence rate and predicted number of new diabetes cases in Ontario using the different algorithms were compared. The algorithms were developed using the 1996-7 National Population Health Survey in Ontario (N = 19,861) and validated in the 2000/1 Canadian Community Health Survey in Ontario (N=26,465).

90

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Results: All non-white ethnicities were associated with higher risk for developing diabetes than white ethnicity. Among non-white ethnicities South Asian ethnicity had the highest hazard ratio for diabetes in both males and females. Discrimination and calibration were similar across all algorithms (0.75 – 0.77). Sufficient calibration (χ2H-L < 20) was maintained in the development and validation cohort for all models except the detailed ethnicity models for males ( χ2H-L = 33.9). For both males and females, applying DPoRT resulted in the lowest overall ratio between observed and predicted diabetes risk compared to the other two algorithms. DPoRT appears to identify more cases at high risk than the other two algorithms in males, whereas in females both DPoRT and the full ethnicity model identified more high risk cases compared to the algorithm without ethnic information. Overall across decile of risk the DPoRT and full ethnicity algorithms were very similar in terms of predictive accuracy and estimated risk in the population. Conclusions: Although from the individual risk perspective, incorporating information on ethnicity may be important, when predicting new cases of diabetes at the population level and accounting for other risk factors, detailed ethnic information did not improve the discrimination and accuracy of the model or identify significantly more diabetes cases in the population.

91

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 3.2.2 Introduction Planning for health care and public health resources needed to address the significant burden of diabetes (1) is as an important aspect of population health management, which can be informed by robust prediction tools, such as the Diabetes Population Risk Tool (DPoRT)( Chapter 2). This tool can aid policy makers, planners, and physicians by providing reliable estimates of the upcoming diabetes epidemic. In addition, the effectiveness of widespread prevention strategies can be improved by knowing which groups to target and how extensive a strategy is needed to stabilize or reduce the number of new cases. Risk prediction tools for estimating disease risk are common in clinical settings and are used for clinical decision-making (2). One of the limitations of clinical risk prediction tools for population prediction is the reliance on physical measurements or special risk questions, such as fasting blood sugar (3-5) or diabetes family history (6, 7) in the case of diabetes. At the population level in Canada, these measurements are not easily, accurately, or systematically captured. Currently, data on the prevalence of risk factors in the population are only collected through national population health surveys using self-reported measures. One of the key attributes of DPoRT is its accessibility to a broad audience. This is achieved by using data from surveys that are publicly available. Detailed ethnic information, though collected, is not publicly reported. Ethnic information from the surveys are reported publicly as ―white/non-white,‖ and thus this form for ethnicity was used in DPoRT in order to ensure that the tool can be applied to public data. There is growing evidence that certain ethnic groups are at increased risk for developing type 2 diabetes. Globally, non-European populations have a higher proportional burden of type 2 diabetes compared to the other regions of the world(8). The highest diabetes rates in the world 92

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella are seen in aboriginal population, including those in Australia(9, 10), United States(11, 12), and Canada(13, 14). In the United States, studies have shown that those of African and Hispanic decent are at increased risk for developing diabetes compared with non-Hispanic white Americans (15-17). Throughout the world, those of South Asian decent are another ethnic group which has been shown to carry an increase burden of type 2 diabetes compared with both nonwhite and white ethnicities (18-20). Data from Ontario demonstrate that overall immigrant and ethnic minority populations suffer from a higher burden of diabetes and its complications (21). The importance of ethnicity when considering those at high risk for developing diabetes in the clinical setting has been emphasized through diabetes guidelines that recommend people of Aboriginal, Hispanic, south Asian, Asian, or African descent should be targeted for screening (22, 23). Canada‘s immigrant population is largely made up of non-white ethnicities. Immigrants account for 18-20% of Canada‘s population (24), and this percentage is expected to increase over time. Estimates of immigrant populations are as high as 50% for major urban centers such as Toronto. Though clinically and epidemiologically important risks associated with ethnicity are apparent, it is not clear how ignoring ethnic-specific predictors will affect a population-based prediction tool for diabetes. Currently, given that DPoRT performed well in 2 external validation cohorts, it is assumed that diabetes risk is sufficiently estimated using the current form of the model. However, given the significance of ethnicity in Canada and its important influence on diabetes risk, it may be possible that failing to apply ethnic-specific predictors will reduce the ability to identify high risk groups. In order to have confidence in applying this tool in Canada, it needs to be determined if the inclusion of detailed ethnic predictors will significantly change the performance of DPoRT. 93

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella The purpose of this study was to assess the impact of including detailed ethnic information in a prediction algorithm for diabetes. Specifically this study described the relative benefits to predictive accuracy and model outputs that are gained with the addition of ethnic specific predictors to the model. In addition to informing the application of DPoRT, this work also provides insight into the independent role of ethnicity on diabetes risk once additional risk factors are considered. No study has previously taken such an approach (i.e. prediction) to understand the impact of ethnicity on estimating the probability of developing diabetes.

94

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 3.2.3 Methods Creation and validation of DPoRT Development Cohort The study cohort was derived from 23,403 people from Ontario that responded to the 1996/7 National Population Health Survey (NPHS-ON) conducted by Statistics Canada. In the NPHS, households were selected though a stratified, multilevel cluster sampling of private residences using provinces and/or local planning regions as the primary sampling unit. The survey, conducted via telephone, had an overall 83% response rate and all responses were self-reported (15). Persons under the age of 20 (n = 2, 407) and those who had previously diagnosed diabetes or self-reported diabetes were excluded (n = 894). Those who were pregnant at the time of the survey were also excluded (n = 241), due to the fact that baseline Body Mass Index (BMI) could not be accurately ascertained, leaving a total of 19,861 individuals (Figure 1). Sixty-six males were further excluded due to missing baseline BMI resulting in 9,177 males and 10,618 females in the final cohort.

Validation Cohorts The DPoRT algorithm was validated in two external cohorts. Further details on the DPoRT validation are provided in Chapter 2. One validation cohort was used in this study to compare the performance of the 3 risk functions in an external cohort. The validation study used in this cohort was derived from the Ontario portion of the 2000/1 Canadian Community Health Survey (CCHS, Cycle 1.1, N = 37,473), a national telephone survey administered by Statistics Canada. The target population of the CCHS consisted of persons aged 12 and over resident in private dwellings in all provinces and territories, excepting those living on Aboriginal reserves, on 95

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Canadian Forces Bases, or in some remote places. The CCHS included the same self-reported health questions as the NPHS. Like the NPHS, this survey uses a multistage stratified cluster design and provides cross-sectional data representative of 98% of the Canadian population over the age of 12 years, and attained an 80% overall response rate (25, 26). After the exclusion criteria were applied there were 26,465 individuals in the validation cohort.

Identifying respondents who develop diabetes Survey data from development and validation cohorts were linked to provincial administrative health care databases that include all persons covered under the government funded universal health insurance plan. The diabetes status of all respondents in Ontario was established by linking persons to the Ontario Diabetes Database (ODD). The Ontario Diabetes Database (ODD) contains all physician diagnosed diabetes patients in Ontario identified since 1991. The database is created using hospital discharge abstracts and physician service claims. A patient is said to have physician diagnosed diabetes if he or she meets at least one of the following two criteria: (a) a hospital admission with a diabetes diagnosis (International Classification of Diseases Clinical Modification code 250 (ICD9-CM) before 2002 or ICD-10 code E10 – E14 after 2002, or (b) a physician services claim with a diabetes diagnosis (code 250) followed within two years by a either physician services claim or a hospital admission with a diabetes diagnosis. Individuals entered the ODD as incident cases when they were defined as having diabetes according to the criteria described above. A hospital record with a diagnosis of pregnancy care or delivery close to a diabetic record (i.e. a gestational admission date between 90 days before and 120 days after the diabetic record date), were considered to represent gestational diabetes and so were excluded. The ODD has been validated against primary care health records as an 96

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella accurate measure of incidence and prevalence of diabetes in Ontario (sensitivity of 86%, specificity of 97%) (27, 28). Information regarding the vital statistics and eligibility for health care coverage for linked respondents was captured from the Registered Persons Data Base (RPDB). The algorithm used to create the ODD was applied to Manitoba‘s administrative health care data to ascertain physician-diagnosed diabetes status in that province. The ODD algorithm is applied nationally using provincial administrative registries (known as the National Diabetes Surveillance System (NDSS)) and has been used and validated in several Canadian provinces (29).

Variable Definitions Variables used in this study were obtained from responses in the NPHS and CCHS, including: age, Body Mass Index (BMI), presence of chronic conditions diagnosed by a health professional (including hypertension and heart disease), ethnicity, immigration status, smoking status, highest level of achieved education. Body Mass Index (BMI) in kg/ m2 was used as an indicator of obesity. Derived BMI, calculated by dividing the weight in kilograms by height squared in meters-squared directly from the NPHS, is only calculated for respondents aged 30 to 64; therefore, BMI was calculated using weight and height according to derived variable specification for those who fell outside the age range of 30 – 64 (30). Ethnicity was ascertained by the question, 'To which ethnic or cultural groups do your ancestors belong?' Classification of ethnic groups were: White, Black, Asian, South Asian, and First Nation according to Statistics Canada‘s definition (31). Statistics Canada releases public-use micro data files of the national health surveys; however, certain variables are suppressed or modified in these files to protect privacy. In the public-use file ethnicity is only categorized as white or non-white, derived from 97

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella the response to the ethnicity question (20). The shared population health survey files, which are available at the provincial level, contain more detailed information (including detailed ethnicity). Access to the shared data files is highly restricted, which is why DPoRT was developed using variables from the public use file rather than the shared file. In this study the shared file was used in order to allow for both forms of the variable (white/non-white vs 5 ethnic groups) to be compared.

Statistical Analysis Creation of DPoRT The goal in the creation of DPoRT was to create a risk algorithm that would accurately predict diabetes risk with high discrimination and calibration using risk factors that are measured reliably from health survey data. A detailed description of the development of DPoRT can be found in Chapter 2. Briefly, for each cohort member, the probability of physician-diagnosed diabetes was assessed from the interview date until censoring for death or end of follow-up using a Weibull accelerated failure time model Diabetes risk functions were derived separately for men and non-pregnant women above the age of 20 without a prior diabetes diagnosis. Each variable was centred on the population mean before inclusion in the model to allow for easier calibration with other cohorts. This means that when the algorithm is applied, all variables are centered to the mean variables for the cohort to which it is being applied allowing levels of risk to be reflective of the average baseline risk in that cohort. Overall risk (predicted probability) of diabetes for each person was calculated by multiplying the individual‘s risk factor values by the corresponding regression coefficients, and summing the products (32). The form of the model was assessed using likelihood ratio tests to compare nested parametric models (33). A plot of 98

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Log(-logS(t))) vs log(t) was inspected for linearity to assess consistency of the survival times with the Weibull distribution. Cox-Snell residual plots were also constructed to assess the adequacy of the Weibull distribution assumption. All estimates (including betas and variance estimates) incorporated bootstrap replicate survey weights to accurately reflect the demographics of the Canadian population and account for the survey sampling design based on selection probabilities and post stratification adjustments. Variance estimates and 95% confidence intervals were calculated using bootstrap survey weights(34, 35). All statistics were computed using SAS statistical software (version 9.1 SAS Institute Inc, Cary, NC).

Creation of additional models Two additional models were created in the NPHS-ON development cohort as described above except the models were modified to either include ethnic-specific predictors or remove any ethnic predictors; therefore in total 3 prediction models were compared: (i) DPoRT minus ethnicity – called ―no ethnicity‖ (ii) DPoRT (iii) DPoRT plus detailed ethnic information – called ―Full ethnicity model‖ In DPoRT (model (ii)) ethnicity is grouped as in the public use file as white/non-white. In model (iii) ethnicity is broken up into the categories consistent with the diabetes screening guidelines (36): White, Black, Asian, South Asian, and First Nation.

99

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Comparison across models The performance of the model was measured by the discrimination and calibration values in the external validation cohort. Discrimination was measured using a C statistic modified for survival data developed by Pencina et al. (37), analogous to the area under the ROC curve (38) . Calibration was assessed using a modified version of the Hosmer-Lemenshow χ2 statistic developed by Nam (39, 40). This statistic is computed by dividing the validation cohort into deciles of predicted risk and compared the observed versus predicted risk in each decile using a chi-square statistic (see appendix 6.1). To mark sufficient calibration χ2 = 20 was used as a cutoff (p<0.01), consistent with D‘Agostino‘s validation of the Framingham algorithms (39). Observed versus predicted cases of diabetes were also compared across ethnicities to examine the concordance across ethnic groups using the three algorithms. The policy implications of the 3 models were assessed by applying the model to the 2000/1 data to predict 10-year diabetes incidence rates and cases and then to compare these values between the algorithms. The proportion of the population who were identified as high risk were also reported and compared across algorithms. In this study the probability of developing diabetes was stratified into the following categories: <2%, 2-5%, 5-10%, 11-20%, ≥20% where 11-20% and ≥20% were considered high risk. In order to describe the impact of disagreement between observed and predicted diabetes risk as a function of the proportion of the population where that disagreement exists, an index called the Population Disagreement Index (PDI) was developed. PDI was summarized across ethnic groups and compared between models.

100

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella This index is defined as follows: Population Disagreement Index (PDI) 𝑛

PDI =

𝑂𝑖

𝑖=0 𝑃𝑖

× 𝑃𝑝𝑖

𝑂𝑖

Where 𝑃𝑖 = Ratio between observed: predicted in subgroup i 𝑃𝑝𝑖 = Proportion of the population made up of subgroup i i = 1 ….. n where n = number of subgroups in the population 0 < PDI <∞

The unweighted ratio between observed and predicted were calculated to demonstrate the influence of the distribution of the subgroup (i.e. 𝑃𝑝𝑖 ) This was calculated by sampling taking 𝑛

the overall ratio, i.e.

𝑂𝑖

𝑖=0 𝑃𝑖

.

101

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 3.2.4 Results The observed 10-year diabetes risk (cumulative incidence rate) in the Ontario development cohort was 8.8% for males and 7.3% for non-pregnant females aged 20 years and older at baseline. In addition to non-white ethnicity the other attributes in the model which were previously validated were: BMI, age (and its interactions), hypertension, smoking, heart disease and immigrant status. The DPoRT development section shows the multivariate-adjusted hazard ratios for the risk factors in the DPoRT algorithm for males and females.

Ethnicity Adjusted hazard ratios for the ethnic categories in DPoRT and the full ethnicity algorithm are shown in Figure 1. Non-white ethnicity has a hazard ratio of 2.14 95% CI (1.74, 2.63) in males and 1.71 (1.35, 2.16) in females, adjusted for all other variables in the risk algorithm. In the full ethnicity model, hazard ratios for specific ethnic groups ranged from 1.11 to 3.02 compared to white ethnicity. All non-white ethnic groups were at higher diabetes risk than white ethnicity. South Asian ethnicity had the highest hazard ratio for diabetes in both males and females.

102

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella

Figure 1. Adjusted Hazard Ratios (white ethnicity as reference) and 95% CIs for Ethnic Variables in DPoRT.

103

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Model Performance All 3 models showed good discrimination in the development and validation cohorts (C-statistic ranging from 0.75 – 0.77). The algorithm with white/non-white ethnicity predictors had slightly higher discrimination versus a model with no ethnicity (i.e. DPoRT versus no ethnicity algorithm) (table 2). The full ethnicity algorithm achieved the same discrimination as DPoRT for males and females. Sufficient calibration ( χ2H-L < 20) was maintained in the development and validation cohort for all models except the detailed ethnicity model for males (χ2H-L = 33.9). All models had a similar ratio of observed to predicted risk across decile of risk (figure 1a&b). All 3 models under-predicted risk in south Asian males and the largest under-prediction was in the model without any ethnic information (figure 2a&b). Of the three risk algorithms DPoRT had the lowest overall average ratio between the observed and predicted (Figure 3). Weighting by population proportion significantly reduces the overall disagreement in the population due to the fact that larger disagreement occurs in smaller proportions of the population. Table 1. C statistics with 95% confidence intervals and calibration χ2 statistics for 3 algorithms.

Cohort C-Statistic Model No ethnicity DPoRT Full ethnicity H-L χ2 Model No ethnicity DPoRT Full ethnicity

Males Development Validation

Females Development Validation

0.75 (0.74,0.77) 0.76 (0.74,0.77) 0.76 (0.74,0.77)

0.76 (0.75,0.78) 0.77 (0.75,0.78) 0.77 (0.76,0.79)

0.77 (0.75,0.78) 0.77 (0.75,0.79) 0.77 (0.76,0.79)

0.76 (0.74, 078) 0.76 (0.74,0.78) 0.76 (0.74,0.78)

3.22 1.54 6.27

11.38 15.36 33.91

9.01 7.02 7.18

6.05 8.00 9.11

104

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Figure 1a. Observed versus predicted in the validation cohort by decile of risk using the algorithm without ethnicity, with ‗white/non-white‘ ethnicity, and with detailed ethnic predictors for males.

Males 70000

No Ethnicity

60000 50000 40000 Observed 30000

Predicted

20000 10000 0 1

2

3

4

70000

5

6

7

8

9

10

DPoRT

60000 50000 40000 Observed 30000

Predicted

20000 10000 0 1

2

3

4

70000

5

6

7

8

9

10

Full Ethnicity

60000 50000 40000 30000

Observed

20000

Predicted

10000 0 1

2

3

4

5 105

6

Deciles of Risk

7

8

9

10

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Figure 1b. Observed versus predicted in the validation cohort by decile of predicted risk using the algorithm without ethnicity, with ‗white/non-white‘ ethnicity, and with detailed ethnic predictors for females.

Females 70000

No Ethnicity

60000 50000 40000 Observed 30000

Predicted

20000 10000 0 1

2

3

4

70000

5

6

7

8

9

10

DPoRT

60000

50000 40000 Observed 30000

Predicted

20000 10000 0 1

2

70000

3

4

5

6

7

8

9

10

Full Ethnicity Females

60000 50000 40000 Observed 30000

Predicted

20000 10000

106

0 1

2

3

4

5

6

7

8

9

10

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Figure 2a. Observed versus predicted 5-year risk and number of cases by ethnicity for males.

120 10.00 80 5.00 40

0.00

0 First Nation

Predicted Rate

15.00

South Asian

Other

Observed Cases

Predicted Cases

160

DPoRT

120 10.00 80 5.00 40

0.00

0 White

Observed Rate

Black

Asian

First Nation

Predicted Rate

15.00

South Asian

Other

Observed Cases

Predicted Cases

160

Full Ethnicity

120 10.00 80 5.00 40

0.00

0 White

Observed Rate

Black

Asian

Predicted Rate

107

First Nation

South Asian

Other

Observed Cases

Predicted Cases

Predicted number of DM cases

5-year DM rate (%)

Asian

Predicted number of DM cases

Observed Rate

Black

Thousands

White

5-year DM rate (%)

Predicted number of DM cases

160

No Ethnicity

Thousands

5-year DM rate (%)

15.00

Thousands

Males

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Figure 2b. Observed versus predicted 5-year risk and number of cases by ethnicity for females.

Females 160 120 10.00 80 5.00 40

0.00

0 Asian

First Nation

Predicted Rate

15.00

South Asian

Other

Observed Cases

Predicted Cases

160

DPoRT

120 10.00 80 5.00 40 0.00

0 Asian

First Nation

Predicted Rate

15.00

South Asian

Other

Observed Cases

Predicted Cases

160

Full Ethnicity

120 10.00 80 5.00 40 0.00

Predicted number of DM cases

Observed Rate

Black

Thousands

White

5-year DM rate (%)

Predicted number of DM cases

Observed Rate

Black

Thousands

White

5-year DM rate (%)

Predicted number of DM cases

No Ethnicity

Thousands

5-year DM rate (%)

15.00

0 White

Observed Rate

Black

Asian

Predicted Rate

108

First Nation

South Asian

Other

Observed Cases

Predicted Cases

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Figure 3. Unweighted average ratio between observed and predicted cases and weighted average ratio between observed and predicted (i.e. PDI). Males

Average ratio observed:predicted

2.0 1.89

1.8

1.6

1.4 1.32

1.2

1.17 1.09

1.0

A lgorithm

1.08 1.03

DPoRT

Full ethnicity No ethnicity Unweighted

DPoRT

Full ethnicity Weighted

No ethnicity

Females 1.3 Average ratio observed:predicted

1.27

1.2

1.1

1.09 1.07

1.06

1.08

1.0

0.93

0.9 A lgorithm

DPoRT

Full ethnicity No ethnicity Unweighted

109

DPoRT

Full ethnicity Weighted

No ethnicity

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella

Policy Implications – estimates of 10-year population risk in Ontario In males the overall 10- year predicted risk ranged from 9.85% for the no ethnicity model, 10.11% in DPoRT and 10.06% in the full ethnicity model. In females average 10 year predicted risk ranged from 7.83% for the no ethnicity model, 7.95% in DPoRT and 7.97% in the full ethnicity model. There were 9,660 more predicted cases in males and 5, 013 predicted cases in females in DPoRT than with the model without ethnicity. In males, 1,409 less cases were predicted in the full model compared to DPoRT and in females 934 more cases were predicted in the full model compared with DPoRT (Table 2). Ethnic specific risk diverged more in males and females when including ethnic specific terms (Figure 4). The distribution of risk in the population for males and females can be seen in Figure 5. Overall the distribution of risk (i.e. the proportion of the population belonging to each risk category) is similar across the algorithms for females and males. Overall, DPoRT appears to identify more cases at high risk than the other two algorithms in males, whereas in females both DPoRT and the full ethnicity identify more high risk cases and are not substantially different. Across decile of risk the number of diabetes cases predicted using DPoRT and full ethnicity algorithms were very similar in both males and females (Figure 6).

Table 2. 10-year risk and predicted new diabetes cases in Ontario from baseline in the 2001 CCHS. Males Females DIABETES Med Rate, DIABETES Med Rate, rate (%) % # new cases rate (%) % # new cases 7.83 No ethnicity 9.85 5.78 386,964 4.18 321,724 7.95 DPoRT 10.11 6.50 396,624 4.74 326,737 Full 7.97 ethnicity 10.06 6.74 395,215 4.69 327,671 110

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Figure 4. Average 10-year diabetes risk for males and females by ethnicity using 3 algorithms Males ian As

k ac Bl

10-year risk (%)

DPoRT

rs Fi

io at N t

n er th O

h ut So

As

ian te hi W

Full Ethnicity

20 15 10 5

No Ethnicity

20 15 10 5 As

n ia

k ac Bl

n t io a tN rs Fi

O

er th

h ut So

n ia As

te hi W

Ethnicity Panel variable: Algorithm

Females ian As

10-year risk (%)

DPoRT

k ac Bl

rs Fi

io at tN

n er th O

Full Ethnicity

h ut So

As

ian te hi W

20 15 10 5

No Ethnicity

20 15 10 5 As

n ia

k ac Bl

n io at N t rs Fi

O

er th

h ut So

n ia As

te hi W

Ethnicity Panel variable: Algorithm

111

Doctor of Philosophy Epidemiology (PhD) Dissertation

Laura C.A. Rosella

Figure 5a. Distribution of diabetes risk (left) and number of new cases by risk group in the population (below) using 3 algorithms for males.

112

Doctor of Philosophy Epidemiology (PhD) Dissertation

Laura C.A. Rosella

Figure 5b. Distribution of diabetes risk (left) and number of new cases by risk group in the population (below) using 3 algorithms for females.

113

Doctor of Philosophy Epidemiology (PhD) Dissertation

Laura C.A. Rosella

Figure 6. Predicted diabetes cases in 10 years by decile of predicted risk for males and females using three algorithms

140000

140000

Males

120000

Females

120000 No Ethnicity

No Ethnicity

DPoRT

DPoRT

Full

Full

100000

100000

80000

80000

60000

60000

40000

40000

20000

20000

0

0 1

2

3

4

5

6

7

8

9

10

1

Decile of Risk

2

3

4

5

6 Decile of Risk

114

7

8

9

10

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 3.2.5 Discussion The aim of this study was to assess the impact of including detailed ethnic predictors in a population-based risk tool for diabetes in Canada. In addition to identifying relative hazards of developing diabetes by ethnicity, this study provides estimates of the predicted number of cases in a provincial population by ethnicity for the next 10-year period. Using a population-based cohort, this study confirmed that those of non-Caucasian descent are at increased risk for developing diabetes and consistent with previous research, hazard ratios were highest among South Asians (18). In terms of overall model performance, no additional predictive value was detected when adding detailed ethnic predictors. At the population level, distribution of risk was similar across of different risk levels in the population, particularly between DPoRT and the full ethnicity model. This study suggests that using DPoRT in its current form is sufficient for accurately predicting diabetes cases in ethnically diverse population similar to Ontario. The funding that the algorithm to predict diabetes that uses detailed ethnicity did not significantly differ from one that uses a broad categorization of ethnicity can be driven by two mechanisms involving statistical prediction and population disagreement influence (PDI). There are several reasons why a clinically important risk factor may not improve the performance of a prediction tool. Even though a variable is independently associated with an outcome, it may not provide incremental improvements in test characteristics which are relevant for prediction (i.e. discrimination and calibration) in the context of existing predictors. This phenomenon has also been shown for other clinical predictors and outcomes such as C-reactive protein for cardiac risk prediction (41). In fact, research has shown that although a battery of novel risk factors have been developed for the prediction of major coronary heart disease (CHD) events, these novel factors have been generally unimpressive in their ability to improve CHD 115

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella prediction (42). Furthermore, it has been shown that for a variable to make significant improvements in discrimination (i.e. improvement in AUC from 0.8 to 0.9) its multivariable odds ratio must be 6.9 or greater (43) suggesting that in order for detailed ethnicity to improve the algorithm beyond its current discrimination, the adjusted hazard ratio must be very large in magnitude. Interestingly, the model without ethnicity was not detectably worse in terms of model performance such that discrimination and calibration were only marginally decreased compared to DPoRT. This is likely due to the fact that many of the reasons that ethnicity plays a role in diabetes risk are related to other factors captured in the model. In particular, socioeconomic status, obesity (particularly younger onset of obesity), and other lifestyle factors have been shown to be related to both ethnicity and diabetes risk (18). Most importantly, immigration status, which is already captured in the model, may explain a significant amount of variability in diabetes incidence that is associated with ethnicity. The diminishing return on model performance when adding statistically significant predictors to the model was also noted in the model building process of DPoRT and was one of the reasons that DPoRT maintained good discrimination, even with considerable constraints on variable selection. Population disagreement influence (PDI) is an extension of the idea of population attributable risk (PAR). PAR describes the impact of a risk factor on population risk as a function of the prevalence of the exposure and the relative risk of disease (44). In this study the estimate of relative risk in PAR is translated to the disagreement between observed and predicted and the prevalence of the exposure is translated to the prevalence of the population where the disagreement exists. Therefore, PDI described the impact of disagreement between observed and predicted risk for a risk tool as a function of the proportion of the population where that disagreement exists. PDI exemplifies how population risk is driven by where the cases lie in the 116

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella population. This means that a large relative discrepancy between observed and predicted that is concentrated in a subgroup that covers a small proportion of the population will have less impact on the overall population estimate of diabetes compared with disagreement of the same magnitude that affects a larger proportion of the population. This finding emphasizes an important difference between individual and population risk prediction; differences in individuals or subgroups may be important if the algorithm was to be applied to an individual, but these differences may not be as critical if applied for population estimates. This also identifies a potential difference in the way that algorithms must be independently validated, depending on whether they are intended for use on the individual or in small subpopulations. Of course, in the same way PAR is affected by the prevalence of the risk factor in the population, the influence of disagreement within ethnic groups is affected by the ethnic composition of the population. Ethnic composition can change over time and the impact of this on the validity of the algorithm should be regularly assessed. The use of prediction tools at the individual level or in small subpopulations must be independently validated in specific subpopulations and used with caution where evidence of poor fit is occurring. The purpose of this study was to examine the impact of ethnicity on population risk prediction and not to validate it for use within specific ethnic groups. Nonetheless, looking at performance within ethnic groups provides important information about diabetes risk by ethnicity. DPoRT performed well in all ethnic groups (especially in females), with the exception of South Asian males where even with the inclusion of full ethnic information the algorithm resulted in an under-prediction of diabetes risk. This can indicate that there is an aspect of risk in this population which is not captured by either the variables in DPoRT or detailed ethnicity. This result is consistent with emerging evidence about the nature of metabolic risk in South 117

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Asian males. This population has significantly more insulin resistance than Caucasian populations even in the absence of excessive obesity (18). It has been proposed that the excessive insulin resistance in Asian Indians could be explained by an abdominal fat distribution which may be genetically determined (45). A recent study looking at detailed radiographic and anthropometric measures in Asian and Caucasian men showed that for a given BMI or waist circumference, South Asian men had approximately 6% higher total body fat than Caucasian men. Other studies have shown that adjustments for BMI or waist circumferences to define obesity do not entirely account for possible differences in inherent insulin resistance in the South Asian population (46). Several physiological mechanisms for this occurrence have been proposed including that South Asian men have a defect in adipose tissue metabolism, which occurs independently of obesity or abdominal fat distribution. These abnormalities of adipose tissue metabolism are concomitant with insulin resistance (47). These studies indicate that there may be an important aspect of diabetes risk which is not captured by simply including ethnicity and BMI along with the other predictors of the model. The type of detailed physiological information which may be needed is not captured at the population level nor is it feasible to include in a tool such as DPoRT. Regardless, these differences did not affect the performance of the model and the validity of overall population estimates of diabetes. Another difficulty in estimating the ethnicity-diabetes risk among males is the possibility of confounding by physical activity. Immigrant men are more likely to engage in jobs that require physical activity on a daily basis (48) which has been shown to reduce the risk of developing diabetes (49). This may explain why the full ethnicity algorithm actually performs worse than DPoRT for some ethnic groups. Inclusion of full ethnic information may result in over-fitting of the model. This phenomenon was also seen during DPoRT creation. 118

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Previous studies indicate that weight cut-offs may differ in their associated risk for diabetes within ethnicity groups and that different cut-offs should be used to identify those at high risk (50, 51). Our study suggests that as long as additional risk factor differences among ethnic groups are captured in the prediction algorithm, the difference may not actually be as substantial as previous noted. This difference may be due to the fact that previous studies did not fully account for possible confounders including age and additional metabolic disorders (50). This study examined the interaction between age-specific BMI and ethnicity and found no significant differences. There are several limitations to consider when interpreting the results of this study. Firstly, the minimal difference detected between DPoRT and the full ethnicity algorithm may not be found in other populations with different ethnic compositions. Secondly, using self-report measures from the health survey is a limitation which could affect predictive risk accuracy since these measures may be more subject to reporting error and bias than clinical measures. For selfreported height and weight, in general there is a high agreement; however, validation studies show that weight tends to be slightly underestimated and that height may be slightly overestimated and as a result reported BMI is generally lower than measured BMI (52-54), which would result in a slight underestimation of predicted risk (Chapter 3.1). The possibility of misclassification is also possible with the use of self-reported ethnicity, even though it is the most common measure of acquiring ethnic information in epidemiological studies (55). Interestingly self-report ethnicity is generally preferred by epidemiologists and federal agencies, such as the US and Canadian Census and the National Center for Health Statistics (56). This is due to the fact the self-identification with ethnicity is most important for studying influences of lifestyle on disease risk. Misclassification due to self-reported ethnicity 119

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella may be more problematic when examining the genetic associations with disease (57). Another important limitation is the exclusion of an important subpopulation which compromises the generalizability of this research to all ethnicities in Canada. The cohorts covered in the NPHS and CCHS exclude those living on Aboriginal reserves. Therefore, estimates for First Nations people apply only to those living off-reserve and are not intended to represent First Nations onreserve. Previous studies show that First Nations are at greater risk for diabetes than other members of the Canadian population including off-reserve First Nations counterparts (13, 14, 5860). This is an important component of the population to consider for diabetes prevention and a population risk algorithm developed specifically for on-reserve populations would be beneficial to estimating overall diabetes burden in Canada. Overall there are several key messages to be taken from the results of this study. Firstly, this analysis provides adjusted hazard ratios and risk estimates to quantify the impact of ethnicity on diabetes risk using a prospective population-based cohort study in Ontario. This is the first study that reports 10-year risk and number of cases of diabetes from a prediction model according to ethnicity. These estimates provide key information for predicting diabetes risk at the provincial or national level, particularly in the increasingly multiethnic Canadian population. Secondly, this study shows that DPoRT in its current form is as effective or in some cases better than the algorithm with full ethnic information for predicting diabetes risk at the population level. Furthermore, it also appears to work well within ethnic groups, in particular for women. Though overall model performance was good, analysis by ethnicity shows that further research is required to improve model fit in South Asian males. This study, consistent with other prediction tools, has reaffirmed that using a measure that has a statistically significant association with a disease is not enough to improve predictive performance of a model (61). 120

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 3.2.6 References

1. Wild S et al. Global prevalence of diabetes - Estimates for the year 2000 and projections for 2030. Diabetes Care 2004;27:1047-53. 2. Anderson KM et al. An Updated Coronary Risk Profile - A Statement for HealthProfessionals. Circulation 1991;83:356-62. 3. Eddy DM, Schlessinger L. Validation of the archimedes diabetes model. Diabetes Care 2003;26:3102-10. 4. Hanley AJG et al. Prediction of type 2 diabetes using simple measures of insulin resistance - Combined results from the San Antonio Heart Study, the Mexico City Diabetes Study, and the Insulin Resistance Atherosclerosis Study. Diabetes 2003;52:4639. 5. Ito C et al. Prediction of diabetes mellitus (NIDDM). Diabetes Research and Clinical Practice 1996;34:S7-S11. 6. Herman WH et al. A new and simple questionnaire to identify people at increased risk for undiagnosed diabetes. Diabetes Care 1995;18:382-7. 7. Lindstrom J, Tuomilehto J. The diabetes risk score: a practical tool to predict type 2 diabetes risk. Diabetes Care 2007;26:725-31. 8. World Health Organization. Report of a WHO consultation on obesity, Obesity: preventing and managing the global epidemic. 1998. Geneva, World Health Organization. 9. Odea K. Westernization, Insulin Resistance and Diabetes in Australian Aborigines. Medical Journal of Australia 1991;155:258-64. 10. Odea K et al. Obesity, Diabetes, and Hyperlipidemia in A Central Australian Aboriginal Community with A Long History of Acculturation. Diabetes Care 1993;16:1004-10. 11. Mokdad AH et al. The continuing epidemics of obesity and Diabetes in the United States. JAMA: The Journal of the American Medical Association 2001;286:1195-200. 12. Pavkov ME et al. Changing patterns of type 2 diabetes incidence among Pima Indians. Diabetes Care 2007;30:1758-63. 13. Harris SB et al. The prevalence of NIDDM and associated risk factors in native Canadians. Diabetes Care 1997;20:185-7. 121

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 14. Young TK et al. Type 2 diabetes mellitus in Canada's First Nations: status of an epidemic in progress. Canadian Medical Association Journal 2000;163:561-6. 15. Narayan KMV et al. Lifetime risk for diabetes mellitus in the United States. JAMA 2003;290:1884-90. 16. Winkleby MA et al. Ethnic and socioeconomic differences in cardiovascular disease risk factors - Findings for women from the third national health and nutrition examination survey, 1988-1994. Jama-Journal of the American Medical Association 1998;280:356-62. 17. Brancati FL et al. Diabetes mellitus, race, and socioeconomic status - A population based study. Annals of Epidemiology 1996;6:67-73. 18. Abate N, Chandalia M. Ethnicity and type 2 diabetes - Focus on Asian Indians. Journal of Diabetes and Its Complications 2001;15:320-7. 19. Ramachandran A et al. Risk of noninsulin dependent diabetes mellitus conferred by obesity and central adiposity in different ethnic groups: A comparative analysis between Asian Indians, Mexican Americans and Whites. Diabetes Research and Clinical Practice 1997;36:121-5. 20. Ramachandran A et al. Rising prevalence of NIDDM in an urban population in India. Diabetologia 1997;40:232-7. 21. Manuel D, Schultz S. Diabetes in Ontario: An ICES Practice Atlas. Hux, J. E., Booth, G. L., Slaughter, P. M., and Laupacis, A. 4.77-4.94. 2003. Toronto, Institute for Clinical and Evaluative Sciences. 22. Calonge N et al. Screening for type 2 diabetes mellitus in adults: U.S. Preventive Services Task Force recommendation statement. Annals of Internal Medicine 2008;148:846-U63. 23. Norris SL et al. Screening adults for type 2 diabetes: A review of the evidence for the U.S. Preventive Services Task Force. Annals of Internal Medicine 2008;148:855U82. 24. Newbold KB, Danforth J. Health status and Canada's immigrant population. Social Science & Medicine 2003;57:1981-95. 25. Statistics Canada. Canadian Community Health Survey Methodological Overview. Health Reports 2002;13:9-14. 26.

Canadian Community Health Survey 2000–2001. Ottawa: 2003.

27.

Hux JE, Ivis F. Diabetes in Ontario. Diabetes Care 2005;25:512-6.

122

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 28. Lipscombe LL, Hux JE. Trends in diabetes prevalence, incidence, and mortality in Ontario, Canada 1995-2005: a population-based study. Lancet 2007;369:750-6. 29. Health Canada. Responding to the Challenge of Diabetes in Canada. 2003. Ottawa, ON. 30. Statistics Canada. 1996-7 National Population Health Survey: Derived Variable Specifications. 1999. Ottawa. 31. Statistics Canada. Population and Dwelling Counts, for Census Divisions, Census Subdivisions (Municipalities) and Designated Places, 2001 and 1996. 2001. 32. Odell PM, Anderson KM, Kannel WB. New Models for Predicting Cardiovascular Events. Journal of Clinical Epidemiology 1994;47:583-92. 33. Farewell VT, Prentice RL. A study of distributional shape in life testing. Technometrics 1977;19:69-75. 34. Yeo D, Mantel H, Lui TP. Bootstrap variance estimation for the National Population Health Survey. 778-783. 1999. Baltimore, American Statistical Association. 35. Kovacevic MS, Mach L, Roberts G. Bootstrap variance estimation for predicted individual and population-average risks. Proceedings of the Survey Research Methods Section. 2008. American Statistical Association. 36. Berg AO et al. Screening for type 2 diabetes mellitus in adults: Recommendations and rationale. Annals of Internal Medicine 2003;138:212-4. 37. Pencina M, D'Agostino RB. Overall C as a measure of discrimination in survival analysis: model specific population value and confidence interval estimation. Statistics in Medicine 2004;23:2109-23. 38. Campbell G. General Methodology I: Advances in statistic methodology for the evaluation of diagnostic and laboratory tests. Statistics in Medicine 2004;13:499-508. 39. D'Agostino RB et al. Validation of the Framingham Coronary Disease Prediction Scores. JAMA 2001;286:180-7. 40. Nam B-H. Discrimination and Calibration in Survival Analysis: Extension of the ROC Curve for Descrimination and Chi-square test for Calibration. 2000. Boston University. 41. Lloyd-Jones DM et al. Narrative review: Assessment of C-reactive protein in risk prediction for cardiovascular disease. Annals of Internal Medicine 2006;145:35-42. 42. Wilson PWF et al. C-reactive protein and risk of cardiovascular disease in men and women from the Framingham Heart Study. Archives of Internal Medicine 2005;165:2473-8. 123

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 43. Pepe MS et al. Limitations of the odds ratio in gauging the performance of a diagnostic, prognostic, or screening marker. American Journal of Epidemiology 2004;159:882-90. 44. Levin ML, Bertell R. Re - Simple Estimation of Population Attributable Risk from Case-Control Studies. American Journal of Epidemiology 1978;108:78-9. 45. Banerji MA et al. Body composition, visceral fat, leptin, and insulin resistance in Asian Indian men. Journal of Clinical Endocrinology and Metabolism 1999;84:137-44. 46. Chandalia M et al. Relationship between generalized and upper body obesity to insulin resistance in Asian Indian men. Journal of Clinical Endocrinology and Metabolism 1999;84:2329-35. 47. Abate N et al. Adipose tissue metabolites and insulin resistance in nondiabetic Asian Indian men. Journal of Clinical Endocrinology and Metabolism 2004;89:2750-5. 48. Norman A et al. Total physical activity in relation to age, body mass, health and other factors in a cohort of Swedish men. International Journal of Obesity 2002;26:670-5. 49. Qi L, Hu FB, Hu G. Genes, environment, and interactions in prevention of type 2 diabetes: A focus on physical activity and lifestyle changes. Current Molecular Medicine 2008;8:519-32. 50. Barba C et al. Appropriate body-mass index for Asian populations and its implications for policy and intervention strategies. Lancet 2004;363:157-63. 51. Diaz VA et al. How does ethnicity affect the association between obesity and diabetes? Diabetic Medicine 2007;24:1199-204. 52. Nawaz H et al. Self-reported weight and height: implications for obesity research. Journal of Preventive Medicine 2001;20:294-8. 53. Rowland M. Self-reported height and weight. American Journal of Clinical Nutrition 2007;52:1125-33. 54. Shields M, Gorber SC, Tremblay MS. Estimates of obesity based on self-report versus direct measures. Health Reports 2008;19:1-16. 55. Comstock RD, Castillo EM, Lindsay SP. Four-year review of the use of race and ethnicity in epidemiologic and public health research. American Journal of Epidemiology 2004;159:611-9. 56. Gomez SL et al. Inconsistencies between self-reported ethnicity and ethnicity recorded in a health maintenance organization. Annals of Epidemiology 2005;15:71-9. 57. Burchard EG et al. The importance of race and ethnic background in biomedical research and clinical practice. New England Journal of Medicine 2003;348:1170-5. 124

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 58. Green C et al. The epidemiology of diabetes in the Manitoba-registered first nation population - Current patterns and comparative trends. Diabetes Care 2003;26:1993-8. 59. Horn OK et al. Incidence and prevalence of type 2 diabetes in the first nation community of Kahnawa : ke, Quebec, Canada, 1986-2003. Canadian Journal of Public Health-Revue Canadienne de Sante Publique 2007;98:438-43. 60. Kaler SN et al. High rates of the metabolic syndrome in a first nations community in western Canada: Prevalence and determinants in adults and children. International Journal of Circumpolar Health 2006;65:389-402. 61. Pencina MJ et al. Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond. Statistics in Medicine 2008;27:157-72.

125

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 4. Determining predictors of body mass index and change in body mass index from 1994 – 2004 using a multilevel growth model 4.1 Abstract

Abstract Background: Increases in obesity and weight are major contributors to chronic disease around the world, particularly for type II diabetes. Understanding determinants and trajectories of weight change is an important aspect of public health prevention of chronic disease through obesity reduction. Few studies have examined the change in Body Mass Index (BMI) over time and its association with lifestyle and demographic factors using longitudinal population-based data. Objective: To understand the predictors of BMI and BMI trajectories in the Canadian population over a 10-year period. Methods: This study uses a population-based sample of 14, 123 adults in cycles 1 – 6 of the longitudinal National Population Health Survey. BMI (at baseline and over time) was modeled separately for men and women using multilevel growth models with random and fixed effects. Demographic and lifestyles variables were investigated for their association with these BMI outcomes, with lifestyle variables modeled in their time-dependent form. Results: The multilevel analysis showed that age and initial BMI were associated with higher BMI and increased physical activity, immigrant status, and smoking were negatively associated with BMI. Those who were older and had higher BMI had significantly lower rates of BMI increase over time. Female immigrants were less likely to increase BMI over time compared to non-immigrants. Adjusting for all factors in the model, an important interaction between income

126

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella and sex was found in which, compared to those with higher income, low-income males had lower BMI whereas the opposite was true for females. Conclusions: Lifestyle and demographic factors are associated with BMI and BMI change over time. Longitudinal data and appropriate analytic techniques, such as multilevel growth models, are crucial to accurately describing these effects.

127

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 4.2 Introduction There are few risk factors so influential to the development of a disease as increased Body Mass Index (BMI) on the development of type 2 diabetes (1-5). The past and projected increases in diabetes incidence throughout the world have been attributed mainly to the increasing incidence of obesity in the population (6). From a primary prevention perspective, understanding factors which influence excess weight is integral in implementing and planning effective diabetes prevention strategies. There have been several cross-sectional studies that have examined correlates of BMI; however, longitudinal studies of weight change are less common. Previous analysis of change in self-reported weight in Canada has been analyzed using ordinary linear regression (OLS) (7). In this study, Orpana et al. found, on average, a trend of weight gain in the Canadian population from 1996/7 to 2004/5 and recommend further research to identify the correlates and causes of this trend. However, as stated by the authors of this study, OLS is not as efficient as other statistical methods because it does not exploit all the information present in longitudinal data. Accordingly, this study uses longitudinal data and multilevel growth modeling techniques to expand on previous research to achieve a better understanding of the determinants of weight and weight change in the Canadian population. Hierarchical or multilevel growth models can be used to model individual change over time in an intuitive and flexible way. These analytic methods extend the concept of multilevel modeling, taking into account the hierarchical structure of the data, such that individuals (level1) are nested into groups (level-2). Multilevel growth models treat level-1 variables as withinperson differences over time and level-2 variables as between-person differences independent of time (8). In addition to understanding what factors influence BMI at each time point, multilevel 128

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella growth models allows one to study the factors that influence BMI change. Compared to other models, this technique exploits all of the available data by using variability that exists withinand between individuals to inform estimates of associations. In doing so, this approach uses longitudinal data to model the predictors of weight and predictors of weight change distinctly, yet efficiently, in one model. Using Canadian population-based longitudinal data, this study employs multilevel growth models to assess predictors of BMI and BMI change over time.

129

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 4.3 Methods Study Population This analysis was done using the longitudinal National Population Health Survey (NPHS), a nationally representative longitudinal household survey conducted by Statistics Canada (9). The NPHS longitudinal sample contains 17,276 persons, from all ages, sampled in 1994/1995. These same persons were interviewed at regular cycles (i.e., every two years) and will continue to be interviewed up to a total period of 18 years, i.e. 10 cycles. This study uses the first 6 cycles of the survey, up to 2004/5 and covering a 10-year period. Households were selected though a stratified, multilevel cluster sampling of private residences using local planning regions as the primary sampling unit, excluding residents of Indian reserves, long-term care institutions, prisons, remote areas, and Foreign Service personnel. The survey design was a two-stage probability sample. The overall response rates were cycle 1, 83.6%; cycle 2, 92.8%; cycle 3, 88.2%; cycle 4, 84.8%; cycle 5, 80.6%; and cycle 6, 77.4%. The cumulative attrition rate (i.e. i.e. those who did not complete the questionnaire in all 6 cycles) were: cycle 2 9.3%, cycle 3 15.4%, cycle 4 21.3%, cycle 5 27.3 % and cycle 6. 32.7%. The most significant causes of attrition were inability to trace and refusal. Average non-response for questionnaire items was < 0.1%. Nevertheless, the methods used in this study do not require respondents to have complete data for all waves of the survey. Population weights to reflect the population characteristics were computed based on selection probabilities and post stratification adjustments. Subjects with missing BMI at the first wave of data collection or those under the age of 18 were excluded from this analysis.

130

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Variables Self-reported measures of exposures (highest level of education achieved, income, rural status, chronic conditions, ethnicity, physical activity, and smoking) were obtained from the NPHS, at every cycle. The outcome, BMI, was based on self-reported height and weight at each cycle. BMI is the most commonly measured metric of relative weight and is calculated as weight [in kilograms (kg)] divided by height [in meters (m)] squared (kg/m2). Baseline BMI (BMI at the first wave of data collection) was included in the model in order to assess its effect on BMI change. Categories of physical activity were based upon a Physical Activity Index (PAI). The PAI is based on an individual's leisure time metabolic energy expenditure (EE). EE is calculated using the frequency and duration of several physical activities, as well as their metabolic (MET) value. The MET is the energy cost of the activity expressed as kilocalories (kcal) expended per kg of body weight per hour of activity, doing a physical activity and the number of times and time spent on each activity. A PAI < 1.5 kcal/kg/day was considered inactive, moderate was defined by a 1.5 ≤PAI < 3.0 kcal/kg/day, and active was defined as ≥3.0 kcal/kg/day (10). Income was assessed using income adequacy which is calculated as the dollar distance between the individual‘s gross household income in the past 12 months and the low-income cut-off calculated annually to reflect inflation, and adjusted for household size.

Statistical Analysis Descriptive statistics on baseline characteristics and average BMI at each survey were reported as means (for continuous variables) and percentages (for categorical variables). As part of the descriptive and exploratory analysis, BMI change (expressed as a slope) and BMI at the first wave of data collection (expressed as intercept) were calculated using OLS regression. For each 131

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella individual a regression line was fit using BMI values at each time point. Then the slopes and intercepts were averaged across individuals. Unadjusted BMI and rate of BMI change were compared between sociodemographic and lifestyle factors and tested using a t-test or one-way ANOVA depending on if the variable was categorical or continuous. All descriptive statistics were calculated using survey weights to accurately reflect the demographics of the Canadian population and account for the survey sampling design. Significance and variance estimates were calculated using a bootstrap method (11). The multilevel growth model was fit on the longitudinal data using PROC MIXED in SAS (8) including both fixed and random effects. A random effect describes an estimate in the model that is generated from a subset of all possible subjects in the population; therefore when estimating a random effect, the variance for that parameter is also estimated. The multilevel growth model allows the user to examine both within-individual factors and systematic differences in growth trajectories that occur across groups. The model solves the within- and between-individual variation in two stages and can include random effects of variation in slopes and intercepts between individuals (12, 13). This allows the variances of these parameters to be estimated. The models are described in detail below. The level-1 model describes within-person variation for BMI. This model can be written as: (I) BMIij = π0i + π1i(WAVEij) + εij Where i = individual and j = measurement occasion or wave (j = 0 to 5 for each wave of data) π0i is the intercept of individual i‘s BMI trajectory π1i is the slope or rate of change for the individual i‘s BMI (i.e. average rate of change across cycles) 132

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella

εij is the deviation from the individuals i‘s trajectory from linearity on occasion j (wave).

BMIij is the subject i‘s BMI value at time j (j goes from 0 – 5). Level-1 equations predict the smallest data-unit, which in this study are BMI-time values. For example, if 10 subjects are measured 6 times there are 60 level-1 units. Variables that change over time (time-varying covariates) may also enter into level-1 equations (Figure 1). In this study, the functional form of the time variable (WAVE) was tested in both its linear and quadratic form to determine which form was most appropriate for the data. The level-2 model represents the between-person differences in the change trajectories (intercept and slope of BMI over time) and time-invariant characteristics of the individual. The level-1 model only allows people to vary in the values of their individual growth parameters but the level 2 model ascribes differences in individuals‘ slopes or intercepts such that individuals who share common predictor values should, on average, vary only according to their individual change trajectories. The level-2 or between-person model represents the association between a predictor (shown here as one dichotomous predictor, Xi = 1 or 0) with each subject‘s estimated initial BMI (i.e. intercept) and rate of change over time (i.e. slope): π0i = γ00 + γ01Xi + δ0i (Xi is the predictor of initial BMI) π1i = γ10 + γ11Xi + δ1i (Xi is the predictor of rate of change of BMI)

where

ζ0i ~𝑁 ζ1i

𝜎2 0 , 20 0 𝜎10

2 𝜎01 𝜎12

The γs represent the structural or fixed effects of the model. The fixed effects capture systematic inter-individual differences such as age, ethnicity, or immigrant status. In the above model: 133

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella γ00 = mean population initial BMI when Xi = 0 γ01 = effect of covariate Xi on initial average BMI γ10 = mean population rate of change (slope) BMI when Xi = 0 γ10 = effect of covariate x on rate of change (slope) in BMI. γ00 and γ10 are the level-2 intercepts which represent the mean initial BMI (intercept) and mean rate of change of BMI (slope) in the population. Variance components δ0i and δ1i are level-2 error variables (residuals) which represent the stochastic variation in the individual growth models allowing each individual‘s growth parameters (slopes and intercepts) to differ from the population average (See figure 2). In other words, δ0i and δ1i represent the between-individual stochastic variation for the level-1 predictors (slope and intercept). Specifically: δ0i = Difference between population average initial BMI and individual i‘s level of BMI δ1i = Difference between population average rate of change of BMI and individual i‘s rate of change of BMI. δ0i and δ1i represent the unexplained variation in initital BMI and BMI rate of change. Their variances, defined by the matrix above (σ02 , σ12 & σ012) represent the population variation in the individual intercept and slope around the mean initial BMI and the mean rate of change as defined above in the δ0i and δ1i. σ02 = population variance in initial BMI

134

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella σ12 = population variance in BMI rate of change σ012 = population covariance between baseline BMI and BMI rate of change Interpreting these values (σ02 , σ12 & σ012) allow us to comment on how much heterogeneity exists in the change parameters after accounting for the variables in the model. Figure 1. Graphical representation of physical activity as a time-varying and timeinvariant predictor.

Active

Moderately active

Inactive Time-varying

BMI

Time-invariant

1

2

3 Wave

135

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella

Figure 2. Ten randomly selected growth trajectories from individuals in the cohort to represent variation in growth trajectories between individuals in the cohort.

136

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella In growth models, level-1 units typically represent time and are nested within individuals. Therefore, using the earlier example of 10 subjects measured 6 times, we would have 10 level-2 units. Therefore, our full joint multilevel growth model can be written as: BMIij = γ00 + γ01Xi + γ10* (WAVEij) + γ11Xi* (WAVEij) + δ1i δ0i + εij γ01 describes the effect of covariate X on initial BMI γ11 describes the effect of covariate X on rate of change of BMI.

Multilevel growth modeling makes use of intra-individual variation and inter-individual variation (thus accounting for clustering within subjects) in an efficient manner since coefficients are iteratively estimated at both levels. Although attrition can be a significant problem with longitudinal data, multilevel models are able to accommodate missing or unbalanced data. The underlying assumption in multilevel growth models is that each individual‘s observed data are a random sample from their underlying growth trajectory allowing them to be estimated even without complete data (14).

Also, the model is able to handle between-individual variation in

the timing and frequency of measurements because the timing of each measurement occasion is treated as an explanatory variable in the model; therefore, data from individuals‘ different measurement patterns (e.g. some of whom may have been measured only once and others at several irregularly spaced intervals) can be analyzed simultaneously in the same model. Each BMI-trajectory is a combination of the observed trajectory for the individual based on the measured waves, and the model-based trajectory based on the predictor variables.

137

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella The following covariance structures were examined and compared to determine the best fit with the data: Unstructured (UN), Compound Symmetry (CS), Autoregressive(1) (AR1), and Heterogeneous AR(1) (ARH). The final covariance structure was chosen based on the most favorable information criteria [Akaike‘s Information Criteria (AIC) and Bayesian Information Criteria (BIC)], log-likelihood statistics, and number of iterations. Standard diagnostics with level-1 and level-2 residuals were performed to check model assumptions of normality and homoschedasicity. As mentioned in the NPHS survey households were selected though a stratified, multilevel cluster sampling of private residences using local planning regions as the primary sampling unit. In order to account for the way that the respondents were sampled normalized weights that account for selection probabilities were used. Normalized weights represent the population bootstrap weights provided by Statistics Canada (described in study population section) divided by the total sample size; accordingly, the sample size is not inflated but the differential probabilities associated with the survey design are accounted for. All statistical analyses were carried out using SAS version 9.2 (SAS Institute, Inc, Cary, NC).

138

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 4.4 Results There were 17, 276 respondents in the first wave of the longitudinal NPHS. Those under the age of 18 (n = 3,159) and without BMI information in wave 1 or 2 (n = 354) were excluded from the cohort. The final sample size was 14, 123, or, separated by sex, 7,496 women and 6,627 men (Figure 3). Figure 3. Cohort development.

Starting Cohort N = 17,276

354 excluded no BMI in wave 1 or 2

3,159 excluded < 18 yrs

N=13,763

FINAL COHORT N = 7,496 females

N = 6,627 males

Baseline characteristics of the cohort are shown in table 1. Males had an average baseline BMI of 25.9 kg/m2 and women at 24.7 kg/m2. Male BMI increased on average by 4.4% and female BMI increased on average by 5.3% over the 10-year period (Figure 4). OLS regression analysis revealed that baseline BMI and average change in BMI were associated with a number of 139

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella population characteristics (Table 2 & 3). Higher levels of physical activity and smoking status were associated with lower baseline BMI; however, these factors also tended to be positively associated with higher rates of BMI increase over time (not statistically significant). Low income status was negatively associated with baseline BMI in men but the opposite was true for women. In a cross-sectional analysis at the amid-point of the study (2003) the association between income quintile and percentage of individuals that reached the obese cutoff (BMI ≥ 30 kg/m2) were not strongly associated in men; however the likelihood of being obese was strongly inversely related to income in women (Figure 5).

Table 1. Baseline means and proportions (standard deviations) of characteristics at wave 1 (1994-5), by sex. Males Females (N = 6,627) (N = 7,496) Variable BMI (kg/m2) 25.9 (0.1) 24.7 (0.1) Age (yrs) 43 (0.2) 45 (0.2) Immigrant (%) 19.5 (7.3) 20.0 (6.5) Low Income (%) 13.1 (5.5) 19.0 (6.3) Current Smoker (%) 32.7 (7.6) 28.4 (6.9) Education < Secondary School 24.7 (9.1) 25.4 (8.9) Secondary School Grad 15.2 (6.3) 17.3 (6.1) Post-secondary School Grad 60.1 (8.8) 57.3 (8.3)

140

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella

Figure 4. Mean unadjusted Body Mass Index (BMI) (+/- standard error) across time [National Population Health Survey (NPHS) 2-year cycles), by sex.

28

Males

27

27.1 26.9

Females

26.6

BMI (kg/m2)

26.3 26

25.9

26.0

26.0 25.8 25.5 25.2

25

24.9 24.7

24 1

2

3

4 Cycle

141

5

6

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Table 2. Average BMI at the first wave of data collection (i.e. baseline BMI) (+/- standard error) (i.e. intercepts from OLS) and tests for significant differences between cohort characteristics by sex.

Overall Age 18-30 31-60 ≥ 61 Low Income No Yes Immigrant No Yes Physical Activity Low Medium High Education < Secondary School Secondary School Grad Post-secondary School Grad Marital Status Married Common-law Single Separated or Divorced Widowed

MALES N = 6,627 Mean (se)* P-value or difference from reference (se) 25.88 (0.07)

FEMALES N = 7,496 Mean (se) P-value or difference from reference (se) 24.70 (0.07)

24.48 (0.12) +1.86 (0.14) + 1.54 (0.20)

(ref) <0.0001 <0.0001

23.18 (0.14) +1.88 (0.18) +2.25 (0.20)

(ref) <0.0001 <0.0001

25.93(0.07) - 0.59 (0.19)

(ref) 0.0026

24.56 (0.07) +0.57 (0.19)

(ref) 0.0027

25.92 (0.08) - 0.48 (0.20)

(ref) 0.0155

24.75 (0.07) -0.22 (0.19)

(ref) 0.2842

25.97 (0.09) -0.27 (0.17) -0.38 (0.16)

(ref) 0.1139 0.0152

24.97 (0.09) -0.48 (0.18) -1.16 (0.19)

(ref) 0.0058 <0.0001

26.12 (0.13) -0.15 (0.23) -0.49 (0.17)

(ref) 0.6139 0.0047

25.63 (0.15) -0.94 (0.25) -1.35 (0.18)

(ref) 0.002 <0.0001

26.44 (0.08) -1.22 (0.22) -2.00 (0.21) -0.60 (0.21) -0.65 (0.78)

(ref) <0.0001 <0.0001 0.0039 0.4072

25.17 (0.10) -0.16 (0.31) -2.05 (0.18) -0.06 (0.24) +0.15 (0.22)

(ref) <0.0001 <0.0001 <0.0001 0.7904

Current Smoker No 26.03 (0.08) (ref) 24.92 (0.09) (ref) Yes -0.66 (0.14) <0.0001 -0.76 (0.15) <0.0001 Any chronic condition Yes 25.60 (0.11) (ref) 23.94 (0.12) (ref) No +0.42 (0.13) 0015 +1.28 (0.15) <0.0001 se: standard error - all standard errors are computed using survey bootstrap weights

142

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Table 2. Average change in BMI (+/- standard error) (i.e. slopes from OLS) for significant differences between cohort characteristics by sex.

Overall

MALES Mean (se) P-value or difference from reference (se) 0.206 (0.01)

FEMALES Mean (se) P-value or difference from reference (se) 0.221 (0.01)

Age 18-30 0.365 (0.02) (ref) 0.327 (0.02) (ref) 31-60 -0.186 (0.03) <0.0001 -0.108 (0.02) <0.0001 ≥ 61 - 0.414 (0.04) <0.0001 -0.420(0.03) <0.0001 Quartile of baseline BMI Q1 0.339 (0.03) (ref) 0.265 (0.01) (ref) Q2 -0.109 (0.03) 0.0005 -0.055 (0.02) 0.0150 Q3 -0.186 (0.03) <0.0001 -0.131 (0.02) <0.0001 Q4 -0.272 (0.03) <0.0001 -0.225 (0.03) <0.0001 Low Income No 0.207 (0.01) (ref) 0.193 (0.01) (ref) Yes -0.007 (0.03) 0.8375 -0.050 (0.0.03) 0.0463 Immigrant No 0.191 (0.03) (ref) 0.187 (0.07) (ref) Yes - 0.058 (0.20) 0.0750 -0.057 (0.19) 0.0198 Physical Activity Low 0.162 (0.013) (ref) 0.1588 (0.02) (ref) Medium +0.047 (0.02) 0.044 + 0.038 (0.02) 0.0913 High +0.038 (0.03) 0.145 + 0.155 (0.03) 0.0091 Education < Secondary School 0.138 (0.02) (ref) 0.063 (0.02) (ref) Secondary School Grad +0.008 (0.03) 0.7963 + 0.140 (0.03) <0.0001 Post-secondary School Grad +0.070 (0.02) 0.0040 +0.155 (0.02) <0.0001 Marital Status Married 0.130 (0.01) (ref) 0.152 (0.03) (ref) Common-law +0.101 (0.03) <0.0001 +0.145 (0.04) 0.0007 Single +0.035 (0.04) 0.0017 +0.153 (0.04) <0.0001 Separated or Divorced +0.130 (0.04) <0.0001 +0.09 (0.03) 0.0232 Widowed -0.017 (0.03) 0.4115 -0.207 (0.03) <0.0001 Current Smoker No 0.161 (0.02) (ref) 0.151 (0.01) (ref) Yes +0.059 (0.02) 0.0029 +0.091 (0.02) <0.0001 Any chronic condition Yes 0.222 (0.01) (ref) 0.2120 (0.01) (ref) No -0.080 (0.02) 0.0015 -0.062 (0.02) 0.0007 se: standard error - all standard errors are computed using survey bootstrap weights

143

Figure 6. Conceptual diagram of the mulilevel growth model showing variables in the model at each level. 24

22

21.95 21.01 20.7

p < 0.1109

20

19.98

19.76

% Obese

19.15 18.86 18

17.75 16.9

16

14

Males

13.18 p < 0.0001

Females 12

10 Lowest

Lower-Middle

Middle Income Adequacy Quintile

Upper Middle

Highest

Figure 6. Conceptual diagram of the mulilevel growth model. Age (γ01) Baseline BMI (γ02) Low income(γ03) Immigrant (γ04) Ethnicity (γ05)

σ02

σε2

Intercept (π0i) BMI (y)

Wave (π1i) Intercept (γ10) Age (γ11) Baseline BMI (γ12) σ12

Time-varying covariates (π2i - π8i) Physical Activity Smoking Status Marital Status

The conceptual diagram of the multilevel growth model can be seen in figure 6 and the results from the model are shown in table 4. All four covariance structures (CS,ARH, AR, and UN) produced similar results. However, AIC, BIC and log-likelihood statistics showed ARH(1)and UN produced the best fit. Since iteration time was quicker for UN, this covariance structure was chosen for subsequent analyses. Quadratic forms of time were considered but found not to be statistically significant and not to improve the model properties; therefore, the linear form of time was used in the model. All variables were assessed for their variability over time. Physical activity, marital status, and smoking status, varied across waves and thus were kept in their timevarying form. Income adequacy was kept in its time-invariant form due to its minimal variation over time and to assess its effect in the level 2 equations.

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella The multilevel growth model shown in table 4 reveals several important predictors for BMI and its trajectory. The full specification of the multilevel model for males is shown below:

Level 1 model (within individual) - variables in this model are time-varying BMIij = π0i + π1i(WAVEij) + π2i(Physical Activity -Moderate) + π3i(Physical Activity -Active) + π4i(Marital Status–common-law) + π5i(Marital Status–single) + π6i(Marital Status– separated/divorced) + π7i(Marital Status– widowed) + π8i(Current smoker) + εij

Level 2 model (between individual) Predictors of initial BMI: π0i = γ00 + γ01(age) + γ02(Baseline BMI) + γ03(Low Income) + γ04(Immigrant) +γ05(Ethnicity black) + γ06(Ethnicity –Asian) + γ07(Ethnicity –Aboriginal) + γ08(Ethnicity – South Asian) + γ09(Ethnicity – other/mixed) + δ0i Predictors of BMI rate of change: π1i = γ10 + γ11(age) + γ12(Baseline BMI) + δ1i

146

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella

Combined Model: BMIij = γ00 + γ01(age) + γ02(Baseline BMI) + γ03(Low income) + γ04(immigrant) +γ05(Ethnicity - black) + γ06(Ethnicity –Asian) + γ07(Ethnicity –Aboriginal) + γ08(Ethnicity – South Asian) + γ09(Ethnicity – other/mixed) + γ10(WAVE) + γ11(age)x(WAVE) + γ12(baseline-BMI)x(WAVE) + π2i(Physical Activity -Moderate) + π3i(Physical Activity -Active) + π4i(Marital Status–common-law) + π5i(Marital Status–single) + π6i(Marital Status– separated/divorced) + π7i(Marital Status– widowed) + π8i(Current Smoker) + π9i(Current Smoker) + π9i(Has a chronic condition) + εij +δ0i + δ1i(WAVE)

Reference groups: -For marital status ―married‖ is the reference category -For physical activity ―inactive‖ is the reference category -For income ―Medium or High income quintile‖ is the reference category -For ethnicity ―white‖ is the reference category

147

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella

Table 4 shows the multilevel results from 3 models. The first model (model 1) described the model without covariates or the effects of time. This model is also known as the unconditional growth model since the effects of initial BMI and rate of change are is estimated independent of any terms in the model. The second model (model 2) shows only age and initial BMI and was used for comparison when building the multivariate model. The third model (model 3) is the fully adjusted model. Each multilevel model has two intercepts, one for baseline BMI at wave 1 (γ00) and another for subsequent change in BMI over time 1 (γ10). Average BMI and BMI change in the unconditional growth model was 26.03 km/m2 and 0.2031 kg/m2 for males 24.92 kg/m2 and 0.2245 kg/m2 for females. When each term was added to the second model, results were compared to the unconditional and conditional growth model and improvement tested using the information criteria (AIC and BIC) and likelihood ratio statistics. Controlling for all variables in the model, males have an average increase in BMI of 0.1655 (± 0.0157) kg/m2 and females have an average increase of 0.2216 (± 0.0110) kg/m2. In model 3 in men and women, increasing age and baseline BMI were associated with an increase in BMI; however those that were older and/or had a higher BMI at baseline increased at a reduced rate. Prototypical trajectories, based on the multivariate model in females, demonstrating the effects of age on rates of change of BMI are shown in figure 7. In the final model, marital status, physical activity, smoking status, and chronic conditions (for females) were modeled in their time-varying form due to their fluctuations over the 10 year period. Figure 6 demonstrates the difference between a time-varying and a time-invariant predictor. Controlling for the effects of time and other variables in the model, increasing levels of physical activity were associated with lower levels of BMI and the effect was stronger in 148

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella females than males. Smoking was also associated with lower levels of BMI and being married was associated increased BMI. Having a chronic condition was not associated with BMI or BMI rate of change in males but was positively associated with average BMI in females. Income, immigrant status and ethnicity were modeled in their time invariant form. In the multivariate models, immigrant status was associated with lower BMI; however the effect was stronger in females. In addition, South Asian ethnicity was associated with increased BMI compared to white ethnicity females only. Adjusted for all other factors, in men low income was associated with lower BMI compared to those in the medium or high income category whereas in women, low income status was associated with increased BMI. When the models were analyzed with only the subset of individuals with complete, balanced data, the findings did not differ in direction or statistical significance; however, the magnitude of the associations was slightly larger. The random variance components for the intercept and slope of the multivariate models are also shown in table 4a&b. The random effects for both initial BMI status (σ02) and rate of change (σ12) were statistically significant (P<0.0001) in all models, indicating significant variation in baseline levels of BMI and rate of change within the population after controlling for all variables in the final model. Furthermore, the covariance between baseline BMI and BMI rate of change and (σ012) was also significant (P<0.0001) meaning that variance in BMI trajectories depends on initial BMI. The majority of the variation in initial BMI was explained by the fixed effects added to the model (σ02 decreased from 13.625 to 0.0781 in males and 22.839 to 0.1143 in females). Age and baseline BMI were the main fixed effects that reduced the variance in initial BMI.

149

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Variation in trajectories stayed relatively stable across models; which is a common finding in multilevel growth models (14). The variation in initial status was almost double the magnitude in females versus males indicating that BMI is much more variable in Canadian women than men.

150

Table 4a. Results of multilevel growth model in males. All variables in the model were centered to their overall mean. Model 1

INTERCEPT

WAVE

π0i

π1i

Symbol

Term

γ00

Intercept

γ01

Fixed coefficient (standard error) 26.034 (0.052)

Model 2 Fixed coefficient (standard error) 25.445 (0.0157)

p-value

p-value

<.0001

Fixed coefficient (standard error) 25.429 ( 0.0191)

Age

0.0009 (0.001)

0.4110

0.0003 (0.0013)

<.0001

γ02

BMI

0.9465 (0.004)

0.0031

0.9418 (0.004)

0.8203

γ03

Low Income

-0.0881 (0.0440)

0.0455

γ04

Immigrant

-0.0950 (0.0454)

0.0366

ref

Ethnicity - White (ref)

(ref)

γ05

Ethnicity -Black

-0.1697 (0.1090)

0.1197

γ06

Ethnicity -Asian

-0.09784 (0.0878)

0.2653

γ07

Ethnicity -Aboriginal

0.03259 (0.1070)

0.7607

γ08

0.1575 (0.1726)

0.3613

γ09

Ethnicity -South Asian Ethnicity - Other

0.1132 (0.1546)

0.4639

γ10

Intercept

γ11 γ12

0.2031 (0.008)

p-value

Model 3

<.0001

<.0001

<.0001

0.1655 (0.008)

<.0001

0.1655 (0.0157)

<.0001

Age

-0.009 (0.006)

<.0001

-0.01074 (0.0001)

<.0001

BMI

-0.014 (0.002)

<.0001

-0.0187 (0.002)

<.0001

Time-varying covariates π2i π3i

π4i π5i

Physical Activity Inactive (ref) Physical Activity Moderate Physical Activity Active Marital Status Married (ref) Marital Status Common law Marital Status Single

(ref) -0.0421 (0.024)

0.0806

-0.0895 (0.027)

0.0009

(ref) 0.0767 (0.045)

0.0917

-0.0703 (0.034)

0.0409

Doctor of Philosophy Epidemiology (PhD) Dissertation π6i

Laura C.A. Rosella

Marital Status – Separated or divorced Marital Status Widowed Current Smoker

π7i π8i

-0.1934 (0.042)

<.0001

-0.09815 (0.077)

0.1998

-0.2729 (0.0273)

<.0001

Estimated random effects σ02

13.625 (0.2840)

<.0001

0.2033 (0.016)

<.0001

0.07807 (0.015)

<.0001

σ12

0.2005 (0.006)

<.0001

0.2058 (0.006)

<.0001

0.2141 (0.007)

<.0001

σ01

-0.08259 (0.030)

0.0061

0.094 (0.007)

<.0001

0.125 (0.007)

<.0001

1.9922 (0.018)

<.0001

1.884 (0.016)

<.0001

2.1295 (0.021)

<.0001

σε

2

Goodness of fit minus2logL p-value

154417

versus model 1

139376

112726

15041

41691

(df =4; p<0.0001)

(df =19; p<0.0001)

versus model 2

26650 (df =15; p<0.0001)

AIC Improvement

AIC (smaller is better) versus model 1

154429

139396

112776

yes

yes

versus model 2 BIC Improvement

BIC (smaller is better)

yes 154469

versus model 1 versus model 2

139462

112940

yes

yes yes

152

Doctor of Philosophy Epidemiology (PhD) Dissertation

Laura C.A. Rosella

Table 4b. Results of hierarchical linear model fitting in females. All variables in the model were centered to their overall mean. Model 1

INTERCEPT

WAVE

π0i

π1i

Symbol

Term

γ00

Intercept

γ01

Age

γ02

BMI

γ03

Low Income

γ04

Immigrant

ref

Ethnicity - White (ref)

γ05

Ethnicity -Black

γ06

Ethnicity -Asian

γ07

Ethnicity -Aboriginal

γ08 γ09

Ethnicity -South Asian Ethnicity - Other

γ010

Chronic condition

γ10 γ11

Intercept Age

γ12

BMI

Model 2

Model 3

Fixed coefficient (standard error)

p-value

Fixed coefficient (standard error)

p-value

Fixed coefficient (standard error)

pvalue

24.923 (0.063)

<.0001

25.389 ( 0.0169)

<.0001

<.0001

0.00393 ( 0.0012)

0.0008

25.383 (0.019) 0.001825 (0.00138)

0.9488 (0.0034)

<.0001

0.9461(0.004)

<.0001

0.0939 (0.044)

0.0358

-0.1368 ( 0.0504)

0.0066

(ref)

0.0193

0.0505 ( 0.1355)

0.7095

-0.0118 ( 0.1018)

0.9075

-0.0050 (0.1972)

0.9797

0.1434 ( 0.1106)

0.0195

0.2193 (0.1816)

0.2272

0.1815 ( 0.0091)

0.1113 (0.02635) 0.2114 (0.011)

0.0264 <.0001

-0.0073 (0.0006) -0.0184(0.0018)

-0.0078(0.0007) -0.0227 (0.0021)

<.0001 <.0001

-0.0246 ( 0.0262)

0.3467

-0.1772 ( 0.0321)

<.0001

0.2245 (0.010)

<.0001

0.1870

Time-varying covariates π2i

γ20

π3i

γ30

π4i

γ40

Physical Activity Inactive (ref) Physical Activity Moderate Physical Activity Active Marital Status Married (ref) Marital Status -

(ref) -0.0208 (0.054)

153

0.0193

Doctor of Philosophy Epidemiology (PhD) Dissertation

Laura C.A. Rosella

Common law π5i

γ50

π6i

γ60

π7i

γ70

π8i

γ80

Marital Status Single Marital Status – Separated or divorced Marital Status Widowed Current Smoker

-0.1821 (0.038)

0.6996

-0.2045 ( 0.0425)

<.0001

-0.0808 ( 0.0545)

<.0001

-0.3129 (0.03219)

<.0001

Estimated random effects σ02 σ12 σ01 σε2

22.839 (0.441)

<.0001

0.3026 (0.023)

<.0001

0.1143 (0.023)

<.0001

0.2896 (0.008)

<.0001

0.3128 (0.001)

<.0001

0.3350 (0.010)

<.0001

-0.1850 (0.008)

<.0001

0.1994 (0.001)

<.0001

0.2124 (0.011)

<.0001

2.4579 (0.021)

<.0001

2.3499 (0.020)

<.0001

2.6239 (0.024)

<.0001

Goodness of fit minus2logL p-value

188503

versus model 1

171270 17233

143960 44913

(df =4; p<0.0001)

(df = 20; p <0.0001)

versus model 2

27310 (df = 16; p <0.0001)

AIC Improvement

AIC (smaller is better) versus model 1 188515 versus model 2

BIC Improvement

171290

143973

yes

yes

BIC (smaller is better) yes

154

Doctor of Philosophy Epidemiology (PhD) Dissertation versus model 1

Laura C.A. Rosella

188556

versus model 2

155

171358

144167

yes

yes

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Figure 7. Prototypical plot of BMI change over time for quartiles of baseline age in females at population average levels for all other characteristics of the cohort.

156

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 4.5 Discussion This population-based longitudinal study of adults demonstrates how individual characteristics are related to BMI and BMI change over time. Several demographic and lifestyle factors significantly influence baseline BMI (intercept) and rate of change (slope) over 10 years. Though factors which influence BMI and BMI change over time were generally similar between men and women there was an important difference in the influence of income, namely that low income status was protective in men and a risk factor in women. Furthermore, all associations seemed generally stronger in women. There are several strengths to this study. This sample is population-based with high participation (response rates ≥ 77%) and representative of the Canadian population (15). Furthermore, the longitudinal design allowed this study to overcome limitations associated with cross-sectional designs or single follow-up cohort designs which are unable to consider trajectories of growth or predictor variables that vary over time. The multilevel growth model used in this study assessed patterns of growth trajectories, taking into account both within- and between-individual variation. Other approaches to study change over time, including OLS regression, are unable to model these within-individual variations since the data are collapsed to the level-2(16). An additional strength was the use of time-dependent covariates, which has not been considered in previous studies. This is particularly important to consider with lifestyle variables that are known to change over time, such as physical activity or smoking status. A variable that fluctuates over time that is modeled in its time invariant form can lead to the regression-to-the mean phenomenon. Leading to misinterpretations of the relationships or the inability to detect a significant effect(17).

157

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Previous analysis of change in self-reported weight in Canada used OLS (7). Orpana et al. did not examine the effects of the factors investigated in this study; however they did examine how change in BMI was influenced by sex and age group. Consistent with our study, they found that weight gain is higher among young and middle aged-adults compared with those greater than 65, and that females were more likely to increase over time compared to males. Evidence from cross-sectional studies have generally agreed with the finding that physical activity is inversely associated with BMI (18-21); however, longitudinal studies have been less consistent (22-24) though it is unclear if methodological differences, the definition or measurement of physical activity, or true differences produced these inconsistencies. No previous study has considered the time-dependent nature of leisure activity, as was done in this study, and this may have strengthened the ability to find a significant inverse association. The association between low BMI and smoking is well documented (25-27) and confirmed in this study, even when considering the time-varying nature of smoking status, which has not been done in previous studies. Previous studies have also shown, in general, that immigrant status is associated with decreased BMI (28-31); and though acculturation is thought to lessen this effect, and even reverse it after a substantial amount of time, that phenomenon was not observed in the time period of this study. This study revealed that BMI tends to increase with time for younger people, and even more so for people who start with normal BMI values (< 25 kg.m2) as in previous studies (7, 16, 32). This emphasizes the need to consider population demographics when designing weight intervention strategies. Specifically, that obesity prevention is particularly important in young adults. It is possible that targeting this group will have the most impact for preventing obesity in the population, since they are at highest risk for weight gain.

158

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella An interesting finding from this study was the opposite association between low income status and obesity by sex. In their landmark analysis, Sobal and Stunkard (1989) pointed out that in developed nations SES influences on weight affect men and women differently and is perhaps related to attitudes and cultural influences (33). It has been shown consistently that adult women tend to have a strong inverse relationship between SES indicators and obesity; that is, the higher the SES the less likely the women were to have a higher BMI and/or gain weight (33-36). Furthermore the prevalence of obesity and being overweight is paradoxically higher in women with food insecurity(37-41). There are a variety of explanations for these results. Unhealthy patterns in nutrition (high-density/low nutrient diets) occur in women to a greater extent than men. Lower income women are less likely to report reducing calories and snacks and limiting meal proportions to healthy size (42). Emotional eating, described as the practice of consuming large quantities of food (usually unhealthy food) in response to feelings instead of hunger, is more likely to affect women versus men (43). Low SES women may be more likely to experience emotional eating due to feelings of lack of control over their own lives, depression, emotional stress and poor self-esteem associated with low social class. Women who lack selfesteem, have high psychological demand and low controls in their occupation are less likely to exercise and/or have healthy dietary habits (44). A clear demonstration of the SES effects in women is the contradictory trend of increased obesity among food-insecure women. This is known as the ‗hunger-obesity paradox‘ in which under-nutrition and obesity co-exist (40). In periods of food insecurity, women often change their eating patterns to ensure sufficient food for their children and/or husbands (43). The psychosocial and emotional changes that occur in times of food restriction result in an increased likelihood to binge eat when food is available and eat high-calorie foods (one of the key reasons why food restriction is an ineffective method of 159

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella dieting). This phenomenon occurs in a dose-response manner such that the more severe the food insecurity the more likely the women is to have a binge pattern of eating (39). Men are less likely to be influenced by emotional eating brought on by psychosocial effects of low SES or food insecurity. Furthermore, men are not as susceptible to fluctuations in food consumption during times of food insecurity compared to women (41). An additional difference is the potential protective effect of low-income labor. Low SES jobs for men tend to be more physically active than high SES jobs (44). This increased exposure to physical activity burns calories and is potentially protective of BMI. Our analysis revealed that relationship status had an impact on future weight, such that those that were married had higher levels of BMI compared to those who were in other types of relationship or not in a relationship. Previous research using an individual-level fixed effects model on a national longitudinal survey from the U.S. also found a significant relationship between marriage and weight gain and proposed explanations related to shared meals, decreased individual physical activities, and social obligations (45). There are several limitations to consider in this study. First, all factors are self-reported including height and weight, thus possibility subject to measurement error. Non-differential error would lead to an underestimation of the true associations(46, 47). In general, there is a high correlation between self-report and measured height and weight (48), however, some studies have shown that under or over-estimation of height can be related to factors such as sex and socioeconomic status (49-51). If present, these self-reporting biases may have affected the direction of association in our study. Recent research into methods to correct for self-reporting error in these health surveys have recently been introduced. Incorporating these correction factors is a subject of future research (52). Second, due to the complex nature of the modeling 160

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella process, the models were kept relatively simple to ensure convergence; therefore, additional variables related to labor, social support, and specific co-morbid conditions were not considered. Third, additional parameters important to weight gain were not included in the survey‘s data file. Specifically, there was no detailed information on reproductive factors or dietary habits, which are a potentially important aspect of weight and weight gain. With respect to the SES relationship in women, a previous study demonstrated that a large proportion of the gradient was explained by reproductive factors (44). Not controlling for reproductive factors could cause significant confounding in women since women of lower SES are more likely to have children (and more of them), putting them at increased risk for obesity. Consequently, inability to control for reproductive factors may overestimate SES effects in women. Finally, attrition occurs across waves and, though this model can handle missing data, the decrease in participants in later waves may reduce statistical power to detect relationships or higher powers in the functional form for change. Preventing obesity in the population is an important strategy to reduce future chronic disease burden, particularly for diabetes (53). This study describes the lifestyle and demographic factors associated with BMI and BMI change over time. The study‘s population-based longitudinal design and multilevel growth model analyses offered a unique opportunity to investigate new aspects of these relationships. This study shows that weight and weight gain are influenced by age, sex and baseline BMI. Of the lifestyle variables, physical activity was protective for BMI, controlling for the effects of time and accounting for the time-dependent nature of this variable. Furthermore, this study shows differential effects of lifestyle and sociodemographic factors on BMI by sex. This study shows that young and low-income women are at higher risk for increased BMI compared to older women and compared to men of the same ages. 161

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Given that young women are particularly at risk of for developing diabetes in the next 10 years (Rosella et al, under review) particular attention should be paid to this sub-group. Though not specifically carried out in this study, this model can be explored for its potential as a tool to predict future BMI in the population, which may be used when generating future estimates of obesity and/or diabetes.

162

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 4.6 References 1. Colditz G et al. Weight as a risk factor for clinical diabetes in women. American Journal of Epidemiology 1990;132:501-13. 2. Colditz G et al. Weight gain as a risk factor for clinical diabetes mellitus in women. Annals of Internal Medicine 1995;122:481-6. 3. Perry IJ et al. Prospective study of risk factors for development of non-insulin dependent diabetes in middle aged British men. British Medical Journal 1995;310:555-9. 4. Vanderpump MPJ et al. The incidence of diabetes mellitus in an English community: a 20-year follow-up of the Wickham Survey. Diabetic Medicine 1996;13:741-7. 5. Wilson P et al. Prediction of incident diabetes mellitus in middle-aged adults. Archives of Internal Medicine 2007;167:1068-74. 6. Mokdad AH et al. The continuing epidemics of obesity and Diabetes in the United States. JAMA: The Journal of the American Medical Association 2001;286:1195-200. 7. Orpana HM, Tremblay MS, Fines P. Trends in weight change amoung Canadian adults. Health Reports 2007;18:9-16. 8. Singer JD. Using SAS PROC MIXED to fit multilevel models, hierarchical models, and individual growth models. Journal of Educational and Behavioral Statistics 1998;23:323-55. 9. Statistics Canada. National population health survey household component cycle 6 (2005-2006) longitudinal documentation. 2006. Ottawa. Ref Type: Report 10. Statistics Canada. 1996-7 National Population Health Survey: Derived Variable Specifications. 1999. Ottawa. 11. Yeo D, Mantel H, Lui TP. Bootstrap variance estimation for the National Population Health Survey. 778-783. 1999. Baltimore, American Statistical Association. 12. Demidenko E, Stukel TA. Efficient estimation of general linear mixed effects models. Journal of Statistical Planning and Inference 2002;104:197-219. 13. Stukel TA, Demidenko E. Two-stage method of estimation for general linear growth curve models. Biometrics 1997;53:720-8. 14. Singer JD, Willett JB. Applied Longitudinal Data Anlaysis:Modeling Change and Event Occurrence. Oxford: Oxford University Press, 2003. 163

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 15. Statistics Canada. NPHS Public Use Microdata Documentation. 1999. Ottawa, Canada, Statistics Canada. 16. Heo M et al. Hierarchical linea models for the development of growth curves: an example with body mass index in overweight/obese adults. Statistics in Medicine 2003;22:1911-42. 17. Streiner DL, Norman GR. Measuring Change. Health Measurement Scales. New York: Oxford University Press, 2003:194-212. 18. Haapanen N et al. Association between leisure time physical activity and 10-year body mass change among working-aged men and women. International Journal of Obesity 1997;21:288-96. 19. Hu FB et al. Walking Compared With Vigorous Physical Activity and Risk of Type 2 Diabetes in Women . JAMA 1999;282:1433-9. 20. Rissanen AM et al. Determinants of Weight-Gain and Overweight in Adult Finns. European Journal of Clinical Nutrition 1991;45:419-30. 21. Williamson DF et al. Recreational Physical-Activity and 10-Year Weight Change in A United-States National Cohort. International Journal of Obesity 1993;17:279-86. 22. Gordon-Larsen P et al. Fifteen-year longitudinal strends in walking patterns and their impact on weight change. American Journal of Clinical Nutrition 2009;89:19-26. 23. Petersen L, Schnohr P, Sorensen TIA. Longitudinal study of the long-term relation between physical activity and obesity in adults. International Journal of Obesity 2004;28:105-12. 24. Wilsgaard T, Jacobsen BK, Arnesen E. Determining lifestyle correlates of body mass index using multilevel analyses: The Tromso study, 1979-2001. American Journal of Epidemiology 2005;162:1179-88. 25. Fogelholm M et al. Predictors of weight change in middle-aged and old men. Obesity Research 2000;8:367-73. 26. Meltzer AA, Everhart JE. Unintentional Weight-Loss in the United-States. American Journal of Epidemiology 1995;142:1039-46. 27. Molarius A et al. Smoking and relative body weight: An international perspective from the WHO MONICA project. Journal of Epidemiology and Community Health 1997;51:252-60. 28. Goel MS et al. Obesity among US immigrant subgroups by duration of residence. Jama-Journal of the American Medical Association 2004;292:2860-7.

164

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 29. Park Y et al. Place of birth, duration of residence, neighborhood immigrant composition and body mass index in New York City. International Journal of Behavioral Nutrition and Physical Activity 2008;5. 30. Sanchez-Vaznaugh EV et al. Differential effect of birthplace and length of residence on body mass index (BMI) by education, gender and race/ethnicity. Social Science & Medicine 2008;67:1300-10. 31. Wandell PE et al. Country of birth and body mass index: A national study of 2,000 immigrants in Sweden. European Journal of Epidemiology 2004;19:1005-10. 32. Bjorvell H, Rossner S. A ten year follow-up of weight change in severely obese subjecrs treated in a behavioural modification program. International Journal of Obesity 1990;14:88. 33. Sobal J, Stunkard AJ. Socioeconomic status and obesity: a review of the literature. American Psychological Association 1989;105:260-75. 34. Ball K, Crawford D. Socioeconomic status and weight change in adults: a review. Social Science in Medicine 2005;60:1987-2010. 35. Kuskowska-Wolk A, Bergstrin T. Trends in body mass index and prevalence of obesity in Swedish women 1980-1989. Journal of Epidemiology and Community Health 1993;47:195-9. 36. Winkleby MA et al. Ethnic and socioeconomic differences in cardiovascular disease risk factors - Findings for women from the third national health and nutrition examination survey, 1988-1994. Jama-Journal of the American Medical Association 1998;280:356-62. 37. Adams E et al. Food insecurity is associated with increased risk of obesity in California women. Journal of Nutrition 2003;133:1070-4. 38. Basiotis P, Lino M. Food insufficiency and prevalence of overweight among adult women. Family Economics and Nutrition Review 2003;15:55-7. 39. Kendall A, Olson C, Frongillo E. Relationship of hunger and food insecurity to food availability and consumption. Journal of the American Dietetic Association 1996;96:1019-24. 40. Olson C. Nutrition and health outcomes associated with food insecurity and hunger. Journal of Nutrition 1999;129:521S-4S. 41. Townsend M et al. Food insecurity is positively related to overweight in women. Journal of Nutrition 2001;131:1738-45.

165

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 42. Jeffery RW et al. Socioeconomic differences in health behaviors related to obesity - The Health Worder Project. Internation Journal of Obesity and Related Metabolic Disorders 1991;15:689-96. 43. Parker S, Keim K. Epic perspectives of body weight in overweight and obese women with limited income. Journal of Nutrition Education and Behavior 2004;36:282-9. 44. Wamala S, Wolk A, Orth-Gomer K. Determinants of obesity in relation to socioeconomic status among middle-aged Swedish women. Preventive Medicine 1997;26:734-44. 45. Averett SL, Sikora A, Argys LM. For better or worse: Relationship status and body mass index. Economics and Human Biology 2008;6:330-49. 46. Greenland S. The effect of misclassification in the presence of covariates. American Journal of Epidemiology 1980;112:564-9. 47. Rowland M. Self-reported height and weight. American Journal of Clinical Nutrition 2007;52:1125-33. 48. Nawaz H et al. Self-reported weight and height: implications for obesity research. Journal of Preventive Medicine 2001;20:294-8. 49. Bostrom G, Diderichsen F. Socioeconomic differentials in misclassification of height, weight and body mass index based on questionnaire data. International Journal of Epidemiology 1997;26:860-6. 50. Niedhammer I et al. Validity of self-reported weight and height in the French GAZEL cohort. International Journal of Obesity 2000;24:1111-8. 51. Wardle K, Johnson F. Sex differences in the association of socioeconomic status with obesity. Internation Journal of Obesity and Related Metabolic Disorders 2002;26:1144-9. 52. Gorber SC et al. The feasibility of establishing correction factors to adjust selfreported estimates of obesity. Health Reports 2009;19. 53. Mayor S. International Diabetes Federation consensus on prevention of type 2 diabetes. International Journal of Clinical Practice 2007;61:1773-5.

166

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 5. Thesis Conclusion Each of the studies presented in this thesis are distinct in their data and methodological approach but are linked in their common goal to inform population-based risk prediction for diabetes. Taken together, these studies can inform public health aspects of diabetes & obesity and epidemiological methods. In section 2, a novel risk tool - The Diabetes Population Risk Tool (DPoRT) - used to estimate the incidence of type 2 diabetes that can be applied at the population level using publicly available data was created. Four important goals were achieved with this work. First an algorithm to predict the incidence of diabetes with good discrimination and accuracy was developed. Secondly, an important policy advantage was achieved by building the tool so that it can be applied to the current risk factor surveillance data (routinely collected survey data) that is publicly available in Canada. This allows DPoRT to be used by a wide audience of health planners to accurately estimate diabetes incidence and quantify the impact of interventions. Thirdly, the vigor of the validation of DPoRT demonstrates a framework, which should be applied to the validation of other population-based risk algorithms. Finally, the novel application of a risk algorithm at the population level reveals and important way to understand distribution of diabetes risk in populations. The subsequent sections of this thesis were built from specific aspects related to the performance of the risk tool (as affected by measurement) and BMI change in the population over time. One of the key aspects of any epidemiological study is the accurate measurement of exposures and outcomes. To achieve the goals of DPoRT a balance between data availability and data detail was sought. This choice led to several questions regarding the consequence of public survey data. In Chapter 3.1, the impact of measurement error (systematic and random) in self-

167

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella reported height and weight on the performance of a diabetes risk prediction tool was studied. This study demonstrated that systematic and random error decreased the calibration of a prediction model and only random error reduced the model‘s discrimination. One of the particularly interesting findings of this study was the finding that of non-differential error can introduce systematic bias on predicted risk estimates. This study confirms that level of error in self-reported height and weight in Canada‘s health surveys unlikely to affect the performance or validation of DPoRT. Importantly, this study provides a framework to quantify the influence of measurement error on risk prediction using simulation. This work reaffirms that researchers developing and validating risk tools must be aware of the presence of measurement and its impact on the performance of their risk tools. Further efforts must be made to understand the nature of error in self-reporting measurements and ongoing work to improve the quality of measurements used in risk algorithms will improve model performance. Understanding the consequence of measurement error on risk prediction is not only important for population risk tools but can provide further insight for understanding the influence of measurement properties which can be used to provide evidence for making decisions about data utilization for different applications in epidemiology. The results from the investigation into the impact of ethnicity on diabetes risk (3.2) have several important implications for application of DPoRT in addition to providing insight into the independent role of ethnicity in the development of diabetes. From this study we find that although from the individual risk perspective, focusing on different ethnicity may be important, when predicting new cases of diabetes at the population level and accounting for other risk factors, detailed ethnic information did not significantly improve DPoRT population estimates of new diabetes cases. This indicates that for health planning purposes including detailed ethnic 168

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella information may not improve population estimates of future diabetes risk, which has significant implications on how the tool can be used. It also demonstrates the diminishing return that is often seen in prediction tools, such that maximum levels of discrimination and goodness of fit can often be achieved without detailed information on risk factors. While the clinical appeal of a using detailed ethnicity for estimating diabetes risk is significant, the statistical reality is that classifying ethnicity in broader categories which are available to the public work as well as a model which include detailed ethnic information in the Canadian population. Conclusions drawn from both studies within the measurement section can be used to inform research on the influence of measurement properties (error and type) on modeling and statistical prediction. In building the risk tool for diabetes it was demonstrated that that BMI (a relative measure of weight for height) overwhelmingly influences the predictions for developing diabetes in the future. For that reason, clarifying determinants of weight and weight change is essential when developing strategies to prevent or reduce the future diabetes burden. In monitoring trends over time researchers are often faced with the dilemma of separating trends between individuals and trends within individuals. Multilevel growth models allow us to model both these aspects which strengthen the ability to model trends that vary between and within individuals. In Chapter 4 predictors of weight and weight change were modeled in a longitudinal sample of Canadians. Specifically importance of age on baseline obesity levels and rate of change are quantified. The fact that younger individuals are at greatest risk of increasing obesity reinforce the fact that obesity prevention is most important in younger adults. DPoRT also reinforces that those with high BMI at lower ages will have a higher probability of developing diabetes than those that have higher weight at older ages. The fact that factors that influence BMI substantially depend on gender (particularly income) reveal that interventions for reducing weight may need 169

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella to be considered differentially between genders. The multilevel growth model provides a more efficient way for predicting weight change over time and can help inform DPoRT and improve predictive estimates. In addition, weight interventions modeled using DPoRT can be better informed by the findings of this model by targeting interventions for those at highest risk for weight gain. Finally, Chapter 4 demonstrates an important use for multilevel growth models in epidemiology to understand trends of risk factors or diseases that change over time. There are several follow-up studies which will be conducted based on the findings of this thesis. Validation of DPoRT will continue as more effective methods to quantify variation in case ascertainment in different populations are developed. As the surveillance of risk factors and diabetes improves, DPoRT can be adapted to become even more accurate, while maintaining its accessibility for decision makers. Future research on measurement error will include investigation into differential error with respect to disease status or other characteristics of the population. Though not specifically carried out in this thesis, the multilevel growth model can be used to predict future BMI in the population, which may be used when generating future estimates of obesity and/or diabetes for Canada. Taken together, this thesis represents a body of literature focused on diabetes risk prediction at the level of populations. Policy makers and health planners can apply these findings to estimate and plan for the upcoming diabetes epidemics in a country, such as Canada. The effectiveness of widespread prevention strategies can be improved by knowing which groups to target and how extensive a strategy is needed to stabilize or reduce the number of new cases. In summary, population-based prediction tools are an important aspect to epidemiology and public health. The methods and data used for DPoRT development and validation can potentially be applied for other diseases in Canada and represent a new way to understand 170

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella population health. The results and methods from investigating measurement and obesity are important for DPoRT and to further the practice of epidemiology.

171

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella

6. Appendix

172

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella

6.1 Glossary of frequently used terms and acronyms Measures of Predictive Accuracy Discrimination and Calibration Discrimination: Discrimination is the ability to differentiate between those who are high risk and those who are low risk – or in this case those who will and will not develop diabetes given a fixed set of variables. The ROC curve is a good way to measure discrimination. An ROC curve repeats all possible pairings of subjects in the sample who exhibit the outcome and do not exhibit the outcome and calculates the proportion of correct predictions- essentially being and index of resolution of the model. This proportion under the receiving operator curve is equal to the C statistic which can be used to assess the degree of discrimination - 1.0 being perfect discrimination and 0.5 being no discrimination (1-3). A perfect prediction model would perfectly resolve the population into those who get diabetes and those who do not. Accuracy is unaffected by discrimination, meaning a model can posse good discrimination yet poor calibration. Calibration (This concept is also knows as accuracy and/or reliability in the literature): Calibration is achieved in a prediction model if it is able to predict future risk with accuracy i.e. if the predicted probabilities closely agree with the observed outcomes. A model that is not reliable will have significant over- or under- estimation of risk in the overall population and/or within certain subgroups. A model with good accuracy model will maintain reliability across various risk groups and other important subpopulations. Accuracy is not an issue if the purpose of the predicted model is only to rank-order subjects (2). 173

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Calibration in the Framingham prediction models has been assessed statistically by a statistic developed by D‘Agostino (4). It is calculated by dividing the cohort into deciles of predicted risk and comparing observed versus predicted risk resulting in a modified version of HosmerLemenshow χ2. Other measures of assessing accuracy include graphical methods and correlations or R2 values between observed and predicted estimates (1). The formula for the Hosmer-Lemenshow χ2 is:

where Ni =total frequency of subjects in the ith group Oi = total frequency of event outcomes in the ith group π = average estimated predicted probability of event for ith group The Hosmer-Lemeshow statistic is then compared to a chi-square distribution with (g-n) degrees of freedom. As with the other GOF tests, evidence of lack of fit is demonstrates as the chisquare value increases and the subsequent p-value decreases.

174

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Reference List (1) Harrell FE, Lee KL, Mark DB. Multivariable prognostic models: Issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Statistics in Medicine 1996; 15:361-387. (2) Harrell FE. Regression Modeling Strategies With Applications to Linear Models, Logistic Regression, and Survival Analysis. New York: Springer, 2001. (3) Pencina M, D'Agostino RB. Overall C as a measure of discrimination in survival analysis: model specific population value and confidence interval estimation. Statistics in Medicine 2004; 23:2109-2123. (4) D'Agostino RB, Grundy S, Sullivan LM, Wilson P. Validation of the Framingham Coronary Disease Prediction Scores. JAMA 2001; 286(2):180-187.

Acronyms DPoRT: Diabetes Population Risk Tool NPHS: National Population Health Survey CCHS: Canadian Community Health Survey ODD: Ontario Diabetes Database RPDB: Registered Persons Database NDSS: National Diabetes Surveillance System (for Canada) BMI: Body Mass Index (kg/m2) METS: Metabolic Equivalents (expressing the energy cost of physical activities).

175

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella

6.2 Ethics For the DPoRT work approval to link survey data (NPHS 1996/7, CCHS 2000/1) to hospital administrative data held at ICES was received from Sunnybrook Health Sciences Centre Research Ethics Board in January 2005. The study design and use of hospital administrative data meets the requirements of the ICES research agreement with the Ontario Ministry of Health and Long Term Care. Ethics approval was sought from the University of Toronto Research Ethics Board and Sunnybrook and Women‘s College Health Sciences Centre Research Ethics Board and granted in January 2005. Use of the Manitoba data was approved by the Health Information Privacy Committee and University of Manitoba Health REB in January of 2005. Access to the National Population Longitudinal Health Survey at the Research Data Centre and the University of Toronto was received by Statistic Canada in June of 2008.

176

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 6.3 Simulation flow chart and SAS code Study flow chart for simulation

1. Input start values for simulation •taken from actual data •varying ICC and bias

6. Calculate predicted risk , H-L statistic, and Cstatistic for 'true' BMI and observed BMI

2. Apply variance equations (IV) for random error and (V) for bias

3. Simulate ~10,000 BMI values accorgind to input paramaters, 500 times

5. Predict probability of diabetes according to prediction algorithm using observed BMI

4. Predict probability of diabetes according to prediction algorithm using 'true' BMI

177

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella SAS code ***** INPUT PARAMETERS *****; %LET TOTREP = 500; %LET N = 10618; %LET BETA1 = 0.4565 ; %LET SEED1 = 55 ; %LET MEANHEIGHT = 1.627; %LET MEANWEIGHT = 64.761; %LET STDHEIGHT_OBS = 0.069 ;%LET STDWEIGHT_OBS = 12.320 ; %LET SEEDU = 8971 ; %LET BETA2 = - 0.00509; %LET PREV = 0.0736; %let intercept = -10.8967; %LET SEEDE = 4567; %LET STDBMIe = 5.00 ; %LET SEED2 = 123 ; %LET SEED3 = 15 ; %LET SEED4 = 2345 ; %LET SEED5 = 4513; %LET SEED6 = 9876; %LET CORRHW = 0.311; %LET ICCHEIGHT = 0.9; %LET ICCWEIGHT = 0.9; %let b_wt = 0;%let b_ht = 0; data time; format start time.; start = time(); output; run; DATA TEMP ; A1 = &STDHEIGHT_OBS ; B1 = &STDWEIGHT_OBS * &CORRHW ; B2 = SQRT(&STDWEIGHT_OBS**2 - B1**2) ; if &ICCHEIGHT = 1.0 then varht_e = 0; else if &ICCHEIGHT ne 1.0 then VARHT_E = &STDHEIGHT_OBS*&STDHEIGHT_OBS*(1-&ICCHEIGHT); VARHT_TRUE = &STDHEIGHT_OBS*&STDHEIGHT_OBS - VARHT_E; VAR_OBS_HT = &STDHEIGHT_OBS*&STDHEIGHT_OBS ; if &ICCwEIGHT = 1.0 then varwt_e = 0; else if &ICCwEIGHT ne 1.0 then VARWT_E = &STDWEIGHT_OBS*&STDWEIGHT_OBS*(1-&ICCWEIGHT); VARWT_TRUE = &STDWEIGHT_OBS*&STDWEIGHT_OBS - VARWT_E; VAR_OBS_WT = &STDWEIGHT_OBS*&STDWEIGHT_OBS ; corrtrue = (&corrhw)/sqrt(&iccheight * &iccweight); A2 = SQRT(VARHT_TRUE); B3 = SQRT(VARWT_TRUE)* CORRTRUE ; B4 = SQRT(VARWT_TRUE - B3**2) ; if &b_wt = 0 then meanweight_true = &MEANWEIGHT; else if &b_wt ne 0 then meanweight_true = &MEANWEIGHT - &b_wt; 178

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella

if &b_ht = 0 then meanheight_true = &MEANhEIGHT; else if &b_ht ne 0 then meanheight_true = &MEANheIGHT - &b_ht;

RUN; PROC MEANS DATA=TEMP mean; VAR meanweight_true meanheight_true; run; DATA TEMP; SET TEMP; DO REP = 1 TO &TOTREP ; DO IND = 1 TO &N ; Z1 = RANNOR(&SEED1) ; Z2 = RANNOR(&SEED2) ; Z3 = RANNOR(&SEED3); Z4 = RANNOR(&SEED4); Z5 = RANNOR(&SEED5); Z6 = RANNOR(&SEED6); U = RANUNI(&SEEDU) ; ERROR = RANNOR(&SEEDE); Y1 = A1 * Z1 ; Y2 = B1 * Z1 + B2 * Z2 ; Y3 = A2 * Z1 ; Y4 = B3 * Z1 + B4 * Z2 ; HEIGHT_OBS = &MEANHEIGHT + Y1 ; WEIGHT_OBS = &MEANWEIGHT + Y2 ; BMI_OBS = WEIGHT_OBS/(HEIGHT_OBS*HEIGHT_OBS); BMI_OBS_SQ = BMI_OBS*BMI_OBS; HEIGHT_TRUE = meanheight_true + Y3 ; WEIGHT_TRUE = meanweight_true + Y4 ; BMI_TRUE = WEIGHT_TRUE/(HEIGHT_TRUE*HEIGHT_TRUE); BMI_TRUE_SQ = BMI_TRUE*BMI_TRUE; output; end ; end ;RUN;

DATA TEMP; SET TEMP; 179

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Logit = &intercept + &BETA1 * BMI_TRUE + &BETA2 * BMI_TRUE_SQ ; elogit = exp(logit) ; prob = elogit / ( 1 + elogit) ; LOGIT_obs = &intercept + &BETA1 * BMI_obs + &BETA2 * BMI_obs_SQ ; elogit_obs = exp(logit_obs) ; prob_obs = elogit_obs / ( 1 + elogit_obs) ; DIABETES = 0 ; IF U LE PROB THEN DIABETES = 1 ; NEWDIABETES = 1 - DIABETES; diabetes_obs = 0; IF U LE PROB_obs THEN DIABETES_obs = 1; NEWDIABETES_OBS = 1 - DIABETES_OBS; RUN; proc sort data=temp; by rep;run; PROC MEANS DATA=TEMP mean median min max noprint; VAR PROB prob_obs diabetes diabetes_obs; by rep; output out=means mean = probmean probobsmean diabetestrue diabetesobs median = medtrue medobs min = mintrue minobs max = maxtrue maxobs ; RUN;

proc means data=means mean; var probmean medtrue mintrue maxtrue ; title1 "Females iccheight = &iccheight iccweight = &iccweight"; title2 "Females weight bias = &b_wt height bias = &b_ht"; run; ods output BinomialProp = Sens; ods listing close; proc freq data=temp (where = (diabetes = 1)); by rep; tables NEWdiabetes_obs; exact binomial; run; ods listing;

data Sens1; set sens; if Name1 = '_BIN_'; 180

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella Sensitivity = CValue1 + 0; run; proc means data=Sens1; Var Sensitivity; run;

ods output BinomialProp = SPEC; ods listing close; proc freq data=temp (where = (diabetes = 0)); by rep; tables diabetes_obs; exact binomial; run; ods listing; data Spec1; set spec; if Name1 = '_BIN_'; Specificity = CValue1 + 0; run; proc means data=Spec1; Var Specificity; run;

ods output PearsonCorr = Corr; ods listing close; proc corr data=temp; by rep; var prob prob_obs; run; ods listing;

proc means data=Corr (where = (Variable = 'prob')); var prob_obs; run; /* C stat generation*/ proc rank data=temp out=rankout groups=10; var prob_obs; 181

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella ranks deciles; by rep; run; proc sort data=rankout; by rep deciles;run; proc means data=rankout sum noprint; by rep deciles; var prob_obs diabetes; output out=hldiab sum=trueprobsum diabsum; run; data hlgof; set hldiab; N=_freq_; chi = (N*(diabsum - trueprobsum)**2)/(trueprobsum*(N-trueprobsum)); run; proc means data=hlgof sum noprint; var chi; by rep; output out=totalchi sum=chi_total; run;

data totalchi; set totalchi; chisqcut = 0; if chi_total ge 20 then chisqcut=1; pvalue = 1-probchi(chi_total,8); run; proc means data=totalchi;var Chi_total pvalue; run;

proc freq data=totalchi; tables chisqcut; RUN ; ods output WilcoxonScores=Wranks; ods listing close; proc npar1way data=temp wilcoxan; class diabetes; var prob_obs; by rep; run; ods listing; 182

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella data Wranks (keep = class N SumofScores numerator denominator C_stat);set Wranks; if class = 1; denominator = N*(&N - N); numerator = SumofScores - (N*(N+1)/2); C_stat = numerator/denominator; by rep; run;

proc means data=Wranks; var C_stat; run; data time; set time; format stop time.; stop = time(); TIMETORUN = stop - start; format timetorun mmss. output; run; proc print data=time; title 'TIME TAKEN TO RUN THE PROGRAM'; RUN;

183

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella 6.4 Example of ICC related to BMI distributions ICC (ρ) is defined as (II) ρ = σ2true / (σ2true + σ2error) When no error exists + σ2error = 0 thus ρ = σ2true / (σ2true + 0) = 1.0 The distribution of BMI in our survey data (for females) can be represented below:

10

8

Percent

6

4

2

0 6

7

8

9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43

BMI_OBS

In the case where ICC = 1.0 (no error in self-reported weight) the true BMI distribution will look identical:

184

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella

ICC = 1.0 10

8

Percent

6

4

2

0 6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

40

41

BMI_TRUE

However if that observed BMI has 20% of its variability due to random error (eg. Nondifferential reporting error) then the true distribution of BMI will be:

ICC = 0.8 10

8

Percent

6

4

2

0 6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

BMI_TRUE

185

26

27

28

29

30

31

32

33

34

35

36

37

38

39

42

43

Doctor of Philosophy Epidemiology (PhD) Dissertation Laura C.A. Rosella And in the case of even more error it would look like as so:

ICC = 0.6 12

10

Percent

8

6

4

2

0 6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

BMI_TRUE

Therefore, an incrase in reporting error results in a wider variance and distribution of BMI observed from the survey.

186

37

38

39

40

41

42

43

Doctor of Philosophy Epidemiology (PhD) Dissertation Rosella

6.5 Full results from simulations

187

Laura C.A.

Doctor of Philosophy Epidemiology (PhD) Dissertation

Laura C.A. Rosella

6.5 a) Results from random error in males Predicted Probability Height Weight ICC ICC 1.0 1.0 1.0 0.9 1.0 0.8 1.0 0.7 1.0 0.6 1.0 0.5 0.9 1.0 0.9 0.9 0.9 0.8 0.9 0.7 0.9 0.6 0.9 0.5 0.8 1.0 0.8 0.9 0.8 0.8 0.8 0.7 0.8 0.6 0.8 0.5 0.7 1.0 0.7 0.9 0.7 0.8 0.7 0.7 0.7 0.6 0.7 0.5 0.6 1.0 0.6 0.9

Mean Median Min, Max 9.31% 8.17% (0.16%, 38.11%) 9.18% 8.17% (0.22%, 36.68%) 9.05% 8.17% (0.32%, 34.92%) 8.91% 8.17% (0.45%, 32.75%) 8.76% 8.17% (0.67%, 30.12%) 8.62% 8.17% (0.10%, 26.85%) 9.26% 8.17% (0.16%, 37.33%) 9.13% 8.17% (0.23%, 35.77%) 8.99% 8.17% (0.32%, 33.87%) 8.85% 8.17% (0.47%, 31.56%) 8.71% 8.17% (0.69%, 28.75%) 8.56% 8.18% (1.11%, 25.28%) 9.21% 8.18% (0.17%, 36.50%) 9.07% 8.18% (0.23%, 34.80%) 8.94% 8.18% (0.33%, 32.76%) 8.79% 8.18% (0.48%, 30.30%) 8.65% 8.18% (0.72%, 27.32%) 8.50% 8.18% (1.11%, 23.65%) 9.15% 8.19% (0.17%, 35.59%) 9.02% 8.19% (0.24%, 33.76%) 8.89% 8.19% (0.34%, 31.58%) 8.74% 8.19% (0.50 %, 28.97% 8.59% 8.19% (0.75%, 25.82%) 8.44% 8.19% (1.12%, 21.95%) 9.01% 8.19% (0.17%, 34.74%) 8.97% 8.20% (0.24%, 32.68%)

Mean Mean P% H-L < rHW H-L value 20 rpo O-t Sensitivity Specificity 0.475 9.95 0.3689 97.0% 1.0000 0% 100.00% 100.00% 0.501 11.50 0.2919 92.6% 0.9999 0.13% 99.14% 99.76% 0.531 16.33 0.1397 71.6% 0.9997 0.27% 98.12% 99.52% 0.568 25.21 0.0323 30.8% 0.9991 0.40% 96.94% 99.26% 0.613 40.44 0.0023 3.4% 0.9982 0.55% 95.49% 98.97% 0.672 65.12 0.0000 0.0% 0.9965 0.70% 93.70% 98.65% 0.501 10.16 0.3595 96.6% 0.9980 0.05% 98.78% 99.82% 0.528 12.56 0.2519 90.6% 0.9977 0.18% 98.49% 99.64% 0.560 18.52 0.0990 62.8% 0.9971 0.32% 97.62% 99.41% 0.598 28.95 0.0155 18.2% 0.9961 0.46% 96.46% 99.15% 0.646 46.44 0.0006 1.2% 0.9944 0.60% 94.97% 98.86% 0.708 74.17 0.0000 0.0% 0.9915 0.75% 93.11% 98.53% 0.531 10.79 0.3269 95.0% 0.9906 0.10% 97.41% 99.62% 0.560 14.43 0.1875 81.8% 0.9893 0.24% 97.28% 99.46% 0.594 21.79 0.0544 47.0% 0.9873 0.37% 96.57% 99.25% 0.635 34.26 0.0053 9.0% 0.9842 0.52% 95.49% 99.00% 0.685 54.58 0.0001 0.2% 0.9792 0.66% 94.04% 98.71% 0.751 86.65 0.0000 0.0% 0.9703 0.81% 92.10% 98.38% 0.568 12.28 0.2636 90.2% 0.9751 0.16% 95.75% 99.39% 0.598 17.56 0.1137 66.6% 0.9713 0.29% 95.60% 99.24% 0.623 26.98 0.0204 23.6% 0.9660 0.29% 95.03% 99.04% 0.678 42.31 0.0011 2.2% 0.9579 0.57% 94.04% 98.80% 0.733 66.78 0.0000 0.0% 0.9446 0.59% 92.56% 98.52% 0.803 106.77 0.0000 0.0% 0.9187 0.74% 90.48% 98.17% 0.613 15.42 0.1615 78.0% 0.9462 0.30% 93.65% 99.13% 0.646 23.06 0.0423 41.0% 0.9377 0.34% 93.50% 98.98%

188

Doctor of Philosophy Epidemiology (PhD) Dissertation 0.6 0.6 0.6 0.6 0.5 0.5 0.5 0.5 0.5 0.5

0.8 0.7 0.6 0.5 1 0.9 0.8 0.7 0.6 0.5

8.83% 8.68% 8.53% 8.38% 9.05% 8.91% 8.77% 8.63% 8.48% 8.33%

8.20% 8.20% 8.20% 8.21% 8.20% 8.20% 8.21% 8.21% 8.21% 8.21%

(0.35%, 30.36%) (0.51%, 27.60%) (0.77%, 24.26%) (1.24%, 20.15%) (0.17%, 33.63%) (0.25%, 31.55%) (0.35%, 29.09%) (0.52%, 26.14%) (0.80%, 22.57%) (0.13%, 18.07%)

0.685 0.733 0.791 0.867 0.672 0.708 0.751 0.803 0.867 0.950

Laura C.A. Rosella 35.55 55.64 87.75 142.23 22.22 33.670 51.852 80.593 128.457 225.916

189

0.0037 0.0001 0.0000 0.0000 0.0518 0.0052 0.0001 0.0000 0.0000 0.0000

7.8% 0.2% 0.0% 0.0% 46.2% 10.4% 0.2% 0.0% 0.0% 0.0%

0.9254 0.9066 0.8740 0.8035 0.8942 0.8761 0.8496 0.8070 0.7273 0.5130

0.48% 0.63% 0.78% 0.93% 0.26% 0.40% 0.54% 0.68% 0.83% 0.99%

92.85% 91.93% 90.36% 87.90% 91.04% 90.77% 90.13% 88.92% 86.96% 83.39%

98.78% 98.54% 98.25% 97.88% 98.81% 98.65% 98.45% 98.20% 97.88% 97.42%

Doctor of Philosophy Epidemiology (PhD) Dissertation

Laura C.A. Rosella

6.5 b) Results from random error in females Predicted Probability Height ICC 1.0 1.0 1.0 1.0 1.0 1.0 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.8 0.8 0.8 0.7 0.7 0.7 0.7 0.7 0.7 0.6 0.6

Weight ICC 1.0 0.9 0.8 0.7 0.6 0.5 1.0 0.9 0.8 0.7 0.6 0.5 1.0 0.9 0.8 0.7 0.6 0.5 1.0 0.9 0.8 0.7 0.6 0.5 1.0 0.9

Mean Median Min, Max 7.20% 5.82% (0.04%, 33.49%) 7.07% 5.82% (0.06%, 33.02%) 6.95% 5.82% (0.83%, 32.38%) 6.82% 5.82% (0.12%, 31.22%) 6.68% 5.82% (0.19%, 27.77%) 6.54% 5.82% (0.29%, 27.83%) 7.16% 5.82% (0.04%, 33.29%) 7.04% 5.82% (0.06%, 32.70%) 6.91% 5.82% (0.08%, 31.84%) 6.77% 5.82% (0.13%, 30.64%) 6.63% 5.82% (0.19%, 29.04%) 6.49% 5.82% (0.30%, 26.94%) 7.12% 5.82% (0.04%, 33.02%) 7.00% 5.83% (0.06%, 32.32%) 6.87% 5.83% (0.08%, 31.33%) 6.73% 5.83% (0.13%, 29.89%) 6.59% 5.83% (0.20%, 28.24%) 6.45% 5.83% (0.31%, 25.99%) 7.08% 5.83% (0.04%, 32.68%) 6.95% 5.83% (0.06%, 31.85%) 6.82% 5.83% (0.09%, 30.73%) 6.69% 5.83% (0.13%, 29.25%) 6.55% 5.83% (0.20%, 27.36%) 6.40% 5.83% (0.32%, 24.95%) 7.04% 5.83% (0.04%, 32.29%) 6.91% 5.84% (0.06%, 31.34%)

Mean rHW H-L 0.311 10.45 0.328 11.69 0.348 15.62 0.372 23.00 0.402 34.71 0.440 52.83 0.328 10.50 0.346 12.32 0.367 16.94 0.382 25.05 0.423 37.72 0.464 57.42 0.348 10.83 0.367 13.37 0.389 18.62 0.416 27.72 0.449 41.81 0.492 62.90 0.372 11.48 0.392 19.40 0.416 21.21 0.444 31.47 0.480 47.02 0.526 70.37 0.402 12.80 0.423 17.36

190

Mean Pvalue 0.3356 0.2861 0.1561 0.0456 0.0043 0.0001 0.3361 0.2617 0.1273 0.0291 0.0021 0.0000 0.3198 0.2219 0.0941 0.0172 0.0009 0.0000 0.2929 0.1726 0.0563 0.0081 0.0002 0.0000 0.2360 0.1116

% H-L < 20 3.2% 7.2% 25.20% 56.6% 92.60% 99.8% 3.8% 9.80% 30.40% 67.00% 97.0% 100.0% 4.8% 13.0% 35.4% 75.4% 98.2% 100.0% 6.80% 19.4% 47.8% 87.2% 99.8% 100.00% 11.00% 29.40%

rpo 96.8% 92.8% 74.8% 43.4% 7.4% 0.2% 96.2% 90.2% 69.6% 33.0% 3.0% 0.0% 95.2% 87.0% 64.6% 24.6% 1.8% 0.0% 93.2% 80.6% 52.2% 12.8% 0.2% 0.0% 89.0% 70.6%

O-t Sensitivity Specificity 1.0000 0 100% 0.9991 0.13% 99.30% 0.9996 0.25% 98.46% 0.9989 0.38% 97.50% 0.9977 0.52% 96.33% 0.9955 0.66% 94.89% 0.9990 0.04% 99.05% 0.9991 0.16% 98.96% 0.9991 0.29% 98.29% 0.9989 0.43% 97.36% 0.9983 0.57% 96.22% 0.9972 0.71% 94.75% 0.9953 0.08% 97.75% 0.9955 0.21% 98.09% 0.9955 0.33% 97.68% 0.9955 0.47% 96.88% 0.9950 0.61% 95.80% 0.9942 0.75% 94.38% 0.9879 0.12% 96.61% 0.9877 0.25% 96.87% 0.9873 0.38% 96.65% 0.9867 0.51% 96.02% 0.9856 0.65% 95.03% 0.9839 0.80% 93.65% 0.9748 0.16% 95.03% 0.9736 0.29% 95.31%

Doctor of Philosophy Epidemiology (PhD) Dissertation 0.6 0.6 0.6 0.6 0.5 0.5 0.5 0.5 0.5 0.5

0.8 0.7 0.6 0.5 1 0.9 0.8 0.7 0.6 0.5

6.78% 6.64% 6.50% 6.36% 7.00% 6.87% 6.74% 6.60% 6.46% 6.31%

5.84% 5.84% 5.84% 5.84% 5.84% 5.84% 5.84% 5.84% 5.84% 5.84%

(0.09%, 30.09%) (0.13%, 28.48%) (0.21%, 26.44%) (0.33%, 23.88%) (0.04%, 31.84%) (0.06%, 30.77%) (0.09%, 29.38%) (0.14%, 27.64%) (0.21%, 25.46%) (0.34%, 22.76%)

0.449 0.480 0.518 0.568 0.440 0.464 0.492 0.526 0.568 0.622

Laura C.A. Rosella 24.86 36.84 54.59 80.58 15.28 21.25 30.81 45.26 66.06 96.18

191

0.0292 0.0026 0.0000 0.0000 0.1582 0.0547 0.0084 0.0003 0.0000 0.0000

68.60% 95.20% 100.00% 100.00% 20.20% 50.00% 85.2% 98.80% 100.00% 100.00%

31.4% 4.8% 0.0% 0.0% 79.8% 50.0% 14.8% 1.2% 0.0% 0.0%

0.9720 0.9698 0.9666 0.9616 0.9525 0.9494 0.9452 0.9395 0.9313 0.9186

0.42% 0.56% 0.70% 0.84% 0.00% 0.33% 0.46% 0.60% 0.74% 0.89%

95.20% 94.72% 93.84% 92.51% 93.08% 93.36% 93.31% 92.87% 92.07% 90.80%

Doctor of Philosophy Epidemiology (PhD) Dissertation Rosella

Laura C.A.

Table 6 c) C-statistics under random error for males and females MALES C-statistic

FEMALES C-statistic

Height Weight ICC ICC Mean StDev Min Max Mean 1.0 1.0 0.6858 0.0082 0.6628 0.7134 0.7179 1.0 0.9 0.6760 0.0091 0.6512 0.7031 0.7097 1.0 1.0 1.0 1.0 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.8 0.8 0.8 0.7 0.7 0.7 0.7 0.7 0.7 0.6 0.6 0.6 0.6 0.6 0.6 0.5 0.5 0.5 0.5 0.5 0.5

0.8 0.7 0.6 0.5 1.0 0.9 0.8 0.7 0.6 0.5 1.0 0.9 0.8 0.7 0.6 0.5 1.0 0.9 0.8 0.7 0.6 0.5 1.0 0.9 0.8 0.7 0.6 0.5 1 0.9 0.8 0.7 0.6 0.5

0.6649 0.6525 0.6377 0.6199 0.6827 0.6727 0.6613 0.6482 0.6327 0.6131 0.6787 0.6683 0.6565 0.6428 0.6265 0.6067 0.6735 0.6625 0.6499 0.6354 0.6181 0.5960 0.6659 0.6542 0.6409 0.6250 0.6054 0.5796 0.6550 0.6423 0.6272 0.6089 0.5852 0.5492

0.0094 0.0097 0.0099 0.0102 0.0090 0.0092 0.0095 0.0097 0.0100 0.0102 0.0091 0.0094 0.0096 0.0098 0.0101 0.0101 0.0092 0.0095 0.0096 0.0098 0.0101 0.0102 0.0094 0.0095 0.0098 0.0099 0.0102 0.0104 0.0095 0.0099 0.0098 0.0103 0.0103 0.0108

0.6397 0.6243 0.6090 0.5928 0.6597 0.6462 0.6345 0.6198 0.6058 0.5862 0.6558 0.6439 0.6291 0.6164 0.5978 0.5791 0.6507 0.6364 0.6251 0.6102 0.5888 0.5700 0.6407 0.6293 0.6149 0.5959 0.5772 0.5520 0.6279 0.6148 0.6014 0.5824 0.5591 0.5193

0.6943 0.6806 0.6654 0.6519 0.7093 0.6991 0.6903 0.6758 0.6607 0.6467 0.7033 0.6949 0.6853 0.6720 0.6575 0.6407 0.7009 0.6885 0.6778 0.6666 0.6499 0.6266 0.6930 0.6806 0.6683 0.6552 0.6347 0.6145 0.6810 0.6708 0.6535 0.6387 0.6188 0.5841

192

0.7002 0.6894 0.6770 0.6625 0.7158 0.7074 0.6978 0.6868 0.6743 0.6592 0.7131 0.7044 0.6949 0.6836 0.6706 0.6553 0.7097 0.7009 0.6909 0.6795 0.6661 0.6502 0.7053 0.6960 0.6858 0.6741 0.6601 0.6435 0.6993 0.6898 0.6790 0.6664 0.6518 0.6344

StDev 0.0091 0.0094

Min 0.6933 0.6811

Max 0.7424 0.7358

0.0096 0.0010 0.0100 0.0103 0.0092 0.0009 0.0096 0.0099 0.0100 0.0104 0.0093 0.0095 0.0097 0.0098 0.0101 0.0105 0.0093 0.0095 0.0097 0.0099 0.0101 0.0106 0.0094 0.0095 0.0010 0.0101 0.0102 0.0106 0.0095 0.0097 0.0099 0.0102 0.0103 0.0107

0.6710 0.6605 0.6470 0.6342 0.6899 0.6803 0.6668 0.6579 0.6469 0.6296 0.6873 0.6757 0.6643 0.6566 0.6424 0.6250 0.6828 0.6725 0.6586 0.6493 0.6394 0.6215 0.6799 0.6667 0.6556 0.6451 0.6336 0.6104 0.6720 0.6625 0.6516 0.6392 0.6238 0.5987

0.7277 0.7202 0.7035 0.6920 0.7401 0.7316 0.7271 0.7166 0.7033 0.6869 0.7381 0.7315 0.7246 0.7139 0.6989 0.6841 0.7353 0.7291 0.7220 0.7090 0.6980 0.6795 0.7339 0.7242 0.7157 0.7017 0.6913 0.6719 0.7286 0.7202 0.7068 0.6947 0.6804 0.6616

Doctor of Philosophy Epidemiology (PhD) Dissertation

Laura C.A. Rosella

Table 6d) Results from bias simulations for males

Height ICC Weight ICC Bias Height 1.0 1.0 0 1.0 1.0 0 1.0 1.0 0 1.0 1.0 0 1.0 1.0 0 1.0 1.0 0 1.0 1.0 0 1.0 1.0 Overestimate by 0.5 cm 1.0 1.0 Overestimate by 1.0 cm 1.0 1.0 Overestimate by 1.5 cm 1.0 1.0 Overestimate by 2.0 cm 1.0 1.0 Overestimate by 2.5 cm 1.0 1.0 Overestimate by 3.0 cm 1.0 1.0 Overestimate by 2.5cm 1.0 1.0 Overestimate by 2.5 cm 1.0 1.0 Overestimate by 5.0 cm 1.0 1.0 Overestimate by 5.0 cm 0.9 0.9 Overestimate by 2.5cm 0.9 0.9 Overestimate by 2.5 cm 0.9 0.9 Overestimate by 5.0 cm 0.9 0.9 Overestimate by 5.0 cm 0.8 0.8 Overestimate by 2.5cm 0.8 0.8 Overestimate by 2.5 cm 0.8 0.8 Overestimate by 5.0 cm 0.8 0.8 Overestimate by 5.0 cm

Bias Weight 0 Underestimate by 0.5 kg Underestimate by 1.0 kg Underestimate by 1.5 kg Underestimate by 2.0 kg Underestimate by 2.5 kg Underestimate by 3.0 kg 0 0 0 0 0 0 Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg

193

Mean o-t Median 9.31% 0.00% 8.17% 9.54% -0.23% 8.40% 9.77% -0.46% 8.64% 10.00% -0.69% 8.88% 10.25% -0.94% 9.12% 10.49% -1.18% 9.37% 10.74% -1.43% 9.63% 9.53% -0.22% 8.38% 9.76% -0.45% 8.60% 10.00% -0.69% 8.82% 10.23% -0.92% 9.06% 10.47% -1.16% 9.30% 10.72% -1.41% 9.54% 10.22% -0.91% 9.09% 12.14% -2.83% 11.04% 12.70% -3.39% 11.58% 13.56% -4.25% 12.50% 10.04% -0.73% 9.10% 11.97% -2.66% 11.04% 12.52% -3.21% 11.59% 13.39% -4.08% 0.24% 9.85% -0.54% 9.11% 11.77% -2.46% 11.05% 12.33% -3.02% 11.60% 13.21% -3.90% 12.52%

Min, Max (0.16%, 38.11%) (0.17%, 38.34%) (0.18%, 38.57%) (0.19%, 38.78%) (0.20%, 38.99%) (0.21%, 39.19%) (0.23%, 39.39%) (0.17%, 38.43%) (0.17%, 38.74%) (0.17%, 39.04%) (0.18%, 39.33%) (0.18%, 39.60%) (0.18%, 39.87%) (0.20%, 39.01%) (0.26%, 40.70%) (0.25%, 41.28%) (0.29%, 41.62%) (0.28%, 36.85%) (0.37%, 39.02%) (0.35%, 39.83%) (0.41%, 40.36%) (0.40%, 34.00%) (0.53%, 36.56%) (0.51%, 37.54%) (0.60%, 38.27%)

Doctor of Philosophy Epidemiology (PhD) Dissertation

Laura C.A. Rosella

Table 6e) Results from bias simulations for females

Height ICC Weight ICC Bias Height 1.0 1.0 0 1.0 1.0 0 1.0 1.0 0 1.0 1.0 0 1.0 1.0 0 1.0 1.0 0 1.0 1.0 Overestimate by 3.0 cm 1.0 1.0 Overestimate by 2.5cm 1.0 1.0 Overestimate by 2.5 cm 1.0 1.0 Overestimate by 5.0 cm 1.0 1.0 Overestimate by 5.0 cm 1.0 1.0 Overestimate by 2.5cm 1.0 1.0 Overestimate by 2.5 cm 1.0 1.0 Overestimate by 5.0 cm 1.0 1.0 Overestimate by 5.0 cm 1.0 1.0 Overestimate by 2.5cm 1.0 1.0 Overestimate by 2.5 cm 0.9 0.9 Overestimate by 5.0 cm 0.9 0.9 Overestimate by 5.0 cm 0.9 0.9 Overestimate by 3.0 cm 0.9 0.9 Overestimate by 2.5cm 0.8 0.8 Overestimate by 2.5cm 0.8 0.8 Overestimate by 2.5 cm 0.8 0.8 Overestimate by 5.0 cm 0.8 0.8 Overestimate by 5.0 cm

Bias Weight 0 Underestimate by 0.5 kg Underestimate by 1.0 kg Underestimate by 1.5 kg Underestimate by 2.0 kg Underestimate by 2.5 kg Underestimate by 3.0 kg 0 0 0 0 0 0 Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg

194

Mean o-t Median 7.20% 0.00% 5.82% 7.42% -0.22% 6.03% 7.64% -0.44% 6.26% 7.87% -0.67% 6.49% 8.10% -0.90% 6.72% 8.33% -1.13% 6.96% 8.57% -1.37% 7.20% 7.39% -0.19% 5.99% 7.58% -0.38% 6.17% 7.77% -0.57% 6.35% 7.97% -0.77% 6.54% 8.18% -0.98% 6.74% 8.39% -1.19% 6.94% 8.06% -0.86% 6.67% 9.79% -2.59% 8.42% 10.18% -2.98% 8.77% 11.02% -3.82% 9.69% 7.89% -0.69% 6.68% 9.63% -2.43% 8.43% 10.02% -2.82% 8.78% 10.87% -3.67% 9.70% 7.77% -0.57% 6.68% 9.47% -2.27% 8.43% 9.85% -2.65% 8.79% 10.71% -3.51% 9.70%

Min, Max (0.04%, 33.49%) (0.05%, 33.57%) (0.05%, 33.64%) (0.05%, 33.70%) (0.05%, 33.76%) (0.06%, 33.80%) (0.06%, 33.84%) (0.04%, 33.59%) (0.04%, 33.68%) (0.04%, 33.75%) (0.04%, 33.81%) (0.04%, 33.86%) (0.04%, 33.90%) (0.05%, 33.76%) (0.07%, 34.00%) (0.07%, 34.04%) (0.08%, 34.04%) (0.07%, 33.22%) (0.10%, 33.86%) (0.09%, 33.96%) (0.11%, 34.00%) (0.11%, 32.12%) (0.15%, 33.37%) (0.14%, 33.66%) (0.17%, 33.83%)

Doctor of Philosophy Epidemiology (PhD) Dissertation

Laura C.A. Rosella

6e) H-L and C-statistics from bias simulations for males Height ICC 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.8

Weight ICC 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.8

Bias Height

Bias Weight 0 0 0 0 0 0 0

Overestimate by 3.0 cm Overestimate by 2.5cm Overestimate by 2.5 cm Overestimate by 5.0 cm Overestimate by 5.0 cm Overestimate by 2.5cm Overestimate by 2.5 cm Overestimate by 5.0 cm Overestimate by 5.0 cm Overestimate by 2.5cm Overestimate by 2.5 cm Overestimate by 5.0 cm Overestimate by 5.0 cm Overestimate by 3.0 cm Overestimate by 2.5cm Overestimate by 2.5cm Overestimate by 2.5 cm Overestimate by 5.0 cm

0 Underestimate by 0.5 kg Underestimate by 1.0 kg Underestimate by 1.5 kg Underestimate by 2.0 kg Underestimate by 2.5 kg Underestimate by 3.0 kg 0 0 0 0 0 0 Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg

195

Mean H% H-L L Mean P-value <20 rop Sensitivity Specificity 9.951 0.3689 97.0% 1.0000 100.00% 100.00% 10.885 0.3239 95.4% 1.0000 97.60% 100.00% 13.036 0.2340 87.2% 0.9999 95.29% 100.00% 16.536 0.1343 73.0% 0.9998 93.01% 100.00% 21.415 0.0562 49.0% 0.9997 90.86% 100.00% 27.790 0.0190 23.2% 0.9995 88.73% 100.00% 35.653 0.0045 7.4% 0.9993 86.68% 100.00% 10.790 0.3274 95.4% 1.0000 97.67% 100.00% 12.801 0.2418 87.4% 1.0000 95.39% 100.00% 16.024 0.1470 75.0% 0.9999 93.16% 100.00% 20.609 0.0677 52.4% 0.9998 91.00% 100.00% 26.615 0.0237 25.8% 0.9997 88.89% 100.00% 34.137 0.0058 9.4% 0.9995 86.82% 100.00% 20.757 0.0647 51.4% 0.9997 91.09% 100.00% 106.266 0.0000 0.0% 0.9977 76.69% 100.00% 146.077 0.0000 0.0% 0.9971 73.31% 100.00% 224.760 0.0000 0.0% 0.9954 68.65% 100.00% 21.379 0.0574 50.2% 0.9980 92.46% 99.97% 103.347 0.0000 0.0% 0.9969 77.84% 100.00% 141.070 0.0000 0.0% 0.9963 74.36% 100.00% 220.009 0.0000 0.0% 0.9951 69.53% 100.00% 29.396 0.0149 23.0% 0.9896 91.86% 99.71% 109.846 0.0000 0.0% 0.9890 79.01% 100.00% 145.911 0.0000 0.0% 0.9882 75.49% 100.00% 227.025 0.0000 0.0% 0.9880 70.48% 100.00%

Doctor of Philosophy Epidemiology (PhD) Dissertation

Laura C.A. Rosella

6f) H-L from bias simulations for females Height ICC 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.8

Weight ICC 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.8

Bias Height

Bias Weight 0 0 0 0 0 0 0

Overestimate by 3.0 cm Overestimate by 2.5cm Overestimate by 2.5 cm Overestimate by 5.0 cm Overestimate by 5.0 cm Overestimate by 2.5cm Overestimate by 2.5 cm Overestimate by 5.0 cm Overestimate by 5.0 cm Overestimate by 2.5cm Overestimate by 2.5 cm Overestimate by 5.0 cm Overestimate by 5.0 cm Overestimate by 3.0 cm Overestimate by 2.5cm Overestimate by 2.5cm Overestimate by 2.5 cm Overestimate by 5.0 cm

0 Underestimate by 0.5 kg Underestimate by 1.0 kg Underestimate by 1.5 kg Underestimate by 2.0 kg Underestimate by 2.5 kg Underestimate by 3.0 kg 0 0 0 0 0 0 Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg

196

Mean % H-L H-L Mean P-value <20 rop Sensitivity Specificity 10.455 0.3356 96.8% 1.0000 100.00% 100.00% 11.659 0.2847 93.0% 1.0000 97.11% 100.00% 14.528 0.1922 80.2% 0.9998 94.29% 100.00% 19.354 0.0961 58.6% 0.9997 91.59% 100.00% 25.948 0.0370 32.2% 0.9994 89.00% 100.00% 34.611 0.0090 11.8% 0.9991 86.50% 100.00% 45.477 0.0014 2.0% 0.9986 84.11% 100.00% 11.340 0.2964 93.4% 1.0000 97.51% 100.00% 13.343 0.2236 85.6% 0.9999 95.07% 100.00% 16.768 0.1395 69.4% 0.9998 92.68% 100.00% 21.661 0.0667 48.2% 0.9997 90.34% 100.00% 27.848 0.0255 25.8% 0.9995 88.10% 100.00% 35.721 0.0068 9.2% 0.9992 85.91% 100.00% 24.557 0.0454 34.4% 0.9995 89.42% 100.00% 132.033 0.0000 0.0% 0.9959 73.54% 100.00% 168.291 0.0000 0.0% 0.9949 70.73% 100.00% 272.338 0.0000 0.0% 0.9917 65.31% 100.00% 24.011 0.0522 38.0% 0.9989 91.19% 100.00% 127.114 0.0000 0.0% 0.9961 74.45% 100.00% 161.191 0.0000 0.0% 0.9951 71.82% 100.00% 265.706 0.0000 0.0% 0.9924 66.21% 100.00% 28.423 0.0299 23.6% 0.9958 91.72% 99.88% 128.995 0.0000 0.0% 0.9938 76.05% 100.00% 160.670 0.0000 0.0% 0.9929 73.05% 100.00% 267.507 0.0000 0.0% 0.9907 67.18% 100.00%

Doctor of Philosophy Epidemiology (PhD) Dissertation

Laura C.A. Rosella

6 g) C-statistics from bias simulations for males and females C-statistic Height Weight ICC ICC 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.9 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8

Bias Height

Bias Weight 0 0 0 0 0 0

Overestimate by 2.5cm Overestimate by 2.5 cm Overestimate by 5.0 cm Overestimate by 5.0 cm Overestimate by 2.5cm Overestimate by 2.5 cm Overestimate by 5.0 cm Overestimate by 5.0 cm Overestimate by 3.0 cm Overestimate by 2.5cm Overestimate by 2.5cm Overestimate by 2.5 cm Overestimate by 5.0 cm Overestimate by 2.5cm Overestimate by 2.5 cm Overestimate by 5.0 cm Overestimate by 5.0 cm Overestimate by 2.5cm Overestimate by 2.5 cm

0 Underestimate by 0.5 kg Underestimate by 1.0 kg Underestimate by 1.5 kg Underestimate by 2.0 kg Underestimate by 2.5 kg Underestimate by 3.0 kg 0 0 0 0 0 0 Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg Underestimate by 1.7 kg Underestimate by 3.2 kg

Mean 0.6858 0.6847 0.6834 0.6821 0.6810 0.6798 0.6786 0.6856 0.6852 0.6848 0.6844 0.6841 0.6838 0.6815 0.6762 0.6776 0.6737 0.6687 0.0066 0.0067 0.0066 0.6528 0.6486 0.6500 0.6467

197

StDev 0.0082 0.0088 0.0087 0.0086 0.0085 0.0084 0.0084 0.0088 0.0087 0.0086 0.0086 0.0084 0.0083 0.0085 0.0079 0.0078 0.0076 0.0087 0.0082 0.0079 0.0000 0.0090 0.0084 0.0083 0.0080

C-statistic Min 0.6628 0.6623 0.6607 0.6592 0.6556 0.6560 0.6531 0.6639 0.6626 0.6616 0.6590 0.6599 0.6601 0.6561 0.6519 0.6551 0.6520 0.6428 0.6379 0.6413 0.6368 0.6293 0.6218 0.6244 0.6214

Max 0.7134 0.7107 0.7098 0.7081 0.7083 0.7072 0.7066 0.7118 0.7116 0.7120 0.7126 0.7111 0.7095 0.7086 0.7025 0.7039 0.6988 0.6925 0.6926 0.6906 0.6888 0.6791 0.6742 0.6796 0.6771

Mean 0.7179 0.7126 0.7145 0.7127 0.7110 0.7093 0.7078 0.7174 0.7169 0.7163 0.7157 0.7152 0.7147 0.7118 0.7037 0.7055 0.7000 0.7014 0.6938 0.6958 0.6903 0.6894 0.6821 0.6840 0.6789

StDev 0.0091 0.0090 0.0089 0.0089 0.0088 0.0087 0.0085 0.0090 0.0089 0.0088 0.0089 0.0087 0.0086 0.0087 0.0080 0.0078 0.0075 0.0089 0.0081 0.0079 0.0077 0.0092 0.0082 0.0080 0.0077

Min 0.6933 0.6931 0.6918 0.6898 0.6882 0.6880 0.6846 0.6940 0.6944 0.6928 0.6926 0.6929 0.6918 0.6888 0.6818 0.6845 0.6774 0.6778 0.6715 0.6737 0.6656 0.6621 0.6590 0.6622 0.6547

Max 0.7424 0.7468 0.7383 0.7369 0.7338 0.7336 0.7316 0.7428 0.7422 0.7397 0.7408 0.7383 0.7391 0.7353 0.7263 0.7264 0.7204 0.7259 0.7178 0.7173 0.7104 0.7163 0.7036 0.7052 0.7005

More Documents from "Aman Aman"