Seminario 11 1

  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Seminario 11 1 as PDF for free.

More details

  • Words: 6,202
  • Pages: 10
The

n e w e ng l a n d j o u r na l

of

m e dic i n e

review article

current concepts

Microarray Analysis and Tumor Classification John Quackenbush, Ph.D.

D

na microarray analysis was first described in the mid-1990s as a means to probe the expression of thousands of genes simultaneously1,2 and was quickly adopted by the research community for the study of a wide range of biologic processes. Most of the early studies had a simple and powerful design: to compare two biologic classes in order to identify the differential expression of the genes in them — genes with potential relevance to a wide range of biologic processes, such as the progression of cancer,3-6 the causes of asthma,7-9 heart disease,10-12 and neuropsychiatric disorders,13-17 and the analysis of factors associated with infertility.18-21 Soon after microarrays were introduced, many researchers realized that the technique could be used to find new subclasses in disease states22,23 and identify biologic markers (biomarkers) associated with disease24 and that even the expression patterns of the genes could be used to distinguish subclasses of disease.5,25-27 This realization resulted in a proliferation of searches for patterns of expression that could be used to classify types of tumors28 and predict the outcome29,30 and response to chemotherapy.31,32 An example is the Netherlands breast-cancer study,31 which sought to distinguish between patients who had the same stage of disease but a different response to treatment and a different overall outcome. The study was motivated by the observation that the best clinical predictors of metastasis, including lymph-node status and histologic grade, did not adequately predict clinical outcome, with the result that many patients receive chemotherapy or hormonal therapy regardless of whether they need this additional treatment. The study searched for gene-expression signatures that would indicate which patients would benefit from adjuvant chemotherapy. By profiling tumors of young patients who had received only surgical treatment and searching for correlations with clinical outcome, a signature of poor prognosis consisting of 70 genes was identified and was predictive of a short interval to distant metastasis in patients with tumors that were lymphnode–negative. The analysis showed that microarray-based signatures could outperform clinically based predictions of outcome in identifying patients who would benefit most from adjuvant therapy. These initial results led to a more extensive study30 that showed that the 70-gene classification profile was a more powerful predictor of disease outcome in young patients with breast cancer than were standard systems based on clinical and histologic criteria. Although the superiority of expression-based biomarkers, as compared with traditional clinical staging, has been disputed,33 a nationwide clinical trial is under way in the Netherlands in which gene-expression profiles for these 70 classifier genes are being collected from all newly identified consenting patients with breast cancer and used as an adjunct to classic clinical staging. This is an important step in determining the value of the 70-gene signature in the care of patients. It extends the analysis to a broader population than that included in the earlier studies,30,31 making it possible to gauge the extent to which heterogeneity of patients,

n engl j med 354;23

www.nejm.org

From the Dana–Farber Cancer Institute and Harvard School of Public Health — both in Boston. Address reprint requests to Dr. Quackenbush at the Dana–Farber Cancer Institute, 44 Binney St., Boston, MA 02115, or at [email protected]. N Engl J Med 2006;354:2463-72. Copyright © 2006 Massachusetts Medical Society.

june 8, 2006

Downloaded from www.nejm.org on August 5, 2008 . Copyright © 2006 Massachusetts Medical Society. All rights reserved.

2463

The

n e w e ng l a n d j o u r na l

treatment protocols, variations in the methods of collecting and processing samples, and other factors influence the value of the expression signature as a diagnostic tool. This large-scale trial is a clear sign that the use of expression profiles as biomarkers to predict disease prognosis and outcome is coming of age. Many early studies focused on unsupervised approaches to data-mining, such as hierarchical clustering for class discovery, because such studies take an unbiased approach to searching for subgroups in the data. This approach was used by Alizadeh et al.34 to analyze expression data from samples of lymphoma and to identify two previously unknown and transcriptionally distinct subclasses of diffuse large-B-cell lymphoma, each of them related to a different stage of B-cell differentiation. One subclass expressed genes characteristic of germinal B cells and the other expressed those normally induced during in vitro activation of peripheral-blood B cells. The analysis also showed that patients whose tumors were within these subclasses had distinct clinical prognoses. Although this class discovery is useful, in a clinical setting, the goal is to develop improved diagnostic and prognostic evidence that can be used to direct treatment. Golub et al.25 showed that microarray-expression profiles can be used to classify disease states. They obtained expression profiles of samples of acute lymphoblastic leukemia and acute myeloblastic leukemia and found that tumor groups defined according to patterns of gene expression corresponded to known classes of disease. The ability to partition the samples into distinct groups of either acute lymphoblastic leukemia or acute myeloblastic leukemia suggested that expression profiles could serve as a means of classifying the samples. With the use of a weighted voting scheme (with individual genes given different weights according to their predictive power) in which the level of expression of each relevant gene contributes to the final classification, the patients’ tumors could be assigned to the appropriate disease class. This study and other studies have made it increasingly clear that disease classification according to expression profiles will become an important area of application for microarrays, proteomics, metabolomics, and other high-throughput genomic techniques. The question is whether a pattern can be found that can be used to distinguish bio-

2464

n engl j med 354;23

of

m e dic i n e

logic samples on the basis of some inherent property. To understand how and when this question can be answered, it is useful to review some basic issues related to use of microarray data, including the potential limitations of their use. Collection, Transformation, and Representation of the Data

The microarray technique uses gene-specific probes that represent thousands of individual genes. The probes are arrayed on an inert substrate and levels of gene expression in a target biologic sample are assayed (Fig. 1). RNA is extracted from tissues of interest, labeled with a detectable marker (typically, a fluorescent dye), and allowed to hybridize to the arrays. Samples of messenger RNA (mRNA) hybridize to complementary gene-specific probes on the array. Images are rendered with the use of confocal laser scanning; the relative fluorescence intensity of each gene-specific probe is a measure of the level of expression of the particular gene. The greater the degree of hybridization, the more intense the signal, implying a higher relative level of expression. There are two basic approaches to generating microarray data. In a two-color array, two samples of RNA, each labeled with a different dye, are simultaneously hybridized to the array (Fig. 2). The sample of interest, or query sample (for example, a sample of breast cancer), is labeled with one dye, and a reference sample (for example, normal breast tissue) is labeled with another dye; the two samples are mixed in an approximate ratio of 1:1 on the basis of incorporation of the dye. Such an assay compares paired samples and reports expression as the logarithm of the ratio of RNA in a query sample to that in a control sample. For single-color arrays, such as the GeneChip (Affymetrix), each sample is labeled and individually incubated with an array (Fig. 2). After nonhybridized material in the sample is removed by washing, the level of expression of each gene is reported as a single fluorescence intensity that represents an estimated level of gene expression. Regardless of the approach or technique, the data used in all subsequent analyses are expression measures for each gene in each sample. After collection, the data are usually normalized to facilitate the comparison between the different hybridization assays. Normalization compensates for differences in labeling, hybridization, and detection methods. There are several ap-

www.nejm.org

june 8, 2006

Downloaded from www.nejm.org on August 5, 2008 . Copyright © 2006 Massachusetts Medical Society. All rights reserved.

current concepts

Figure 1. Hybridization with Gene Elements on a Microarray. Central to microarray analysis for determining gene expression is the hybridization of fluorescently labeled RNA with a short strand of complementary DNA that is tethered to a solid surface, such as glass. The natural affinity of two complementary strands of nucleic acid for each other drives the reaction. Complementarity is indicated by color. For example, the stationary orange probe is complementary to the floating orange target. Because the position and sequence of the stationary probe are known, hybridization between probe and labeled target provides an assay of the level of target in solution.

proaches to data normalization; the approach that will be the most appropriate will depend on the type of array and assumptions about biases in the data.35-39 Next, the data are often filtered with the use of some set of objective criteria (e.g., elimination of genes with minimal variance in the samples) or statistical analyses to select genes with expression levels that correlate with particular groups of samples. Normalization and filtering transformations must be carefully applied, because they can have a profound effect on the results. Different methods of statistical analysis applied to the same data set may produce different (but usually overlapping) sets of significant genes. Not surprisingly, the best way to deal with high-dimensional data sets — in which there are often more measurements (genes) than there are samples — is an area of active research and debate. Caution should be exercised in comparing data

n engl j med 354;23

sets from different laboratories. Comparisons of published lists of genes can produce discordant results,40,41 because they rarely take into account the differences in the methods of analysis of the data. Comparisons that do take the analysis of the data into account usually find good concordance between different laboratories and various types of microarray.42-47 To facilitate comparisons between studies, the Microarray Gene Expression Data Society developed standards for data reporting that are known as “minimal information about a microarray experiment” (MIAME).48-52 The primary DNA sequence databases have developed public repositories for gene-expression data that require that the data meet the MIAME standards, and some publications require that the data be submitted to one of these repositories. (The Journal requires that both raw and transformed data be submitted to one of these repositories.) There is great value in making gene-expression data sets

www.nejm.org

june 8, 2006

Downloaded from www.nejm.org on August 5, 2008 . Copyright © 2006 Massachusetts Medical Society. All rights reserved.

2465

The

n e w e ng l a n d j o u r na l

of

m e dic i n e

Figure 2. Overview of DNA Microarray Analysis. In a two-color analysis (Panel A), RNA samples obtained from patients and control subjects are individually labeled with distinguishable fluorescent dyes and hybridized to a single DNA microarray consisting of individual gene-specific probes. Relative levels of gene expression in the two samples are estimated by measuring the fluorescence intensity for each probe; a sample-expression vector summarizes the level of expression of each gene in the sample obtained from a patient (as compared with a sample obtained from a control). A single-color analysis (Panel B), performed with the use of the GeneChip (Affymetrix), hybridizes labeled RNA from each biologic sample to a single array in which a series of gene-specific probes are arrayed. Gene-expression levels are estimated by measuring the hybridization intensity for a series of “perfect match” probes, and the background is measured with the use of a corresponding set of “mismatch” probes. Gene-expression levels are reported for each sample from a patient as a sample-expression vector that summarizes the difference between the signal and background for each gene.

publicly available: independent studies that provide data on the same disease can be used to validate results, and large sample sizes can provide robust data sets for meta-analyses designed to find universal patterns of gene expression that can be associated with a given disease or outcome. The raw data should be made available, because small changes in the mathematical analytic approach can produce measurable differences in the conclusions drawn from the data and in the ability to compare results from studies performed at different laboratories.53-55 After the normalized and filtered expression 2466

n engl j med 354;23

data are collected, they are typically represented in a matrix in which each row represents a particular gene and each column represents a specific biologic sample (Fig. 3). Each row shows a gene-expression vector — the separate entries are the levels of expression of a specific gene in all the samples. Each column shows a sampleexpression vector that records the expression of all the genes in the sample. To ease interpretation of the results of multiple hybridizations, elements of the data in a matrix are often rendered in color, which indicates the level of expression of each gene in each sample (Fig. 3) and yields a visual

www.nejm.org

june 8, 2006

Downloaded from www.nejm.org on August 5, 2008 . Copyright © 2006 Massachusetts Medical Society. All rights reserved.

current concepts

Figure 3. Development of an Expression Matrix. Data from each gene in each sample are collected, and these sample-expression vectors are assembled into a single expression matrix. Each column in the matrix represents an individual sample and its measured gene-expression levels (the sample-expression vector); each row represents a gene and its expression levels in all samples (a gene-expression vector). The expression matrix is often shown as a colored matrix (typically, red or green, although other combinations, such as blue or yellow, are also common). In the colored matrix, the color and its intensity represent the relative direction and magnitude of a difference in gene expression.

representation of gene-expression patterns in the samples being analyzed; in the most common approach, the colors used for the genes are based on the log ratio for each sample measured as compared with a control sample; log-ratio values close to zero are rendered in black, those with values greater than zero are rendered in red (indicating up-regulated genes), and those with negative values are rendered in green (down-regulated genes), although there are other color schemes that are kinder to people who have red–green blindness (Fig. 3). The intensity of each element, as compared with the intensities of other elements, indicates the relative expression of the gene that the element represents, and brighter elements indicate a higher level of expression. For any group of samples, the expression matrix generally appears to have no apparent pattern or order (Fig. 3). Programs that perform clustering generally order the rows, the columns, or both, so that the patterns of expression are visualized (Fig. 4).

disease states or have similar patterns of expression in multiple samples. Alternatively, samples are sought in which the genes share similar expression profiles. Such an analysis depends on defining a measure of similarity between expression profiles, and each measure can reveal different features in the data. Two measures that are commonly used are euclidean distance and Pearson’s correlation-coefficient distances. Euclidean distance is best used when the magnitude of the expression level is important, whereas Pearson’s correlation coefficients are useful when the pattern of expression in the genes or samples is more important. Generally, when microarray analysis is used to classify tumors, for example, what is of interest is the pattern of expression in all the samples, so in this case the use of Pearson’s correlation-coefficient distances would generally be appropriate. After the appropriate data have been recorded, normalized, and filtered and a means of measuring similarity has been chosen, a variety of approaches are available for further analysis. These Iden t if y ing Pat ter ns approaches are generally grouped into two types: of E x pr e s sion supervised and unsupervised methods. Supervised In microarray analysis, typically genes are sought methods depend on prior knowledge about the that show patterns of expression that correlate with samples in order to search for genes that corren engl j med 354;23

www.nejm.org

june 8, 2006

Downloaded from www.nejm.org on August 5, 2008 . Copyright © 2006 Massachusetts Medical Society. All rights reserved.

2467

The

A Unordered Expression Matrix

n e w e ng l a n d j o u r na l

of

m e dic i n e

B Ordered by Hierarchical Clustering

C Samples Partitioned by k-Means Clustering

Figure 4. Cluster Analysis Applied to Gene Expression in Microarray Data. An unordered data set (Panel A) has been subjected to average linkage hierarchical clustering (Panel B) or k-means cluster analysis (Panel C), which reveals underlying patterns that can help identify subgroups in the data set.

late with a disease state, and they are useful for Cl a s sific at ion classification studies. Unsupervised methods disregard prior knowledge and can be useful for Generally, clinically based expression-profiling identifying subgroups of samples that may rep- studies begin with samples obtained from patients in well-defined groups, and such prior knowledge resent hitherto unrecognized disease states. can be useful in analyzing the data. For example, in looking at leukemia samples, the investigator Cl a s s Dis c ov er y may know that an initial data set was derived Regardless of the goal of a microarray study, the from patients with acute lymphoblastic leukemia first technique applied is invariably an unsuper- and patients with acute myeloblastic leukemia. vised analysis designed to determine whether un- The first need is to identify which genes best disexpected but biologically interesting patterns tinguish the two classes of patients in the data exist in the data. Unsupervised methods do not set. Fortunately, a wide variety of statistical tools take into account sample classification — such can be brought to bear on this question, includas whether the samples come from patients with ing t-tests (for two classes) and analysis of variacute lymphoblastic leukemia or acute myeloblas- ance (for three or more classes), and with the use tic leukemia. These methods simply group sam- of these tools, P values are assigned to genes on ples (or genes, or both) on the basis of some mea- the basis of whether the genes distinguish the sure of similarity between the expression profiles. groups of samples. Although these statistical Two widely used unsupervised approaches are methods are widely used, they suffer from the hierarchical clustering (Fig. 4) and k-means clus- problem of multiple testing — simply put, beter analysis (Fig. 4). These approaches divide the cause the number of samples typically included data into clusters, but determining whether the in an analysis is in the tens or hundreds and the clusters are meaningful requires expert analysis to number of genes is in the thousands, there are relate the clusters to the clinical data, and clearly, generally too few samples to constrain the selection of genes. As a result, even at 95 percent connewly discovered classes require validation.

2468

n engl j med 354;23

www.nejm.org

june 8, 2006

Downloaded from www.nejm.org on August 5, 2008 . Copyright © 2006 Massachusetts Medical Society. All rights reserved.

current concepts

fidence (P≤0.05), on an array of 10,000 elements, 500 significant genes may be found purely by chance. Clearly, greater stringency is needed to establish criteria for gene selection, but it should also be understood that the P values are useful for prioritizing genes for further study. The problem here is a bit more complex than one might assume. The multiple-testing argument is based on the measurement of a large number of variables that are independent of one another in a population of samples that is small relative to the number of variables. However, measurements in gene expression are clearly not independent, because genes map to networks and pathways in which expression is regulated in a coordinated fashion. Because we do not yet have a full understanding of the relationships among genes and other factors that influence coordinated patterns of expression, appropriate correction for multiple testing remains an area of active research; criteria for selecting particular genes for study need to be established, and it should be understood that the P values are useful in prioritizing genes for investigation. A collection of genes thus selected can be used for a variety of purposes. Often, such genes are studied to provide insight into the mechanistic aspects of the diseases being studied. However, there has been a great deal of interest in using the selected genes and their patterns of expression as biomarkers for diagnostic and prognostic applications. The question is whether a set of genes and their expression patterns in an initial set of patients can be used to classify disease in new patients. The first step in answering it is to select an algorithm for classification and then “train” it with the use of the available data. The expression vectors (the pattern of gene expression in all the samples) of the discriminatory genes are used to train the selected algorithm in order to optimize its discriminatory power. The result is a computational rule that can be applied to a new sample and can be assigned to one of the biologic classes. Ideally, the trained algorithm is applied to a test set of samples to assess its sensitivity and specificity.

tion of samples to allow an independent test set and training set. In practice, usually only a limited number of samples are available, and these are needed for building and training the algorithm. An alternative to using an independent test set is to leave out k when using the cross-validation method.56 This approach leaves out some subgroup (k) of the initial collection of N samples, develops a classifier with the use of the remaining (N − k) samples, and applies the classifier to omitted k samples, which are thus used as a test set. Then, a new set of k samples to be left out and classified is chosen, and the process is repeated. The simplest approach is known as “leave-one-out” cross-validation. Although the cross-validation approach can be useful when an independent test set is unavailable, the method is often applied inappropriately, as a partial rather than a full cross-validation. The distinction between the two methods is the point in the process where k is left out. Some published studies have used an entire data set to select a set of classification genes, and then the samples were divided into k and (N − k), a sample test set and a training set, respectively. Proceeding in this fashion has the potential to bias the results, because the test set and the training set are not independent of each other; inclusion of all the samples in the initial process of gene selection may favorably bias the ultimate success of any classifier that is constructed.

L imi tat ions a nd Suc ce s s of Cl a s sific at ion

Although there have been attempts to identify the best classification approach, no single method will work in all cases and many methods may work in any one case. Consequently, the limitations of the different approaches need to be understood. Most studies that have been conducted to date have involved relatively few patients, and it is not clear how the results can be generalized to large clinical populations, with the collection of samples at a number of sites making for variation in the collection and handling of them. Often, the gene-expression signatures obtained with the use of microarray analysis are difficult Va l idat ion to interpret with respect to the biology of the unTo develop and validate a method of classifica- derlying disease. Ultimately, finding genes that tion it is best to have a sufficiently large collec- can be linked through their mechanism to dis-

n engl j med 354;23

www.nejm.org

june 8, 2006

Downloaded from www.nejm.org on August 5, 2008 . Copyright © 2006 Massachusetts Medical Society. All rights reserved.

2469

The

n e w e ng l a n d j o u r na l

of

m e dic i n e

A Proliferation of “-omics” Because of the Human Genome Project, most of us know that the genome refers to the collection of all the DNA and, by extension, all the genes in an organism. The success of genomics, however, has resulted in a proliferation of “-omics” sciences, resulting in the assignment of new descriptive terms to familiar concepts, with the common theme that -omics approaches attempt to measure all of some biologic entity. A search in the PubMed database (and eliminating such common words as “economics”) returns 110 terms (http://biocomp.dfci.harvard.edu/tgi/ omics_count.html) that contain the -omics suffix. Although each of these words might be of some interest to someone, few of them have become sufficiently common to warrant definition. Genomics is the study of genomes and the complete collection of genes that they contain. A short time ago, this collection might have been limited to protein-coding genes, but genomics has shown that many other elements have important functions in the genome, such as transcription factor binding domains, regions encoding microRNAs and antisense transcripts, and large, evolutionarily conserved regions. The primary technique used in genomics is highthroughput genome sequencing. Functional genomics, also known as transcriptomics, attempts to analyze patterns of gene expression and to correlate the patterns with the underlying biology. There is a wide range of techniques used, including DNA microarray analysis and serial analysis of gene expression. Metabolomics, or metabonomics, is a large-scale approach to monitoring as many as possible of the compounds involved in cellular processes in a single assay to derive metabolic profiles. Although metabolomics first referred to the monitoring of individual cells and metabonomics referred to multicellular organisms, these terms are now often used interchangeably. Techniques applied to metabolic profiling include nuclear magnetic resonance and mass spectrometry. Proteomics approaches examine the collection of proteins to determine how, when, and where they are expressed. Techniques used in this approach include two-dimensional gel electrophoresis, mass spectrometry, and protein microarrays. Bioinformatics, although not graced with the -omics suffix, remains a key element in collection, management, and analysis of large-scale data sets that are generated by the approaches described here. Bioinformatics uses techniques developed in fields such as computer science and statistics to facilitate understanding how the expression profiles generated are related to the biologic systems being studied. Proteomics and metabolomics have the advantage of being capable of searching for proteins or metabolites in the blood or urine (rather than in the primary tissue, where a disease might appear), but typically these approaches identify far fewer proteins or metabolites than the number of genes that can be identified with the use of microarray analysis. Because microarray analysis is a more mature technique than the other approaches and because of the relative ease of working with nucleic acids, microarrays remain the -omics technique that is most likely to have early applications in diagnosis or prognosis.

ease outcome suggests potential therapeutic interventions. But the failure to provide a biologic interpretation does not diminish the potential clinical usefulness of well-established biomarkers. Many biomarkers, such as prostate-specific antigen and carcinoembryonic antigen, that have unknown functions are useful as diagnostic or prognostic markers for various diseases. It may be useful to consider the lists of genes emerging from classification experiments as sets of biomarkers; insight into biologic mechanism is a bonus. Although early successes have been realized on the basis of expression profiling with the use of microarrays, there is great promise in the application of similar techniques from proteomics

and metabolic profiling (see Box). The lack of large, carefully procured, and well-annotated collections of samples obtained from patients, particularly patients for whom sufficient follow-up data are available, has proved to be a bottleneck. The prospect of collecting such samples, analyzing them, and developing new approaches to diagnostics represents an outstanding opportunity for collaboration among clinical, laboratory, computational, and biostatistical scientists, with the promise of advances in the diagnosis and management of disease. Dr. Quackenbush reports having received GeneChips donated by Affymetrix. No other potential conflict of interest relevant to this article was reported.

References

2470

1. Lipshutz RJ, Morris D, Chee M, et al.

2. Schena M, Shalon D, Davis RW, Brown

3. DeRisi J, Penland L, Brown PO, et al.

Using oligonucleotide probe arrays to access genetic diversity. Biotechniques 1995; 19:442-7.

PO. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 1995;270:467-70.

Use of a cDNA microarray to analyse gene expression patterns in human cancer. Nat Genet 1996;14:457-60.

n engl j med 354;23

www.nejm.org

june 8, 2006

Downloaded from www.nejm.org on August 5, 2008 . Copyright © 2006 Massachusetts Medical Society. All rights reserved.

current concepts

4. Welford SM, Gregg J, Chen E, et al.

Detection of differentially expressed genes in primary tumor tissues using representational differences analysis coupled to microarray hybridization. Nucleic Acids Res 1998;26:3059-65. 5. Khan J, Simon R, Bittner M, et al. Gene expression profiling of alveolar rhabdomyosarcoma with cDNA microarrays. Cancer Res 1998;58:5009-13. 6. Agrawal D, Chen T, Irby R, et al. Osteopontin identified as lead marker of colon cancer progression, using pooled sample expression profiling. J Natl Cancer Inst 2002;94:513-21. 7. Rolph MS, Sisavanh M, Liu SM, Mackay CR. Clues to asthma pathogenesis from microarray expression studies. Pharmacol Ther 2006;109:284-94. 8. Erle DJ, Yang YH. Asthma investigators begin to reap the fruits of genomics. Genome Biol 2003;4:232. 9. Syed F, Panettieri RA Jr, Tliba O, et al. The effect of IL-13 and IL-13R130Q, a naturally occurring IL-13 polymorphism, on the gene expression of human airway smooth muscle cells. Respir Res 2005;6:9. 10. Heymans S, Schroen B, Vermeersch P, et al. Increased cardiac expression of tissue inhibitor of metalloproteinase-1 and tissue inhibitor of metalloproteinase-2 is related to cardiac fibrosis and dysfunction in the chronic pressure-overloaded human heart. Circulation 2005;112:1136-44. 11. Ohki R, Yamamoto K, Ueno S, et al. Gene expression profiling of human atrial myocardium with atrial fibrillation by DNA microarray analysis. Int J Cardiol 2005;102:233-8. 12. Li T, Chen YH, Liu TJ, et al. Using DNA microarray to identify Sp1 as a transcriptional regulatory element of insulinlike growth factor 1 in cardiac muscle cells. Circ Res 2003;93:1202-9. 13. Evans SJ, Choudary PV, Vawter MP, et al. DNA microarray analysis of functionally discrete human brain regions reveals divergent transcriptional profiles. Neurobiol Dis 2003;14:240-50. 14. Pierce A, Small SA. Combining brain imaging with microarray: isolating molecules underlying the physiologic disorders of the brain. Neurochem Res 2004;29: 1145-52. 15. Zvara A, Szekeres G, Janka Z, et al. Over-expression of dopamine D2 receptor and inwardly rectifying potassium channel genes in drug-naive schizophrenic peripheral blood lymphocytes as potential diagnostic markers. Dis Markers 2005;21:619. 16. Ricciarelli R, d’Abramo C, Massone S, Marinari U, Pronzato M, Tabaton M. Microarray analysis in Alzheimer’s disease and normal aging. IUBMB Life 2004;56: 349-54. 17. Loring JF, Wen X, Lee JM, Seilhamer J, Somogyi R. A gene expression profile of Alzheimer’s disease. DNA Cell Biol 2001;

20:683-95. [Erratum, DNA Cell Biol 2002; 21:241.] 18. Chin KV, Seifer DB, Feng B, Lin Y, Shih WC. DNA microarray analysis of the expression profiles of luteinized granulosa cells as a function of ovarian reserve. Fertil Steril 2002;77:1214-8. 19. Giudice LC, Telles TL, Lobo S, Kao L. The molecular basis for implantation failure in endometriosis: on the road to discovery. Ann N Y Acad Sci 2002;955:25264, 293-5, 396-406. 20. Zhang X, Jafari N, Barnes RB, Confino E, Milad M, Kazer RR. Studies of gene expression in human cumulus cells indicate pentraxin 3 as a possible marker for oocyte quality. Fertil Steril 2005;83:Suppl 1:1169-79. 21. Horcajadas JA, Riesewijk A, Dominguez F, Cervero A, Pellicer A, Simon C. Determinants of endometrial receptivity. Ann N Y Acad Sci 2004;1034:166-75. 22. Alon U, Barkai N, Notterman DA, et al. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc Natl Acad Sci U S A 1999;96:6745-50. 23. Perou CM, Jeffrey SS, van de Rijn M, et al. Distinctive gene expression patterns in human mammary epithelial cells and breast cancers. Proc Natl Acad Sci U S A 1999;96:9212-7. 24. Moch H, Schraml P, Bubendorf L, et al. Identification of prognostic parameters for renal cell carcinoma by cDNA arrays and cell chips. Verh Dtsch Ges Pathol 1999;83:225-32. (In German.) 25. Golub TR, Slonim DK, Tamayo P, et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 1999;286: 531-7. 26. Bloom G, Yang IV, Boulware D, et al. Multi-platform, multi-site, microarraybased human tumor classification. Am J Pathol 2004;164:9-16. 27. Eschrich S, Yang I, Bloom G, et al. Molecular staging for survival prediction of colorectal cancer patients. J Clin Oncol 2005;23:3526-35. 28. Sorlie T, Perou CM, Tibshirani R, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci U S A 2001;98:10869-74. 29. Beer DG, Kardia SL, Huang CC, et al. Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nat Med 2002;8:816-24. 30. van de Vijver MJ, He YD, van ’t Veer LJ, et al. A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med 2002;347:1999-2009. 31. van ’t Veer LJ, Dai H, van de Vijver MJ, et al. Gene expression profiling predicts clinical outcome of breast cancer. Nature 2002;415:530-6. 32. Kihara C, Tsunoda T, Tanaka T, et al.

n engl j med 354;23

www.nejm.org

Prediction of sensitivity of esophageal tumors to adjuvant chemotherapy by cDNA microarray analysis of gene-expression profiles. Cancer Res 2001;61:6474-9. 33. Eden P, Ritz C, Rose C, Ferno M, Peterson C. “Good Old” clinical markers have similar power in breast cancer prognosis as microarray gene expression profilers. Eur J Cancer 2004;40:1837-41. 34. Alizadeh AA, Eisen MB, Davis RE, et al. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 2000;403:503-11. 35. Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP. Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res 2003;31:e15. 36. Schadt EE, Li C, Ellis B, Wong WH. Feature extraction and normalization algorithms for high-density oligonucleotide gene expression array data. J Cell Biochem Suppl 2001;37:120-5. 37. Quackenbush J. Microarray data normalization and transformation. Nat Genet 2002;32:Suppl:496-501. 38. Yang IV, Chen E, Hasseman JP, et al. Within the fold: assessing differential expression measures and reproducibility in microarray assays. Genome Biol 2002;3(11): research0062. 39. Yang YH, Dudoit S, Luu P, et al. Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res 2002;30:e15. 40. Tan PK, Downey TJ, Spitznagel EL Jr, et al. Evaluation of gene expression measurements from commercial microarray platforms. Nucleic Acids Res 2003;31: 5676-84. 41. Kothapalli R, Yoder SJ, Mane S, Loughran TP Jr. Microarray results: how accurate are they? BMC Bioinformatics 2002;3:22. 42. Wang HY, Malek RL, Kwitek AE, et al. Assessing unmodified 70-mer oligonucleotide probe performance on glass-slide microarrays. Genome Biol 2003;4:R5. 43. Kane MD, Jatkoe TA, Stumpf CR, Lu J, Thomas JD, Madore SJ. Assessment of the sensitivity and specificity of oligonucleotide (50mer) microarrays. Nucleic Acids Res 2000;28:4552-7. 44. Hughes TR, Mao M, Jones AR, et al. Expression profiling using microarrays fabricated by an ink-jet oligonucleotide synthesizer. Nat Biotechnol 2001;19:342-7. 45. Yuen T, Wurmbach E, Pfeffer RL, Ebersole BJ, Sealfon SC. Accuracy and calibration of commercial oligonucleotide and custom cDNA microarrays. Nucleic Acids Res 2002;30:e48. 46. Barczak A, Rodriguez MW, Hanspers K, et al. Spotted long oligonucleotide arrays for human gene expression analysis. Genome Res 2003;13:1775-85. 47. Carter MG, Hamatani T, Sharov AA, et al. In situ-synthesized novel microarray optimized for mouse stem cell and early

june 8, 2006

Downloaded from www.nejm.org on August 5, 2008 . Copyright © 2006 Massachusetts Medical Society. All rights reserved.

2471

current concepts

developmental expression profiling. Genome Res 2003;13:1011-21. 48. Brazma A, Hingamp P, Quackenbush J, et al. Minimum information about a microarray experiment (MIAME) — toward standards for microarray data. Nat Genet 2001;29:365-71. 49. Ball CA, Brazma A, Causton H, et al. Submission of microarray data to public repositories. PLoS Biol 2004;2:E317. 50. Ball CA, Sherlock G, Parkinson H, et al. The underlying principles of scientific publication. Bioinformatics 2002;18: 1409.

2472

51. Ball C, Brazma A, Causton H, et al.

Standards for microarray data: an open letter. Environ Health Perspect 2004;112: A666-A667. 52. Ball CA, Sherlock G, Parkinson H, et al. Standards for microarray data. Science 2002;298:539. 53. Irizarry RA, Warren D, Spencer F, et al. Multiple-laboratory comparison of microarray platforms. Nat Methods 2005;2: 345-50. [Erratum, Nat Methods 2005;2: 477.] 54. Larkin JE, Frank BC, Gavras H, Sultana R, Quackenbush J. Independence and

n engl j med 354;23

www.nejm.org

reproducibility across microarray platforms. Nat Methods 2005;2:337-44. 55. Bammler T, Beyer RP, Bhattacharya S, et al. Standardizing global gene expression analysis between laboratories and across platforms. Nat Methods 2005;2:3516. [Erratum, Nat Methods 2005;2:477.] 56. Simon R, Radmacher MD, Dobbin K, McShane LM. Pitfalls in the use of DNA microarray data for diagnostic and prognostic classification. J Natl Cancer Inst 2003;95:14-8. Copyright © 2006 Massachusetts Medical Society.

june 8, 2006

Downloaded from www.nejm.org on August 5, 2008 . Copyright © 2006 Massachusetts Medical Society. All rights reserved.

Related Documents

Seminario 11 1
November 2019 5
Seminario 11
November 2019 7
Seminario[1]
December 2019 39
Seminario 1
November 2019 28
Seminario 1
October 2019 18