Confreyetalagilecurriculum.pdf

  • Uploaded by: Wulan Aulia Azizah
  • 0
  • 0
  • November 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Confreyetalagilecurriculum.pdf as PDF for free.

More details

  • Words: 11,365
  • Pages: 16
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/328129332

The concept of an agile curriculum as applied to a middle school mathematics digital learning system (DLS) Article · October 2018 DOI: 10.1016/j.ijer.2018.09.017

CITATION

READS

1

33

6 authors, including: Jere Confrey

Meetal Shah

North Carolina State University

North Carolina State University

113 PUBLICATIONS   4,538 CITATIONS   

3 PUBLICATIONS   11 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Equipartitioning View project

All content following this page was uploaded by Meetal Shah on 07 March 2019.

The user has requested enhancement of the downloaded file.

SEE PROFILE

International Journal of Educational Research 92 (2018) 158–172

Contents lists available at ScienceDirect

International Journal of Educational Research journal homepage: www.elsevier.com/locate/ijedures

The concept of an agile curriculum as applied to a middle school mathematics digital learning system (DLS)

T

Jere Confreya, , Alan P. Maloneyb, Michael Belchera, William McGowana, Margaret Hennesseya, Meetal Shaha ⁎

a

SUDDS (Scaling Up Digital Design Studies) Group, STEM Education Department, College of Education, North Carolina State University, 407 Gorman Street, Raleigh, NC 27607, USA b The Math Door, 104 Duryer Ct., Cary, NC 27511, USA

ARTICLE INFO

ABSTRACT

Keywords: Learning Trajectories Classroom Assessments Curricular Frameworks Learning Maps Middle Grades Mathematics

Curricular theory must evolve to keep pace with the implications of the design, use, and effects of deploying and adapting digital curricular resources, especially when placed within digital learning systems (DLS) with rapid feedback and analytic capacity. We introduce an “agile curriculum” framework describing how to use classroom assessment data to regulate teachers’ practices of iteratively adapting curricula. Our DLS, called Math-Mapper 6–8, is introduced as an example with its diagnostic assessments of students’ progress along learning trajectories. An exploratory video study of middle school teachers reviewing, interpreting, and acting on its data, both during instruction (short cycle feedback) and within professional learning communities (long cycle feedback) illustrates how an agile curriculum framework supports data-driven adjustments based on student learning.

1. Introduction Curricular theory must keep pace with the implications of easy access and use of ubiquitous digital curricular resources, especially when they are incorporated into digital learning systems (DLS) with rapid feedback and analytic capacity. Concerns have been raised about how to maintain curricular coherence, when and if teachers supplement their instruction haphazardly by adding in a range of materials of varied quality from the web (Confrey, Gianopulos, McGowan, Shah, & Belcher, 2017; Larson, 2016). At the same time, more and more people are advocating for the customization of curricular offerings instead of insisting on strict fidelity in implementation (Pane, Steiner, Baird, & Hamilton, 2015). What is needed is a means to determine when adaptations to a curriculum are achieving their intended purpose by providing relevant, valid and timely data to teachers. In this paper, we describe one approach to providing such data by using diagnostic assessments built around an explicit theory, that of learning trajectories. To ground our approach in current curricular theory, in Section 2, we trace the evolution of curricular frameworks spanning design, implementation, and outcomes of curriculum and propose and describe a new framework named “the agile curriculum”. We define an agile curriculum as a means to support the ongoing revision and adaptation of teachers’ curricular practices based on providing immediate data about what one’s students are learning. In Section 3, after describing the theoretical components of an agile curriculum and its enacting framework, we introduce and describe briefly the features and affordances of a software application built by the team, designed to support the use of an agile curriculum. It includes a learning map with a foundation in learning trajectories and a set of



Corresponding author. E-mail address: [email protected] (J. Confrey).

https://doi.org/10.1016/j.ijer.2018.09.017 Received 8 April 2018; Received in revised form 20 September 2018; Accepted 24 September 2018 Available online 06 October 2018 0883-0355/ © 2018 Elsevier Ltd. All rights reserved.

International Journal of Educational Research 92 (2018) 158–172

J. Confrey et al.

Fig. 1. The three phases of curricula.

related digitally-scored diagnostic assessments. In Section 4, in middle schools that use the software in 1-1 enactment, we report on a study of the teachers’ practices reviewing data from the diagnostic assessments on students’ progress on learning trajectories both with their classes and other teachers. Finally, we reflect on the conditions of support needed to implement and refine an agile curriculum. We view the agile curriculum as situated in the broader field of curriculum ergonomics but in a more restricted way. The field as a whole examines the full range of design, implementation, and revision of curricular materials to consider how to take into account the users’ needs. In contrast, the agile curriculum is limited to studying how teachers use and revise existing curriculum materials, where curricular revisions are driven by data from learning-trajectory based classroom assessments. 2. Theoretical views of curricular frameworks 2.1. Curricular frameworks Over the last two decades, frameworks for describing curriculum1 have changed from being oriented to production (input-output) to ones that model use as evolving, iterative, and dynamic. A longstanding framework (Fig. 1) specified that the intended curriculum, reflecting what authors and designers had in mind when the curriculum was originally built, would become the enacted (or implemented) curriculum, encompassing the transformation that occurs as practitioners teach, which would subsequently produce the achieved (or attained) curriculum, describing what students learned (Cuban, 1992; McKnight et al., 1987; Schmidt et al., 1996; Stein, Remillard, & Smith, 2007). Researchers documented how transitions from one component to the next were influenced by teaching conditions, students’ prior knowledge, teacher variables, and varied measures and the student outcomes (Ball & Cohen, 1996; Gehrke, Knapp, & Sirotnik, 1992; Remillard, 1999; Stein, Grover, & Henningsen, 1996; Tarr, Chávez, Reys, & Reys, 2006). Remillard and Heck (2014) offered a different conceptualization for curriculum, policy and enactment, distinguishing “official” from “operational” curriculum mediated by “instructional materials.” Within the operational curriculum, they identified teacherintended curriculum (planning), enacted curriculum, and student outcomes. By placing these three components within an instructional cycle, they conceptualized curriculum “enactment” as interactional and evolving based on a teacher’s interpretative activities and reactions (Remillard & Heck, 2014, p. 711- 716). The role and meaning of implementation fidelity is being revised in light of approaches that legitimize curricular adaptations. Traditionally the degree of adherence to the authors’ intention (implementation fidelity) has been a critical element in evaluating the effectiveness of a curriculum (Huntley, 2009). However, in the last five years, curriculum scholars have reconceptualized the use of materials and human resources to include forms of “re-sourcing” (Pepin, Gueudet, & Trouche, 2013). Remillard (2005) recognized “teachers as ‘active’ designers and users of the curriculum materials (and not as simple transmitters) and analyzed teachers’ usages of resources and interpretation of and participation with the resources.” (Pepin et al., 2013 p. 931). This led them to argue therefore that evaluation of a resource’s quality becomes “collective and dynamic,” in that the question is whether a resource is good “for a given context, for a given community, at a given stage of its development,” rather than “of good quality per se” (p. 936). Gueudet and Trouche (2009) use “document” and “documentational genesis” to describe teachers’ ways of adapting and modifying resources. Documentational genesis contains the twin processes of instrumentation (how resources influence teaching) and instrumentalization (how teachers appropriate and reshape resources). These view curriculum as a tool to accomplish instruction and focus on curricular affordances and the process of appropriation in socio-cultural theory. Trouche (2004) introduced instrumental orchestration to analyze how teachers guide students’ instrumental genesis via interactions with a given software. Early experimentation with digital curricula has paved the way for the instrumental orchestrations. E-textbooks began as digital replicas of printed textbooks, but evolved quickly (Chazan & Yerushalmy, 2014; Gueudet, Pepin, & Trouche, 2013). Pepin, Gueudet, Yerushalmy, Trouche, and Chazan (2015) define an e-textbook as “an evolving structured set of digital resources, dedicated to teaching, initially designed by different types of authors, but open for redesign by teachers, both individually and collectively,” (p. 644). Features of these various tools have included: 1) easy revisions and additions, 2) use of a variety of media, and 3) interactivity. Digital affordances can also support jointly-authored digital curricula, and repeated, asynchronous, and distributed revisions (Barquero, Papadopoulos, Barajas, & Kynigos, 2016; Gueudet et al., 2013). These studies demonstrate both the potential and the challenges of distributed joint authorship relative to quality and coherence. Research on the use of open educational resources (OER) reinforces the need to create new frameworks to handle supplementation and modification of curriculum. Since initiation of the Common Core State Standards in Mathematics (CCSS-M) and the severe economic recession (2008–2010) in the U.S., districts and teachers, strapped for funding, turned to the internet for its plethora of (free) resources. 1 We define curriculum as “a plan for the experiences that learners will encounter, as well as the actual experiences they do encounter, that are designed to help them reach specified mathematics objectives” (Remillard & Heck, 2014, p. 707). This definition is broad enough to encompass the underlying learning theory, the curricular materials and classroom assessments themselves, and the modifications that teachers make (with or without intention) during instruction. Later in the paper we propose a framework for the agile curriculum that situates these components into an iterative cycle of improvement.

159

International Journal of Educational Research 92 (2018) 158–172

J. Confrey et al.

Research found that 60% of teachers report using the web to supplement instruction (Davis, Choppin, McDuffie, & Drake, 2013). But teachers’ use of the web-based content to assemble curriculum is often distressingly incoherent. Webel, Krupa, and McManus (2015) examined how a group of 5th and 6th grade teachers combined internet materials with other curricular approaches. Asked to evaluate examples of open educational resources (OERs), “the teachers tended to value activities they perceived students would enjoy (e.g., games, online activities, videos), resources with worked examples and opportunities for practicing procedures, activities that they believed students could complete successfully, and, to some extent, resources with multiple representations” (p. 59). Selecting materials for reasons unrelated to their cognitive intentions can undercut curricular coherence: “a collection of educational resources is no more a curriculum than a pile of bricks is a home" (Wiliam, 2018, p. 42). Using standards to locate resources and organize instruction fails on several counts: standards’ highly variable grain-sizes, short ramps to competency, and lack of attention to learning (Confrey et al., 2017). They provide little or no insight into where students are in relation to instructional goals. Clearly, there is a need for a means of monitoring and regulating curricular assembly and modification. In the next section, we suggest an appropriate application of classroom assessment provides a missing piece to how to establish, maintain, and improve curricular quality. 2.2. Classroom assessment Remillard and Heck (2014) situated high-stakes testing as consequential outcomes in the official curriculum, likely acknowledging its direct effects on the designated curriculum, and indirect and limited effects on the enacted curriculum (National Research Council (NRC, 2003). They acknowledge a bidirectional relationship between student outcomes and enacted curriculum, but attribute the bidirectionality to students learning from solving the outcome measures’ tasks. What is as yet underrepresented in the enacted curriculum literature is sufficient attention to “classroom assessments” and features related to them (National Research Council (NRC, 2003; Pellegrino, Chudowsky, Glaser, 2001; Pellegrino, DiBello, & Goldman, 2016) as a means to drive instructional decision-making. Classroom assessments2 are assessments for supporting students while learning by providing relevant, timely, detailed, and actionable feedback on their current progress; it can guide instructional decision-making. This meaning of classroom assessment builds on the foundation of formative assessment (sometimes called “assessment for learning;” Black, Harrison, Lee, Marshall, & Wiliam, 2003). In contrast to assessments that are used to rank, evaluate, or certify some aspect of performance (summative assessments), formative assessment is a systematic process to gather evidence about learning as it is occurring, so as to adapt lessons to increase learners’ likely success in achieving the goals of the lesson (Heritage, 2007). Digital tools are adding to the power and accessibility of formative assessments; for example when student work can be shared in real time through applications such as the STEP formative assessment platform (Olsher, Yerushalmy, & Chazan, 2016). A critical element of formative assessment is to strengthen the role of the students as partners in assessment—to collaborate with teachers to assess their own current levels of understanding, and to recognize what they know and what they need to know to succeed, to reduce the “discrepancy between current and desired understanding” (Hattie & Timperley, 2007). Students learn to take an active rather than passive role towards their own learning, strengthening self-regulation strategies, and adapting their learning tactics to meet their own learning needs (Brookhart, 2018; Heritage, 2007) Classroom assessment encompasses formative assessment practices and often involves the application of more formal measurement properties (Confrey, Toutkoushian, & Shah, in press; Confrey & Toutkoushian, in press; Wilson, 2018). It requires an explicit research-based theory of student learning (Pellegrino et al., 2001). Shepard, Penuel, and Pellegrino (2018) argue that classroom assessment should be based on discipline-specific, detailed models of learning, such as developmental models, learning progressions, facets of understanding (Minstrell, 2001), or local instructional theories (Gravemeijer, 1994). The National Research Council (2003) argued that classroom assessments should be dynamic, administered and scored in a timely and immediate way, criterion-referenced, and directly relevant to a student’s current state of learning. Under traditional models, an end-of-unit assessment is used to evaluate students’ levels of achievement before moving to a new topic, whereas classroom assessment drives instructional changes within the unit dynamically and iteratively. Repeatedly leveraging such change necessitates a new curricular framework. In the next section we describe such a framework, the agile curriculum, that uses LT-aligned classroom assessment. 2.3. A framework for an agile curriculum To synthesize the elements of the enacted curriculum with those of classroom assessment, we offer a revised framework (Fig. 2) of curriculum use together with a set of four enactment principles (described below), and call the overall approach “the agile curriculum." The agile curriculum framework is proposed in order to describe a process of continuous revision and improvement based on data gathered during curricular enactment. The term “agile” derives from the agile methods used in software engineering, which focus on setting clear targets for design, creating rapid prototypes for building an application, sharing responsibility among teams for identifying and achieving subgoals, and creating iterative enhancements based on opportunities and weaknesses identified through gathering continuous feedback (Cohen, Lindvall, & Costa, 2003). Analogously, in curricular enactment the focus needs to be on rapidly and flexibly meeting the needs of the students, responding to challenges to and opportunities for learning that arise during the course of instruction. Proposed adaptations of materials and pedagogy need to be evaluated on the basis of feedback data in both a prospective sense and in retrospective analysis. 2 This use of “classroom assessments” contrasts to when used informally by practitioners for any quiz, graded homework, or unit test for grades as a summary of students’ achievement.

160

International Journal of Educational Research 92 (2018) 158–172

J. Confrey et al.

Fig. 2. The agile curriculum framework, leveraging two-cycle feedback.

The framework positions the instructional core between the bookends of standards and high stakes testing (as in Confrey and Maloney (2012) and positions curricular materials as the mediators of curricular activity (as in Remillard and Heck (2014). It envisions the instructional core as a cycle and recognizes the role of classroom-based diagnostic assessment data as feedback, first within the classroom modifying instruction and then outside the classroom, among a collective group of practitioners (professional learning communities (PLCs)), modifying subsequent curriculum enactment. Thus, the former requirement for implementation fidelity is exchanged for one in which change is driven by high-quality evidence from classroom assessments. We emphasize that this does not mean we endorse ad hoc curricular assembly—an agile curriculum assumes an initial adoption of carefully-designed materials as its basis but anticipates and welcomes rapid refinements based on evidence. We identify two primary types of feedback cycles. “Short-cycle” feedback operates during instruction episodes such as a curricular unit, using assessment diagnostically to affect learning and instruction during active instruction or prospectively. “Long-cycle” feedback involves retrospective evidence-based deliberations toward revisions of materials and/or sequencing and operates across months and years. Both cycles can involve collective actions among teachers. We further propose four principles of the agile curriculum that meld key features of curricular enactment and classroom assessment. 1.) Explicit, transparent learning theory guides the interpretation of data and enactment. Focusing on student learning is an essential foundation of an agile approach to curriculum. This requires specification of developmentally-appropriate, fine-grained learning theory. We use learning trajectories. 2.) Instructional adjustments and supplementation occur in response to short-cycle feedback during enactment. Curricular revisions occur in response to long-cycle feedback. Both are based on interpretation of multiple sources of data relevant to the curricular aims. Agility implies continuous interactions between curriculum enactment and data. Short term adjustments can be based on student questions, review of student methods or ideas, ways to connect to prior learning, and a need for differentiation among student groups (Pellegrino et al., 2016). Long-cycle adjustments include adoption of an approach or a change in sequencing of topics, based on evidence connected to specific examples across teachers working in professional learning communities (PLCs). 3.) Students are recruited as partners in interpreting and acting on assessment data. Growing evidence acknowledges the importance of strengthening students’ perception of efficacy with regard to their own learning (Heritage, 2010; Heritage, 2007). Therefore, an agile curriculum should provide compelling, immediate, systematic, and actionable learning data, tied to specific curricular goals, to help students identify gaps in their learning and ways forward, and should assist them in developing a “growth mindset” (Dweck, 2006) with regard to their role in the learning process. 4.) Teachers’ roles in instrumental orchestration (Trouche, 2004) are strengthened: they become increasingly skilled in conducting studentcentered instruction while leveraging learning trajectory-based evidence to meet individual and group needs. We chose the term “conducting” to emphasize that we are not requiring teachers to be the initial composers of curriculum, but rather recognizing their critical role in refining the compositions, improvising, and adding supplemental elements based on evidence. The conductor is often considered a solitary actor, but we conceptualize teacher orchestration as encompassing both individual and collective action (Pepin et al., 2013). 3. Description of Math-Mapper 6–8 Math-Mapper 6–8 (M-M) was designed and used in this study to support enactment and investigation of the proposed agile curriculum, and its four principles. M-M consists of a learning map, a linked diagnostic assessment system, and various tools for scheduling and organizing instruction, assessments, materials and assignments3, all with learning trajectories as the underlying learning theory (Confrey, 2015). 3 M-M contains a “sequencer” for scheduling the use of materials and assessments, and a “resourcer” that links to curated curricular resources. These were not relevant to the present study.

161

International Journal of Educational Research 92 (2018) 158–172

J. Confrey et al.

Fig. 3. “Finding Key Percent Relationships” relational learning cluster from the Math-Mapper 6–8 learning map. The cluster’s three constructs are “Percent as Amount per 100," “Benchmark Percents,” and “Percents as Combination of Other Percents." The LT progress levels (L1-L5) are displayed for the top construct, “Percents as Combination of Other Percents." The progress levels’ grade mapping, based on CCSS-M topic designations, is shown at right.

3.1. Learning map The learning map conveys all the typical middle school math content as the organized territory of what is to be learned, through a hierarchical arrangement of nine big ideas, populated by 24 “relational learning clusters” (RLCs) that contain a total of 62 constructs, each with a related learning trajectory consisting of at least 5 “progress levels.” It is designed intentionally to redirect teachers’ reliance on individual standards toward a focus on big ideas, concepts’ interrelationships, and the underlying learning trajectories4. Learning trajectories (LTs), which form the learning theory underlying M-M, are defined as: …researcher-conjectured, empirically supported description of the ordered network of constructs a student encounters through instruction (i.e., activities, tasks, tools, forms of interaction, and methods of evaluation), in order to move from informal ideas, through successive refinements of representation, articulation, and reflection, toward increasingly complex concepts over time (Confrey, Maloney, Nguyen, Mojica, & Myers, 2009, p. 3). Math-Mapper’s LTs are all based on new syntheses of empirical research on student learning. Expressed as sequences of progress levels, they are, in essence, probabilistic conjectures that identify the landmarks and obstacles students are likely to encounter through instruction, as their thinking develops over time from less sophisticated to more sophisticated (Clements & Sarama, 2004; Confrey, Maloney, Nguyen, & Rupp, 2014; Nguyen & Confrey, 2014; Simon & Tzur, 2004). Understanding and leveraging the learning trajectories towards big ideas is what helps to keep teachers focused on teaching for understanding, drawing on concepts, strategies, procedures. and generalizations. Each RLC contains closely-related constructs, arrayed to suggest an instructional ordering of content: constructs are arrayed vertically to represent more basic (lower) to more sophisticated (higher) constructs, student learning of which is supported by understanding the lower cluster(s). A single RLC, “Finding Key Percent Relationships” comprising three constructs/LTs, is shown in Fig. 3. 3.2. Diagnostic assessment and reporting system Assessments (9–11 items) are administered at the RLC level and require 20–30 min. Each item targets a single LT progress level, is written conceptually to address the level, and is designed to promote classroom discussion. The assessments and their resulting reports are based directly on the LTs. Also available are construct-level practice tests which allow students to select a desired level, work items, and receive immediate feedback. 4

M-M bidirectionally displays relationships between each learning trajectory and the CCSS-M. 162

International Journal of Educational Research 92 (2018) 158–172

J. Confrey et al.

Fig. 4. A. Top half of a student report for diagnostic assessment for cluster “Finding Key Percent Relationships.” Color-coded dials (left side) summarize overall percent correct responses for each construct: percent correct for assessment (black), additional percent correct due to revision of incorrect items (turquoise). LT-specific color-coded learning ladder (right side) indicates relative levels of correctness for items at each level tested: incorrect (orange); varying percentages correct (shades of blue); untested levels (white). B. Bottom half of student report for a diagnostic assessment for cluster “Finding Key Percent Relationships” (“Item matrix”). Item responses are color coding as in Fig. 4A, with additional information shown in figure legends at top and bottom of figure. Clicking a construct displays the LT; clicking an item-response cell launches an item review panel, which displays the item itself, and permits student to revise response(s) or reveal correct response(s). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article).

Upon completion of an assessment, students and teachers receive immediate feedback. Student reports simultaneously provide feedback at several different levels of detail (Fig. 4A, B). Autonomous access to their own assessment data from tests, practice tests and retests, is designed to facilitate students to reflect on their current understanding, and their learning needs, in relation to the learning targets of constructs and clusters, and to recruit them as partners in acting on their data. Class reports provide teachers with “heatmap” displays that detail the entire class’s assessment responses—by student, item, LT level, and construct (Fig. 5). The reports were designed to support teachers in making instructional adjustments based on data, and support teachers in conducting student-centered instruction. M-M was designed to support an agile curriculum, with explicit features developed to align with the four principles of agile curriculum. In the section that follows, we report on a study about the degree to which teachers and students engage agilely in instructional adaptations based on the reports they receive on students’ progress along learning trajectories. 163

International Journal of Educational Research 92 (2018) 158–172

J. Confrey et al.

Fig. 5. Heatmap for the cluster “Finding Key Percent Relationships” (T2′s class). Vertical axis/rows: Progress levels for corresponding construct LT (low to high). Horizontal axis/columns: students, ordered from weakest (left) to strongest (right) by overall cluster performance. Cells: student’s performance on individual tested item, color-coded as in student report (Fig. 4A, right, and Fig. 4B). Clicking the left margin of any heatmap reveals the specific item, along with an item analysis tool that (anonymously) specifies all student responses for that item.

4. An exploratory study using Math-Mapper 6–8 A central component of the agile curriculum is the use of data by teachers and students to influence and support learning. Therefore, we chose to examine in detail a sample of classes to explore how teachers reviewed data with their students and used the information from the diagnostic assessments. It requires considerable time for students and teachers to learn to use M-M and invent relevant practices to integrate its use into instruction. Their practices continue to evolve and change. For this reason, we collected data multiple times from participating teachers over time. We chose to conduct an exploratory study to identify ways to measure salient instructional factors related to data review (and subsequently link them to student outcomes). The study’s data collection examined: 1) Teachers’ proficiency in accessing and using the assessment data, 2) Teachers’ application of classroom assessment data to improve learning, and 3) Instructional episodes connected with data reviews for evidence of students’ active participation in the learning process and of teachers’ facilitation of such. To include both short- and long-term cycles of feedback in the framework, we investigated both classroom data review sessions and teachers’ collective (PLC) discussions. The research questions explore how the four principles of an agile curriculum are exercised within the framework: 1 How do teachers review the data from the heatmaps and use student reports to make adjustments in their instruction? 2 What role, if any, do the learning trajectories appear to play in those processes? 3 How do students participate in the review process? Is there evidence that they become more active partners in the assessment process? 4 How do the teachers collectively discuss, interpret, and use their data to adjust, or plan to adjust, instruction? 4.1. Context Field-testing M-M has been conducted at three partner schools in two districts. One district, listed as low-performing at the state level, has transitioned to digital resources more recently. The second district is recognized as high-performing and has used digital resources extensively. (Table 1). The data for this study consisted of videos of teachers reviewing results with their students in class, and monthly grade-level meetings of District 1 professional learning communities (Table 2). Seven class sessions were selected to represent contrasting cases, representing different grade levels, topics, districts, and styles of teaching. One teacher (T2) was selected for study across multiple classes over time to examine the variation in her evolving practice. 164

International Journal of Educational Research 92 (2018) 158–172

J. Confrey et al.

Table 1 Research site student and teacher demographics.

Population served African-American (%) Asian (%) Hispanic and Mixed (%) White (%) Percent Free and Reduced Lunch Number of years implementing 1-1 computing Number of teachers participating Number of tests taken

District 1

District 2

977 27 1 10 53 56.9 3 19 9,197

1163 4 9 8 79 9.9 5 33 21,696

Table 2 Classroom data-review sessions and PLC sessions analyzed. Grade

District

Teacher/PLC

Cluster

No. obs’v’ns

Year

6 6 7 7 7 6 8

1 1 1 2 2 1 1

T1 T2 T3 T4 T5 Gr. 6 PLC Gr. 8 PLC

Ratio Ratio, Percents Ratio, Percents

1 3 1 1 1 1 1

1 1-2 2 2 2 2 2

Area of circles Percents Bivariate Data, Functions

Table 3 Categories and codes for video analysis. Review Data

Instructional Actions (decisions)

Data Sources

Norms of Data Interpretation

Whole-Class

Peer-to-Peer

Learning Map Student reports Heatmaps Practice

Dichotomous evaluative feedback (good/bad) Growth Mindset Low Expectations of Students Connection of feedback to LT Student Self Reflection Student Self-correction

Teacher meta-talk Test strategies of exploring the options Leveraging connected knowledge Establishing classroom norms around data use Showing evidence of instructional insecurity Characterizing interaction patterns by task

Form groups Assign to Practice Provide assistance Monitor Progress

4.2. Methods and analyses Several class sessions and PLC meetings were video recorded. The recordings were transcribed and reviewed by research team members. Inductive-contrastive analysis was undertaken (Derry et al., 2010, p. 10). Because this was an exploratory study, a single team member watched all the videos and categorized episodes as: 1) representing actions involving a review of the data or 2) examples of instructional actions. Within the category of reviewing data, the episodes were further labeled as either discussing the meaning and features of specific data sources or as helping the class form norms of data use. The instructional actions fell into two broad categories according to the organization of the class: whole-class or peer-to-peer. Across these four categories, 51 codes were proposed and presented to the team. The team viewed a subset of videos and selected 20 codes as most representative of the frequent and consequential examples of data use (See Table 3). To prepare for coding by the research team, an initial standard was set by applying these codes to a set of 39 episodes. Episodes could be coded with multiple codes. To measure inter-rater reliability, all researchers coded the identified episodes, and a pooled kappa (De Vries, Elliott, Kanouse, & Teleki, 2008) was calculated to measure reviewers’ agreement with the standard. A pooled kappa greater than 0.75 (criterion for excellent clinical significance per Cicchetti (1994)) was chosen as a minimum requirement for agreement. After a training period, reviewers’ pooled kappa exceeded 0.75 (minimum of 0.79). With inter-rater reliability established, transcripts were distributed, and two researchers independently coded each transcript, using the ’’Dedoose’’ (2018) application. Across all transcripts, the two reviewers agreed on all codes applied to 87.8% of the episodes (567 of 646 episodes coded). For the remaining episodes, agreement was established through team discussion. Illustrative examples were selected for use in this paper. 165

International Journal of Educational Research 92 (2018) 158–172

J. Confrey et al.

4.3. Results We present results relating to each research question, describing the relevant components of the theory of action for each question. Specific examples are categorized according to the components of the theory of action (Table 3). 4.3.1. Research Question 1: how do teachers review the data from the heatmaps and use student reports to make adjustments in their instruction? 4.3.1.1. Data Sources. Teachers differed in their facility with and proclivity towards use of M-M. Two teachers did not share the data representations of student reports or a heatmap, and only reviewed the assessment items one after the other, mirroring the conduct of a traditional test review. This contrasted with other teachers who carefully demonstrated the features of the tool and communicated their purpose to students before interpreting the data. For example, during year 1, when T2 introduced an anonymized version of her class’s heatmap for the first time, she told the class how to read the data display, and discussed how to interpret the map’s color-coding: T2: Now, the white spaces on my report mean you didn’t have that question, because we didn’t all have the same question. So, if there’s a white space, it means you didn’t have a question there. And… each one of these columns is an individual student. She asked which color they thought stood out the most. The students replied in unison, “Yellow.”5She asked them if the “yellow popping out” was a “good thing,” and the students replied in the negative. This orientation towards data would be expected in a “performance setting” where a teacher reports on how well the material was learned. However, she moved from evaluating to interpreting the data, using “we” repeatedly to identify the whole class with the heatmap results. In year 2, after the M-M Practice feature became available, she and her class discussed the heatmap data from their recent test on “Key Percent Relationships” and asked: “Which construct would you say would be the first one we need to practice?” They replied, “Construct C,” which had more orange (Fig. 5). The teacher used the features of the heatmaps to guide her instruction. 4.3.1.2. Instructional actions (whole class). The majority of the teachers chose to review the data with the class as a whole, occasionally working individually or in pairs. In the first year of M-M6 use, all teachers reviewed assessment items with the class using a preview of the test, with little reference to the LTs. Most of the review was teacher-directed, repeating past methods rather than reteaching the topic. Students’ participation consisted of short responses to teacher questions, without requests for explanations. The excerpt below is typical of the interactions in T2′s first data review. T2: Yes. Ok, that makes sense. Ok so now I’m using less because 5 is less than 35. 5 divided by 5? S (students): 1 T2: One cup of mango. Is 1 less than 6. S: Yes T2: Ok, that makes sense 15 divided by 5. S: 3 Some teacher explanations were clear, while others revealed evidence of instructional insecurity. Weaker teachers revealed a tendency to work only part of the problem, failing to reach closure. T3 posted the table of values from the problem and then copied two rows of the table to the right to make it look like a ratio box, one row with two values and the other with a missing value. She solved it cursorily, then said: T3: “So, somebody, be honest and tell me. Was it the table or was there something else that made you not understand this?” S1: Mine was the wording of the questions T3: the wording of the questions? What, can you be more specific? S1: Like, on some of them, it didn’t make sense that, like, when I asked you what it was asking on the first one, I didn’t get it, like what it was asking, but I knew how to do it. Despite having solicited student opinion, the teacher’s lack of follow-up left the student with little benefit from the exchange. In year 1, while reviewing M-M items, teachers’ references to similar problems not on the assessment were rare and yielded unpredictable outcomes. T1 once abruptly asked“…if 5 golden unicorns cost $40, how does it compare to buying one unicorn for $10?”, leaving the students confused. In contrast, T2 smoothly connected her discussion to a problem on ratio equivalence from the original context: T2: All right if Mary has only 6 cups of watermelon how many of each fruit does she need in order to make less tropical punch that still taste the same. So we're back to Arturo and Monique. Remember, Monique wanted to make the same lemonade but she wanted to make less than Arturo. It appeared to help establish a connection between regular instruction and the assessment review process. 5 6

Mostly orange; hues can differ based on the individual display monitor setting T1 and the first class of T2 occurred in year 1. T3 joined the program this year so she is also viewed as a part of the year 1 group. 166

International Journal of Educational Research 92 (2018) 158–172

J. Confrey et al.

4.3.1.3. Instructional actions (peer-to-peer). T5 created small groups and facilitated student-directed discussion of data, based on their performance on a test on the cluster “Measuring Characteristics of Circles.” T5 allowed groups to collaboratively decide where to focus their efforts using the Practice feature to work items aligned to a selected LT level. He encouraged his students to act on their data to gain competency: T5: If you look at your stats, what I want you guys to do now is kind of discuss with one another because I’ve paired you guys together because you guys have very similar questions right and wrong. Look at the stacks, see what level and what construct you guys didn’t get right—you maybe misunderstood—and then you can do some practice within that construct. So why don’t you guys take a look at there and see what level you want to use, talk to each other and then kind of work through the practice problems together. And then you can revise and revisit your questions. One group of students, recognizing they needed to work on levels 2 and 3 of “Area of Circles” (construct B), opted to begin their practice with level 3. S2: So basically when…I see construct A, I see pi and circumference it shows that L2, or meaning Level 2 with estimating pi, by comparing the lengths of circumference to diameter as ratios. It looked like I got that wrong which means that I would need, like, I think we would need more practice in that one. S3: And then Level 1, level 6, level 5, level 4, Level 3 are all blue… S1: Yeah, so we got those right, and then for the top one, construct B. S3: Yeah, we need Level 2 and level 3. By allowing students to direct their review process and work collaboratively to revisit and revise items, T5 empowered students as partners in the assessment process—but not without accountability. Throughout the peer-to-peer session, T5 monitored students’ progress, probed for understanding, and intervened effectively and efficiently when necessary. 4.3.2. Research Question 2: what role, if any, do the learning trajectories appear to play in those processes? 4.3.2.1. Data sources. During the first year, we seldom observed teachers refer to LT levels; it took time for the teachers to recognize their utility, let alone begin to share them with students. In year 2, however, more teachers were observed leveraging the LTs. For example, T4 encouraged her students to view their own reports on their devices and drew their attention to the LTs. Student responses indicated that they comprehended the hierarchy in the levels: T4: Can someone tell me what you think they mean? You’re thinking of ladder. What do you think, [Student 1]? S1: …L1 is the first level. T4: I like that. So, if you think math levels what do you think that should be? What do you think, [S2]? S2: Um, like how, advanced or how like… T4: Ooh, I love that. Very nice. So how advanced, okay, so L1 compared to L5. In this, which one do you think is the easiest level, which one do you think is the most challenging? [S3]. S3: L1 is the easiest and L5 is the hardest. She deftly linked the student reports to practice activity: T4: Orange, ok, so I would start focusing on whichever areas that you have on your reports that are orange, because that’s where you struggled with, and I would do the practice problems in that area. Feeling good? S1: Good, because then if they are going to do like a test, after this, then they know how to do it. S2: So, like study more on that, to like… T4: To study more. It identifies which ones you might still need a little bit of help on. By restating Student 2′s response, she emphasized how to focus on levels with which they needed help and strengthened student agency. Teachers also referred to the “revise or reveal” feature of student reports to increase student participation in assessment routines. 4.3.2.2. Instructional actions (whole class). Teachers changed in the way they referred to the LTs, the constructs, and the levels. In year one, T2 related each of the problems to the particular construct in the map by name but she did not refer to the items’ LT levels. T2: “Identifying Ratio Equivalence” was measured by question 1, 2, and 3. So we want to go back to questions 1, 2, and 3 and see what we did wrong. “Finding Base Ratios” was questions 4 and 5. We can go back and figure out what happened. A compelling example of the evolution of a teacher’s ability to make instructional adjustments based on data involved T2 discussing the construct “Combinations of Percents.” During an examination of a heatmap, she asked her class7 to identify where they should focus their attention: T2: Okay, so < student > said Construct C is our problem area and I agree with him…I blame myself for this, because we 7 It is worth noting that, within the CCSS-M, it is easy to overlook percents greater than 100 as the standard does not specify different cases of percentages: 6.RP.A.3.c: Find a percent of a quantity as a rate per 100 (e.g., 30% of a quantity means 30/100 times the quantity); solve problems involving finding the whole, given a part and the percent.(CCSS-I, 2011)

167

International Journal of Educational Research 92 (2018) 158–172

J. Confrey et al.

Fig. 6. A base ratio assessment item with multiple solution methods.

haven’t done a lot with construct C. But this helps me because we need to go back and do a little bit more, do some practice. So, percents as combinations of other percents. So, knowing that we can take 20% and 1% and put them together to make 21%. And knowing that we can take 100% of something and doubling it. We can have more than 100%. If you have more than what you planned on having, you have more than 100%, ‘cause you got extra. We didn’t do a lot of that, so I blame myself for that… Which one [level] do you think is the most problematic? Class: Level 3 T2 realized that she neglected to teach a Level 3 topic (“Extends benchmark of 100% to scale to, for example, 200% or 300% as 2 times or 3 times as large”), and admits that to her class. Then to address that oversight, she pulled up an item from level 3 about a puppy that originally weighed 25 pounds and had grown to 50 lbs., and uses it to teach the topic. The item asked what percent of the puppy’s original weight is his current weight. Below is the exchange with students: T: Now in reading the question, well not question, statements. Every statement says “of its original weight.” What is the original weight of the puppy? S: 25 T: 25 percent, so 25 percent, er, 25 pounds is our 100 percent. So, the puppy is now more than 25 pounds, so the puppy is more than 100 percent. You’ve doubled it, we’ve doubled the weight from 25 pounds to 50 pounds, so if I double 100%, I get 200%. So the puppy’s current weight is 200% of its original weight. And again, it goes back to that piece where it says “of its original weight.” We can be more than 100%. Yes? In the class video, after teaching 200% using the puppy problem from level 3, T2 pulled up an problem from level 5 (“Combines known percents to multiply, divide, add, or subtract to find any percent”). It required the students to find 245% of a $130 water bill using combinations of percents. Throughout this exchange, the teacher leveraged the structure of the map to maintain coherence of ideas, and deliver student-centered instruction. T2 trusted the students to take the lead on the problem. She acted as a note taker, recorded the percentages students calculated. She tracked the conversation, without suggesting the students solve the problem a particular way. The students confidently proposed four distinct methods for solving the problem, delighting themselves and the teacher. 4.3.3. Research Question 3: how do students participate in the review process; is there evidence that they become more active partners in the assessment process? 4.3.3.1. Norms for data interpretation. The ways teachers talked to students about interpreting the assessment data fell along a spectrum. On one end was an evaluative, performance-based interpretation, with data not used effectively to inform instruction. At the other was a formative, growth-based orientation that focused on promoting learning (Heritage, 2007), the data being central to identifying areas for growth and instructional decisions. We illustrate this with examples of use of M-M data in the classrooms of two teachers. T1 (6th grade) demonstrated a rigid approach to learning, frequently describing questions as “easy” or “hard,” and not as opportunities to learn. She did not use heatmap data. Several students attempted to describe methods they had used to solve a problem (Fig. 6). Throughout the exchange, T1 did not follow up on productive comments that would have engaged students as partners in assessment; she instead sought out students who were, like her, “confused.” T2 (6th grade) demonstrated many instances of a growth-based orientation. She had created a Google form as well as a paper reflection sheet for her students to use when reviewing their own reports: We did the Google form before and then we’ve done this reflection sheet, and we’ve done without the reflection sheet. I’m just trying to find…the perfect fit of what works best for us, as far as reflecting on the information that we gathered.

168

International Journal of Educational Research 92 (2018) 158–172

J. Confrey et al.

She instructed her students to iteratively cycle through observing data (viewing test results in reports), interpreting data (identifying areas of weak understanding as displayed in the reports), and acting on data (revising missed questions, practicing weak constructs, and eventually taking a retest). On a separate occasion, she offered the following to assist her students in viewing their percent correct scores from a nonevaluative perspective: …I don’t want you to focus so much on the percentage you got on the test, but where you fall in the [proficiency] chart. Ok, so yes, a 63 may not be what you typically think of as a good score. However, if a 63 is showing proficiency, in our chart, then we are doing pretty well. Ok, are you at the top end, at 100? No. That’s ok. I don’t think I ever score 100 on anything. But that just means we can learn from it. [italics added] Later, referring to (anonymized) heatmap data, she said, The great thing about this, it shows us what we need to learn. So we have some work to do. As a whole, as a group. T2 repeatedly exhorted her class that finding out what you don’t know is a good thing, that learning gains result from hard work, and that working together will yield the best outcomes for all. 4.3.4. Research Question 4: how do the teachers collectively discuss, interpret, and use their data to adjust, or plan to adjust, instruction? 4.3.4.1. Norms for data interpretation. Teachers were observed collectively reflecting on their instruction through the lens of their diagnostic assessment data. Over time, we observed teachers adopting more of a growth orientation: toward their use of the tool, their own teaching, and their students’ mathematical aptitude. “Last year, it felt like they were going to fail, and it was going to be terrible. And this year I’ve approached it more as a learning tool and less as a be-all end-all test kind of thing,” said a sixth-grade teacher in a PLC. T2 described above is part of the same PLC, and reflected on how her teaching had changed as a result of M-M: R: What do you think makes the difference between last year and this year, in terms of the results from the students? T2: As far as mine goes, my kids have a stronger math comprehension this year. Coming from elementary school. Last year, I had a lot of 1 s, 2 s and 3 s [on the end-of-grade test]. This year I have mostly 3 s and 4 s. So my kids came in stronger, which I think leads back to this. But I hope that I’m a little bit stronger, you know, using this [M-M] to guide my instruction and decide, ‘What was I missing?’ Like what steps, what stones in the sidewalk did I just forget to put down for the kids? And I mean still, even still, like I said construct C was our lowest, and we talked about that, and the fact that I still, this year missed that step of going past 100%. 4.3.4.2. Instructional actions (whole class). After her in-class discovery that she had overlooked teaching percents greater than 100%, T2 discussed with the 6th grade PLC her oversight and recovery based on the analysis of the data in the heatmap: T2: Yeah, I found with mine, and I discussed it with them today in the video, that was a “me” problem. Like, “I did not discuss with you percentages over 100.″ T6: Yeah, I hadn’t done it. I realized that too with mine. That I hadn’t taught it. T2: We hadn’t taught it, and it was kind of a confusing situation, because they felt like “Well you said it was always 100 is like the total, so why are we going over 100 if you said 100 is the total.” So I have to find a way to reorganize that, like we talked about “ok, if your total is this, but you get more than that, you’re going over that total. You’re going over 100.″ She related to her colleagues the multiple ways the students solved the level 5 problem, and then compared it with a problem from level 2 (“Builds up to other percents including using addition or multiplication of benchmark percent of 1%”): T2: And then we looked at the questions for [levels] 3 and 5. We actually did question [level] 5 like four times, in four different ways. That’s where my low kid had his “aha” moment….it’s definitely combining, not so much combining, but going above 100. So they were OK talking about the one where you have to find 23% of 80 [level 2], and two students do it two different ways. One finds 20%, finds 1% and adds. The other finds 25% and 1% and subtracts. And they understood that, they could conceptualize that, for the most part, but going above 100% was just for the most part like, mind boggling to them. So we did spend a lot of time on this one. The episode demonstrates the extent to which T2 learns to respond more agilely to student learning needs, and supports studentdirected learning, all facilitated by her use of M-M affordances. T2: 1) recognized her own mistake in having neglected to teach the topic (percents greater than 100%), and was forthright enough to take responsibility with the students, 2) worked an item from the neglected level, teaching the topic in real time, 3) allowed the students to direct their own approaches to a substantially harder problem from level 5, while taking the role of orchestrator, and 4) reported back on her mistake and adaptation to other teachers, learned they also had missed it, and explained to them how she leveraged the LT to address the issue in real time. The other 6th grade teachers also recognized that they had neglected the topic of percents greater than 100 and made plans to address the topic prior to the end of the unit. 4.4. Discussion We developed a theory of action (Argyris & Schon, 1978; Elmore, 2006) by studying the positive and negative examples of how teachers used the data to review instruction and refine their curricular use. By using the two observed major categories of “data 169

International Journal of Educational Research 92 (2018) 158–172

J. Confrey et al.

Fig. 7. Initial theory of action for class reviews of classroom M-M assessment data, constituting the short-cycle feedback loop.

review” and “instructional actions” from our coding organization, we were able to elaborate on the short cycle of the overall “agile curriculum” model (Fig. 7). It revealed variations in how teachers used the tool, how they supported students in focusing on their learning and next steps, and how they adjusted their instructional practices to respond to and stimulate the changes. From these analyses, we obtained preliminary answers to our research questions. The first question asked how teachers review the data from the heatmaps and student reports and use it to make adjustments in instruction. Variations immediately were apparent in the extent to which teachers leveraged the tool’s affordances for displaying data. Most of the teachers did share the heatmaps with students, connecting item review with a systematic approach to the data. Once the teachers used the data displays, most used them to decide where to focus attention and how to sequence the items, making data reviews more efficient and coherent. Most teachers also made significant use of the student reports, particularly by encouraging students to revise their responses either on their own, with peers, or after engaging in practice. Question 2 concerned the role and influence of the LTs on classroom practice. All teachers leveraged the LTs to some degree by reviewing the items in order of the levels. At first teachers hesitated to discuss an LT itself, fearing it was too difficult or obtuse to students—although they would use the labels. Over time, teachers and students were both observed to use the LTs to characterize student knowledge and to describe what remained to be learned. In one case, weak performance on items from one level of an LT led to teachers’ awareness that they had neglected an important topic—one missing explicitly from the Standards. The 6th grade PLC discussion of percents greater than 100 therefore provides a concrete example of how the LT structure and corresponding diagnostic assessment data can be used to identify critical learning topics that may be overlooked in a standards-based approach, and support teachers in developing curriculum-context knowledge (CCK) (Choppin, 2009). In this case, CCK is developed not through successive enactments of a lesson, but rather through a professional conversation about student progress on LTs based on an instructional sequence. Recognizing how the LTs can guide instruction is a gradual process, and progress does not come “free.” Active instruction, whether whole-class or through careful monitoring of group work, was a necessary element in achieving such progress. The items provided students opportunities to learn unmastered content, and the LTs provided guidance and direction. But also critical was the teacher’s ability to facilitate that learning using a variety of strategies, and developing those into classroom norms that would establish an agile approach to curricular enactment. Teachers who leveraged the structure of the map were able to maintain coherence in assessment reviews, rather than treating each item as an independent skill. Students were involved in the assessment process in various ways. At first, the teachers recognized a need to help students interpret the lower scores, and worried about students becoming discouraged. Over time, as the engagement with more conceptual problems generated interesting classroom exchanges, and teachers and students both began to focus more on learning instead of evaluation. Student involvement also increased as teachers gradually relied less on teacher-directed explanations and allowed more space for students’ choices of problem focus, and their proposals, ideas, and explanations. There seemed to be evidence of subtle shifts in the relative emphasis on teacher’s insistence on choices of representations, the involvement of students in rephrasing the problems, and solicitation of multiple methods. These shifts seemed to indicate that many teachers began to trust their students more, value their contributions and insights more, and give them a more active role in the assessment process. 5. Conclusions The context of curriculum ergonomics invites one to propose methods of curricular revisioning based on data. Specifically, it provides the opportunities to integrate disparate fields of scholarship in order to strengthen and guide the revisioning process based on data. To this end, we integrated two fields of scholarship: curricular theory and classroom assessment. Curricular theory has progressed towards recognition of continuous change by communities of practice, and classroom assessment relies on the systematic 170

International Journal of Educational Research 92 (2018) 158–172

J. Confrey et al.

application to learning as it occurs within instructional settings. Both thrive within digitally-supported environments that can support revisioning and related documentation as well as on-line assessment with immediate scoring and reporting to diverse users. In this paper, specifically, we demonstrate a means to guide curricular revisions and adaptations based on data on students’ progress on learning trajectories. As articulated in our agile curricular framework, we envision at least two distinct cycles for applying the feedback from diagnostic assessments—changes during instruction (short-term feedback) and changes in further curricular enactments (long-term feedback) based on collective data discussions and related proposals. To help the reader understand the concept of an agile curriculum, we provided data on multiple teachers’ review practices and discussed how they related to our theory of action within the classroom. We further reported on how those data were also used to guide discussions with professional learning communities to decide to make long term curricular adjustments. The examples from the study only begin to illustrate the evolution of curriculum envisioned in the agile concept, and help to explain why the concept of agility is accompanied by four principles. An explicit theory of learning (learning trajectories in our example) was necessary to guide the development and analysis of the assessments. For our study, the examples illustrate how teachers leverage a fine-grain delineation as well as the sequential feature of the data from the learning trajectory levels. The study also reported on the ways teachers use relevant and timely feedback to guide immediate discussions and influence next steps. And with the example of reasoning about percents greater than 100%, one sees how the data affect both short term instructional decisionmaking and then collectively affect long-term curricular planning. The example illustrates how the digital resource relies on a critical role for teachers in terms of how facilely and effectively they use the tool, how they envision the role of classroom assessment in targeting learning, and the degree to which teachers use the data to promote learner-centered instruction. Limitations of the study include its inclusion of a relatively small number of schools and teachers and its focus only on the data review aspects of instruction. Further work is needed to see how various review practices are linked to changes in student outcomes and to understand how the changes in assessment approaches are viewed by the students. Finally, we close in recognition that establishing secure research results in these complex digital learning systems will accrue slowly and only when researchers, district personnel, and practitioners engage with DLSs that use varied instructional organizations. Acknowledgement This work was supported by the National Science Foundation (NSF) under Grant 1621254. References Argyris, C., & Schon, D. (1978). Organizational learning: A theory of action perspective. Reading, MA: Addison-Wesley Publishing Company. Ball, D. L., & Cohen, D. K. (1996). Reform by the book: What is—or might be—the role of curriculum materials in teacher learning and instructional reform? Educational Researcher, 25(9), 6–14. Barquero, B., Papadopoulos, I., Barajas, M., & Kynigos, C. (2016). Cross-case design in using digital technologies: Two communities of interest designing a c-book unit. Extended Paper Presented in TSG 36 Task Design, ICME 13. Black, P., Harrison, C., Lee, C., Marshall, B., & Wiliam, D. (2003). Assessment for learning: Putting it into practice. Buckingham, UK: Open University Press. Brookhart, S. M. (2018). Learning is the primary source of coherence in assessment. Educational Measurement Issues and Practice, 37(1), 35–38. CCSS-I (2011). Mathematics standards. Accessed 17 July 2016 www.corestandards.org/Math. Chazan, D., & Yerushalmy, M. (2014). The future of mathematics textbooks: Ramifications of technological change. In M. Stochetti (Ed.). Media and education in the digital age: Concepts, assessment and subversions (pp. 63–76). New York: Peter Lang. Choppin, J. (2009). Curriculum‐context knowledge: Teacher learning from successive enactments of a standards‐based mathematics curriculum. Curriculum Inquiry, 39(2), 287–320. Cicchetti, D. V. (1994). Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychological Assessment, 6(4), 284–290. Clements, D. H., & Sarama, J. (2004). Learning trajectories in mathematics education. Mathematical Thinking and Learning, 6(2), 81–89. Cohen, D., Lindvall, M., & Costa, P. (2003). Agile software development. Rome, NY: Data and Analysis Center for Software. Confrey, J. (2015). Some possible implications of data-intensive research in education—The value of learning maps and evidence-centered design of assessment to educational data mining. In C. Dede (Ed.). Data-intensive research in education: Current work and next steps. (pp. 79–87). Washington, DC: Computing Research Association. Confrey, J., & Maloney, A. (2012). A next generation digital classroom assessment based on learning trajectories. In C. Dede, & J. Richards (Eds.). Steps toward a digital teaching platform (pp. 134–152). New York: Teachers College Press. Confrey, J., Gianopulos, G., McGowan, W., Shah, M., & Belcher, M. (2017). Scaffolding learner-centered curricular coherence using learning maps and diagnostic assessments designed around mathematics learning trajectories. ZDM, 49(5), 717–734. Confrey, J., Maloney, A., Nguyen, K. H., Mojica, G., & Myers, M. (2009). Equipartitioning/splitting as a foundation of rational number reasoning using learning trajectories. Paper Presented at the Proceedings of the 33rd Conference of the International Group for the Psychology of Mathematics Education. Confrey, J., Maloney, A. P., Nguyen, K. H., & Rupp, A. A. (2014). Equipartitioning, a foundation for rational number reasoning: Elucidation of a learning trajectory. In A. P. Maloney, J. Confrey, & K. H. Nguyen (Eds.). Learning Over time: Learning trajectories in mathematics education (pp. 61–96). Charlotte, NC: Information Age Publishing. Confrey, J., Toutkoushian, E. & Shah, M. (in press). A validation argument from soup to nuts: Assessing progress on learning trajectories for middle school mathematics. Applied Measurement in Education. Confrey, J., & Toutkoushian, E. (in press). A validation approach to middle-grades learning trajectories within a digital learning system applied to the “measurement of characteristics of circles.” In J. Bostic, E. E. Krupa, & J.Shih (Eds.) Quantitative Measures of Mathematical Knowledge: Researching Instruments and Perspectives. New York, NY: Routledge. Cuban, L. (1992). Curriculum stability and change. In P. W. Jackson (Ed.). Handbook of research on curriculum (pp. 216–247). New York: Simon & Schuster Macmillan. Davis, J., Choppin, J., McDuffie, A. R., & Drake, C. (2013). Common core state standards for mathematics: Middle school mathematics teachers’ perceptions. Rochester, NY: The Warner Center for Professional Development and Education Reform. De Vries, H., Elliott, M. N., Kanouse, D. E., & Teleki, S. S. (2008). Using pooled kappa to summarize interrater agreement across many items. Field Methods, 20(3), 272–282. Dedoose(Version 8.0.35). Los Angeles, CA: SocioCultural Research Consultants, LLC. Retrieved from www.dedoose.com. Derry, S. J., Pea, R. D., Barron, B., Engle, R. A., Erickson, F., Goldman, R., ... Sherin, B. L. (2010). Conducting video research in the learning sciences: Guidance on

171

International Journal of Educational Research 92 (2018) 158–172

J. Confrey et al.

selection, analysis, technology, and ethics. Journal of the Learning Sciences, 19(1), 3–53. Dweck, C. S. (2006). Mindset: The new psychology of success. Random House Incorporated.. Elmore, R. F. (2006). International perspectives on school leadership for systemic improvement. Politics, (July), 1–28. Gehrke, N. J., Knapp, M. S., & Sirotnik, K. A. (1992). In search of the school curriculum. Chapter 2 Review of Research in Education, 18(1), 51–110. Gravemeijer, K. (1994). Educational development and developmental research in mathematics education. Journal for Research in Mathematics Education, 25(5), 443–471. Gueudet, G., & Trouche, L. (2009). Towards new documentation systems for mathematics teachers? Educational Studies in Mathematics, 71(3), 199–218. Gueudet, G., Pepin, B., & Trouche, L. (2013). Textbooks design and digital resources. In C. Margolinas (Ed.). Task Design in Mathematics Education. Proceedings of ICMI Study 22 (pp. 327–337). . Accessed 21 July 2016 https://hal.archives-ouvertes.fr/hal-00834054v2. Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77(1), 81–112. Heritage, M. (2007). Formative assessment: What do teachers need to know and do? Phi Delta Kappan, 89(2), 140–145. Heritage, M. (2010). Formative assessment and next-generation assessment systems: Are we losing an opportunity? Council of Chief State School Officers. Huntley, M. (2009). Measuring curriculum implementation. Journal for Research in Mathematics Education, 40(4), 355–362. Larson, M. (2016). Curricular coherence in the age of open educational-resources. Accessed 13 August 2016 https://www.nctm.org/News-and-Calendar/Messages-fromthe-President/Archive/Matt-Larson/Curricular-Coherence-in-the-Age-of-Open-Educational-Resources/. McKnight, C., Crosswhite, J., Dossey, J., Kifer, L., Swafford, J., Travers, K., ... Cooney, T. (1987). The underachieving curriculum: Assessing U.S. School mathematics from an international perspective. A national report on the second international mathematics study. Champaign, IL: Stipes Publishing Co. Minstrell, J. (2001). Facets of students’ thinking: Designing to cross the gap from research to standards-based practice. In K. Crowley, C. D. Schunn, & T. Okada (Eds.). Designing for science: Implications from everyday, classroom, and professional settings. Mahwah, NJ: Lawrence Erlbaum Associates. National Research Council (NRC) (2003). Assessment in support of instruction and learning: Bridging the gap between large-scale and classroom assessment. Workshop report. Committee on assessment in support of instruction and learning. Board on testing and assessment, committee on science education K-12, mathematical sciences education board. Center for education. Division of behavioral and social sciences and education. Washington, DC: The National Academies Press. Nguyen, K. H., & Confrey, J. (2014). Exploring the relationship between learning trajectories and curriculum. In A. P. Maloney, J. Confrey, & K. H. Nguyen (Eds.). Learning Over time: Learning trajectories in mathematics education (pp. 161–186). Charlotte, NC: Information Age Publishing, INC. Olsher, S., Yerushalmy, M., & Chazan, D. (2016). How Might the Use of Technology in Formative Assessment Support Changes in Mathematics Teaching? For the Learning of Mathematics, 36(3), 11–18. Pane, J. F., Steiner, E. D., Baird, M. D., & Hamilton, L. S. (2015). Continued progress: Promises evidence on personalized learning. RAND Corporation. Pellegrino, J. W., Chudowsky, N., & Glaser, R. (Eds.). (2001). Knowing what students know: The science and design of educational assessmentWashington, DC: The National Academies Press. https://doi.org/10.17226/10019. Retrieved from https://www.nap.edu/catalog/10019/knowing-what-students-know-the-science-and-designof-educational. Pellegrino, J. W., DiBello, L. V., & Goldman, S. R. (2016). A framework for conceptualizing and evaluating the validity of instructionally relevant assessments. Educational Psychologist, 51(1), 59–81. Pepin, B., Gueudet, G., & Trouche, L. (2013). Re-sourcing teachers’ work and interactions: A collective perspective on resources, their use and transformation. ZDM, 45(7), 929–943. Pepin, B., Gueudet, G., Yerushalmy, M., Trouche, L., & Chazan, D. (2015). E-textbooks in/for teaching and learning mathematics: A disruptive and potentially transformative educational technology. In L. D. English, & D. Kirshner (Eds.). Handbook of international research in mathematics education (pp. 636–661). (3rd ed). New York, NY: Routledge. Remillard, J. T. (1999). Curriculum materials in mathematics education reform: A framework for examining teachers’ curriculum development. Curriculum Inquiry, 29(3), 315–342. Remillard, J. T. (2005). Examining key concepts in research on teachers’ use of mathematics curricula. Review of Educational Research, 75(2), 211–246. Remillard, J. T., & Heck, D. J. (2014). Conceptualizing the curriculum enactment process in mathematics education. ZDM, 45(5), 705–718. Schmidt, W. H., Jorde, D., Cogan, L., Barrier, E., Ganzalo, I., Moser, U., ... Wolfe, R. G. (1996). Characterizing pedagogical flow: An investigation of mathematics and science teaching in six countries. Dordrecht, The Netherlands: Kluwer. Shepard, L. A., Penuel, W. R., & Pellegrino, J. W. (2018). Using learning and motivation theories to coherently link formative assessment, grading practices, and largescale assessment. Educational Measurement Issues and Practice, 37(1), 21–34. Simon, M. A., & Tzur, R. (2004). Explicating the role of mathematical tasks in conceptual learning: An elaboration of the hypothetical learning trajectory. Mathematical Thinking and Learning, 6(2), 91–104. Stein, M. K., Grover, B. W., & Henningsen, M. (1996). Building student capacity for mathematical thinking and reasoning: An analysis of mathematical tasks used in reform classrooms. American Educational Research Journal, 33(2), 455–488. Stein, M. K., Remillard, J., & Smith, M. S. (2007). How curriculum influences student learning. Second handbook of research on mathematics teaching and learning, 1(1), 319–370. Tarr, J. E., Chávez, Ó., Reys, R. E., & Reys, B. J. (2006). From the written to the enacted curricula: The intermediary role of middle school mathematics teachers in shaping students’ opportunity to learn. School Science and Mathematics, 106(4), 191–201. Trouche, L. (2004). Managing the complexity of human/machine interactions in computerized learning environments: Guiding students’ command process through instrumental orchestrations. International Journal of Computers for Mathematical Learning, 9(3), 281–307. Webel, C., Krupa, E. E., & McManus, J. (2015). Teachers evaluations and use of web-based curriculum resources to support their teaching of the Common Core State Standards for Mathematics. Middle Grades Research Journal, 10(2), 49–64. Wiliam, D. (2018). How can assessment support learning? A response to Wilson and Shepard, Penuel, and Pellegrino. Educational Measurement Issues and Practice, 37(1), 42–44. Wilson, M. (2018). Making measurement important for education: The crucial role of classroom assessment. Educational Measurement Issues and Practice, 37(1), 5–20.

172

View publication stats

More Documents from "Wulan Aulia Azizah"