What Is Data Warehousing and Why Is It Important? Dr Brian Preston, Nassau BOCES, Massapequa NY
"Data warehousing is the coordinated, architected, and periodic copying of data from various sources, both inside and outside the enterprise, into an environment optimized for analytical and information processing." (Alan Simon, Data Warehousing for Dummies.) Across the country, school districts are being held accountable for improving student achievement on multiple state examinations. School Report Cards are reporting performance to the public in increasingly detailed ways, and on the horizon, student achievement will be disaggregated even further for public accountability. New plans for the provision of services to students in all grade levels who are at-risk of not meeting new graduation standards are now being approved for implementation in the coming school year. The state education departments regularly identify the need to use data as the starting point for virtually all state required plans. Getting at this data is frequently challenging, since in a typical district, the data resides in a student database, a transportation database, a human resources database, financial systems, and filing cabinets and folders. Efficiently collecting all the relevant data on the factors that affect student performance is time consuming, at the very least, and in some cases, impossible, because the data was never collected in the first place. Data warehousing represents a significant solution to this increasing challenge. A data warehouse is, simply put, a single large database that has collected relevant information from several other sources into a single accessible format, from which users in a district can tease out useful information about students that would otherwise remain hidden. Here's an example of using data for planning in the current, cumbersome manner. One Eastern states staff worked with a district to examine patterns of low performance among 4th graders on ELA examinations. Using LEAP data from grade 4 testing, and PEP test data from the same students in grade 3, collected in Excel format, it was difficult to see patterns of lack of success. We looked at ethnicity, mobility, poverty, and ESL status and did not see big differences on scores. The district provided a list they had identified as at-risk, and further analysis of the at-risk population showed that ESL students were doing better on the ELA examination than native speakers of English. That was not making a lot of sense, so we asked the district to identify the language spoken at home for these students. This data had to be culled by hand from paper records in the district. When this was complete, we discovered that Spanish speakers outperformed the native speakers of English, and speakers of other languages were doing significantly worse. We were able to suggest several additional analyses for the district to undertake to determine why this was happening, and they were able to use the information to develop efficient ways to apply limited resources to create interventions appropriate for their low performing students.
Confidential and Proprietary Property of eScholar LLC ©2005. All Rights Reserved.
With conventional means, this process took several weeks, lots of manpower, and considerable skill with Excel. If a data warehouse existed which electronically linked ELA and LEAP data with student demographic data, the relationships could have been identified in a matter of minutes. Here's another perfect application for data warehousing. An elementary teacher meets a new group of students in September, and wants to know the nature of the skill set found within the class. Using a data warehouse that can access student schedules, the teacher can use a simple analysis tool to get individual profiles of each student in the class containing test scores, analyses of strong and weak areas of achievement measured on standardized tests. A collected class profile summarizing the information for the entire group could be created. The profile of a grade 5 class on the 7 key areas of grade 4 math can be derived in a matter of minutes and graphed for the teacher to identify which key areas need the most attention in the teacher's own class. We can do this now by going to individual student profiles on paper, entering the data into a spreadsheet, and graphing the results, but a warehouse makes the process so straightforward that it can become an automatic way to plan for instruction. What kinds of data belong in a school district data warehouse aimed at improving instruction? Here are some basics: Student demographics from an electronic student data base; grades, student schedules, attendance and discipline; standardized test data including item analysis capability; state test results; teacher information; financial data relating to instructional programs; and student extracurricular activities and programs. Using analysis tools now available to all districts, it would be a routine matter to compile information that would lead to quick identification of all students in all grades that should receive high, medium and low intensity Academic Intervention Services, and to follow their progress while receiving services in order to identify when they can be release from AIS programs. Right now most districts will find this a difficult, largely manual, burden. Building the data warehouse requires significant changes in the ways in which many districts track student information at the present time. The advantages of data warehousing will demonstrate the significant advantages of maintaining electronic records on all students from their enrollment in the district until their exit. And it will demonstrate the utility of using all the potential fields in student databases that often are currently left blank. Accountability for student performance means districts will need to be able to make informed decisions about identified groups of students to better understand which students are doing well and which are not meeting with success. Having data readily accessible to administrators and teachers means people can ask good questions, check for data to confirm or disprove their hunches, and proceed to improved decision-making and instructional planning. Are resources allocated to the greatest needs? Are their patterns of improvement that demonstrate the success of planned interventions? If not, what is actually happening, and to which groups of students? In some areas, districts are planning to allow parents to access their own children's grades and homework assignments through a data warehouse and an Internet connection, and this information flow is expected to improve the interactions between schools and parents who are themselves interested in instructional improvement. Once a warehouse is created and the steps to update it regularly are developed as regular, automatic update processes, the opportunities will be expansive. Data warehousing is a tool that can truly make a difference in education. Confidential and Proprietary Property of eScholar LLC ©2005. All Rights Reserved.