2781a Designing Microsoft Sql Server 2005 Server-side Solutions

  • May 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View 2781a Designing Microsoft Sql Server 2005 Server-side Solutions as PDF for free.

More details

  • Words: 111,381
  • Pages: 424
Course Preparation Checklist for Course 2781A: Designing Microsoft® SQL Server™ 2005 Server-Side Solutions It is recommended that you complete the following checklist to help you prepare for a successful delivery of Course 2781A: Designing Microsoft® SQL Server™ 2005 Server-Side Solutions.

Courses It is highly recommended that you audit the following courses: „

Course 2782: Designing Microsoft SQL Server 2005 Databases

„

Clinic 2783: Designing the Data Tier for Microsoft SQL Server 2005

For additional preparation, you should consider auditing the following workshop: „

Workshop 2784: Tuning and Optimizing Queries Using Microsoft SQL Server 2005

Exams To identify your technical proficiency with the content of this course, it is highly recommended that you pass the following exam: „

Exam 70-441: PRO: Designing Database Solutions by Using Microsoft SQL Server 2005

For additional preparation, you should consider taking the following exam: „

Exam 70-442: PRO: Designing and Optimizing Data Access by Using Microsoft SQL Server 2005

Technical Preparation Activities It is highly recommended that you complete the following technical preparation activities. _______

Read the Additional Readings included on the Trainer Materials DVD.

_______

Practice using SQL Server 2005.

_______

Practice setting up the classroom by following the instructions in the “Microsoft Virtual PC Classroom Setup Guide.”

_______

Review the Microsoft SQL Server Web site at http://www.microsoft.com/sql/default.mspx for updated information.

_______

Review the course error log, which is available in the Microsoft Download Center.

2

Course Preparation Checklist for Course 2781A: Designing Microsoft® SQL Server™ 2005 Server-Side Solutions

Instructional Preparation Activities It is also recommended that you complete the following instructional preparation activities. _______

Read the About This Course at the beginning of the course and the Instructor Notes that precede each module.

_______

Practice presenting each demonstration.

_______

Practice presenting each module and lab. • Identify the information that students need to complete each lab successfully. Anticipate the questions that students may have. • Identify the key points for each topic, demonstration, practice, and lab. • Identify how each demonstration, practice, and lab supports the module topics and reinforces the module objectives. • Identify examples, analogies, demonstrations, and additional delivery tips that will help to clarify module topics. • Note any problems that you may encounter during a demonstration, practice, or lab, and determine a course of action for how you will resolve them in the classroom. • Identify ways to improve a demonstration, practice, or lab to provide a more meaningful learning experience for your specific audience.

_______

Review the Microsoft Certifications for IT Professionals Web site at http://www.microsoft.com/learning/mcp/mcitp/default.asp for updated information about the Microsoft Certified IT Professional program.

Course 2781A: Designing Microsoft® SQL Server™ 2005 Server-Side Solutions

Information in this document, including URL and other Internet Web site references, is subject to change without notice. Unless otherwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place or event is intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. The names of manufacturers, products, or URLs are provided for informational purposes only and Microsoft makes no representations and warranties, either expressed, implied, or statutory, regarding these manufacturers or the use of the products with any Microsoft technologies. The inclusion of a manufacturer or product does not imply endorsement of Microsoft of the manufacturer or product. Links are provided to third party sites. Such sites are not under the control of Microsoft and Microsoft is not responsible for the contents of any linked site or any link contained in a linked site, or any changes or updates to such sites. Microsoft is not responsible for webcasting or any other form of transmission received from any linked site. Microsoft is providing these links to you only as a convenience, and the inclusion of any link does not imply endorsement of Microsoft of the site or the products contained therein. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. ©2006 Microsoft Corporation. All rights reserved. Microsoft, Active Directory, ActiveX, BizTalk, Excel, Microsoft Press, MSDN, MSN, Outlook, PowerPoint, SharePoint, Tahoma, Visio, Visual Basic, Visual C#, Visual SourceSafe, Visual Studio, Windows, and Windows Server. All other trademarks are property of their respective owners. 1 2 3 4 5 6 7 8 9 QWE 0 9 8 7 6 5

Course Number: 2781A Part Number: X12-08568 Released: 03/2006

END-USER LICENSE AGREEMENT FOR OFFICIAL MICROSOFT LEARNING PRODUCTS – TRAINER EDITION PLEASE READ THIS END-USER LICENSE AGREEMENT (“EULA”) CAREFULLY. THIS EULA ACCOMPANIES AND GOVERNS THE USE OF ALL SOFTWARE AND LICENSED CONTENT THAT ACCOMPANIES THIS EULA. BY USING THE CONTENT AND/OR USING OR INSTALLING THE SOFTWARE YOU AGREE TO THE TERMS OF THIS EULA. IF YOU DO NOT AGREE, DO NOT INSTALL OR USE SUCH CONTENT AND/OR SOFTWARE. 1.

DEFINITIONS. 1.1. “Authorized Learning Center(s)” means a training session conducted at a Microsoft Certified Partner for Learning Solutions location, an IT Academy, or such other entity as Microsoft may designate from time to time (for more information on these entities, please visit www.microsoft.com). 1.2. “Authorized Training Session(s)”means those training sessions authorized by Microsoft and conducted at or through Authorized Learning Centers by a MCT providing training to Students solely on Official Microsoft Learning Products (formerly known as Microsoft Official Curriculum or “MOC”). 1.3. device.

“Device(s)” means a single computer, device, workstation, terminal, or other digital electronic or analog

1.4. “Document(s)” means the printed or electronic documentation such as manuals, workbooks, white papers, press releases, datasheets, and FAQs which may be included in the Licensed Content. 1.5. “Licensed Content” means the materials accompanying this EULA. The Licensed Content may include, but is not limited to, the following elements: (i) Trainer Content, (ii) Student Content, (iii) Media Elements, (iv) Software, and (v) Documents. 1.6. "Media Elements" means the certain photographs, clip art, animations, sounds, music, and/or video clips which may accompany this EULA. 1.7. “Software” means the Virtual Hard Disks, or such other software applications that may be included with the Licensed Content. 1.8. “Student(s)” means students duly enrolled for an Authorized Training Session at an Authorized Learning Center. 1.9. “Student Content” means the learning materials accompanying this EULA that are for Use by Students and Trainers. 1.10. “Trainer(s)” or “MCT(s)” means a) a person who is duly certified by Microsoft as a Microsoft Certified Trainer and b) such other individual as authorized in writing by Microsoft and has been engaged by an Authorized Learning Center to teach or instruct an Authorized Training Session to Students on behalf of the Authorized Learning Center. 1.11. “Trainer Content” means the materials accompanying this EULA that are for Use by Trainers solely for the preparation of and/or Use during an Authorized Training Session. 1.12.

“Use”

(a) “Use” by Trainers means the use of the Licensed Content by Trainers and/or Students solely to conduct educational classes, labs or related programs designed to train other Trainers and/or Students in the Use of the Microsoft technology, products or services related to the subject matter of the Licensed Content and/or concepts related to such Microsoft technology, products or services. (b) “Use” by Students means the use of the Licensed Content by Students solely at an Authorized Training Session solely to participate in educational classes, labs or related programs designed to train Students in the use of the Microsoft technology, products or services related to the subject matter of the Licensed Content and/or concepts related to such Microsoft technology, products or services; and (c) “Use” under this EULA shall not include the use of the Licensed Content for general business purposes. 1.13. “Virtual Hard Disks”” means Microsoft Software that is comprised of virtualized hard disks (such as a base virtual hard disk or differencing disks) that can be loaded onto a single computer or other device in order to allow end-users to run multiple operating systems concurrently. For the purposes of this EULA, Virtual Hard Disks shall be considered “Trainer Content”. 1.14.

“You” shall mean Trainer.

2. GENERAL. This EULA is a legal agreement between You (an individual) and Microsoft Corporation (“Microsoft”). This EULA governs the Licensed Content. This EULA applies to updates, supplements, add-on components, and Internet-based services components of the Licensed Content that Microsoft may provide or make available to You (each, a “Component”), provided, however, that if a separate end user license agreement appears upon the installation of a Component (a “Component EULA”) the terms of the Component EULA will control as to the applicable Component. Microsoft reserves the right to discontinue any Internet-based services provided to You or made available to You through the Use of the Licensed Content. This EULA also governs any product support services relating to the Licensed Content except as may be included in another agreement between You and Microsoft. An amendment or addendum to this EULA may accompany the Licensed Content. 3. INSTALLATION AND USE RIGHTS. Subject to Your compliance with the terms and conditions of this EULA, Microsoft hereby grants You a limited, non-exclusive, royalty-free license to Use the Licensed Content as follows: 3.1

Student Content.

(a) You may install and sublicense to individual Students the right to Use one (1) copy of the Student Content on a single Device solely Student’s personal training Use during the Authorized Training Session. (b) You may install and Use one (1) copy of the Student Content on a single Device solely for Your personal training Use in conjunction with and for preparation of one or more Authorized Training Sessions. You are allowed to make a second copy of such Student Content and install it on a portable Device for Your personal training Use in conjunction with and for preparation of such Authorized Training Session(s). (c) For each Authorized Training Session, Trainers may either (a) install individual copies of the Student Content corresponding to the subject matter of each such Authorized Training Session on classroom Devices to be Used by the Students solely in the Authorized Training Session, provided that the number of copies in Use does not exceed the number of duly enrolled Students for the Authorized Training Session; OR (b) Trainers may install one copy of the Student Content corresponding to the subject matter of each such Authorized Training Session on a network server, provided that the number of Devices accessing such Student Content on such server does not exceed the number of Students for the Authorized Training Session. (d) For the purposes of this EULA, any Software that is included in the Student version of the Licensed Content and designated as “Evaluation Software” may be used by Students solely for their personal training outside of the Authorized Training Session. 3.2.

Trainer Content.

(a) You may sublicense to individual Students the right to Use one (1) copy of the Virtual Hard Disks included in the Trainer Content on a single Device solely for Students’ personal training Use in connection with and during the Authorized Training Session for which they are enrolled.

(b) You may install and Use one (1) copy of the Trainer Content on a single Device solely for Your personal training Use and for preparation of an Authorized Training Session. You are allowed to make a second copy of the Trainer Content and install it on a portable Device solely for Your personal training Use and for preparation of an Authorized Training Session. (c) For each Authorized Training Session, Trainers may either (a) install individual copies of the Trainer Content corresponding to the subject matter of each such Authorized Training Session on classroom Devices to be Used by the Students in the Authorized Training Session, provided that the number of copies in Use does not exceed the number of duly enrolled Students for the Authorized Training Session; OR (b) Trainers may install one copy of the Trainer Content corresponding to the subject matter of each such Authorized Training Session on a network server, provided that the number of Devices accessing such Student Content on such server does not exceed the number of Students for the Authorized Training Session. WITHOUT LIMITING THE FOREGOING, COPYING OR REPRODUCTION OF THE LICENSED CONTENT TO ANY SERVER OR LOCATION FOR FURTHER REPRODUCTION OR REDISTRIBUTION IS EXPRESSLY PROHIBITED. 4.

DESCRIPTION OF OTHER RIGHTS AND LICENSE LIMITATIONS 4.1

Errors; Changes; Fictitious Names.

(a) You acknowledge and agree that (i) the Licensed Content, including without limitation Documents, related graphics, and other Components included therein, may include technical inaccuracies or typographical errors, and (ii) Microsoft may make improvements and/or changes in the Licensed Content or any portion thereof at any time without notice.

(b) You understand that the names of companies, products, people, characters and/or data mentioned in the Licensed Content may be fictitious and are in no way intended to represent any real individual, company, product or event, unless otherwise noted. 4.2

Software.

Virtual Hard Disks. The Licensed Content may contain versions of Microsoft Windows XP, Windows Server 2003, and Windows 2000 Advanced Server and/or other Microsoft products which are provided in Virtual Hard Disks. No modifications may be made to the Virtual Hard Disks. Any reproduction or redistribution of the Virtual Hard Disks not in accordance with this EULA is expressly prohibited by law, and may result in severe civil and criminal penalties. Violators will be prosecuted to the maximum extent possible. YOUR RIGHT TO USE THE VIRTUAL HARD DISKS SHALL BE DEPENDENT UPON YOUR EMPLOYING THE FOLLOWING SECURITY REQUIREMENTS: If You install the Licensed Content on any Device(s) at an Authorized Training Session, you will make sure that: a) the Licensed Content, and any components thereof, are removed from said Device(s) at the conclusion of each such Authorized Training Session and b) no copies of the Licensed Content are copied, reproduced and/or downloaded from such Devices. 4.3 Use and Reproduction of Documents. Subject to the terms and conditions of this EULA, Microsoft grants You the right to reproduce portions of the Documents provided with the Licensed Content solely for Use in Authorized Training Sessions. You may not print any book (either electronic or print version) in its entirety. If You choose to reproduce Documents, You agree that: (a) the Documents will not republished or posted on any network computer or broadcast in any media; and (b) any reproduction will include either the Document’s original copyright notice or a copyright notice to Microsoft’s benefit substantially in the format provided below. “Form of Notice: © 2006. Reprinted with permission by Microsoft Corporation. All rights reserved. Microsoft and Windows are either registered trademarks or trademarks of Microsoft Corporation in the US and/or other countries. Other product and company names mentioned herein may be the trademarks of their respective owners.” 4.4

Use of Media Elements. You may not modify the Media Elements.

4.5 Use of PowerPoint Slide Deck Templates. The Trainer’s Content may include Microsoft PowerPoint slide decks. Subject to the terms and conditions of this EULA, Trainers may Use, copy and modify the PowerPoint slide decks solely in conjunction with providing an Authorized Training Session. If You elect to exercise the foregoing rights, You agree: (a) that modification of the slide decks will not constitute creation of obscene or scandalous works, as defined by federal law at the time the work is created; and (b) to comply with all other terms and conditions of this EULA, including without limitation Sections 4.8, 4.9, and 7. 4.6 Use of Components in Trainer Content. Solely in conjunction with providing an Authorized Training Session, and subject to the terms and conditions of this EULA, Trainers may customize and reproduce or customize for their own purposes, those portions of the Licensed Content that are logically associated with instruction of an Authorized Training Session, including without limitation the labs, simulations, animations, modules, and assessment items for each such Authorized Training Session. 4.7 Use of Sample Code. In the event that the Licensed Content includes sample code in source or object code format (“Sample Code”), subject to the terms and conditions of this EULA, Microsoft grants You a limited, non-exclusive, royalty-free license to Use, copy and modify the Sample Code; if You elect to exercise the foregoing rights, You agree to comply with all other terms and conditions of this EULA, including without limitation Sections 4.8, 4.9, and 7. 4.8 Permitted Modifications. In the event that You exercise any rights provided under this EULA to create modifications of the Licensed Content, You agree that any such modifications: (a) will not be used for providing training where a fee is charged in public or private classes and will not be used for training other than at an Authorized Training Session; (b) indemnify, hold harmless, and defend Microsoft from and against any claims or lawsuits, including attorneys’ fees, which arise from or result from Your Use of any modified version of the Licensed Content; and (c) not to transfer or assign any rights to any modified version of the License Content to any third party without the express written permission of Microsoft. Your license to the Licensed Content or any of the Software or other materials included therewith, does not include any license, right, power or authority to (a) create derivative works of the Software in any manner that would cause the Microsoft Software and/or derivative works thereof, in whole or in part to become subject to any of the terms of the Excluded License. “Excluded License” means any license that requires as a condition of use, modification and/or distribution

of software subject to the Excluded License, that such software of other software combined and/or distributed with such software be (A) disclosed or distributed in source code form; (B) licensed for the purpose of making derivative works; or (C) redistributable at no charge. 4.9 Reproduction/Redistribution Licensed Content. Except as expressly provided in this EULA, You may not reproduce or distribute the Licensed Content or any portion thereof (including any permitted modifications) to any third parties without the express written permission of Microsoft. 5. RESERVATION OF RIGHTS AND OWNERSHIP. Microsoft reserves all rights not expressly granted to You in this EULA. The Licensed Content is protected by copyright and other intellectual property laws and treaties. Microsoft or its suppliers own the title, copyright, and other intellectual property rights in the Licensed Content. You may not remove or obscure any copyright, trademark or patent notices that appear on the Licensed Content, or any components thereof, as delivered to You. The Licensed Content is licensed, not sold. 6. LIMITATIONS ON REVERSE ENGINEERING, DECOMPILATION, AND DISASSEMBLY. You may not reverse engineer, decompile, or disassemble the Licensed Content, except and only to the extent that such activity is expressly permitted by applicable law notwithstanding this limitation. 7. LIMITATIONS ON SALE, RENTAL, ETC. AND CERTAIN ASSIGNMENTS. You may not provide commercial hosting services with, sell, rent, lease, lend, sublicense, or assign copies of the Licensed Content, or any portion thereof (including any permitted modifications thereof) on a stand-alone basis or as part of any collection, product or service. 8. CONSENT TO USE OF DATA. You agree that Microsoft and its affiliates may collect and Use technical information gathered as part of the product support services provided to You, if any, related to the Licensed Content. Microsoft may Use this information solely to improve our products or to provide customized services or technologies to You and will not disclose this information in a form that personally identifies You. 9. LINKS TO THIRD PARTY SITES. You may link to third party sites through the Use of the Licensed Content. The third party sites are not under the control of Microsoft, and Microsoft is not responsible for the contents of any third party sites, any links contained in third party sites, or any changes or updates to third party sites. Microsoft is not responsible for webcasting or any other form of transmission received from any third party sites. Microsoft is providing these links to third party sites to You only as a convenience, and the inclusion of any link does not imply an endorsement by Microsoft of the third party site. 10. ADDITIONAL LICENSED CONTENT/SERVICES. This EULA applies to Components that Microsoft may provide to You or make available to You after the date You obtain Your initial copy of the Licensed Content, unless we provide a Component EULA or other terms of Use with such Components. Microsoft reserves the right to discontinue any Internet-based services provided to You or made available to You through the Use of the Licensed Content. 11. U.S. GOVERNMENT LICENSE RIGHTS. All software provided to the U.S. Government pursuant to solicitations issued on or after December 1, 1995 is provided with the commercial license rights and restrictions described elsewhere herein. All software provided to the U.S. Government pursuant to solicitations issued prior to December 1, 1995 is provided with “Restricted Rights” as provided for in FAR, 48 CFR 52.227-14 (JUNE 1987) or DFAR, 48 CFR 252.227-7013 (OCT 1988), as applicable. 12. EXPORT RESTRICTIONS. You acknowledge that the Licensed Content is subject to U.S. export jurisdiction. You agree to comply with all applicable international and national laws that apply to the Licensed Content, including the U.S. Export Administration Regulations, as well as end-user, end-use, and destination restrictions issued by U.S. and other governments. For additional information see . 13. “NOT FOR RESALE” LICENSED CONTENT. Licensed Content identified as “Not For Resale” or “NFR,” may not be sold or otherwise transferred for value, or Used for any purpose other than demonstration, test or evaluation. 14. TERMINATION. Without prejudice to any other rights, Microsoft may terminate this EULA if You fail to comply with the terms and conditions of this EULA. In the event Your status as a Microsoft Certified Trainer a) expires, b) is voluntarily terminated by You, and/or c) is terminated by Microsoft, this EULA shall automatically terminate. Upon any termination of this EULA, You must destroy all copies of the Licensed Content and all of its Component parts. 15. DISCLAIMER OF WARRANTIES. To the maximum extent permitted by applicable law, Microsoft and its suppliers provide the LICENSED MATERIAL and support services (if any) AS IS AND WITH ALL FAULTS, and Microsoft and its suppliers hereby disclaim all OTHER warranties and conditions, whether express, implied or statutory, including, but not limited to, any (if any) IMPLIED warranties, DUTIES or conditions of MERCHANTABILITY, OF fitness for a particular purpose, OF RELIABILITY OR AVAILABILITY, OF ACCURACY OR COMPLETENESS OF RESPONSES, OF RESULTS, OF WORKMANLIKE EFFORT, OF LACK OF VIRUSES, AND OF LACK OF NEGLIGENCE, ALL WITH REGARD TO THE LICENSED CONTENT, AND THE PROVISION OF OR FAILURE TO PROVIDE SUPPORT OR OTHER SERVICES, INFORMATION, SOFTWARE, AND RELATED CONTENT THROUGH THE LICENSED CONTENT, OR OTHERWISE ARISING OUT OF THE USE OF THE LICENSED CONTENT. also, there is no warranty or condition of title, quiet enjoyment, quiet possession, correspondence to description or non-

infringement with regard to the LICENSED CONTENT. THE ENTIRE RISK AS TO THE QUALITY, OR ARISING OUT OF THE USE OR PERFORMANCE OF THE LICENSED CONTENT, AND ANY SUPPORT SERVICES, REMAINS WITH YOU. 16. EXCLUSION OF INDIRECT DAMAGES. To the maximum extent permitted by applicable law, in no event shall Microsoft or its suppliers be liable for any special, incidental, punitive, indirect, or consequential damages whatsoever (including, but not limited to, damages for loss of profits or confidential or other information, for business interruption, for personal injury, for loss of privacy, for failure to meet any duty including of good faith or of reasonable care, for negligence, and for any other pecuniary or other loss whatsoever) arising out of or in any way related to the use of or inability to use the LICENSED CONTENT, the provision of or failure to provide Support OR OTHER Services, informatIon, software, and related CONTENT through the LICENSED CONTENT, or otherwise arising out of the use of the LICENSED CONTENT, or otherwise under or in connection with any provision of this EULA, even in the event of the fault, tort (including negligence), misrepresentation, strict liability, breach of contract or breach of warranty of Microsoft or any supplier, and even if Microsoft or any supplier has been advised of the possibility of such damages. BECAUSE SOME STATES/JURISDICTIONS DO NOT ALLOW THE EXCLUSION OR LIMITATION OF LIABILITY FOR CONSEQUENTIAL OR INCIDENTAL DAMAGES, THE ABOVE LIMITATION MAY NOT APPLY TO YOU. 17. LIMITATION OF LIABILITY. NOTWITHSTANDING ANY DAMAGES THAT YOU MIGHT INCUR FOR ANY REASON WHATSOEVER (INCLUDING, WITHOUT LIMITATION, ALL DAMAGES REFERENCED HEREIN AND ALL DIRECT OR GENERAL DAMAGES IN CONTRACT OR ANYTHING ELSE), THE ENTIRE LIABILITY OF MICROSOFT AND ANY OF ITS SUPPLIERS UNDER ANY PROVISION OF THIS EULA AND YOUR EXCLUSIVE REMEDY HEREUNDER SHALL BE LIMITED TO THE GREATER OF THE ACTUAL DAMAGES YOU INCUR IN REASONABLE RELIANCE ON THE LICENSED CONTENT UP TO THE AMOUNT ACTUALLY PAID BY YOU FOR THE LICENSED CONTENT OR US$5.00. THE FOREGOING LIMITATIONS, EXCLUSIONS AND DISCLAIMERS SHALL APPLY TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW, EVEN IF ANY REMEDY FAILS ITS ESSENTIAL PURPOSE. 18. APPLICABLE LAW. If You acquired this Licensed Content in the United States, this EULA is governed by the laws of the State of Washington, and, in respect of any dispute which may arise hereunder, You consent to the jurisdiction of the federal and state courts located in King County, Washington. If You acquired this Licensed Content in Canada, unless expressly prohibited by local law, this EULA is governed by the laws in force in the Province of Ontario, Canada; and, in respect of any dispute which may arise hereunder, You consent to the jurisdiction of the federal and provincial courts sitting in Toronto, Ontario. If You acquired this Licensed Content in the European Union, Iceland, Norway, or Switzerland, then the local law of such jurisdictions applies. If You acquired this Licensed Content in any other country, then local law may apply. 19. ENTIRE AGREEMENT; SEVERABILITY. This EULA (including any addendum or amendment to this EULA which is included with the Licensed Content) is the entire agreement between You and Microsoft relating to the Licensed Content and the support services (if any) and supersedes all prior or contemporaneous oral or written communications, proposals and representations with respect to the Licensed Content or any other subject matter covered by this EULA. To the extent the terms of any Microsoft policies or programs for support services conflict with the terms of this EULA, the terms of this EULA shall control. If any provision of this EULA is held to be void, invalid, unenforceable or illegal, the other provisions shall continue in full force and effect. Should You have any questions concerning this EULA, or if You desire to contact Microsoft for any reason, please use the address information enclosed in this Licensed Content to contact the Microsoft subsidiary serving Your country or visit Microsoft on the World Wide Web at http://www.microsoft.com. Si vous avez acquis votre Contenu Sous Licence Microsoft au CANADA : DÉNI DE GARANTIES. Dans la mesure maximale permise par les lois applicables, le Contenu Sous Licence et les services de soutien technique (le cas échéant) sont fournis TELS QUELS ET AVEC TOUS LES DÉFAUTS par Microsoft et ses fournisseurs, lesquels par les présentes dénient toutes autres garanties et conditions expresses, implicites ou en vertu de la loi, notamment, mais sans limitation, (le cas échéant) les garanties, devoirs ou conditions implicites de qualité marchande, d’adaptation à une fin usage particulière, de fiabilité ou de disponibilité, d’exactitude ou d’exhaustivité des réponses, des résultats, des efforts déployés selon les règles de l’art, d’absence de virus et d’absence de négligence, le tout à l’égard du Contenu Sous Licence et de la prestation des services de soutien technique ou de l’omission de la ’une telle prestation des services de soutien technique ou à l’égard de la fourniture ou de l’omission de la fourniture de tous autres services, renseignements, Contenus Sous Licence, et contenu qui s’y rapporte grâce au Contenu Sous Licence ou provenant autrement de l’utilisation du Contenu Sous Licence. PAR AILLEURS, IL N’Y A AUCUNE GARANTIE OU CONDITION QUANT AU TITRE DE PROPRIÉTÉ, À LA JOUISSANCE OU LA POSSESSION PAISIBLE, À LA CONCORDANCE À UNE DESCRIPTION NI QUANT À UNE ABSENCE DE CONTREFAÇON CONCERNANT LE CONTENU SOUS LICENCE.

EXCLUSION DES DOMMAGES ACCESSOIRES, INDIRECTS ET DE CERTAINS AUTRES DOMMAGES. DANS LA MESURE MAXIMALE PERMISE PAR LES LOIS APPLICABLES, EN AUCUN CAS MICROSOFT OU SES FOURNISSEURS NE SERONT RESPONSABLES DES DOMMAGES SPÉCIAUX, CONSÉCUTIFS, ACCESSOIRES OU INDIRECTS DE QUELQUE NATURE QUE CE SOIT (NOTAMMENT, LES DOMMAGES À L’ÉGARD DU MANQUE À GAGNER OU DE LA DIVULGATION DE RENSEIGNEMENTS CONFIDENTIELS OU AUTRES, DE LA PERTE D’EXPLOITATION, DE BLESSURES CORPORELLES, DE LA VIOLATION DE LA VIE PRIVÉE, DE L’OMISSION DE REMPLIR TOUT DEVOIR, Y COMPRIS D’AGIR DE BONNE FOI OU D’EXERCER UN SOIN RAISONNABLE, DE LA NÉGLIGENCE ET DE TOUTE AUTRE PERTE PÉCUNIAIRE OU AUTRE PERTE DE QUELQUE NATURE QUE CE SOIT) SE RAPPORTANT DE QUELQUE MANIÈRE QUE CE SOIT À L’UTILISATION DU CONTENU SOUS LICENCE OU À L’INCAPACITÉ DE S’EN SERVIR, À LA PRESTATION OU À L’OMISSION DE LA ’UNE TELLE PRESTATION DE SERVICES DE SOUTIEN TECHNIQUE OU À LA FOURNITURE OU À L’OMISSION DE LA FOURNITURE DE TOUS AUTRES SERVICES, RENSEIGNEMENTS, CONTENUS SOUS LICENCE, ET CONTENU QUI S’Y RAPPORTE GRÂCE AU CONTENU SOUS LICENCE OU PROVENANT AUTREMENT DE L’UTILISATION DU CONTENU SOUS LICENCE OU AUTREMENT AUX TERMES DE TOUTE DISPOSITION DE LA U PRÉSENTE CONVENTION EULA OU RELATIVEMENT À UNE TELLE DISPOSITION, MÊME EN CAS DE FAUTE, DE DÉLIT CIVIL (Y COMPRIS LA NÉGLIGENCE), DE RESPONSABILITÉ STRICTE, DE VIOLATION DE CONTRAT OU DE VIOLATION DE GARANTIE DE MICROSOFT OU DE TOUT FOURNISSEUR ET MÊME SI MICROSOFT OU TOUT FOURNISSEUR A ÉTÉ AVISÉ DE LA POSSIBILITÉ DE TELS DOMMAGES. LIMITATION DE RESPONSABILITÉ ET RECOURS. MALGRÉ LES DOMMAGES QUE VOUS PUISSIEZ SUBIR POUR QUELQUE MOTIF QUE CE SOIT (NOTAMMENT, MAIS SANS LIMITATION, TOUS LES DOMMAGES SUSMENTIONNÉS ET TOUS LES DOMMAGES DIRECTS OU GÉNÉRAUX OU AUTRES), LA SEULE RESPONSABILITÉ ’OBLIGATION INTÉGRALE DE MICROSOFT ET DE L’UN OU L’AUTRE DE SES FOURNISSEURS AUX TERMES DE TOUTE DISPOSITION DEU LA PRÉSENTE CONVENTION EULA ET VOTRE RECOURS EXCLUSIF À L’ÉGARD DE TOUT CE QUI PRÉCÈDE SE LIMITE AU PLUS ÉLEVÉ ENTRE LES MONTANTS SUIVANTS : LE MONTANT QUE VOUS AVEZ RÉELLEMENT PAYÉ POUR LE CONTENU SOUS LICENCE OU 5,00 $US. LES LIMITES, EXCLUSIONS ET DÉNIS QUI PRÉCÈDENT (Y COMPRIS LES CLAUSES CIDESSUS), S’APPLIQUENT DANS LA MESURE MAXIMALE PERMISE PAR LES LOIS APPLICABLES, MÊME SI TOUT RECOURS N’ATTEINT PAS SON BUT ESSENTIEL. À moins que cela ne soit prohibé par le droit local applicable, la présente Convention est régie par les lois de la province d’Ontario, Canada. Vous consentez Chacune des parties à la présente reconnaît irrévocablement à la compétence des tribunaux fédéraux et provinciaux siégeant à Toronto, dans de la province d’Ontario et consent à instituer tout litige qui pourrait découler de la présente auprès des tribunaux situés dans le district judiciaire de York, province d’Ontario. Au cas où Vous auriez des questions concernant cette licence ou que Vous désiriez vous mettre en rapport avec Microsoft pour quelque raison que ce soit, veuillez utiliser l’information contenue dans le Contenu Sous Licence pour contacter la filiale de succursale Microsoft desservant Votre pays, dont l’adresse est fournie dans ce produit, ou visitez écrivez à : Microsoft sur le World Wide Web à http://www.microsoft.com

Course 2781A: Designing Microsoft® SQL Server™ 2005 Server-Side Solutions

Contents Introduction Introduction..................................................................................................................... i Course Materials ............................................................................................................ ii Microsoft Learning Product Types ........................................................................... iv Facilities ........................................................................................................................... v Microsoft Learning ....................................................................................................... vi Microsoft Certification Program...............................................................................vii About This Course......................................................................................................... x The Process of Designing SQL Server 2005 Server-Side Solutions..................xii Course Outline ............................................................................................................xiii Setup...............................................................................................................................xv Demonstration: Using Virtual PC ..........................................................................xvii What Matters Most? ................................................................................................ xviii Introduction to Fabrikam, Inc. .................................................................................xx

Module 1: Selecting SQL Server Services That Support Business Needs Lesson 1: Overview of the Built-In SQL Server Services................................... 1-2 Lesson 2: Evaluating When to Use the New SQL Server Services............... 1-14 Lesson 3: Evaluating the Use of Database Engine Enhancements .............. 1-25 Lab: Selecting SQL Server Services to Support Business Needs................... 1-34

Module 2: Designing a Security Strategy Lesson 1: Overview of Authentication Modes and Authorization Strategies for SQL Server 2005............................................................................... 2-2 Lesson 2: Designing a Security Strategy for Components of a SQL Server 2005 Solution ....................................................................................... 2-9 Lesson 3: Designing Objects to Manage Application Access......................... 2-25 Lesson 4: Creating an Auditing Strategy ............................................................ 2-33 Lesson 5: Managing Multiple Development Teams by Using the SQL Server 2005 Security Features .............................................................. 2-39 Lab: Designing a Security Strategy ...................................................................... 2-46

Module 3: Designing a Data Modeling Strategy Lesson 1: Defining Standards for Storing XML Data in a Solution ............... 3-2 Lesson 2: Designing a Database Solution Schema .......................................... 3-11 Lesson 3: Designing a Scale-Out Strategy .......................................................... 3-24 Lab: Designing a Data Modeling Strategy.......................................................... 3-33

ix

x

Course 2781A: Designing Microsoft® SQL Server™ 2005 Server-Side Solutions

Module 4: Designing a Transaction Strategy for a SQL Server Solution Lesson 1: Defining Data Behavior Requirements .............................................. 4-2 Lesson 2: Defining Isolation Levels....................................................................... 4-8 Lesson 3: Designing a Resilient Transaction Strategy......................................4-17 Lab: Designing a Transaction Strategy for a SQL Server 2005 Solution..... 4-32

Module 5: Designing a Notification Services Solution Lesson 1: Defining Event Data ............................................................................... 5-2 Lesson 2: Designing a Subscription Strategy ...................................................... 5-8 Lesson 3: Designing a Notification Strategy...................................................... 5-18 Lesson 4: Designing a Notification Delivery Strategy ..................................... 5-23 Lab: Designing a Notification Services Solution .............................................. 5-28

Module 6: Designing a Service Broker Solution Lesson 1: Designing a Service Broker Solution Architecture........................... 6-3 Lesson 2: Designing Service Broker Data Flow .................................................6-14 Lesson 3: Designing Service Broker Solution Availability .............................. 6-24 Lab: Designing a Service Broker Solution.......................................................... 6-29

Module 7: Planning for Source Control, Unit Testing, and Deployment Lesson 1: Designing a Source Control Strategy.................................................. 7-2 Lesson 2: Designing a Unit Test Plan.................................................................. 7-10 Lesson 3: Creating a Performance Baseline and Benchmarking Strategy.... 7-20 Lesson 4: Designing a Deployment Strategy..................................................... 7-28 Lab: Planning for Source Control, Unit Testing, and Deployment .............. 7-36

Module 8: Evaluating Advanced Query and XML Techniques Lesson 1: Evaluating Common Table Expressions............................................ 8-2 Lesson 2: Evaluating Pivot Queries ..................................................................... 8-16 Lesson 3: Evaluating Ranking Queries............................................................... 8-30 Lesson 4: Overview of XQuery............................................................................. 8-39 Lesson 5: Overview of Strategies for Converting Data Between XML and Relational Forms.............................................................................................. 8-53 Lab: Evaluating Advanced Query and XML Techniques................................ 8-63

Index

2781A: Designing Microsoft SQL Server™ 2005 Server-Side Solutions ®

Microsoft Virtual PC Classroom Setup Guide ®

Information in this document, including URL and other Internet Web site references, is subject to change without notice. Unless otherwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place or event is intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. The names of manufacturers, products, or URLs are provided for informational purposes only and Microsoft makes no representations and warranties, either expressed, implied, or statutory, regarding these manufacturers or the use of the products with any Microsoft technologies. The inclusion of a manufacturer or product does not imply endorsement of Microsoft of the manufacturer or product. Links are provided to third party sites. Such sites are not under the control of Microsoft and Microsoft is not responsible for the contents of any linked site or any link contained in a linked site, or any changes or updates to such sites. Microsoft is not responsible for webcasting or any other form of transmission received from any linked site. Microsoft is providing these links to you only as a convenience, and the inclusion of any link does not imply endorsement of Microsoft of the site or the products contained therein. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. © 2006 Microsoft Corporation. All rights reserved. Microsoft, MSDN, PowerPoint, Windows, and Windows Server are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. All other trademarks are property of their respective owners.

Course Number: 2781A

2781A: Designing Microsoft® SQL Server™ 2005 Server-Side Solutions

iii

Contents Introducing Microsoft Virtual PC 2004...................................................................1 Setup Overview .......................................................................................................2 Classroom Requirements.........................................................................................2 Classroom Configuration.........................................................................................4 Instructor Computer Checklist.................................................................................5 Instructor Computer Setup.......................................................................................6 1. Install Virtual PC............................................................................................7 2. Install the virtual disk files .............................................................................7 3. Create a desktop shortcut for Virtual PC........................................................8 4. Add virtual machines......................................................................................8 5. Activate virtual machines ...............................................................................9 6. Create a setup share......................................................................................10 7. Install courseware fonts................................................................................10 8. Install the PowerPoint slides ........................................................................10 Student Computer Checklist..................................................................................11 Student Computer Setup........................................................................................12 1. Install Virtual PC..........................................................................................12 2. Install the virtual disk files ...........................................................................12 3. Create a desktop shortcut for Virtual PC......................................................12 4. Add virtual machines....................................................................................13 5. Start the virtual machine...............................................................................13

2781A: Designing Microsoft® SQL Server™ 2005 Server-Side Solutions

1

Introducing Microsoft Virtual PC 2004 This course is designed using Microsoft® Virtual PC 2004. Virtual PC is a technology that allows a single computer to act as a host for one or more virtual machines. The virtual machines use a set of virtual devices that might or might not map to the physical hardware of the host computer. The software that is installed onto the virtual machine is unmodified, fullversion, retail software that operates exactly as it does when it is installed onto physical hardware. The following definitions will help you with the remainder of this document: „

Virtual PC: An application from Microsoft that allows you to install and run other operating systems. Virtual PC does not ship with this course, but it can be acquired from your MSDN® subscription, or can be purchased retail.

„

Host computer: The physical computer onto which an operating system and the Virtual PC application have been installed.

„

Host operating system: The operating system that is running on the physical computer.

„

Virtual machine: The computer that is running inside of Virtual PC. In this document, “Virtual PC” refers to the application running on the host, while “virtual machine” refers to the guest operating system and any software that is running inside of the Virtual PC application.

„

Guest operating system: The operating system that is running inside the virtual machine.

„

Host key: The key that is designated to take the place of the CTRL+ALT combination when logging on to Microsoft Windows®. By default, the host key is the ALT key on the right side of the keyboard HOST+DELETE means RIGHT-ALT+DELETE. The host key can be changed by clicking the File menu in the Virtual PC console, and selecting Options. See Virtual PC online help for other uses of the host key.

By default, the virtual machine will run inside a window on the host computer’s desktop. However, you can run the virtual machine in full screen mode by pressing HOST+ENTER. Using the same key combination, you can return to a windowed view. Note Pressing CTRL+ALT+DELETE while working with a virtual machine will display the Windows Security dialog box for the host operating system. If this is not desired, press ESC. To access the Windows Security dialog box for a guest operating system, press HOST+DELETE. This is the only difference in the way the software works in a virtual machine. You can configure virtual machines to communicate with the host computer, other virtual machines on the same host computer, other host computers, virtual machines on other host computers, other physical computers on the network, or any combination thereof. The setup instructions that you will follow as a part of this classroom setup guide will configure Virtual PC and the virtual machines that will run on the host. Changing any of the configuration settings might render the labs for this course unusable.

2

2781A: Designing Microsoft® SQL Server™ 2005 Server-Side Solutions

Setup Overview The host computers must be running either Windows 2000 Professional or Windows XP Professional. For the purposes of this course, it is not necessary for the host computers to be able to communicate with each other. However, allowing them to communicate with each other is recommended for the ease of setup. You should make note of the administrator user name and password, and provide this to the instructor. Important It is highly recommended that you read the Partner Deployment Guide on the Virtual PC page of the MCT secure site. This document contains valuable information on Microsoft Learning’s virtual machine implementation, activation, troubleshooting, and improving virtual machine performance.

Classroom Requirements This course requires a classroom with a minimum of one computer for the instructor and one computer for each student. Before the class begins, use the following information and instructions to install and configure all computers.

Hardware The classroom computers require the following hardware and software configuration.

Hardware Level 5 „

Pentium IV 2.4 gigahertz (GHz)

„

PCI 2.1 bus

„

2 gigabytes (GB) of RAM

„

40-GB hard disk, 7200 RPM

„

Digital video disc (DVD) player

„

Non-ISA network adapter: 10/100 megabits per second (Mbps) required full duplex

„

16-megabyte (MB) video adapter (32 MB recommended)

„

Super VGA (SVGA) monitor (17 inch)

„

Microsoft Mouse or compatible pointing device

„

Sound card with amplified speakers

„

Projection display device that supports SVGA 800 x 600, 256 colors

In addition, the instructor computer must be connected to a projection display device that supports SVGA 800 x 600 pixels, 256 colors.

2781A: Designing Microsoft® SQL Server™ 2005 Server-Side Solutions

Software Please note that, unless otherwise indicated, this software is not included on the Trainer Materials DVD. This course was developed and tested on the following software, which is required for the classroom computers: „

Windows XP Professional or Windows 2000 Professional

„

Virtual PC 2004

„

Microsoft Office PowerPoint® version 2003 (instructor computer only)

3

4

2781A: Designing Microsoft® SQL Server™ 2005 Server-Side Solutions

Classroom Configuration Each classroom computer will serve as the host for four virtual machines that will run in Virtual PC 2004.The network configuration of the host computers does not matter. After the completion of the setup, all computers will be configured to run the virtual machine 2781A-MIA-SQL.

Estimated time to set up the classroom: 90 minutes

2781A: Designing Microsoft® SQL Server™ 2005 Server-Side Solutions

Instructor Computer Checklist ‰ ‰ ‰ ‰ ‰ ‰ ‰ ‰

1. Install Virtual PC. 2. Install the virtual disk files. 3. Create a desktop shortcut for Virtual PC. 4. Add virtual machines. 5. Activate virtual machines. 6. Create a setup share. 7. Install courseware fonts. 8. Install the PowerPoint slides.

5

6

2781A: Designing Microsoft® SQL Server™ 2005 Server-Side Solutions

Instructor Computer Setup Use the instructions in the following section to set up the classroom manually. Before starting the installation of the instructor computer, Windows 2000 Professional or Windows XP Professional must be installed on the computer. PowerPoint 2003 must also be installed. Important The operating systems installed on the virtual machines in this course have not been activated. To receive product keys that will activate the virtual machines, you must contact Microsoft Learning at [email protected], and include your program ID number in your e-mail. It might take up to 24 hours to receive a response. (It is not necessary to contact Microsoft Learning if you have already done so for another course.) You will use the product keys to activate all virtual machines that you receive from Microsoft Learning. You will only need one key for each operating system. For more information, please see the “Virtual PC Deployment Guide” section of the following Microsoft Certified Trainer (MCT) secure site: https://mcp.microsoft.com/mct/vpc/default.aspx.

2781A: Designing Microsoft® SQL Server™ 2005 Server-Side Solutions

1. Install Virtual PC Task Summary Install Virtual PC.

Note If Virtual PC 2004 is already installed, you may skip this step. 1. Insert the Microsoft Virtual PC 2004 compact disc (CD) into the CD-ROM drive. 2. If autorun is disabled, navigate to the root of the CD, and double-click Setup.exe. 3. On the Welcome to the installation wizard for Microsoft Virtual PC 2004 page, click Next. 4. On the License Agreement page, select I accept the terms in the license agreement, and then click Next. 5. On the Customer Information page, enter a Username, Organization, and the product key for your version of Virtual PC, and then click Next. 6. On the Ready To Install the Program page, click Install. 7. On the InstallShield Wizard Completed page, click Finish.

2. Install the virtual disk files Task Summary

Install the virtual disks and configuration files by running the rar files in the Drives folder on the Trainer Materials DVD.

1. Navigate to the \Setup\Drives folder of the Trainer Materials DVD. Double-click Base05d.exe. Note If you experience long delays when opening the files from the DVD, copy the files to your local hard disk, and open the files from there. 2. In the Official Microsoft Learning Product End-User License Agreement window, click Accept to indicate that you accept the terms in the license agreement. 3. In the WinRAR self-extracting archive window, in the Destination folder text box, ensure that C:\Program Files\Microsoft Learning\Base is listed, and then click Install. • Please wait while the base virtual hard disk file is extracted. This might take a few minutes. 4. Double-click the 2781A-MIA-SQL.exe file, and in the Official Microsoft Learning Product End-User License Agreement window, click Accept. Extract the files from the archive in C:\Program Files\Microsoft Learning\2781\Drives.

7

8

2781A: Designing Microsoft® SQL Server™ 2005 Server-Side Solutions

5. Double-click the 2781A-Allfiles.exe file, and in the Official Microsoft Learning Product End-User License Agreement window, click Accept. Extract the files from the archive in C:\Program Files\Microsoft Learning\2781\Drives. 6. Navigate to C:\Program Files\Microsoft Learning\Base. 7. Right-click the Base05D.vhd file, and then click Properties. 8. Under Attributes, click the Read-only check box, and then click OK.

3. Create a desktop shortcut for Virtual PC Task Summary

Create a shortcut for Virtual PC on the desktop.

1. Navigate to C:\Program Files\Microsoft Virtual PC. 2. Right-click and drag Virtual PC.exe to the desktop. 3. From the Context menu, select Create Shortcuts Here.

4. Add virtual machines Task Summary

Use the New Virtual Machine Wizard to add the virtual machine to the Virtual PC console.

1. Double-click the Microsoft Virtual PC shortcut on the desktop. 2. On the Welcome to the New Virtual Machine Wizard page, click Next. (If the wizard does not automatically start, click New.) 3. On the Options page, select Add an existing virtual machine, and then click Next. 4. In the Name and location box, type C:\Program Files\Microsoft Learning\2781\Drives\2781A-MIA-SQL-01.vmc, and then click Next. 5. On the Completing the New Virtual Machine Wizard page, verify that When I click Finish, open Settings is selected, and then click Finish. 6. In the Settings for 2781A-MIA-SQL-01 dialog box, select Networking, verify that the value of the Number of network adapters setting is 1, verify that Local only is selected in the Adapter 1 list, verify that 2781AMIA-SQL.vhd is selected as Hard Disk 1, verify that 2781A-Allfiles01.vhd is selected as Hard Disk 2, and then click OK. Important Do not change the RAM allocation for the virtual machine. Doing so might cause the lab exercises or practices to become unstable or to cease functioning. 7. Repeat steps 2–6 for the following virtual machines, using the appropriate 2781A-Allfiles-0X.vhd virtual hard disk file for each virtual machine: • 2781A-MIA-SQL-02.vmc • 2781A-MIA-SQL-03.vmc • 2781A-MIA-SQL-04.vmc • 2781A-MIA-SQL-05.vmc • 2781A-MIA-SQL-06.vmc • 2781A-MIA-SQL-07.vmc • 2781A-MIA-SQL-08.vmc

2781A: Designing Microsoft® SQL Server™ 2005 Server-Side Solutions

9

5. Activate virtual machines Important Potential virtual machine blue screen In some rare cases, a virtual machine might display a blue screen sometime between its first launch and its first shutdown. This is due to a known issue between Virtual PC 2004 and its interaction with newer processors. If this occurs, close the virtual machine, and select Turn Off and Save Changes, ensuring that the Commit Changes to the Virtual Hard Disk checkbox is selected. The problem will rectify itself and will not recur the next time that the virtual machine is started. Note When the 2781A-MIA-SQL machine is started for the first time, a warning appears that the differencing drive appears to have been modified. On the warning message box, click OK. We recommend that after activating the virtual machines, you save them, so that in the future, you can set up the classroom without needing to activate them again. Note This section requires the use of the product keys supplied by Microsoft Learning. For instructions on obtaining these product keys, see the Important note at the beginning of the Instructor Computer Setup section. Task Summary

Activate Windows operating systems within the virtual machines.

1. In the Virtual PC console, select 2781A-MIA-SQL-01, and then click Start. 2. Log on to the virtual machine as Administrator with a password of Pa$$w0rd. Note Pressing CTRL+ALT+DELETE while working with a virtual machine will display the Windows Security dialog box for the host—not the guest— operating system. To log on to the guest operating system running in the virtual machine, press RIGHT-ALT+DELETE. 3. In the Windows Product Activation alert box, click Yes. 4. On the Let’s activate Windows page, select the Yes, I want to telephone a customer service representative to activate Windows radio button, and then click Next. 5. On the Activate Windows by phone page, click the Change Product Key radio button. Note You might need to scroll down the window to see these radio buttons. 6. On the Change Product Key page, enter the course-specific product key provided by Microsoft Learning, and then click Update. 7. On the Activate Windows by phone page, in the Step 1 drop-down list box, select your location. 8. Dial the telephone number that is displayed in Step 2. 9. Follow the telephone instructions to activate Windows. This will take a few minutes.

10

2781A: Designing Microsoft® SQL Server™ 2005 Server-Side Solutions

10. After logon is completed, in the Virtual PC window, from the Action menu, select Close. 11. In the Close window, select Shut down Windows Server 2003 and save changes, verify that Commit changes to the virtual hard disk is selected, and then click OK.

6. Create a setup share Task Summary

Share virtual machine files for installing on student computers.

1. In Windows Explorer, right-click C:\Program Files\Microsoft Learning\ Base, and then click Sharing (on Windows 2000 Professional) or Sharing and Security (on Windows XP). 2. In the Base Properties window, on the Sharing tab, click Share this folder. In the Share name text box, type Base_Drives, and then click OK. 3. In Windows Explorer, right-click C:\Program Files\Microsoft Learning\ 2781\Drives, and then click Sharing (on Windows 2000 Professional) or Sharing and Security (on Windows XP). 4. On the Sharing tab, select Share this folder, type 2781_Drives in the Share name text box, and then click OK.

7. Install courseware fonts Task Summary

Install courseware fonts by running fonts.exe.

1. Click Start, and then click Run. 2. In the Run text box, type x:\setup\fonts.exe (where x is the drive letter of your DVD-ROM drive), and then click OK. 3. In the Courseware fonts dialog box, click Yes. 4. In the Courseware fonts message box, click OK.

8. Install the PowerPoint slides Task Summary

Install PowerPoint slides by running 2781_ppt.msi.

1. Click Start, and then click Run. 2. In the Run text box, type x:\setup\2781_ppt.msi (where x is the drive letter of your DVD-ROM drive).

2781A: Designing Microsoft® SQL Server™ 2005 Server-Side Solutions

Student Computer Checklist ‰ ‰ ‰ ‰ ‰

1. Install Virtual PC. 2. Install the virtual disk files. 3. Create a desktop shortcut for Virtual PC. 4. Add virtual machines. 5. Start the virtual machines.

11

12

2781A: Designing Microsoft® SQL Server™ 2005 Server-Side Solutions

Student Computer Setup To set up the student computers, complete the items in the Student Computer Checklist. Before starting the installation of the student computers, Windows 2000 Professional or Windows XP Professional must be installed on the computers. Caution These instructions assume that there is no network connectivity between the instructor computer and the student computers. We recommend copying the activated virtual machines to the student computers via a burned DVD or USB drive, for example, to avoid the need to activate the virtual machines on each student computer. If you use the original virtual machines from the Trainer Materials DVD, you will need to activate them on each student computer.

1. Install Virtual PC Note If Virtual PC 2004 is already installed, you may skip this step. 1. Insert the Microsoft Virtual PC 2004 compact disc (CD) into the CD-ROM drive. 2. If autorun is disabled, navigate to the root of the CD, and double-click Setup.exe. 3. On the Welcome to the Installation Wizard for Microsoft Virtual PC 2004 page, click Next. 4. On the License Agreement page, select I accept the terms in the license agreement, and then click Next. 5. On the Customer Information page, enter a User Name, Organization, and the product key for your version of Virtual PC, and then click Next. 6. On the Ready to Install the Program page, click Install. 7. On the InstallShield Wizard Completed page, click Finish.

2. Install the virtual disk files 1. Copy all the files from the Base_Drives share on the instructor’s computer to C:\Program Files\Microsoft Learning\Base. 2. Copy all the files from the 2781_Drives share on the instructor’s computer to C:\Program Files\Microsoft Learning\2781\Drives.

3. Create a desktop shortcut for Virtual PC 1. Navigate to C:\Program Files\Microsoft Virtual PC. 2. Right-click and drag Virtual PC.exe to the desktop. 3. From the Context menu, select Create Shortcuts Here.

2781A: Designing Microsoft® SQL Server™ 2005 Server-Side Solutions

13

4. Add virtual machines 1. Double-click the Microsoft Virtual PC shortcut on the desktop. 2. On the Welcome to the New Virtual Machine Wizard page, click Next. (If the wizard does not automatically start, click New.) 3. On the Options page, select Add an existing virtual machine, and then click Next. 4. In the Name and location box, type C:\Program Files\Microsoft Learning\2781\Drives\2781A-MIA-SQL-01.vmc, and then click Next. 5. On the Completing the New Virtual Machine Wizard page, verify that When I click Finish, open Settings is selected, and then click Finish. 6. In the Settings for 2781A-MIA-SQL-01 dialog box, select Networking, verify that the value of the Number of network adapters setting is 1, verify that Local only is selected in the Adapter 1 list, verify that 2781AMIA-SQL.vhd is selected as Hard Disk 1, verify that 2781A-Allfiles01.vhd is selected as Hard Disk 2, and then click OK. Important Do not change the RAM allocation for the virtual machine. Doing so might cause the lab exercises or practices to become unstable or to cease functioning. 7. Repeat steps 2–6 for the following virtual machines, using the appropriate 2781A-Allfiles-0X.vhd virtual hard disk file for each virtual machine: • 2781A-MIA-SQL-02.vmc • 2781A-MIA-SQL-03.vmc • 2781A-MIA-SQL-04.vmc • 2781A-MIA-SQL-05.vmc • 2781A-MIA-SQL-06.vmc • 2781A-MIA-SQL-07.vmc • 2781A-MIA-SQL-08.vmc

5. Start the virtual machine Task Summary

Start the virtual machines.

1. Click Start, and point to All Programs. 2. Click Microsoft Virtual PC. 3. On the Virtual PC Console, click 2781A-MIA-SQL-01, and then click Start. Note Starting the virtual machine is for the purpose of classroom use. This does not need to be performed until the class is going to start.

THIS PAGE INTENTIONALLY LEFT BLANK

Module 0

Introduction

Contents: Introduction

i

Course Materials

ii

Microsoft Learning Product Types

iv

Facilities

v

Microsoft Learning

vi

Microsoft Certification Program

vii

About This Course

x

The Process of Designing SQL Server 2005 Server-Side Solutions

xii

Course Outline

xiii

Setup

xv

Demonstration: Using Virtual PC

xvii

What Matters Most?

xviii

Introduction to Fabrikam, Inc.

xx

Information in this document, including URL and other Internet Web site references, is subject to change without notice. Unless otherwise noted, the example companies, organizations, products, domain names, email addresses, logos, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place or event is intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. The names of manufacturers, products, or URLs are provided for informational purposes only and Microsoft makes no representations and warranties, either expressed, implied, or statutory, regarding these manufacturers or the use of the products with any Microsoft technologies. The inclusion of a manufacturer or product does not imply endorsement of Microsoft of the manufacturer or product. Links are provided to third party sites. Such sites are not under the control of Microsoft and Microsoft is not responsible for the contents of any linked site or any link contained in a linked site, or any changes or updates to such sites. Microsoft is not responsible for webcasting or any other form of transmission received from any linked site. Microsoft is providing these links to you only as a convenience, and the inclusion of any link does not imply endorsement of Microsoft of the site or the products contained therein. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. ©2006 Microsoft Corporation. All rights reserved. Microsoft, Active Directory, ActiveX, BizTalk, Excel, Microsoft Press, MSDN, MSN, Outlook, PowerPoint, SharePoint, Tahoma, Visio, Visual Basic, Visual C#, Visual SourceSafe, Visual Studio, Windows, and Windows Server. All other trademarks are property of their respective owners.

Introduction

i

Introduction

**************************************** Illegal for non-trainer use ***************************************

ii

Introduction

Course Materials

**************************************** Illegal for non-trainer use *************************************** Course kit

The following materials are included with your kit: Student workbook. The student workbook contains the material covered in class, in addition to the hands-on lab exercises. Student Materials compact disc. The Student Materials compact disc contains the Web page that provides links to resources pertaining to this course, including additional reading, review and lab answers, lab files, multimedia presentations, and course-related Web sites. To open the Web page, insert the Student Materials compact disc into the CD-ROM drive, and then in the root directory of the compact disc, double-click Autorun.exe or Default.htm. Course evaluation. You will have the opportunity to provide feedback about the course, training facility, and instructor by completing an online evaluation near the end of the course.

Introduction Document conventions

Providing feedback

iii

The following conventions are used in course materials to distinguish elements of the text. Convention

Use

Bold

Represents commands, command options, and syntax that must be typed exactly as shown. It also indicates commands on menus and buttons, and indicates dialog box titles and options, and icon and menu names.

Italic

In syntax statements or descriptive text, indicates argument names or placeholders for variable information. Italic is also used for introducing new terms, for book titles, and for emphasis in the text.

Title Capitals

Indicate domain names, user names, computer names, directory names, and folder and file names, except when specifically referring to case-sensitive names. Unless otherwise indicated, you can use lowercase letters when you type a directory name or file name in a dialog box or at a command prompt.

ALL CAPITALS

Indicate the names of keys, key sequences, and key combinations — for example, alt+spacebar.

try/Try

Keywords in Microsoft® Visual C#® and Microsoft Visual Basic® .NET are separated by a forward slash when casing differs.

monospace

Represents code samples or examples of screen text.

[]

In syntax statements, enclose optional items. For example, [filename] in command syntax indicates that you can choose to type a file name with the command. Type only the information within the brackets, not the brackets themselves.

{}

In syntax statements, enclose required items. Type only the information within the braces, not the braces themselves.

|

In syntax statements, separates an either/or choice.



Indicates a procedure with sequential steps.

...

In syntax statements, specifies that the preceding item may be repeated. It also represents an omitted portion of a code sample.

To provide additional comments or feedback about the course, send e-mail to [email protected]. To ask about the Microsoft Certification Program, send e-mail to [email protected].

iv

Introduction

Microsoft Learning Product Types

**************************************** Illegal for non-trainer use *************************************** Microsoft Learning product types

Microsoft Learning offers four instructor-led Official Microsoft Learning Product (OMLP) types. Each type is specific to a particular audience and level of experience. The various product types also tend to suit different learning styles. These types are as follows: ■

Courses are for information technology (IT) professionals and developers who are new to a particular product or technology and for experienced individuals who prefer to learn in a traditional classroom format. Courses provide a relevant and guided learning experience that combines lecture and practice to deliver thorough coverage of a Microsoft product or technology. Courses are designed to address the needs of learners engaged in the planning, design, implementation, management, and support phases of the technology adoption life cycle. They provide detailed information by focusing on concepts and principles, reference content, and indepth, hands-on lab activities to ensure knowledge transfer. Typically, the content of a course is broad, addressing a wide range of tasks necessary for the job role.



Workshops are for knowledgeable IT professionals and developers who learn best by doing and exploring. Workshops provide a hands-on learning experience in which participants can use Microsoft products in a safe and collaborative environment based on real-world scenarios. Workshops are the learning products in which students learn by doing through scenarios and through troubleshooting hands-on labs, targeted reviews, information resources, and best practices, with instructor facilitation.



Clinics are for IT professionals, developers, and technical decision makers. Clinics offer a detailed presentation that may describe the features and functionality of an existing or new Microsoft product or technology, provide guidelines and best practices for decision making, and/or showcase product demonstrations and solutions. Clinics focus on how specific features will solve business problems.



Stand-alone hands-on labs provide IT professionals and developers with hands-on experience with an existing or new Microsoft product or technology. Hands-on labs provide a realistic and safe environment to encourage knowledge transfer by learning through doing. The labs provided are completely prescriptive so that no lab answer keys are required. There is very little lecture or text content provided in hands-on labs, aside from lab introductions, context setting, and lab reviews.

Introduction

v

Facilities

**************************************** Illegal for non-trainer use ***************************************

vi

Introduction

Microsoft Learning

**************************************** Illegal for non-trainer use *************************************** Introduction

Microsoft Learning develops Official Microsoft Learning Products for computer professionals who use Microsoft products and technologies to design, develop, support, implement, or manage solutions. These learning products provide comprehensive, skills-based training in instructor-led and online formats.

Related courses

Each course relates in some way to another course. A related course might be a prerequisite, a follow-up course in a recommended series, or a course that offers additional training. Other related courses might become available in the future, so for up-to-date information about recommended courses, visit the Microsoft Learning Web site.

Microsoft Learning information

For more information, visit the Microsoft Learning Web site at http:// www.microsoft.com/learning/.

Introduction

vii

Microsoft Certification Program

**************************************** Illegal for non-trainer use *************************************** Introduction

Microsoft Learning offers a variety of certification credentials for developers and IT professionals. The Microsoft Certification Program (MCP) is the leading certification program for validating your experience and skills, keeping you competitive in today’s changing business environment.

Related certification exams

This course helps students to prepare for: ■

MCP certifications

Exam 70–441: Designing Database Solutions by Using Microsoft SQL Server 2005

The Microsoft Certification Program includes the following certifications: MCITP. The new Microsoft Certified IT Professional (MCITP) credential allows IT professionals to distinguish themselves as experts in their specific area of focus. There is a direct upgrade path from the Microsoft Certified Database Administrator (MCDBA) certification to the new MCITP credentials. There are currently three IT Professional certifications in database administration, database development, and business intelligence: ■

Microsoft Certified IT Professional: Database Developer



Microsoft Certified IT Professional: Database Administrator



Microsoft Certified IT Professional: Business Intelligence Developer

MCPD. The Microsoft Certified Professional Developer (MCPD) credential highlights developer job roles, featuring specific areas of expertise. There is a direct upgrade path from the Microsoft Certified Application Developer (MCAD) and Microsoft Certified Solution Developer (MCSD) for Microsoft .NET certifications to the new MCPD credentials. There are three MCPD certification paths in development for Microsoft Windows®, Web applications, and enterprise applications: ■

Microsoft Certified Professional Developer: Web Developer



Microsoft Certified Professional Developer: Windows Developer



Microsoft Certified Professional Developer: Enterprise Applications Developer

viii

Introduction MCTS. The Microsoft Certified Technology Specialist (MCTS) credential enables professionals to target specific technologies and distinguish themselves by demonstrating in-depth knowledge of and expertise in the technologies with which they work. There are currently five MCTS certifications: ■

Microsoft Certified Technology Specialist: .NET Framework 2.0 Web Applications



Microsoft Certified Technology Specialist: .NET Framework 2.0 Windows-Based Applications



Microsoft Certified Technology Specialist: .NET Framework 2.0 Distributed Applications



Microsoft Certified Technology Specialist: Microsoft SQL Server™ 2005



Microsoft Certified Technology Specialist: Microsoft BizTalk® Server 2006

MCDST on Microsoft Windows. The Microsoft Certified Desktop Support Technician (MCDST) certification is designed for professionals who successfully support and educate end users and troubleshoot operating system and application issues on desktop computers running the Windows operating system. MCSA on Microsoft Windows Server™ 2003. The Microsoft Certified Systems Administrator (MCSA) certification is designed for professionals who implement, manage, and troubleshoot existing network and system environments based on the Windows Server 2003 platform. Implementation responsibilities include installing and configuring parts of systems. Management responsibilities include administering and supporting systems. MCSE on Microsoft Windows Server 2003. The Microsoft Certified Systems Engineer (MCSE) credential is the premier certification for professionals who analyze business requirements and design and implement infrastructure for business solutions based on the Windows Server 2003 platform. Implementation responsibilities include installing, configuring, and troubleshooting network systems. MCAD for Microsoft .NET. The Microsoft Certified Application Developer (MCAD) for Microsoft .NET credential provides industry recognition for professional developers who use Microsoft Visual Studio® .NET and Web services to develop and maintain department-level applications, components, Web or desktop clients, or back-end data services, or who work in teams developing enterprise applications. The credential covers job tasks ranging from developing to deploying and maintaining these solutions. MCSD for Microsoft .NET. The Microsoft Certified Solution Developer (MCSD) for Microsoft .NET credential is the top-level certification for advanced developers who design and develop leading-edge enterprise solutions, using Microsoft development tools and technologies as well as the Microsoft .NET Framework. The credential covers job tasks ranging from analyzing business requirements to maintaining solutions. MCDBA on Microsoft SQL Server 2000. The Microsoft Certified Database Administrator (MCDBA) credential is the premier certification for professionals who implement and administer SQL Server 2000 databases. The certification is appropriate for individuals who derive physical database designs, develop logical data models, create physical databases, create data services by using Transact-SQL, manage and maintain databases, configure and manage security, monitor and optimize databases, and install and configure SQL Server. MCP. The Microsoft Certified Professional (MCP) credential is for individuals who have the skills to successfully implement a Microsoft product or technology as part of a business solution in an organization. Hands-on experience with the product is necessary to successfully achieve certification.

Introduction

ix

MCT. Microsoft Certified Trainers (MCTs) demonstrate the instructional and technical skills that qualify them to deliver Official Microsoft Learning Products through a Microsoft Certified Partner for Learning Solutions (CPLS). Certification requirements

Certification requirements differ for each certification category and are specific to the products and job functions addressed by the certification. To earn a certification credential, you must pass rigorous certification exams that provide a valid and reliable measure of technical proficiency and expertise. For More Information See the Microsoft Learning Web site at http:// www.microsoft.com/learning/. You can also send e-mail to [email protected] if you have specific certification questions.

Acquiring the skills tested by an MCP exam

Official Microsoft Learning Products can help you develop the skills that you need to do your job. They also complement the experience that you gain while working with Microsoft products and technologies. However, no one-to-one correlation exists between Official Microsoft Learning Products and MCP exams. Microsoft does not expect or intend for the courses to be the sole preparation method for passing MCP exams. Practical product knowledge and experience is also necessary to pass MCP exams. To help prepare for MCP exams, use the preparation guides that are available for each exam. Each Exam Preparation Guide contains exam-specific information, such as a list of the topics on which you will be tested. These guides are available on the Microsoft Learning Web site at http://www.microsoft.com/learning/.

x

Introduction

About This Course

**************************************** Illegal for non-trainer use *************************************** Description

The purpose of this course is to teach database developers working in enterprise environments to identify and place database technologies during a project’s design phase to achieve a suitable solution that meets the needs of an organization. The developer will learn to consider the solution from a system-wide view instead of from a single database or server perspective.

Audience

The intended audience for this course consists of experienced professional database developers who perform the following tasks by using Microsoft SQL Server 2005:

Course prerequisites



Designing and implementing programming objects.



Designing databases, at both the conceptual and logical levels.



Implementing databases at the physical level (that is, the physical implementation of database objects—for example, table and index creation).



Gathering business requirements.

This course has the following prerequisites: ■

Experience reading user requirements and business-need documents, such as development project vision/mission statements or business analysis reports.



An understanding of Transact-SQL syntax and programming logic.



An understand of Extensible Markup Language (XML). Specifically, you must be familiar with XML syntax, what elements and attributes are, and how to distinguish between them.



An understanding of security requirements. Specifically, you must understand how unauthorized users can gain access to sensitive information and be able to plan strategies to prevent access.

Introduction ■

Course objectives

xi

Some experience with professional-level database design. Specifically, you must: ●

Fully understand third normal form (3NF).



Be able to design a database to 3NF (normalization) and know the trade-offs when backing out of a normalized design (denormalization).



Be able to design a database for performance and business requirements.



Be familiar with design models, such as star and snowflake schemas.



Basic monitoring and troubleshooting skills.



Basic knowledge of the operating system and platform—that is, how the operating system integrates with the database, what the platform or operating system can do, and how interaction between the operating system and the database works.



Basic knowledge of application architecture—that is, how to design applications by using three layers, what applications can do, how interaction between the application and the database works, and how the interaction between the database and the platform or operating system works.



Knowledge about how to use a data modeling tool.



Some experience with a reporting tool.



Exposure to the new features of Microsoft SQL Server 2005. Specifically, you must have: ●

Used the development tools provided in SQL Server 2005.



Used Transact-SQL enhancements in SQL Server 2005 to perform database development tasks.



Developed XML-based solutions by using SQL Server 2005.



Built message-based services by using Service Broker.



Implemented Web services by using native Hypertext Transfer Protocol (HTTP) endpoints.



Built Notification Services applications.



Implemented database functionality with managed (.NET) code.



Built client applications with the Microsoft .NET Framework.



Built administrative applications with SQL Server Management Objects (SMO).

After completing the course, you will be able to: ■

Select SQL Server services to support an organization’s business needs.



Design a security strategy for a SQL Server 2005 solution.



Design a data modeling strategy.



Design a transaction strategy for a SQL Server solution.



Design a Notification Services solution.



Design a Service Broker solution.



Plan for source control, unit testing, and deployment to meet an organization’s needs.



Evaluate advanced query techniques.



Evaluate advanced XML techniques.

xii

Introduction

The Process of Designing SQL Server 2005 Server-Side Solutions

**************************************** Illegal for non-trainer use *************************************** The process of designing SQL Server 2005 serverside solutions

The process of designing a SQL Server 2005 server-side solution can require you to perform the following tasks: 1. Select the necessary SQL Server services to support the business needs. This task ascertains the technologies you will use for your solution and determines the scope of the remaining steps. 2. Design a security strategy for the SQL Server solution. This is a fundamental task that impacts the design of the entire system. 3. Design a data modeling strategy. This task determines how and where you store data and the services you will provide for accessing this data. When you have completed the initial steps, you can perform the following tasks as required by the solution: ■

Design a transaction strategy for the SQL Server 2005 solution. This task helps to ensure the integrity of the data and concurrent data access.



Design a Notification Services solution. This is an optional task if you determine that your solution should use Notification Services.



Design a Service Broker solution. This optional task should be performed if you determine that your solution should use Service Broker.



Plan for source control, unit testing, and deployment. This task defines the iterative process for maintaining, testing, and deploying your solution.

One other task that is important to the design process is evaluating the advanced query and XML techniques available in SQL Server 2005. You can perform this task at any time and as often as necessary throughout the planning and development process.

Introduction

xiii

Course Outline

**************************************** Illegal for non-trainer use *************************************** Course outline



Module 1, “Selecting SQL Server Services That Support Business Needs,” provides an overview of SQL Server 2005 architecture and the considerations to take into account when choosing SQL Server services to include in a solution. The module also provides information about the use of the database enhancements in SQL Server 2005. You will use this knowledge to translate business requirements into SQL Server services and present this solution to nontechnical users, such as business decision makers. At the end of the module, you will present the proposed solution to the rest of the class.



Module 2, “Designing a Security Strategy,” provides the considerations to take into account when designing a security strategy for the various components of a SQL Server 2005 solution. This includes considerations for choosing authentication and authorization strategy for your solution as well as designing security for solution components such as Notification Services and Service Broker. The module also provides guidelines for designing objects to manage application access. Examples of objects include SQL Server 2005 roles and schemas, stored procedures, and views. In addition, the module provides the knowledge required to create an auditing strategy for your solution. Last, the module teaches you how to manage multiple development teams. At the end of the module, you will defend the security strategy to the rest of the class.



Module 3, “Designing a Data Modeling Strategy,” provides the considerations and guidelines to take into account when defining standards for storing XML data in a solution. The module also provides the knowledge required to design a database schema. To enable you do this, the module provides information about the considerations to take into account when implementing online transaction processing (OLTP) and online analytical processing (OLAP) functionality, determining normalization levels, and creating indexes. Last, the module covers the considerations to take into account when designing a scale-out strategy for a solution. Considerations include choosing multiple data stores, designing for performance and redundancy, and integrating data stores in the solution.

xiv

Introduction ■

Module 4, “Designing a Transaction Strategy for a SQL Server 2005 Solution,” teaches the considerations and guidelines that you need to know when defining a transaction strategy for a solution. Defining this strategy includes defining data behavior requirements, defining isolation levels for data stores, and designing a resilient transaction strategy. This module also enables you to defend the proposed transaction strategy to the rest of the class.



Module 5, “Designing a Notification Services Solution,” teaches the guidelines and processes that you need to know when designing a Notification Services solution as part of a SQL Server 2005 solution. Design tasks include defining event data and how this data will be stored, designing a subscription strategy, designing a notification strategy, and designing a notification delivery strategy. In addition to design tasks, at the end of the module you will also execute a Notification Services solution to see this type of solution design in action.



Module 6, “Designing a Service Broker Solution,” teaches the guidelines and processes you need to know to design a Service Broker solution as part of a SQL Server 2005 solution. Design tasks include designing the solution architecture, data flow, and availability. At the end of the module, you will execute a Service Broker solution to see this type of solution design in action.



Module 7, “Planning for Source Control, Unit Testing, and Deployment,” teaches the considerations and guidelines you need to know when planning for source control, unit testing, and deployment during the design of a SQL Server 2005 solution. Design tasks include designing a source control strategy, designing a unit testing plan, creating a performance baseline and benchmarking strategy, and designing a deployment strategy. At the end of the module, you will work in teams to perform design tasks and then present and defend your design decisions to your peers.



Module 8, “Evaluating Advanced Query and XML Techniques,” provides an opportunity for you to evaluate and practice using advanced query and XML techniques, which you might use on the job when designing a SQL Server 2005 solution. Query-related tasks include evaluating common table expressions, pivot queries, and ranking techniques. XML-related tasks include defining standards for storing XML data, evaluating the use of Extensible Query Language (Xquery), and creating a strategy for converting data between XML and relational formats. As part of your evaluation tasks, you will write advanced queries and compare the results of the new techniques to the results of previous techniques.

Note For in-depth coverage of topics not discussed in this course, see other professional-level SQL Server courses that deal with database administration, database development, and business intelligence.

Introduction

xv

Setup

**************************************** Illegal for non-trainer use *************************************** Virtual PC configuration

In this course, you will use Microsoft Virtual PC 2004 to perform the hands-on practices and labs. Important When performing the hands-on activities, if you make any changes to the virtual machine and do not want to save them, you can close the virtual machine without saving the changes. This will return the virtual machine to the most recently saved state. To close a virtual machine without saving the changes, perform the following steps: 1. On the virtual machine, on the Action menu, click Close. 2. In the Close dialog box, in the What do you want the virtual machine to do? list, click Turn off and delete changes, and then click OK. The following table shows the virtual machines used in this course and their roles.

Software configuration

Course files

Virtual machine

Role

2786-MIA-SQL

A server computer running Windows Server 2003 Service Pack 1 (SP1) and SQL Server 2005 Enterprise Edition

The classroom computers use the following software: ■

Microsoft Windows XP Professional with Service Pack 2 (SP2)



Microsoft Virtual PC 2004

There are files associated with the demonstrations, practices, and labs in this course. The files are located on each student computer, in the folder E:\Labfiles.

xvi

Introduction

Classroom setup

Each computer in the classroom will have the same virtual machine configured in the same way. The virtual machines use the following software: ■

Microsoft Windows 2003 Server Enterprise Edition Service Pack 1 (SP1)



Microsoft SQL Server 2005 Enterprise Edition



Microsoft Visual Studio Developer Edition



Microsoft Office Excel® 2003



Microsoft Office Word 2003



Microsoft Office Visio® 2003

Each virtual machine runs an instance of SQL Server named MIA-SQL\SQLINST1. This instance hosts a copy of the example AdventureWorks database. The installation includes SQL Server Integration Services, SQL Server Notification Services, SQL Server Service Broker, and SQL Server Workstation components. Course hardware level

To ensure a satisfactory student experience, Microsoft Learning recommends a minimum configuration for trainer and student computers in all Microsoft Certified Partner for Learning Solutions (CPLS) classrooms in which Official Microsoft Learning Products are used. This course requires that you have a computer that meets or exceeds hardware level 5, which specifies a minimum 2.4 gigahertz (GHz) Pentium 4 or equivalent CPU, at least 2 gigabytes (GB) of RAM, 16 megabytes (MB) of video RAM, and a 7,200 RPM 40-GB hard disk.

Introduction

xvii

Demonstration: Using Virtual PC

**************************************** Illegal for non-trainer use *************************************** Virtual PC demonstration

Keyboard shortcuts

In this demonstration, your instructor will help familiarize you with the Virtual PC environment in which you will work to complete the practices and labs in this course. You will learn: ■

How to start Microsoft Virtual PC.



How to start a virtual machine.



How to log on to a virtual machine.



How to switch between full-screen and window modes.



How to distinguish between the virtual machines that are used in the practices for this course.



That the virtual machines can communicate with each other and with the host computer but that they cannot communicate with computers that are outside the virtual environment. (For example, no Internet access is available from the virtual environment.)



How to close Virtual PC.

While working in the Virtual PC environment, you might find it helpful to use keyboard shortcuts. All Virtual PC shortcuts include a key that is referred to as the HOST key or the RIGHT-ALT key. By default, the HOST key is the ALT key on the right side of your keyboard. Some useful shortcuts include: ■

RIGHT-ALT+DELETE to log on to the virtual machine.



RIGHT-ALT+ENTER to switch between full-screen and window modes.



RIGHT-ALT+RIGHT ARROW to display the next virtual machine.

For more information about using Virtual PC, see Virtual PC Help.

xviii

Introduction

What Matters Most?

**************************************** Illegal for non-trainer use *************************************** What matters most in this course

This table captures the most important information that you should take away from this course. Categories Most important conceptual knowledge and understanding

Most important problems for students to solve or skills to perform in the classroom

Achievement targets ■

The SQL Server 2005 services and when and how they can be used for a solution.



The ways to integrate the SQL Server 2005 security model into a solution design and the security implications of different services.



The benefits and drawbacks of complex query techniques in achieving different performance objectives.



The ways to integrate testing and deployment strategies into the design of a solution.



Given a set of business requirements, choose the best mix of server solutions and associated coding strategies and then defend your decisions.



Based on the benefits and drawbacks of the new features available to write queries (such as XML and recursive queries), choose one and then defend your choice.



Design a solution test and deployment plan that includes the performance benchmarks and baselines.



Design security for a solution and then defend your choices.

Introduction Categories Most important products to create during the course

Most important dispositions (attitudes, interests, beliefs) that will contribute to success on the job

Tips for getting the most out of this course

xix

Achievement targets ■

A design plan for Fabrikam, Inc., that includes: ●

Location of services.



Security strategy.



Data stores and flow between the data.



Object access strategy.



Testing and deployment strategies.



Transact-SQL scripts that use SQL Server 2005 language enhancements.



An attitude of professionalism and openness. You must be able to explain concepts to nontechnical people but also be able to listen to those people, value what they are saying, understand their needs, and then translate those needs into a technical solution.



A willingness to embrace new technologies and accept the challenge of incorporating new technology into new or existing solutions. You must also possess an ability to keep learning and to not fall back on only your standard areas of expertise.



An ability and willingness to be flexible in evaluating solutions and trying alternative approaches. Because finding the right tool for the job is not often straightforward, you must possess a willingness to look for multiple answers before choosing one and to not always stop with the first answer.

If, as the course progresses, you feel that you have not adequately learned something mentioned in this table, ask questions of the instructor and your peers until you are satisfied that you understand the concepts and know how to perform these important tasks. After class each day, review the materials, highlight key ideas and notes, and create a list of questions to ask the next day. Your instructor and peers will be able to suggest additional, up-to-date resources for more information. Ask them about additional resources, and record their ideas so that you can continue learning after this course is over. As soon as possible after this course is over, share this information with your manager, and discuss next steps. Your manager should understand that the sooner you apply these skills on the job, the more likely you will be to remember them long term. However, a person cannot learn everything that there is to know about this complex job task in a three-day course, so schedule some additional time to continue your learning by reading the supplementary materials mentioned throughout the course, offered by your instructor and peers during the course, and included on the Student Materials compact disc.

xx

Introduction

Introduction to Fabrikam, Inc.

**************************************** Illegal for non-trainer use *************************************** Introduction to Fabrikam, Inc.

Fabrikam, Inc., is a wireless telephone company that uses a solution that was built in 1995. They have a global system (Application A) that manages all of the customer accounts. In addition, they have regional sales offices that interact with third-party owned and operated storefronts that sell Fabrikam services. Each storefront uses a local system (Application B) to manage the accounts created at that location. Application A is running at maximum capacity, and any new features being added need to be built in a scaled-out fashion. For most storefronts, data is sent and received between Application A and Application B via nightly batch updates. Some stores do not have Internet access and are sent weekly updates through the post. As a result, out of date data is frequently found in Application A. In addition, the total cost of ownership (TCO) of the billing process has increased over the last few years, and alternatives are being considered.

Your role at Fabrikam, Inc.

You will help Fabrikam solve its problems by completing the labs in Modules 1 through 8. For each lab, you will use specific input to produce the required output, as described in the following table. Note Module 8, “Evaluating Advanced Query and XML Techniques,” does not teach explicit job tasks with regard to the Fabrikam solution. However, the tasks in the module are things that experienced database developers do as a typical part of their jobs. The module provides you with an opportunity to see advanced query and XML techniques, practice using them, and evaluate their use in your organizations.

Introduction Lab

Input

xxi

Output

Lab 1:



Selecting SQL Server Services to Support Business Needs

A Visio diagram with some of the existing sites and databases for the scenario



The Visio diagram augmented with information about services mapped to the business requirements



Transact-SQL scripts and common language runtime (CLR) code for several stored procedures

Lab 2:



Designing a Security Strategy

The Visio diagram answer for the Module 1 lab



An auditing strategy document



A test database with several tables that store sample customer information



Transact-SQL scripts for auditing



Stored procedures to insert and modify the customer information



Updated table diagrams that contains information about the database schemas



Table diagrams



A Word document with information about roles and some prototype objects to access the database

Lab 3:



An example of a bill (Word document)





The Visio diagram answer for the Module 2 lab

The Visio diagram augmented with information about locations of data stores mapped to the business requirements

Designing a Data Modeling Strategy



An additional Visio page with a prototype reporting schema



Modified Visio diagram showing data store configuration



A document defining the order of object access

Lab 4:



Designing a Transaction Strategy for a SQL Server 2005 Solution

The Visio diagram answer for the Module 3 lab



Customers and Orders database containing tables specified in the scenario



Stored procedures defined in the scenario



A document defining the transaction strategy

Lab 5:



The Visio diagram answer for the Module 4 lab



Designing a Notification Services Solution



A Word document that contains the following sections:

Modified Visio diagram with Notification Services components



Completed Word document for the Notification Services solution





Hardware Configurations



Instance of Notification Services Deployment Strategy



Nonhosted Event Provider Deployment Strategy



Subscription Management Deployment Strategy



Key Areas of Administration to the Database Administrator (DBA)

The Instance Configuration File (ICF) file for this solution

xxii

Introduction

Lab

Input

Lab 6:



Designing a Service Broker Solution

The Visio diagram answer for the Module 5 lab



A Visio template to document where the brokers and conversations are and to identify the queues between the brokers

Lab 7:



Planning for Source Control, Unit Testing, and Deployment

The Visio diagram answer for the Module 6 lab.



The Notification Services document answer for the Module 5 lab



Description of project teams



Template to document the strategies

Lab 8:



Rollup queries that need to be executed

Evaluating Advanced Query and XML Techniques



Pivot and unpivot queries that need to be executed



Ranking queries that need to be executed



Tables against which the queries need to run



XML data in a table

Output ■

Modified Visio diagram with Service Broker components



A document containing source control, unit testing, and deployment standards strategies

Module 1

Selecting SQL Server Services That Support Business Needs

Contents: Lesson 1: Overview of the Built-In SQL Server Services

1-2

Lesson 2: Evaluating When to Use the New SQL Server Services

1-14

Lesson 3: Evaluating the Use of Database Engine Enhancements

1-25

Lab: Selecting SQL Server Services to Support Business Needs

1-34

Information in this document, including URL and other Internet Web site references, is subject to change without notice. Unless otherwise noted, the example companies, organizations, products, domain names, email addresses, logos, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place or event is intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. The names of manufacturers, products, or URLs are provided for informational purposes only and Microsoft makes no representations and warranties, either expressed, implied, or statutory, regarding these manufacturers or the use of the products with any Microsoft technologies. The inclusion of a manufacturer or product does not imply endorsement of Microsoft of the manufacturer or product. Links are provided to third party sites. Such sites are not under the control of Microsoft and Microsoft is not responsible for the contents of any linked site or any link contained in a linked site, or any changes or updates to such sites. Microsoft is not responsible for webcasting or any other form of transmission received from any linked site. Microsoft is providing these links to you only as a convenience, and the inclusion of any link does not imply endorsement of Microsoft of the site or the products contained therein. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. ©2006 Microsoft Corporation. All rights reserved. Microsoft, Active Directory, ActiveX, BizTalk, Excel, Microsoft Press, MSDN, MSN, Outlook, PowerPoint, SharePoint, Tahoma, Visio, Visual Basic, Visual C#, Visual SourceSafe, Visual Studio, Windows, and Windows Server. All other trademarks are property of their respective owners.

Module 1: Selecting SQL Server Services That Support Business Needs

1–1

**************************************** Illegal for non-trainer use *************************************** Module objectives

Introduction

After completing this module, students will be able to: ■

Evaluate the use of the built-in SQL Server services.



Evaluate the use of the new SQL Server services.



Evaluate the use of Database Engine enhancements.

Designing a solution requires that you assess the business requirements of the organization and suggest a design that does not exceed the available budget. A good understanding of the new functionality and features available in Microsoft® SQL Server™ 2005 is essential to help you make scalable and secure design decisions. This will help you plan business solutions and select the best components that meet your organization’s requirements. In this module, you will learn how to select the appropriate SQL Server services to meet a particular set of business requirements. This module provides an overview of the SQL Server 2005 architecture and the features that you should consider when choosing the SQL Server services to include in a solution. You will use this knowledge to map your business requirements to the SQL Server services, choose the best mix of server solutions and associated coding strategies, and justify your decisions.

1–2

Module 1: Selecting SQL Server Services That Support Business Needs

Lesson 1: Overview of the Built-In SQL Server Services

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Describe the architecture of SQL Server 2005 components.



Describe the enhancements to the Full-Text Search service in SQL Server 2005.



Evaluate the scenarios in which you can use HTTP endpoints.



Evaluate the scenarios in which you can use SQL Server Replication.



Evaluate the scenarios in which you can use SQL Server Agent.



Evaluate the scenarios in which you can use Database Mail.

Earlier versions of SQL Server provided a number of additional services beyond those of the SQL Server Database Engine. SQL Server 2005 extends and enhances many of these services and adds several new services, such as HTTP endpoints and Database Mail. It is important for you to have a good understanding of these new and updated services to enable you to select the appropriate service to satisfy the requirements of a given business scenario. This lesson describes considerations for using these services.

Module 1: Selecting SQL Server Services That Support Business Needs

1–3

The Architecture of SQL Server 2005

**************************************** Illegal for non-trainer use *************************************** Introduction

SQL Server 2005 is more than a simple database management system. The architecture of SQL Server 2005 includes multiple components and services that constitute a comprehensive platform for enterprise applications. Gaining a thorough understanding of the SQL Server 2005 architecture will enable you to make the best use of its services in your solutions.

Components of SQL Server 2005

The following table provides a brief description of the various components that make up the SQL Server 2005 architecture. Component

Description

Database Engine

The SQL Server Database Engine is responsible for storing, processing, and securing data. The SQL Server 2005 Database Engine includes many new features, such as common language runtime (CLR) integration, support for Extensible Markup Language (XML), and support for Web services.

Replication

SQL Server Replication is a set of technologies that enable you to distribute and copy data and objects between geographically distributed databases. SQL Server ensures consistency by using robust and customizable database synchronization features.

Full-Text Search

SQL Server Full-Text Search enables you to perform full-text queries against text data stored in a SQL Server database. You can include words and phrases or multiple forms of a word or a phrase in a full-text query.

Service Broker

SQL Server Service Broker uses message queuing to enable you to build asynchronous, loosely coupled, reliable, and scalable database applications.

SQL Server Agent

SQL Server Agent enables you to schedule and execute jobs, configure alerts, and send messages to operators. By using SQL Server Agent, you can automate many administrative tasks for SQL Server.

1–4

Module 1: Selecting SQL Server Services That Support Business Needs Component

Description

Analysis Services

SQL Server Analysis Services provide online analytical processing (OLAP) and data mining functionality for business intelligence applications. Analysis Services supports multidimensional structures and data-mining models.

Integration Services

SQL Server Integration Services (SSIS) is a platform for data integration and consolidation. It enables you to define and automate comprehensive processes for performing Extraction, Transformation, and Load (ETL) procedures.

Reporting Services

SQL Server Reporting Services is a server-based reporting platform that enables you to create and manage reports. These reports can contain data from relational and multidimensional data sources. Users can view and manage reports over a Web-based connection.

Notification Services

SQL Server Notification Services is a platform that enables you to create applications that generate and send notifications based on specified events to subscribers. You can use Notification Services to send messages to a variety of devices.

Module 1: Selecting SQL Server Services That Support Business Needs

1–5

The Full-Text Search Service

**************************************** Illegal for non-trainer use *************************************** Introduction

Enterprise applications often require the ability to perform search operations for keywords and phrases in large datasets. The Full-Text Search service enables you to perform efficient full-text queries and execute linguistic searches over text data. The Full-Text Search service uses specialized full-text indexes to search for text efficiently. The Full-Text Search service was available in earlier versions of SQL Server, but SQL Server 2005 has introduced some new features.

1–6

Module 1: Selecting SQL Server Services That Support Business Needs

Enhancements to the Full-Text Search service in SQL Server 2005

The following table provides a brief explanation of the enhancements made to the FullText Search service; Enhancement

Description

New data definition language (DDL) statements

You can create, modify, and implement full-text catalogs and indexes by using DDL statements. In earlier versions of SQL Server, you had to use stored procedures to accomplish these tasks.

Ability to include an arbitrary number of columns in a CONTAINS clause

In earlier versions of SQL Server, you could include only one column or specify all columns by using the wildcard operator (*).

Support for XML data

You can create full-text indexes for the XML data type. You can perform full-text queries against XML data.

Improvement in scalability

Each instance of the Database Engine runs its own instance of the Full-Text Search service.

Ability to issue full-text queries against linked servers

SQL Server 2005 enables you to run full-text queries against linked servers, if the tables involved in the query have the appropriate indexes.

Support for locales

You can specify the language to be used for word breaks, stemming, thesaurus, and noise-word processing.

Improvements in performance when populating full-text indexes and running queries

The Full-Text Search service uses improved algorithms when gathering data and performing searches. This makes the fulltext indexing process more efficient. Full-text queries no longer need to wait for the merge process to be completed to retrieve the results.

Improvement in the backup process

Full-text indexes are now backed up when you perform a database backup.

Module 1: Selecting SQL Server Services That Support Business Needs

1–7

Considerations for Using HTTP Endpoints

**************************************** Illegal for non-trainer use *************************************** Introduction

Developing service-oriented and XML-based solutions that are not based on Microsoft Windows® clients often requires the use of a more open communications protocol than the Tabular Data Stream (TDS) protocol used by SQL Server. Prior to SQL Server 2005, TDS was the only data transmission protocol available for SQL Server. SQL Server 2005 provides HTTP endpoints to support the situations in which TDS is not suitable. For example, client applications that use SOAP to access services over a corporate intranet are better implemented by using HTTP endpoints. Other technologies such as Microsoft ASP.NET Web services provide similar functionality. Note HTTP endpoints enable you to expose SQL Server 2005 data to the network in a manner similar to implementing a Web service. SQL Server 2005 can also consume Web services by using the new Microsoft .NET Framework common language runtime (CLR) integration features. Lesson 3 describes using the CLR in more detail.

Considerations for using HTTP endpoints

You should consider these factors when deciding whether to use HTTP endpoints: ■

Interoperability. HTTP endpoints use the SOAP protocol. If your solution requires that clients using operating systems other than the Microsoft Windows® operating system be able to access SQL Server 2005, using HTTP endpoints is an obvious choice.



Performance. HTTP endpoints use the http.sys kernel mode driver, which provides better performance than using Internet Information Services (IIS)–based solutions such as ASP.NET Web services.



Security. You should not use HTTP endpoints to connect the database directly to an Internet application, because doing so exposes the database to the Internet. This increases the risk of attack. If you must provide access to the database over the Internet, you should select more secure technologies, such as ASP.NET Web services and SQLXML.

1–8

Module 1: Selecting SQL Server Services That Support Business Needs ■

Scenarios suitable for using HTTP endpoints

Scenarios where you should not use HTTP endpoints

Scaling out. HTTP endpoints do not scale out because they use the Database Engine directly. If you need to scale out your system, you should select another technology, such as ASP.NET Web Services or SQLXML, which can run on multiple servers.

The following list describes some scenarios in which you should consider using HTTP endpoints: ■

Generating reports for internal use. You can quickly create stored procedures that retrieve the required data and expose them through HTTP endpoints. You can provide internal users who need this data with the URL of the HTTP endpoint. These users can then use a Web browser such as Microsoft Internet Explorer to connect to this URL and view the data output from the stored procedure.



Using XML. You might have applications that generate and process data as XML instead of using the relational format used by SQL Server. You can use an HTTP endpoint to send and receive data as XML documents between an application and SQL Server 2005.



Implementing a Service-Oriented Architecture (SOA). The implementation of HTTP endpoints conforms to the SOA, providing programmers with a consistent means of exposing and consuming data services.

The following list describes some scenarios in which it is not appropriate to use HTTP endpoints: ■

Using Windows 2000 or earlier operating systems. HTTP endpoints use the http.sys kernel driver. You can use this driver only with Microsoft Windows Server™ 2003 and Windows XP Service Pack 2.



Performing real-time transaction processing. HTTP endpoints cannot provide the same response time as other data access technologies, such as connecting directly to the database server by using Microsoft ADO.NET. Do not use HTTP endpoints when you require mission-critical response times.



Using large objects (LOBs). XML and HTTP endpoints are not the most efficient mechanism for transport LOBs to and from a database server. The serialization mechanism required to convert LOB data to XML can occupy considerable processing power and consume significant network bandwidth.

Module 1: Selecting SQL Server Services That Support Business Needs

1–9

Considerations for Using SQL Server Replication

**************************************** Illegal for non-trainer use *************************************** Introduction

SQL Server replication technology enables you to build low-overhead distributed solutions. SQL Server 2005 provides several enhancements for replicating data compared to earlier versions. You can use peer-to-peer transactional replication to enable different sites to transmit changes to each other as they perform transactions on replicated data, and you can perform merge replication across the Internet over an HTTP connection. You can also integrate with databases from other vendors.

Considerations for using SQL Server Replication

The following list describes some considerations that you should take into account when deciding to use replication to build distributed solutions: ■

Replicating data over the Internet. You can use HTTP merge replication to synchronize sites connected over the Web by using the Hypertext Transfer Protocol Secure (HTTPS) protocol and IIS. HTTP merge replication also supports SQL Server Mobile Edition subscribers. You can implement transactional replication across the Internet by using a virtual private network (VPN) connection to SQL Server.



Restricting the operations performed by subscribers. You have full control over the operations available to subscribers over the replicated data. You can configure subscribers to be read-only as well as read/write.



Supporting different connection models. Subscribers do not need to be permanently connected to the replication publisher and distributor. Subscribers can operate autonomously; they need to connect only to perform synchronization. However, in this scenario, the data held by a subscriber can quickly become out-ofdate, because updates made by other subscribers can cause conflicts. Permanently connected systems can propagate the changes to data with little delay, enabling fast synchronization and consistency between a publisher and subscribers.

1–10

Module 1: Selecting SQL Server Services That Support Business Needs ■

Performing DDL operations. SQL Server 2005 enables you to replicate schema changes easily. You no longer need to use special stored procedures to add and drop columns. Additionally, SQL Server 2005 Replication supports a much broader range of schema changes. You can execute ordinary DDL statements on the publisher, and the changes are automatically propagated to all subscribers as part of the synchronization process.



Supporting heterogeneous replication. SQL Server 2000 Replication supports Oracle and IBM DB2 snapshot and transactional subscribers. SQL Server 2005 enables you to incorporate an Oracle 8.0.4 database or later as a publisher, making it easy to integrate data from an Oracle database into a SQL Server solution.



Improving performance and scalability. SQL Server 2005 replication has introduced several performance and scalability enhancements. These improvements include performing parallel processing for merge and distribution agents and the ability to use precomputed partitions for filtered merge publications.

Module 1: Selecting SQL Server Services That Support Business Needs

1–11

Considerations for Using SQL Server Agent

**************************************** Illegal for non-trainer use *************************************** Introduction

You can use SQL Server Agent for scheduling the execution of administrative tasks and routine maintenance operations. SQL Server Agent can monitor the SQL Server Database Engine and raise alerts if an unusual event occurs or a particular threshold is crossed. SQL Server Agent can send information about alerts to operators. Note SQL Server Agent is disabled by default. You can enable SQL Server Agent by using the SQL Server 2005 Surface Area Configuration tool.

Considerations for using SQL Server Agent

You should consider using SQL Server Agent to perform the following tasks: ■

Performing regular maintenance operations. You can use SQL Server Agent to schedule regular maintenance operations such as performing backups and reorganizing or rebuilding indexes.



Responding to alerts. SQL Server Agent can recognize and respond to an alert by notifying an operator and executing a job. Depending on the nature of the alert, you can define jobs to handle and recover from a number of potentially serious conditions. For example, you can detect whether the transaction log for a database is close to full and trigger a transaction log backup before the database stops functioning. SQL Server 2005 also enables SQL Server Agent to respond to Windows Management Instrumentation (WMI) events and SQL Server performance conditions.



Scheduling regular business tasks. Some businesses need to perform regular processes on specified dates, such as performing end-of-period consolidation of accounts. These processes might require performing a number of database operations and transactions, some of which could be time-consuming and resource intensive. You can use SQL Server Agent to schedule these tasks to run within the appropriate time frame and at off-peak hours to minimize disruption to other daily operations.

1–12

Module 1: Selecting SQL Server Services That Support Business Needs

Considerations for Using Database Mail

**************************************** Illegal for non-trainer use *************************************** Introduction

Database Mail is a reliable, scalable, and secure solution for sending e-mail messages from the Database Engine. The messages can contain query results and attachments. Database Mail is a lightweight alternative to using SQL Mail included in earlier versions of SQL Server. It consumes fewer resources than SQL Mail, but it provides less functionality. Note Database Mail is disabled by default. You can enable Database Mail by using the SQL Server 2005 Surface Area Configuration tool.

Considerations for using Database Mail

The following list describes some considerations that will help you to decide when to use Database Mail instead of another e-mail solution: ■

Operating with the SQL Server Database Engine. You can use Database Mail to send e-mail messages from Transact-SQL batches, stored procedures, and triggers. You cannot use Database Mail for receiving and processing e-mail messages. In contrast, you can use SQL Mail to receive and process e-mail messages. However, SQL Mail is a deprecated component that should not be used in new developments.



Using SMTP instead of MAPI. Database Mail uses Simple Mail Transfer Protocol (SMTP) to communicate with mail servers and does not need any additional email software. Other solutions, such as SQL Mail, require a MAPI-enabled client application such as Microsoft Office Outlook®.

Module 1: Selecting SQL Server Services That Support Business Needs

1–13



Implementing isolation and robustness. Database Mail runs in its own process, separate from the SQL Server Database Engine. If the Database Mail process fails, it will not affect the Database Engine. Database Mail is activated by Service Broker when clients issue requests to send e-mail messages. The separation of Database Mail from the Database Engine relieves some of the load from the Database Engine. SQL Server can continue to queue e-mail messages even if the Database Mail process stops or fails. Database Mail will transmit queued e-mail requests when it restarts successfully.



Supporting clusters and 64-bit servers. Database Mail is cluster-aware and is fully supported on 64-bit installations of SQL Server 2005. This improves the scalability and reliability of Database Mail.

1–14

Module 1: Selecting SQL Server Services That Support Business Needs

Lesson 2: Evaluating When to Use the New SQL Server Services

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Evaluate the scenarios in which you can use Notification Services.



Evaluate the scenarios in which you can use Service Broker.



Evaluate the scenarios in which you can use Reporting Services.



Evaluate the scenarios in which you can use SQL Server Integration Services.

SQL Server 2005 includes several services, such as Service Broker, that provide new functionality. Data Transformation Services (DTS) has been replaced by SQL Server Integration Services (SSIS). SSIS has many new features added specifically for use with SQL Server 2005. Reporting Services and Notification Services are available as separate downloads for SQL Server 2000, but they are now an integral part of SQL Server 2005. It is important to understand the purpose and features of these services to enable you to select the appropriate services to meet the requirements of a given business scenario.

Module 1: Selecting SQL Server Services That Support Business Needs

1–15

Considerations for Using Notification Services

**************************************** Illegal for non-trainer use *************************************** Introduction

Event-based notifications and alerting application subsystems often require significant development effort. Notification Services can reduce development time and effort to a great extent. It can also interoperate with other application components.

Considerations for using Notification Services

You should consider the following points when determining whether to use Notification Services for your solution:

Scenarios in which you should you should not use Notification Services



Capturing and responding to events. You can use Notification Services for sending notifications based on events. Notification Services uses notificationgeneration rules that enable you to focus on the notification definition instead of the implementation details.



Supporting scalability. Notification Services uses a set-oriented processing model and is designed to support a large number of subscribers.



Scaling out. Unlike some of the alternative technologies such as Database Mail, Notification Services can run on a different SQL Server instance. This provides you with the ability to scale out your solution across multiple computers.



Supporting different types of devices. Notification Services applications can send notifications to a wide variety of devices. These devices include PDAs, instant messenger clients, cell phones, pagers, and e-mail client devices. You can also use Notification Services to send notifications using technologies such as short message service (SMS).



Building custom components. The Notification Services application programming interface (API) is highly extensible. You can create custom components, including custom event providers, formatters, and delivery protocols.

Although Notification Services operates asynchronously, you should not use Notification Services in scenarios in which you need to perform asynchronous processing. In scenarios like this, you should use an alternative technology such as Service Broker.

1–16

Module 1: Selecting SQL Server Services That Support Business Needs

Considerations for Using Service Broker

**************************************** Illegal for non-trainer use *************************************** Introduction

Service Broker is a new subsystem of the Database Engine in SQL Server 2005. You can use it for building reliable, scalable, and loosely coupled distributed applications. Service Broker uses the built-in, asynchronous message-queuing infrastructure of SQL Server. Asynchronous queuing helps you to develop solutions that can respond to more requests than the platform might be able to handle immediately.

Considerations for using Service Broker

To decide when to use Service Broker, you should consider the following: ■

Security. Unlike Microsoft Message Queue (MSMQ), which stores messages in memory or in the file system, Service Broker stores messages in hidden tables in a SQL Server database. The messages are as secure as your database. Messages are backed up when you back up the database.



Reliable messaging. Service Broker is an integral component of SQL Server 2005. Messages are stored in hidden tables in the database and are automatically subject to transaction management. If a transaction fails, any messages processed during that transaction will automatically be returned to the message queue and can be processed again by a new transaction.



Large messages. Unlike technologies such as COM+ queued components, in which the message length cannot exceed 4 MB, Service Broker can efficiently and dynamically handle messages up to 2 GB in size.



Heterogeneous data. You can use Service Broker for scenarios involving one or more SQL Server databases. Service Broker is not suitable for accessing heterogeneous systems. In such scenarios, you should consider the use of other technologies, such as COM+ components.

Module 1: Selecting SQL Server Services That Support Business Needs Scenarios where using Service Broker is appropriate

1–17

You can use Service Broker in scenarios when you need to: ■

Scale out the system. Server resources can soon become exhausted in a situation in which processing is performed on a large scale. When using Service Broker, you can divide the workload among multiple SQL Server instances. One instance can perform part of the processing and send messages to other instances by using Service Broker. These other instances can read the messages from the queue and perform other parts of the processing.



Improve the reliability and availability of the system. If you build a distributed system using several servers and you perform distributed transactions, using Service Broker can improve the reliability and availability of the system. If one server fails, Service Broker can send messages to a working server to continue the processing.



Improve the response time of applications. Using Service Broker, you can arrange for lengthy transactions to be processed asynchronously. A client application can send a message to perform the transaction. The client application does not have to wait for the transaction to be completed. This can improve the response time of applications.



Build an auditing system. You can use the Event Notifications feature of Service Broker instead of DDL triggers to build an auditing system.



Maintain up-to-date cached data. Some applications need to maintain data in a cache. It is important that this data is accurate and up-to-date. You can use Service Broker Query Notifications to maintain cached data by enabling an application to request notifications from SQL Server when the results of a query change. The application can then refresh its cache with the new data.

Note Other technologies, such as COM+ queued components and MSMQ, provide similar functionality to Service Broker. Based on your requirements, you should evaluate the advantages and disadvantages of using these technologies versus Service Broker.

1–18

Module 1: Selecting SQL Server Services That Support Business Needs

Considerations for Using Reporting Services

**************************************** Illegal for non-trainer use *************************************** Introduction

Reporting is a common requirement in many applications. However, generating reports can require running lengthy queries and performing complex calculations and formatting. This additional workload can impact the performance of the database server. You can use SQL Server 2005 Reporting Services to offload some of the work to a different server.

Scenarios where using Reporting Services is appropriate

The following list describes some scenarios in which it is appropriate to use Reporting Services: ■

There is no requirement for data update. Most Web applications are designed to provide read-only access to information and do not provide any update capabilities. In most of those cases, using the interactivity features of Reporting Services can provide the user with a rich experience without requiring direct access to the data source.



Reports query data that is not often updated. Examples of this type of report include weekly or monthly summary reports. If you have a significant number of users requiring reports like this, you can give them access to the information they need by using Reporting Services.



Reports contain the same information grouped or sorted in different ways. You can consolidate several reports into one by using grouping and sorting parameters. You can then use the interactive features of Reporting Services to allow users to display the data sorted or grouped as required.



Information is required in a specific format. The database administrator can use Reporting Services to generate reports during off-peak hours. The administrator can distribute the reports to users when appropriate, without requiring continuous access to the source databases.

Module 1: Selecting SQL Server Services That Support Business Needs Issues that can arise when using Reporting Services

Scaling out Reporting Services

1–19

During design, you should keep in mind the following issues that might arise if you choose to deploy Reporting Services as part of your solution: ■

Inappropriate security settings. Reporting Services has its own security settings, which are independent of the security of SQL Server. Ensure that you plan user access to reports carefully to avoid granting inadvertent access to other data.



Unbalanced workload distribution between the database extraction and report rendering stages. Careful planning is necessary to achieve the correct balance between the load placed on the database and the Reporting Services caching data for each solution.



Increased storage requirements. Access to cached reports or reports stored as snapshots requires additional storage space and backup devices. Sometimes, the amount of space required is significant.

You can scale out Reporting Services functionality in the following ways: ■

Separate the database query and reporting functionality. You can extract the data from a database running on one server and format the report by using Reporting Services running on another server. This is the easiest solution, but it might require that you buy additional licenses.



Use another database as the Reporting Services data source instead of using the production database. You can copy the required data from the production database to a new database designed to support efficient queries. You can use a technology such as SQL Server Replication or SQL Server Integration Services (SSIS) to transfer the data from the production system to the new database. You could also consider setting up a data warehouse, although this is not a simple task.



Use multiple instances of Reporting Services. Several Reporting Services instances can share the same Reporting Services database. You can use this configuration to run Reporting Services in a clustered environment. You must have access to the necessary tools and software for creating and managing the server cluster, because Reporting Services does not provide this functionality itself.

1–20

Module 1: Selecting SQL Server Services That Support Business Needs

Considerations for Using SQL Server Integration Services

**************************************** Illegal for non-trainer use *************************************** Introduction

SSIS provides functionality that enables you to perform extraction, transformation, and loading of data into a SQL Server database. However, you can also perform many of the same operations by using other tools. SSIS is a powerful tool, but performing complex transformations across a large data set can impose a significant load on the database server.

Considerations for using SQL Server Integration Services

The following list describes some considerations for determining whether you should use SSIS in your solution: ■

Automating data loading operations. SSIS can operate with SQL Server Agent to perform scheduled data loading operations. This is useful if you need to load data on a periodic basis.



Merging data from heterogeneous data sources. You might need to consolidate information from different repositories. SSIS provides easy access to data sources such as Microsoft Office Excel®, Microsoft Office Access®, and Oracle databases; XML files; and many other sources, using either Open Database Connectivity (ODBC) or OLE DB. Additionally, SSIS supports custom extensions, enabling you to create data adapters to address specific needs.



Splitting out data into different destinations according to business rules. You can use SSIS to examine data and send it to different databases based on a set of business rules.



Populating data warehouses and generating summary data. SSIS includes support for advanced operations such as loading data to a data warehouse and for performing business intelligence tasks. This includes support for Slowly Changing Dimensions for OLAP databases.



Cleaning and transforming data. SSIS provides features that you can use to clean data as you load it. The available features include the ability to perform lookups in reference tables and advanced mechanisms such as fuzzy lookups, which you can use to identify and clean data even when there is no perfect match. SSIS also enables you to transform data as you load it.

Module 1: Selecting SQL Server Services That Support Business Needs

1–21

Practice: Evaluating When to Use the New SQL Server 2005 Services

**************************************** Illegal for non-trainer use *************************************** Introduction

In this practice, you will work in small groups to analyze several scenarios and then identify the business requirements for each scenario. Read the questions that follow, and consider how different SQL Server 2005 services can meet the various business requirements.

Scenario 1

You run a Web site for automobile collectors. Users of this Web site participate in discussions, submit photographs of automobiles, and post comments. You want to modify the Web site as follows:

Scenario 2



The Web site needs to display information about regional car shows. The Web site users want to be informed about any car shows in their local areas.



Manufacturers of automobile parts should be able to use this Web site to advertise information about their products that are specific to particular automobiles.



You want to duplicate the Web site to another Internet service provider (ISP) in a different geographical location. However, you need to ensure that the new Web site displays the same data as the original Web site.

You work for an airline company. This company has made agreements with travel agents so that customers can book flights through these agents. Currently, travel agents accept bookings by telephone and through a Web application. At the end of each day, the travel agents send all bookings as an e-mail attachment to your company. A system at your company processes this data automatically. On the next business day, the travel agents receive a message confirming or canceling the bookings. You have been asked to modify the system design to meet the following requirements: ■

Travel agents should be able to send booking requests at any time.



Bookings must be confirmed or canceled more quickly, ideally within an hour of receiving them from the travel agent. The results should be sent to the travel agent as soon as possible.

1–22

Module 1: Selecting SQL Server Services That Support Business Needs

Scenario 3

You work for a company that manufactures toys. Some of the required components for the toys are built in factories located in several different cities, but the toys are assembled in the central factory. The central factory accepts orders from customers. The central factory keeps a stock of components received from other factories. When the stock of any component in the central factory falls below a specific threshold, you place an order with the appropriate factory by using a manual process. When the stock of any component in a remote factory falls below a specific threshold, the factory increases production of that component to meet demand. This company requires a new stock control system that performs order processing and work processing to meet the following requirements:

Discussion questions



It must be possible to query the stock level and work order status of all components from the central factory.



When any component reaches its minimum threshold in the central factory, the system should automatically send an order to the appropriate factory.



When any component reaches its minimum threshold in a remote factory, the system should automatically produce the work order to increase production.



Product shipment and work order processing must be executed every week.

1. Which SQL Server services should you use to support the requirements of each of the scenarios? Why? For scenario 1: Requirement

SQL Server services

The Web site needs to display information about regional car shows. The Web site users want to be informed about any car shows in their local areas.

Users probably access the Web site by using a browser. Therefore, for this scenario, you do not need to create HTTP endpoints. The Web server can retrieve the data provided by the SQL Server Database Engine and generate Web pages. Users can display these Web pages in their browser. You could consider using Notification Services to inform users about car shows because of the nature of the events and scalability of Notification Services.

Manufacturers of automobile parts should be able to use this Web site to advertise information about their products that are specific to particular automobiles.

Provide an ASP.NET Web service that manufacturers can use to submit their advertisements. Using a Web service provides a simple interface, enabling manufacturers to integrate the interface into their own applications. It also maintains security by avoiding direct access to SQL Server.

You want to duplicate the Web site to another ISP in a different geographical location. However, you need to ensure that the new Web site displays the same data as the original Web site.

Consider using replication to copy the SQL Server data to the new ISP and keep it synchronized.

Module 1: Selecting SQL Server Services That Support Business Needs

1–23

For scenario 2: Requirements

SQL Server services

Travel agents should be able to send booking requests at any time.

Implement an ASP.NET Web service for the travel agents to use for sending booking requests at any time. Using HTTP endpoints is inappropriate, for security reasons.

Bookings must be confirmed or canceled more quickly, ideally within an hour of receiving them from the travel agent. The results should be sent to the travel agent as soon as possible.

Use Service Broker to process the booking requests and to send confirmation or rejection messages, because messages need not be sent immediately. The ASP.NET Web service can receive the booking requests from the travel agents. This Web service can call a stored procedure, which places the booking request in a queue. You can schedule a second stored procedure to read and process the booking requests from the queue. This stored procedure can then send the confirmation or cancellation message of the bookings to the travel agents by using Database Mail.

For scenario 3: Requirements

SQL Server services

It must be possible to query the stock level and work order status of all components from the central factory.

Consider using HTTP endpoints and a private network. For security reasons, you should also use Secure Sockets Layer (SSL) and implement Windows Authentication for accessing SQL Server.

When any component reaches its minimum threshold in the central factory, the system should automatically send an order to the appropriate factory.

Use Notification Services to send an order to the appropriate factory when a component reaches its minimum threshold in the central factory. Notification Services is appropriate because of the nature of the events.

When any component reaches its minimum threshold in a remote factory, the system should automatically produce the work order to increase production.

Use Notification Services to automatically produce the work order when a component reaches its minimum threshold in a remote factory. Again, Notification Services is appropriate because of the nature of the events.

Product shipment and work order processing must be executed every week.

Use Service Broker to process product shipments and work orders. Using Service Broker enables you to schedule the processing of the shipments and work orders.

2. For the automobile Web site, if you were to duplicate the Web site database on a different server to ensure that the data remains available to the Web server if the primary database goes offline unexpectedly, would you change your approach? Why? In this case, you might use the database mirroring technique, as it provides more availability than replication.

1–24

Module 1: Selecting SQL Server Services That Support Business Needs 3. In which scenarios would you choose to use HTTP endpoints instead of Web services? Why? You should deploy HTTP endpoints only for internal use by an organization or across a private network. Using HTTP endpoints is more efficient than using a Web service, but HTTP endpoints are more difficult to protect, so you should use them only in environments in which you can control their exposure and vulnerability.

Module 1: Selecting SQL Server Services That Support Business Needs

1–25

Lesson 3: Evaluating the Use of Database Engine Enhancements

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Describe the Transact-SQL enhancements.



Evaluate the use of the integrated common language runtime.



Evaluate the use of SQLXML.

In this lesson, you will learn how to evaluate the use of the Database Engine enhancements in SQL Server 2005. SQL Server 2005 has introduced several features that enable you to create efficient code. These enhancements provide greater functionality and employ simpler coding techniques to help you build optimal solutions. This lesson concentrates on showing you how to evaluate the new Transact-SQL statements and syntax available in SQL Server 2005, explaining the issues you need to consider when using the new common language runtime (CLR) integration features, and demonstrating when to use SQLXML to work with XML data.

1–26

Module 1: Selecting SQL Server Services That Support Business Needs

Transact-SQL Enhancements

**************************************** Illegal for non-trainer use *************************************** Introduction

SQL Server 2005 provides several new features that extend the functionality of the Transact-SQL language you can use for performing queries, as well as for stored procedures, triggers, and user-defined functions. These new features include the following: ■

Common Table Expressions (CTEs)



Ranking functions



The XML data type



A unified large object programming model



New statement operators



Structured error handling



New triggers

The following sections provide more information about these features. Common Table Expressions

CTEs are syntactic constructs that operate as views. However, they are short-lived and exist only while the statements in which they are defined are executing. Using CTEs, you can more easily create complex recursive queries that otherwise would require you to define temporary tables. CTEs are especially useful for analyzing self-referencing tables, where a row in the table is related to other rows in the same table.

Ranking functions

You can use ranking functions to obtain positional information about rows in a result set. Ranking functions return a ranking value for each row in a partition. The ranking functions available in SQL Server 2005 are: RANK, DENSE_RANK, NTILE, ROW_NUMBER.

Module 1: Selecting SQL Server Services That Support Business Needs

1–27

The XML data type

You can store XML documents and fragments in a database. You can define table columns and variables by using the XML data type. You can also associate XML values with an XML schema to perform validation. You can query XML data by using Extensible Query Language (XQuery) statements, and you can modify XML data by using SQL Server extensions to the XQuery language.

Unified large object programming model

In earlier versions of SQL Server, the large object types (text and image) had limited functionality and required you to handle them in a different way from the ordinary data types. SQL Server 2005 adds the types varchar(max) and nvarchar(max), which can both hold up to 2 GB of data. You can manipulate these types by using the same statements that you use for the ordinary data types.

New statement operators

SQL Server 2005 introduces the following statement operators: OUTPUT. The OUTPUT operator enables you to retrieve the original or new values of rows affected by a data modification operation (INSERT, UPDATE, or DELETE) and store them in a table or table variable. APPLY. The APPLY operator enables you to invoke a table-valued function for each row returned by an outer-table expression of a query. PIVOT/UNPIVOT. The PIVOT and UNPIVOT operators enable you to transform column values for a table-valued expression into columns, and vice versa. You can use the PIVOT operator to generate cross-tabulation reports to summarize data. PIVOT provides syntax that is simpler and more readable than can otherwise be specified by using a complex series of SELECT...CASE statements. In earlier versions of SQL Server, you had to write complex Transact-SQL code to implement similar functionality. TOP (expression). The TOP (expression) operator enables you to return the topmost rows of a query only, by specifying a numeric expression instead of an integer constant. Earlier versions of SQL Server required you to generate dynamic SQL to achieve the same functionality.

Error handling

SQL Server 2005 introduces a structured exception handling mechanism through the TRY …CATCH construct. TRY …CATCH provides a uniform way to handle the errors that can occur in a block of Transact-SQL code at run time. You no longer have to test the success or failure of each individual statement.

New trigger functionality

SQL Server 2005 adds the following new trigger functionality: DDL triggers. DDL triggers operate when a user executes a data definition language (DDL) statement, such as CREATE, ALTER, or DROP. You can use them to audit the operations performed by database developers, to help to enforce business rules, and to prevent dangerous DDL operations (such as dropping the table containing the organization’s financial accounting information). INSTEAD OF triggers. The ordinary data manipulation language (DML) triggers run either before or after an INSERT, an UPDATE, or a DELETE operation, according to how you design them. You can use an INSTEAD OF trigger to replace the INSERT, UPDATE, or DELETE operation and perform your own custom processing. You can also define INSTEAD OF triggers for views, extending and controlling the range of operations that a user can perform through a view.

1–28

Module 1: Selecting SQL Server Services That Support Business Needs

Considerations for Using the Integrated Common Language Runtime

**************************************** Illegal for non-trainer use *************************************** Introduction

In earlier versions of SQL Server, if you needed to perform operations that interacted with the environment, such as running a script of operating system commands, you needed to use extended stored procedures. An extended stored procedure that malfunctioned could compromise the Database Engine and even stop SQL Server from running. Using the Microsoft .NET Framework common language runtime (CLR) integration in SQL Server 2005 provides a high degree of performance without compromising SQL Server stability. CLR integration with the SQL Server Database Engine provides developers with much functionality not available in earlier versions of SQL Server. Transact-SQL code is well suited to performing database tasks. When performing procedural operations, TransactSQL code can be suboptimal, so you should consider using the CLR for these tasks. Note The integrated CLR is disabled by default. You can enable CLR integration by using the SQL Server 2005 Surface Area Configuration tool, or by using the sp_configure stored procedure: sp_configure 'clr enabled', 1

You must also execute the RECONFIGURE command. Considerations for using integrated CLR objects

You should consider the following points when evaluating using integrated CLR objects to meet your business requirements: ■

Storing assemblies. CLR objects do not have external dependencies, because assemblies are stored in the database. Alternative technologies such as Component Object Model (COM) objects or extended stored procedures use external dynamic-link libraries (DLLs) to implement their functionality.



Ensuring performance. CLR objects run in the same process as the SQL Server Database Engine. The Database Engine does not need to run an additional process to execute your CLR objects.

Module 1: Selecting SQL Server Services That Support Business Needs

Scenarios in which using the integrated CLR is appropriate

Scenarios in which using the integrated CLR is not appropriate

1–29



Controlling security. You can control the resources that CLR objects can access by using code access security (CAS). It is more difficult to restrict the access available to code written by using COM objects or extended stored procedures.



Maintaining stability. Although CLR objects run in the same process as the SQL Server Database Engine, the CLR carefully manages these objects. Unless you explicitly configure the CLR to bypass the built-in type safety features, CLR objects cannot compromise the Database Engine. A CLR object that fails will not cause the Database Engine to stop.



Restricting access. You can restrict objects that run by using the integrated CLR to a subset of the functionality available in a trusted subset of assemblies in the Microsoft .NET Framework. This trusted subset does not contain any classes or methods that enable your code to access resources outside the SQL Server environment. You can relax this restriction for selected objects, but you should carefully consider the need to do so and verify that these objects cannot compromise the security of your database or the computer it is running on.



Retaining compatibility. Earlier versions of SQL Server do not support the CLR. If your solution must be compatible with earlier versions of SQL Server, you cannot use the integrated CLR.

You should consider using the integrated CLR in the following scenarios: ■

Performing complex server-side processing. The CLR is better suited than Transact-SQL for writing code that performs complex logic and computationally intensive server-side processing.



Accessing external resources. CLR integration enables you to access external resources such as Web services, network resources, and the file system from stored procedures and other programmable objects in a more secure way than using COM objects and extended stored procedures. However, you must understand how the .NET Framework implements CAS before writing CLR code that accesses external resources.



Extending Database Engine functionality. You can extend the Database Engine functionality by adding functions, data types, and aggregates to solve problems that can be difficult or inefficient to implement by using Transact-SQL. For example, you could define a matrix data type and provide operations that perform matrix addition and multiplication.

You should avoid using the integrated CLR in the following scenarios: ■

Performing database queries. Do not use the CLR to run code that simply wraps database queries. The Transact-SQL language is much better suited to these types of operations than the procedural languages supported by the .NET Framework.



Performing row-by-row processing of data. If you can use a Transact-SQL UPDATE statement to perform the same task, it will perform more efficiently and consume fewer resources than using the CLR.



Calculating aggregate values. If you need to perform calculations such as evaluating the maximum, minimum, or average value in a set, use Transact-SQL. Transact-SQL provides aggregation functions that can perform these operations much more efficiently than using the CLR.

Additionally, remember that Transact-SQL is the native programming language supported by SQL Server. When you write code by using the CLR, you must open another connection to the database and create command objects for performing SQL operations.

1–30

Module 1: Selecting SQL Server Services That Support Business Needs

Practice: Evaluating Transact-SQL and the CLR for Implementing Functionality

**************************************** Illegal for non-trainer use *************************************** Introduction

In this practice, you will work individually to evaluate the implementation of four pieces of functionality. Each item has been implemented by using Transact-SQL code and by using the CLR. You will evaluate which implementation is the most appropriate in each case. The four pieces of functionality are:

Preparation



A procedure that updates the address of a row in the Person.Address table.



A function from the HumanResources.Employee table that displays the hierarchy of managers and the employees who report to them.



A function that searches for the first n occurrences of a fragment of text in a string and inserts a specified sequence of characters into the search string immediately before each occurrence found. The string to search, the text to search for, the number of occurrences, and the text to insert are all supplied as parameters.



A procedure that splits the data in the AddressLine1 column in the Person.Address table. If the AddressLine1 column contains a house number and a street, the procedure splits the string and moves the street into the AddressLine2 column. If AddressLine2 already contains information, it is appended to the street name.

1. Start the SQL Server Surface Area Configuration tool. 2. In the SQL Server Surface Area Configuration tool, click Surface Area Configuration for Features. 3. In the Surface Area Configuration For Features window, click CLR Integration, click Enable CLR integration, and then click OK. 4. Close the SQL Server Surface Area Configuration tool. 5. Start SQL Server Management Studio. Connect to the MIA-SQL\SQLINST1 Database Engine by using Windows Authentication.

Module 1: Selecting SQL Server Services That Support Business Needs

1–31

6. In SQL Server Management Studio, open the file E:\Practices\Setup T-SQL Objects.sql. Connect to the MIA-SQL\SQLINST1 server by using Windows Authentication when prompted. 7. Execute the script. The script should run without any errors. 8. In the Object Explorer window, verify that the following objects exist in the AdventureWorks database: ●

Stored procedure Person.UpdateAddress



Stored procedure HumanResources.GetDependentEmployees



Scalar-valued function dbo.InsertString



Stored procedure Person.SplitAddress

9. Open the file E:\Practices\Setup CLR Objects.sql. Connect to the MIASQL\SQLINST1 server by using Windows Authentication. 10. Execute the script. The script should run without any errors. 11. In the Object Explorer window, verify that the following objects exist in the AdventureWorks database: ●

Stored procedure Person.ClrUpdateAddress



Stored procedure HumanResources.ClrGetDependentEmployees



Scalar-valued function dbo.ClrInsertString



Stored procedure Person.ClrSplitAddress

12. Leave SQL Server Management Studio open to perform the remaining tasks for this practice. Evaluate the stored procedure and function implementations

1. In SQL Server Management Studio, open the file E:\Practices\Test.sql. Connect to the MIA-SQL\SQLINST1 server by using Windows Authentication when prompted. 2. Examine the script. The script calls each implementation of the stored procedures and functions a number of times, passing the same parameter values to each implementation. Do not run the script yet. 3. Start SQL Server Profiler, and create a new trace. Connect to the MIASQL\SQLINST1 Database Engine by using Windows Authentication when prompted. 4. In the Trace Properties dialog box, click the Events Selection tab and then select the following event and columns (remove all other events and columns): ●

TSQL – SQL:BatchCompleted. Select the TextData, SPID, Duration, Reads, and Writes columns.

5. Click Run. 6. Return to SQL Server Management Studio and then execute the Test.sql script. 7. Wait for the script to complete and then return to SQL Server Profile. Note The final query will generate the message “The query has exceeded the maximum number of result sets that can be displayed in the results grid.” You can safely ignore this message.

1–32

Module 1: Selecting SQL Server Services That Support Business Needs 8. Record the values in the Read, Write, and Duration columns for each TransactSQL and CLR batch in the Test.sql script. (Look for the rows containing -- BATCH comments.) 9. Run the script three or four more times (as time permits) to obtain an average value for the Read, Write, and Duration columns for each batch. 10. Stop the trace and then close SQL Server Profiler. 11. Close SQL Server Management Studio.

Discussion questions

1. Which functionality showed better performance when implemented by using the CLR? Which functionality was better implemented by using Transact-SQL? The Transact-SQL implementations of batches 1, 2, and 4 show a shorter duration than the CLR implementations. The CLR implementations of batches 2 and 4 also perform more I/O than the Transact-SQL implementations. However, batch 3 shows a much shorter duration (two orders of magnitude) for the CLR implementation than for the Transact-SQL implementation.

Note You might notice that the first run of the Transact-SQL implementations for batches 1 and 2 are slower than the CLR implementations. This is because SQL Server has to retrieve the data from disk the first time these procedures run. In subsequent runs, the data will be cached in memory, and so the stored procedures operate much more quickly. 2. What conclusions can you draw from your findings? Batches 1, 2, and 4 query and update the database, making heavy use of TransactSQL statements. This type of functionality is best implemented as ordinary Transact-SQL stored procedures and functions. Batch 3 performs complex string manipulation in memory. This does not require access to the database, and the CLR code is much more efficient than the Transact-SQL equivalent.

Module 1: Selecting SQL Server Services That Support Business Needs

1–33

Multimedia: Considerations for Using SQLXML

**************************************** Illegal for non-trainer use *************************************** Introduction

Microsoft SQL Server 2005 provides extensive support for XML data processing. You can store XML values in an XML column in a table. The XML data type optionally enables you to specify a collection of XML schemas that SQL Server will use to validate the data in a column. SQL Server 2005 also extends the SQLXML, FOR XML, and OpenXML features available in SQL Server 2000. With the newly added data types and enhancements to existing features, SQL Server 2005 provides a powerful platform for developing applications that need to store and process XML data.

Discussion questions

1. What approaches have you used when working with XML data in your applications? Possible answers include: ■

Using the XML classes in the Microsoft .NET Framework to write code that processes XML data.



Using Microsoft BizTalk® or similar tools that can transform XML.

2. How does SQL Server implement client-side support for handling XML data? SQL Server uses the SQLXML managed assembly to provide client-side support for handling XML data. You must deploy this assembly with client or middle-tier applications that need to display or manipulate XML data stored in a SQL Server database.

1–34

Module 1: Selecting SQL Server Services That Support Business Needs

Lab: Selecting SQL Server Services to Support Business Needs

**************************************** Illegal for non-trainer use *************************************** Introduction

Scenario

This lab contains two exercises: ■

In Exercise 1, “Translating Business Requirements into SQL Server Services,” you will work in groups to translate the business requirements of Fabrikam, Inc., into SQL Server services. Based on the services identified, you should modify the supplied Microsoft Visio document, which represents the database solution architecture that Fabrikam, Inc., currently uses to reflect the services to be included.



In Exercise 2, “Analyzing the Needs of Real Organizations,” you will discuss your own experiences about using services in your current or past organizations.

Fabrikam, Inc., is a wireless telephone company with a database solution that was built in 1995. The organization has a global system, Application A, that manages all of the customer accounts. In addition, the organization has regional offices that sell its services. Regional offices are able to manage all the accounts that were created at that particular location through Application B. The global system is running at maximum capacity, and any new features being added will need to be built in a scaled-out fashion. All of the regional offices had dial-up connections to the Internet at the time the original solution was built, but they now have direct, dedicated Internet access. Data is synchronized between regional offices and the central office by using SQL Server Merge Replication.

Module 1: Selecting SQL Server Services That Support Business Needs

1–35

Fabrikam, Inc., is facing the following issues: ■

Application A is running at maximum capacity.



Regional offices print their own bills. They use a third-party solution that automates printing, envelope stuffing, and mailing. The total cost of ownership (TCO) of this printing solution has increased over the past few years, and new alternatives are being considered.



The current payment process is manual.



New services are being offered to automate the payment process. The bill-paying companies providing these services have defined a common Web service–based interface. To enable these companies to notify Fabrikam of the results of the payment process, Fabrikam must provide a Web service that implements this interface.

The system should be changed to meet the following requirements: ■

Customers should be permitted to choose an automatic payment method.



A new Web site should be created to permit the customers to perform the following operations:





Preparation



Update their contact address and billing information.



Request new services.



View their bills.

When a customer requests a new service by using the Web application, the following tasks should be performed: ●

The Web application should inform the customer that the request has been registered and that the customer will receive a notification with the request result later.



The request must be approved by following a number of predefined business rules.



After the request has been approved or rejected, the customer should receive a message with a response to the request.

A cheaper solution for printing bills should be provided.

Ensure that the virtual machine for the computer 2781A-MIA-SQL-01 is running. Also, ensure that you have logged on to the computer. If you need to log on to the computer, use the following credentials: ■

Username: Student



Password: Pa$$w0rd

1–36

Module 1: Selecting SQL Server Services That Support Business Needs

Exercise 1: Translating Business Requirements into SQL Server Services Introduction

Select SQL Server services to suit the business requirements

In this exercise, you will identify the SQL Server services needed to support the business requirements for the Fabrikam, Inc., scenario. You will also review the Visio document describing the current solution, and based on the services that you choose, you will modify the Visio document to reflect the proposed solution.

Summary 1. Analyze the business requirements stated by Fabrikam, Inc. 2. Review the initial solution architecture of Fabrikam, Inc. 3. Identify SQL Server services.

Detailed Steps 1. Examine and discuss the shortcomings and requirements of Fabrikam, Inc., stated in the scenario. 2. Open the Fabrikam Inc.vsd Visio document located in the E:\Labfiles\Starter folder, and review it with your partners. 3. For each requirement, select the appropriate SQL Server service and update the Visio diagram.

Modify the solution architecture to meet the business requirements

Summary ■

Modify the solution architecture to meet the business requirements.

Detailed Steps 1. Open the Fabrikam Inc.vsd Visio document located in E:\Labfiles\Starter folder. 2. Based on the SQL Server services that you chose in the preceding task, modify the Visio document to reflect the proposed solution.

Discussion questions

1. When can you use Notification Services or Service Broker instead of replication? Notification Services is not adequate for distributing and synchronizing data between sites. It is intended to send notifications to devices and persons. You will use Notification Services when you need to send notifications based on event occurrence. Although Service Broker can be used to distribute and synchronize data between sites, it is not designed for performing such tasks. If you are planning to use Service Broker for distributing and synchronizing data, you would be re-creating all of the features already provided by replication. However, if you need to perform further processing beyond simply distributing and synchronizing data, you should consider using Service Broker.

Module 1: Selecting SQL Server Services That Support Business Needs

1–37

2. What are some of the best practices that you would follow for presenting the proposed solution to nontechnical business decision makers? Some suggestions include: ■

Avoid using technical words that nontechnical people do not understand.



Use slides, graphics, and multimedia to help nontechnical people easily understand the solution.



In addition to explaining how the solution works, explain the business benefits that are obtained as well.



Explain the current business issues that the proposed solution is addressing.

1–38

Module 1: Selecting SQL Server Services That Support Business Needs

Exercise 2: Analyzing the Needs of Real Organizations Introduction Discussion questions

In this exercise, you will discuss your own experiences about using services in your current or past organizations. 1. In your current or past organizations, what services did you deploy, and why? Answers will vary.

2. Have you ever planned to use a service during the design phase and then decided not to use it during deployment? If so, why? Answers will vary.

3. Have you used (or are you using) these services in a way in which they are not commonly used? If so, how? Answers will vary.

Important After the discussion, shut down the virtual machine for the computer 2781A-MIA-SQL-01. Do not save the changes.

Module 2

Designing a Security Strategy

Contents: Lesson 1: Overview of Authentication Modes and Authorization Strategies for SQL Server 2005

2-2

Lesson 2: Designing a Security Strategy for Components of a SQL Server 2005 Solution

2-9

Lesson 3: Designing Objects to Manage Application Access

2-25

Lesson 4: Creating an Auditing Strategy

2-33

Lesson 5: Managing Multiple Development Teams by Using the SQL Server 2005 Security Features

2-39

Lab: Designing a Security Strategy

2-46

Information in this document, including URL and other Internet Web site references, is subject to change without notice. Unless otherwise noted, the example companies, organizations, products, domain names, email addresses, logos, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place or event is intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. The names of manufacturers, products, or URLs are provided for informational purposes only and Microsoft makes no representations and warranties, either expressed, implied, or statutory, regarding these manufacturers or the use of the products with any Microsoft technologies. The inclusion of a manufacturer or product does not imply endorsement of Microsoft of the manufacturer or product. Links are provided to third party sites. Such sites are not under the control of Microsoft and Microsoft is not responsible for the contents of any linked site or any link contained in a linked site, or any changes or updates to such sites. Microsoft is not responsible for webcasting or any other form of transmission received from any linked site. Microsoft is providing these links to you only as a convenience, and the inclusion of any link does not imply endorsement of Microsoft of the site or the products contained therein. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. ©2006 Microsoft Corporation. All rights reserved. Microsoft, Active Directory, ActiveX, BizTalk, Excel, Microsoft Press, MSDN, MSN, Outlook, PowerPoint, SharePoint, Tahoma, Visio, Visual Basic, Visual C#, Visual SourceSafe, Visual Studio, Windows, and Windows Server. All other trademarks are property of their respective owners.

Module 2: Designing a Security Strategy

2–1

**************************************** Illegal for non-trainer use *************************************** Module objectives

Introduction

After completing this module, students will be able to: ■

Select the authentication mode and authorization strategy for a SQL Server 2005 solution.



Design a security strategy for components of a SQL Server 2005 solution.



Design objects to manage application access.



Create an auditing strategy.



Manage multiple development teams by using the SQL Server 2005 security features.

This module describes considerations for designing a security strategy for the various components of a Microsoft® SQL Server™ 2005 solution. This includes considerations for choosing the authentication and authorization strategy, as well as designing security for the solution components such as Notification Services and Service Broker. The module also presents guidelines for designing objects for controlling database access. Examples include SQL Server 2005 roles and schemas, stored procedures, and views. The module also provides you with the required knowledge to create an auditing strategy for a solution. Finally, the module teaches you how to manage multiple development teams.

2–2

Module 2: Designing a Security Strategy

Lesson 1: Overview of Authentication Modes and Authorization Strategies for SQL Server 2005

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Describe the threat posed by SQL injection attacks.



Evaluate the considerations for choosing an authentication mode.



Describe the process of defining an authorization strategy.

Security is a primary consideration when designing and managing database environments. SQL Server 2005 introduces a number of changes to security features in previous versions and provides new security features that you can implement in a database application. As a developer, you must be familiar with these security features so that you can choose the appropriate security mechanism for each service in a SQL Server 2005 solution. In this lesson, you will learn about the security threat posed by applications that transmit SQL requests to a database server across a network. You will also learn how to select an appropriate authentication mode and authorization strategy for a SQL Server 2005 solution to reduce the possibility of unauthorized access.

Module 2: Designing a Security Strategy

2–3

SQL Injection Attacks

**************************************** Illegal for non-trainer use *************************************** Introduction

SQL injection attacks are one of the most common security loopholes exploited in databases. The term SQL injection refers to the technique of manipulating query parameters to obtain unauthorized access to database objects. Security barriers such as firewalls and antivirus software do not protect the database against SQL injection attacks. You must take additional precautions to protect your SQL Server environment from these attacks.

An example

A typical SQL injection attack takes advantage of dynamic SQL statements when the application fails to check the validity of any user input incorporated in the statement. For example, suppose that your Microsoft ASP.NET application implements a simple custom authentication mechanism that expects users to provide their user IDs and passwords to log in. The application uses Microsoft Windows® Forms Authentication to collect the user ID and password from the user. The application then stores this information in the userId and password variables. Finally, the application uses these variables to dynamically generate the following SQL statement: "SELECT UserName FROM ApplicationUsers WHERE UserId = '" + userId + "' AND Password = '" + password + "'"

This statement returns the user name corresponding to the user. If the user has supplied an invalid user ID or password, the statement does not return any rows. The application traps this situation and prevents the user from continuing until he or she provides a valid user ID and password. As long as the application generates a valid value for the userId variable, this statement functions as expected. However, an attacker aware of this restriction can supply the following value for the userId variable: %' OR 1=1; --

2–4

Module 2: Designing a Security Strategy The value uses the percentage wildcard (%) to retrieve all users, a semicolon (;) to indicate that the statement has ended, and double dashes (--) to comment out the remaining part of the statement. Assuming that the password is “1234,” the following statement would be generated: SELECT UserName FROM ApplicationUsers WHERE UserId = '%' OR 1=1; -- AND Password = '1234'

This statement now returns every row in the ApplicationUsers table. If the application simply examines the first row retrieved and assumes that this is the name of the user, the attacker has successfully gained access as this user. Reducing the scope for SQL injection attacks

The following list describes some security best practices that will help you to reduce the scope for SQL injection attacks: ■

Check all user input. You should never trust any user input. You must check whether the user provides information in the correct format and that it does not contain any special characters such as %,’--, and so on.



Catch system error messages. Attackers using SQL injection can sometimes obtain information about the structure of a database and the information that it contains by generating malformed queries and examining the resulting error messages. You should catch all database error messages and provide feedback that does not disclose sensitive database details to the user.



Use parameterized queries and stored procedures. You can reduce the scope for a SQL injection attack if you use parameterized queries and stored procedures instead of building your queries dynamically. By using type-safe parameters, you can also ensure that the information sent to SQL Server has the correct data type. Type-safe parameters are parameters that provide type checking and length validation.

For More Information For more information about SQL injection attacks and the threats they pose, see the article “Stop SQL Injection Attacks Before They Stop You” in the September 2004 issue of MSDN Magazine. This article is also available online on the MSDN® Web site.

Module 2: Designing a Security Strategy

2–5

Multimedia: Considerations for Choosing Authentication Modes

**************************************** Illegal for non-trainer use *************************************** Introduction

In any organization, you can categorize the users who require access to database resources. Different categories of users might require different levels of access to databases and their contents. In some cases, you might also need to grant access to business partners and customers who are not members of your organization. For example, consider a Web site that supports online shopping. Customers access the database to make purchases, and business partners access the database to track orders and stock levels. Customers must not be able to amend data in the database directly, and business partners should not have access to any confidential details about the customers. You must use an authentication mode that enables you to maintain database security while granting appropriate access rights to users who might not be members of your organization. Tip If you are building a solution that includes Internet Information Services (IIS), you can store encrypted ASP.NET user names and passwords in the registry by using the aspnet_setreg tool.

Discussion questions

Read the following questions and discuss your answers with the class: 1. Does your organization implement authentication standards? If so, what benefits and problems have you observed in trying to work with these authentication standards? If not, what benefits and problems have you observed in trying to work without authentication standards? Answers will vary. In general, you will probably find that organizations that implement a Microsoft Windows® Authentication standard run into fewer authentication-related issues than organizations using a Mixed Mode Authentication standard. You might also find that a student’s answer is mixed. For example, authentication standards might vary from one project to the next, depending on the size of the organization and the nature of the projects. This can sometimes lead to users—either within or outside of the organization—having

2–6

Module 2: Designing a Security Strategy unauthorized access to resources or not having the access they need. Standards might also vary during the development life cycle, which can create problems for applications that rely on a consistent authentication standard from the development phase through implementation.

2. Describe how schemas can help you choose the best authentication mode. In previous versions of SQL Server, legacy applications used Mixed Mode Authentication. Users in the organization could be granted access to objects based on Windows credentials. External users needed to supply a valid SQL Server user name and password. To make objects accessible to internal and external users, you would typically make these objects belong to the dbo user. In SQL Server 2005, you can use schemas to group database objects. By using Windows Authentication Mode and schemas, you can obtain the same functionality along with the benefits of schema separation.

3. Can you use an EXECUTE AS impersonation statement to execute a query between two SQL Server instances with the authentication mode set to Mixed Mode? For the impersonation to work, you must create logins on both instances of SQL Server. If the logins are SQL Server logins (as opposed to Windows logins), you must synchronize the passwords on both instances.

Module 2: Designing a Security Strategy

2–7

The Process of Defining an Authorization Strategy

**************************************** Illegal for non-trainer use *************************************** Introduction

Using authenticated logins is the first step to prevent unauthorized users from accessing SQL Server resources, such as databases, HTTP endpoints, or queues. However, users cannot access the SQL Server resources without authorization. Defining an authorization strategy helps you to secure the organization’s database resources. You should authorize users by following a policy of least privilege: grant access only to objects and data that users actually need, and keep everything else inaccessible. You can define an authorization strategy by following these steps: 1. Identify the objects and data you need to secure (the securables). 2. Identify the scope of the securables to help identify the users and applications that require access to them. 3. Identify the users and applications that can access the securables (the principals).

Identifying the securables

SQL Server 2005 manages a hierarchical collection of objects known as securables. A securable is an object that users access by using the database engine. The database engine controls and regulates access to these objects. You can secure these resources by granting and denying permissions. The most relevant securables are servers, databases, and schemas but also include smaller items such as tables and queues. The following steps will help you identify securables: 1. Start with the database services used by the application, such as Service Broker, Notification Services, or the SQL Server Database Engine. Consider the endpoints used by these services and the interaction between them. Then, identify all the logins created in your server. 2. List the databases and the objects in each database—such as users, roles, and schemas—that are present in your solution. 3. Analyze the objects present in each schema, such as tables, views, and functions.

2–8

Module 2: Designing a Security Strategy For example, if you are using Service Broker in your solution, you must first list all the endpoints created for Service Broker, then list all the Service Broker databases, and, finally, list all the objects, such as queues, contracts, and message types, used in these databases.

Identifying the scope of securables

Some securables act as containers for others. For example, databases contain schemas, and schemas contain tables. You can therefore think of securables as defining a set of hierarchies, known as securable scopes. This grouping and nesting occurs automatically. (You cannot create a table that does not belong to a schema, and you cannot define a schema that is not part of a database, for example.) You can apply security at different scopes. The securable scopes are: Servers. These hold securables such as endpoints and databases. Databases. These hold securables such as users, roles, messages, and schemas. Schemas. These hold securables such as tables, procedures, queues, and XML schemas. It is important that you identify the server and database scopes used by your applications to help identify the principals that need access to them.

Identifying the principals

Principals are the individuals, groups, applications, and service accounts that can request access to SQL Server securables. The scope of influence of a principal depends on the scope of its definition (Windows, server, or database) and whether it is indivisible or a collection. An example of an indivisible principal is a Windows login. An example of a collection principal is a Windows group. To identify the principals that you need to create, you must identify the following items: 1. Users that need access to your solution and the implemented authentication mode. You can then create the logins at the Windows or SQL Server level as needed. 2. Databases that each user needs to access. You can then create a database user principal for each login. 3. Permissions required by each database user. You can then group users into database roles. Note If your Transact-SQL code needs access to external resources, you can map a SQL Server login or a database user to a credential to authenticate the external access.

Module 2: Designing a Security Strategy

2–9

Lesson 2: Designing a Security Strategy for Components of a SQL Server 2005 Solution

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Apply security best practices when using HTTP endpoints.



Apply appropriate Code Access Security (CAS) for common language runtime (CLR) code integrated into the database.



Apply the guidelines for designing security for Reporting Services.



Apply the guidelines for designing security for Notification Services.



Apply the guidelines for designing security for SQL Server Integration Services.



Apply the guidelines for designing security for SQL Server Replication.



Apply the guidelines for security best practices when using SQL Server Agent and Database Mail.

As the use of information becomes more strategic and valuable worldwide, the systems storing this information can become targets for security breaches. Therefore, after choosing the right services for your solution, it is important that you plan for securing these services. By including security policies in the design phase, you can ensure that these policies are implemented during the development process. In this lesson, you will learn the security considerations and guidelines that you should follow when designing a solution that incorporates the various services available with SQL Server 2005. After completing this lesson, you will be able to design the optimum security strategy for components of a SQL Server 2005 solution.

2–10

Module 2: Designing a Security Strategy

Best Practices for Protecting HTTP Endpoints

**************************************** Illegal for non-trainer use *************************************** Introduction

The ability to provide native HTTP access to the database engine is a powerful SQL Server 2005 feature that makes the server available to a large number of users worldwide. Some of these users might be malicious. Therefore, you should be sure to secure access to the HTTP endpoints to minimize the risk of malevolent users attacking your server.

Best practices for using HTTP endpoints

Keep in mind the following best practices when you use HTTP endpoints: ■

Use Kerberos authentication. You can define an endpoint with one or more of the following authentication types: Basic, Digest, NTLM, Kerberos, and Integrated (NTLM and Kerberos). Kerberos is the most secure because it uses stronger encryption algorithms than the other authentication types, and it identifies both the server and the client. In fact, to use Kerberos, you must associate a server principal name (SPN) with the account under which SQL Server is running.



Limit endpoint access by using the Connect permission. You should limit endpoint access only to those users or groups that need it. You can accomplish this by granting or denying the SQL Server 2005 Connect permission.



Use Secure Sockets Layer (SSL) encryption to exchange sensitive data. If you are planning to exchange sensitive data by using HTTP endpoints, use SSL to help ensure the confidentially of that data.



Place SQL Server behind a firewall. If you need to provide HTTP access for SQL Server to Internet users, do not do so directly. Configure SQL Server behind a firewall that protects your communications.



Disable the Windows Guest account on the server. The Guest account, when enabled, allows any user to log on to the local computer without having to provide a password.



Enable HTTP endpoints only as needed. You should not enable an endpoint if it is not required or currently in use.

Module 2: Designing a Security Strategy

2–11

Best Practices for Using Code Access Security

**************************************** Illegal for non-trainer use *************************************** Introduction

CLR integration is disabled by default in SQL Server 2005. Before enabling CLR integration, you should develop a Code Access Security (CAS) policy to control the operation of managed code used by the databases.

Best practices for using Code Access Security

The following list summarizes best practices for using CAS with CLR code and SQL Server: ■

Enable CLR integration only when needed. You can enable and disable CLR integration by using the Surface Area Configuration tool. You should enable CLR integration only if you plan to use managed code.



Set the appropriate CAS policy. Set the CAS policy to meet the following objectives:





User code must not compromise the integrity and stability of SQL Server. Grant only the privileges that an assembly actually requires.



User code should run under the context of the user session that invokes it and with the correct privileges for that context. This helps prevent code from gaining unauthorized access to sensitive data in the database.



Restrict user code from accessing resources outside of the server. Allow code to access data only in local databases.

Configure the appropriate SQL Server host policy permission sets. SQL Server supports the following permissions: SAFE. The most restrictive permission. With the SAFE permission set, you permit user code to access only local data and perform only internal computation. It is the recommended permission set. EXTERNAL-ACCESS. Allows access to external resources such as files and networks. UNSAFE. Allows unrestricted access to resources. It is the least restrictive permission set.

2–12

Module 2: Designing a Security Strategy

Guidelines for Designing Security for Reporting Services

**************************************** Illegal for non-trainer use *************************************** Introduction

Deploying a secure, distributed enterprise reporting solution is a challenging process. You need to make critical decisions—such as granting access to reports and data sources that supply important or sensitive data—to securely authenticate and authorize users in your reporting environment. The guidelines provided in this topic will help you protect your reporting solution.

Guidelines for setting up security for Reporting Services

When designing the security for Reporting Services, follow these guidelines: ■

Use Windows authentication. SQL Server 2005 does not include an authentication component for Reporting Services. By default, Reporting Services uses Internet Information Services (IIS) and Windows security to authenticate user access to a report server. Of the options available, using Windows Authentication provides you with the most security, and you do not need to configure IIS to allow anonymous access. For More Information If you have business requirements that force you to use custom authentication, you should follow ASP.NET security best practices, including using SSL. For more information about ASP.NET security, refer to the topic “Building Secure ASP.NET Applications: Authentication, Authorization, and Secure Communication” on the MSDN Web site.



Use the Reporting Services role-based security model. To manage security effectively, start by using the default security configuration. Grant a minimum set of role assignments to provide access to users, and then follow the principle of setting security by exception, that is, changing or adding security to accommodate special cases.

Module 2: Designing a Security Strategy ■

2–13

Use SSL to protect sensitive data and login credentials. Reporting Services executes queries against specified data sources as it generates the reports. Reporting Services requires you to supply login credentials for these data sources. In the following scenarios, you should use SSL to secure this information: ●

Using custom authentication. If you use custom authentication, users must provide their credentials to access the server running Reporting Services. In this case, follow ASP.NET security best practices, including using SSL, to ensure the privacy of the credentials.



Connecting to external data sources. If your reports connect to external data sources that require additional credentials, you must ensure the privacy of these credentials by using SSL encryption.



Accessing Reporting Services from the Internet. If you access a Reporting Services server from the Internet, using SSL encryption ensures the privacy of your credentials and protects the information provided by the reports.



Disable integrated security in report data sources. If you use integrated security for the data source of a report, Reporting Services executes some components by using the privileges of the user generating the report. As a result, a user’s security token can be passed to an external data source without the user being notified. In addition, a malicious script can compromise or damage a server.



Back up encryption keys. Encryption keys are used to secure stored credentials and connection information for reports. Encryption keys are created either during the initialization of the report server as part of the setup or during the configuration of the report server. It is important to create a backup of the symmetric key and to know when and how to use the backup when deploying or recovering a Reporting Services server.

2–14

Module 2: Designing a Security Strategy

Guidelines for Designing Security for Notification Services

**************************************** Illegal for non-trainer use *************************************** Introduction

Notification Services implements security by using database roles and restricted database user accounts. The login accounts used by the Notification Services engine and by client applications use either Windows Authentication or Mixed Mode Authentication to access SQL Server. Logins gain database access through database user accounts, and then obtain the necessary permissions on instance and application databases through membership in Notification Services database roles. This topic covers guidelines for designing the Notification Services and recommended practices for improving the security of your Notification Services applications.

Guidelines for designing security for Notification Services

When designing the security for Notification Services, follow these guidelines: ■

Run the Notification Services engine under a low-privileged domain or local account. Do not use the Local System, Local Service, or Network Service accounts. Never use any account in the Administrators group. However, a delivery protocol might require additional privileges for the account under which the service runs.



Ensure that each engine has only the permissions it needs and no more. If you run Notification Services as a single-server deployment, the engine runs all the hosted event providers, generators, and distributors. You should add the account used by the engine to the NSRunService database role. If you scale out the hosted event providers, generators, and distributors across multiple servers, you should: ●

Add the accounts used by the event providers to the NSEventProvider role.



Add the accounts used by the generators to the NSGenerator role.



Add the accounts used by the distributors to the NSDistributor role.

The NSRunService role is a superset of the NSEventProvider, NSGenerator, and NSDistributor roles. ■

Secure files and folders. Notification Services uses several files and folders to store configuration information, application definition data, and instance configuration.

Module 2: Designing a Security Strategy

2–15

Although you must grant the Notification Services engine access to these files and folders, you must also limit access to these files and folders so that malicious users cannot compromise security by modifying configuration information or submitting bogus event data. You can secure these files and folders at the operating system level by using one or both of the following methods: ●

Using NTFS permissions to restrict access to the folders and all of their files to only the users that need to access them.



Using the Encrypted File System (EFS) to encrypt specific files and folders, and prevent unauthorized access to data.



Validate all user input. If your application uses subscriber, subscriber device, or subscription information in a protocol field, you must validate user input. This is because Notification Services cannot validate protocol header fields. If you do not validate this input, you are exposing the Notification Services applications to SQL injection vulnerability.



Follow good security practices for custom application component user names and passwords. If you need to develop a custom application such as an event provider, keep in mind general good security practices. For example, use the Data Protection Application Programming Interface (DPAPI) to encrypt sensitive information, secure all custom component source files and binary files, and adopt the principle of least privilege when assigning permissions. You can also store encrypted user names and passwords in the registry of the server computer.

2–16

Module 2: Designing a Security Strategy

Guidelines for Designing Security for SQL Server Integration Services

**************************************** Illegal for non-trainer use *************************************** Introduction

SQL Server 2005 Integration Services implements new security features to ensure privacy of important data in packages, such as user names and passwords. These new security features protect the integrity of packages and help secure the operational environment. When you design security for Integration Services, you should keep in mind considerations such as the possibility of encrypting your packages or the use of new database roles. This topic explains the new security features in SQL Server Integration Services and the considerations that you need to keep in mind when implementing them.

Guidelines for designing security for Integration Services

You should consider the following points when designing security for SQL Server Integration Services: ■

Digitally sign packages. SQL Server 2005 provides the ability to digitally sign SQL Server Integration Services packages. By using digital signing, you can identify and prevent the loading of altered packages. You should digitally sign your packages to prevent a malicious or inadvertent change. Use digital signing when you deploy packages to help ensure that your packages are not altered after deployment.



Encrypt entire packages. You can encrypt your packages by using a password or a user key. You should use a user key to protect a package that only you will run. For example, if you are developing a package, using a key enables you to work with a package without needing to repeatedly type the password. The drawback of encrypting packages with a key is that other users cannot open the package unless they log in with your credentials. Using a password is a more flexible approach when you require several users to share the package. For example, if you have a group of developers who need to work on the same package, the developers can share a single password to access the package.

Module 2: Designing a Security Strategy

2–17



Selectively encrypt sensitive data. You can encrypt the entire package to secure all objects created in the package. SQL Server Integration Services also provides the option of marking objects in a package as being sensitive by setting the ProtectionLevel property. This gives you an easy way to selectively encrypt objects in a package. You should protect all sensitive information inside packages, such as logins and passwords supplied to connections.



Control access to packages stored in msdb by using new database roles. If you decide to store a SQL Server Integration Services package in the msdb database, you can control access to the package by using three new database roles: db_dtsadmin, db_dtsltuser, and db_dtsoperator.



Secure the operational environment. If you choose to save SQL Server Integration Services packages in the file system as XML (.dtsx) files, you should secure the folders and the package files.

2–18

Module 2: Designing a Security Strategy

Guidelines for Designing Security for SQL Server Replication

**************************************** Illegal for non-trainer use *************************************** Introduction

It is important to understand the considerations for designing security for replication to protect the data and business logic in your applications. When you design security for a replication environment, you should take into account such considerations as the authentication mode to use, the new Replication Agent security model, and how to protect replicated data transmitted across the Internet.

Guidelines for designing security for replication

You should consider the following items when designing a security strategy for replication: ■

Select an appropriate authentication mode. As with the SQL Server database engine, you have two authentication modes available for replication: Windows Authentication and Mixed Mode Authentication. Windows Authentication is the recommended mode, and you should use it when all the servers included in replication are in the same trusted-domain structure. When some of the servers involved in replication are not in a trusted domain, you should use Mixed Mode Authentication.



Use Replication Agent security. SQL Server 2005 provides a new replication agent security model that allows you more control over replication agent accounts used in a replication scenario. You can associate each replication agent with a different security account, thereby providing each agent with only the permissions needed to perform its operations.



Use a publication access list (PAL). You should consider using a PAL to manage access to publications. By using a PAL, you ensure that users and Replication Agent accounts have the necessary privileges to access publications. For scenarios in which you need to manage a large number of users in a PAL, you should consider creating a Windows group. You can add all users to this Windows group and add the group to the PAL.

Module 2: Designing a Security Strategy

2–19



Protect the snapshot folder. The snapshot folder holds the replicated data and is external to SQL Server. This data is as sensitive as the data in the database. You should protect this folder and its contents to prevent unauthorized access. Replication Merge Agent and Replication Distribution Agent need read access, and the Replication Snapshot Agent needs write access. Keep in mind that if you change the snapshot folder location, you must apply these access rights to the new location.



Protect data replicated over the Internet. When you need to replicate data over the Internet, you should consider using a security technology to ensure reliability and confidentiality of data. In SQL Server 2005, you can accomplish this in two ways: by creating a virtual private network (VPN) between the replication servers or by using Web synchronization. When you use merge replication, you should use Web Synchronization because it provides you SSL encryption through HTTPS. In transactional or snapshot scenarios, you should create a VPN between replication servers.

2–20

Module 2: Designing a Security Strategy

Guidelines for Designing Security for SQL Server Agent and Database Mail

**************************************** Illegal for non-trainer use *************************************** Introduction

SQL Server 2005 provides a new security environment for SQL Server Agent. You can execute each SQL Server Agent job in a different security context by using subsystems and proxy accounts. SQL Server 2005 also provides a new Simple Mail Transport Protocol (SMTP) mail component called Database Mail. This component has different security requirements from SQLMail, available with previous versions of SQL Server.

Guidelines for designing security for SQL Server Agent

Consider the following points when specifying security for SQL Server Agent: ■

Configure the SQL Server Agent service account. The Windows account used to start SQL Server Agent should meet the following requirements: ●

It must be a member of the sysadmin role.



It must have the following Windows permissions: ●

Adjust memory quotas for a process



Act as part of the operating system



Bypass traverse checking



Log on as a batch job



Log on as a service



Replace a process level token

Important This account used by the SQL Server Agent does not need to belong to the Administrators group.

Module 2: Designing a Security Strategy ■

Guidelines for designing security for Database Mail

2–21

Use proxy accounts and subsystems. You can run each SQL Server Agent job step with a different security context in SQL Server 2005. You can create proxy accounts and map these accounts to Windows credentials to access external resources. Transact-SQL job steps are always executed in the context of the job owner. You can limit proxy account access by using subsystems. Subsystems are predefined objects that represent a set of functionality available to a proxy account. Examples of subsystems are Operating System and Microsoft ActiveX® Script. You should keep proxy account permissions to a minimum.

Keep the following considerations in mind when specifying a Database Mail environment: ■

Enable Database Mail only if you need it. Enable Database Mail only if you need an SMTP mail subsystem. Database Mail is disabled by default, but you can enable it by using the Surface Area Configuration tool.



Use private Database Mail profiles. If you define a public Database Mail profile, all database owners for the msdb database can access the profile. When you define a private profile, you can select which database users can gain access to the profile. You must also make these users members of the DatabaseMailUserRole database role in msdb.



Restrict attachment size and file extensions. Consider limiting the size of attachments that a Database Mail profile can send by configuring the attachment size governor. You can also make a list of prohibited file extensions, which will prevent Database Mail from sending attachments with these types.

2–22

Module 2: Designing a Security Strategy

Practice: Designing a Security Strategy for Components of a SQL Server 2005 Solution

**************************************** Illegal for non-trainer use *************************************** Introduction

In this practice, you will identify the security configuration of the SQL Server services required to support the business requirements presented in the following three scenarios. Read the scenario and the requirements in the table, and then summarize in the Solution column in the table how you will meet this requirement.

Scenario 1

Your ASP.NET application supports four types of users: registered, anonymous, employees, and business partners. Each user group requires varying levels of access to organization’s data. The data is stored on three SQL Server database servers, with one database on each server. The first database supports anonymous and registered users, the second database supports the employees, and the third database supports the business partners. The business partner database and the employee database include stored procedures that access data in the first database. ■

The following table contains a list of requirements that you must consider for this scenario. You must design the necessary solutions to meet these requirements. For each requirement, provide a summary of your solution. Requirement You must authenticate employees through your Microsoft Active Directory® directory service domain and provide them with access without prompting them for additional credentials. You must provide the stored procedures with secure access to the first database.

Solution

Module 2: Designing a Security Strategy

2–23

The following table shows a possible solution.

Scenario 2

Requirement

Solution

You must authenticate employees through your Active Directory domain and provide them with access without prompting them for additional credentials.

Use Windows Authentication, and create logins on the employee server for each employee’s Windows account. Using Windows Authentication causes employees to be authenticated by Active Directory. Employees do not need to provide additional credentials. For the other two database servers, use Mixed Mode Authentication, and create SQL Server logins to authenticate users.

You must provide the stored procedures with secure access to the first database.

Use the EXECUTE AS clause in the stored procedures to implement impersonation. Create a login in the first database, and configure it with only the privileges needed to execute stored procedures. Use the login in the EXECUTE AS clause.

You work for an Internet bookstore that enables customers to submit orders by using an online Web application. The application does not confirm the order immediately. Instead, customers receive a confirmation e-mail after the application has processed the order. The application uses Notification Services to send the confirmation e-mails. You have installed Notification Services on one server in the same internal domain as SQL Server. ■

The following table contains a list of requirements that you must consider for this scenario. You must design the necessary solutions to meet these requirements. For each requirement, provide a summary of your solution. Requirement

Solution

You must send the e-mail notifications through an external SMTP server. You must provide the least privilege needed to the Notification Services service account. The following table shows a possible solution. Requirement

Solution

You must send the e-mail notifications through an external SMTP server.

Use Windows Authentication for Notification Services because it is in the same domain as SQL Server. Notification Services can access the databases by using the service account. Verify that the account has the minimum privileges necessary to send e-mails through the external SMTP server. If you use the local IIS SMTP service, you must use a service account that is a member of the local Administrators group.

You must provide the least privilege Add the service account to the needed to the Notification Services service NSRunService database role. account.

2–24

Module 2: Designing a Security Strategy

Scenario 3

You work for a finance firm that has acquired a travel agency. You must import data from a SQL Server 2005 database on the travel agency server to your database on a scheduled basis. The travel agency’s domain and your domain do not trust each other. You plan to create an Integration Services package on the travel agency server to export the data. ■

The following table contains a list of requirements that you must consider for this scenario. You must design the necessary solutions to meet these requirements. For each requirement, provide a summary of your solution. Requirement

Solution

You must provide privacy for the Integration Services package. You must determine an authentication strategy. You must avoid encrypting the entire package. The following table shows a possible solution.

Discussion questions

Requirement

Solution

You must provide privacy for the Integration Services package.

Use password encryption because the package is stored in the travel agency server.

You must determine an authentication strategy.

Use Mixed Mode Authentication because the domains are not trusted. Create a SQL Server login on the server in your domain to provide access for the package.

You must avoid encrypting the entire package.

Use the ProtectionLevel property to encrypt only the credential data in the package.

1. You work with several databases that use Mixed Mode Authentication. In what circumstances would you continue using Mixed Mode Authentication, and in what circumstances would you switch to Windows Authentication? Answers will vary. The purpose of this question is to discuss scenarios in which Mixed Mode Authentication is a valid option. For example, if the client application and the database server are in trusted domains, use Windows Authentication. If the application and database server reside in separate, non-trusted domains, use Mixed Mode Authentication.

2. You deploy Notification Services with the distributors scaled out across multiple servers. To which role should you add the distributor service accounts? Add the service accounts to the NSDistributor role (rather than to the NSRunService role). The distributor needs permissions only to execute stored procedures that read and update the notification and distribution work tables. Making the service accounts members of the NSRunService role opens the database server to unnecessary vulnerabilities. By adding the accounts to the NSDistributor role, you are applying the principle of least privilege.

3. You are deploying an Integration Services package for an internal user in your organization. What steps should you take to ensure the security of the package? Digitally sign the package to ensure that it cannot be modified, and use the ProtectionLevel property to protect sensitive data in the package.

Module 2: Designing a Security Strategy

2–25

Lesson 3: Designing Objects to Manage Application Access

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Apply the guidelines for using SQL Server 2005 roles and schemas.



Explain the considerations for using the code execution context.



Apply the security guidelines for using stored procedures.



Apply the security guidelines for using views.



Apply the security guidelines for protecting columns.

When you design a security strategy for your SQL Server services, you must take into account the objects that your applications will access. For example, you should consider how you will design schemas and roles to group data objects and users into manageable units. You should also look at how you can use views and stored procedures to provide secure data access so that you do not have to grant permissions directly on base objects. In this lesson, you will learn how to design database objects to manage application access. You will also learn guidelines for using roles, schemas, stored procedures, and views. Finally, you will note the considerations for managing the execution context of code in the database.

2–26

Module 2: Designing a Security Strategy

Guidelines for Using SQL Server 2005 Roles and Schemas

**************************************** Illegal for non-trainer use *************************************** Introduction

SQL Server databases can support a large number of users. As a result, you must know how to design your solution to effectively manage user access while maintaining a secure environment. SQL Server 2005 provides roles and schemas to help you to achieve this. A role is a container for grouping multiple user accounts. A schema is a set of database objects that form a namespace. This topic covers guidelines for using roles and schemas in your security strategy.

Guidelines for using SQL Server 2005 roles and schemas

Use the following guidelines when working with roles and schemas: ■

Use roles to access SQL Server databases. When planning roles, use the following guidelines: ●

Use roles for all access. When you use roles, you can manage a large number of users easily because you can group multiple users together. Using roles, you can easily specify the privileges that a number of users have within a database. For example, suppose you have 1,000 users who need to access the same data. If all users require the same type of access, you can create one role and assign permissions to that role instead of managing user permissions individually.



Use container roles that contain child roles. You can nest roles; a role can be a member of another role. This feature enables you to create subgroups of users who are part of a larger group.

Module 2: Designing a Security Strategy ■

2–27

Classify applications by access requirements. SQL Server 2005 provides roles and schemas for managing the access requirements of applications. To make full use of schemas and application roles, classify your applications by their access requirements, and select the object that is best suited to your requirements: Schemas. You can create a schema for each application. Create a role that acts as the schema owner, and then make the application users members of this role. You can assign permissions for this schema. Application roles. You can use application roles to limit access to specific data to users who connect through a particular application. Unlike fixed database roles, application roles contain no members and are inactive by default. Use application roles in environments in which security requirements are the same for all users. Note You should not use application roles if you need to implement an audit strategy. With application roles, you cannot audit individual user activity, only the activity of the application.



Classify users by access requirements. Group users into categories based on their access requirements. When appropriate, determine which groups can be nested within larger groups. For example, users from the Finance group and the Human Resources group can be nested into the Employees group. Create the appropriate database roles based on your groups.



Document roles and role membership information. Maintain a list of the roles defined in each database and the users who belong to those roles.

2–28

Module 2: Designing a Security Strategy

Multimedia: Considerations for Using the Code Execution Context

**************************************** Illegal for non-trainer use *************************************** Introduction

Discussion questions

In SQL Server 2005, you can control the execution context of stored procedures and user-defined functions (UDFs) by using the EXECUTE AS clause in the module definition. The clause specifies the user account that SQL Server uses to validate permissions on objects referenced by the code in the module. As a result, you can permit users to run a module as if they were authenticated as different users. This way, you can avoid ownership chaining and exposing base objects to unintended access. 1. List some of the scenarios in which you can use the EXECUTE AS clause. You can use the clause to create custom permission sets. Consider a situation in which you want to delegate the ability to truncate a table to user A. However, the ability to truncate is not a grantable permission. The closest permission is ALTER, but ALTER increases the scope of permitted tasks beyond what is needed. Instead, create a stored procedure that truncates the table, add the EXECUTE AS clause to the stored procedure, and specify a user account that has the ALTER permission. Then grant the EXECUTE permission to user A. You can also use the EXECUTE AS clause in situations in which you want to avoid ownership chaining and avoid granting permissions on referenced objects.

2. What is the difference between an ownership chain and an execution context? SQL Server uses ownership chains to verify permissions only when you run SELECT, INSERT, UPDATE, DELETE, and EXECUTE statements. If you run CREATE TABLE, TRUNCATE TABLE, BACKUP DATABASE, and EXEC (dynamic SQL) statements, ownership chains do not apply, and SQL Server uses the execution context to determine if you have the necessary permissions.

Module 2: Designing a Security Strategy

2–29

Guidelines for Using Stored Procedures

**************************************** Illegal for non-trainer use *************************************** Introduction

When developing an application that requires access to SQL Server, you must decide whether the application should communicate with the database by using stored procedures or by using SQL code embedded in the application. From a security viewpoint, the use of stored procedures has more benefits than the use of embedded SQL code. For example, stored procedures can help you avoid SQL injection attacks, prevent unauthorized access, and manage execution contexts. In this topic, you will learn guidelines for using stored procedures to protect the data in a database.

Guidelines for using stored procedures

Use the following guidelines when working with stored procedures: ■

Use stored procedures for all application data access. You should not let your applications use SELECT, INSERT, UPDATE, or DELETE statements to query and manipulate your database directly. Instead, you should provide stored procedures that perform the required operations. There are several reasons for using this approach, including: ●

Making it easier to protect the data in the database. You can avoid assigning permissions directly on base tables and views.



Providing you with more control over the operations that applications can perform.



Decoupling applications from the database schema, enabling you to modify the database schema without having to update the applications. You just need to update the stored procedures.



Encapsulating operations that otherwise might expose sensitive data that should be hidden from the user or application.



Making query optimization easier. You can tune the stored procedures rather than having to tune the SQL statements used by each application.



Reducing network traffic by encapsulating logic in the server rather than in applications.

2–30

Module 2: Designing a Security Strategy ■

Use stored procedures to reduce the risk of SQL injection attacks. Any clientside application that returns SQL statements makes the server vulnerable to such injection attacks. You can reduce this risk by using stored procedures or parameterized queries for all data access. However, applications should still validate all user input for data passed as parameters to stored procedures.



Use encrypted stored procedures. You can encrypt stored procedure definitions by specifying the WITH ENCRYPTION clause in the CREATE or ALTER PROCEDURE statement. Before you encrypt a stored procedure, be sure to save it in a SQL Server project as part of a version control process.



Document owners and groups that use stored procedures. Document the ownership of stored procedures to provide information about the namespace that developers should use to access them. Also, document the users and groups that have permissions to run the stored procedures. This provides you with a reference document that you can use to help resolve permissions issues.



Document dependencies between stored procedures and database objects. List all stored procedures, tables, views, and functions used by a stored procedure. You can use this list to assess the impact of changes made to the definition of a stored procedure or the objects that it uses. This list also provides the information needed to determine the correct execution context for a stored procedure.

Module 2: Designing a Security Strategy

2–31

Guidelines for Using Views

**************************************** Illegal for non-trainer use *************************************** Introduction

By using views, you can prevent users from directly accessing tables while still allowing users to work with data. You can also restrict the rows and columns that are available to end users. You can use views to hide database complexity, filter data, and restrict users from accessing base tables. In this topic, you will learn guidelines for using views to protect the data in a database.

Guidelines for using views

Use the following guidelines when working with views: ■

Limit operations on data made available through a view. Use views to limit the access an application has to data. For example, you can specify that a view supports only SQL SELECT statements and that users and applications cannot update data by using the view. You can also use the WITH CHECK OPTION clause in the view definition to force all data manipulation language (DML) statements that run against the view to adhere to the SELECT statement criteria. Define views to support only the access that applications and users require.



Present only the data required. Limit the columns and rows returned by a view to the information actually required by applications. If different applications require different sets of columns, define multiple views.



Hide database complexity, and insulate applications against schema changes. Use views to join tables together and to calculate summary values. By using views, you can avoid exposing the table schema to the application. If you need to modify the table definitions, you do not need to modify the applications that use them if the data exposed by the views retains the same structure.



Encrypt view definitions. Encrypt the view definition by specifying the WITH ENCRYPTION clause in the CREATE VIEW or ALTER VIEW statement. After you encrypt the view, you must know the view code if you want to change anything in the view. You need to be careful while changing the view, and you also need to document the encrypted view code.



Avoid defining views based on other views. Do not define views based on other views. Defining views based on other views can lead to performance problems and generate an excessive number of temporary tables. It can also lead to fragility because a change to one view might require you to regenerate other views that reference it.

2–32

Module 2: Designing a Security Strategy

Guidelines for Protecting Columns

**************************************** Illegal for non-trainer use *************************************** Introduction

SQL Server 2005 allows you to protect data at the column level by encrypting information stored in that column. However, encrypting data in a column can have a significant impact on performance when applications need to access the data in that column.

Guidelines for protecting data

The following is a list of guidelines for encrypting data at column level. ■

Encrypt sensitive data only. Encrypt data at column level only when you need to provide special security protection for that data. For example, you might need to encrypt data at column level because of business or regulatory requirements.



Evaluate limitations when encrypting data at column level. There are some limitations for data encrypted at column level:





You cannot sort, filter, or index encrypted data.



You should store encrypted data in VARBINARY(128) columns.

Evaluate performance implications. Each time you select from, insert into, or update an encrypted column, the application must perform additional computations to encrypt and decrypt data. You should test encryption to determine its impact on performance.

Module 2: Designing a Security Strategy

2–33

Lesson 4: Creating an Auditing Strategy

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Apply the guidelines for designing an auditing strategy.



Evaluate the considerations for choosing a technique for storing auditing information.



Apply the guidelines for protecting auditing information.

Corporate data must be constantly audited to save a business from the risk of noncompliance, fraud, and diminished reputation. By managing these risks, you can increase the confidence that clients have in the organization. A database auditing strategy can help organizations manage these risks by identifying suspicious data access. The auditing strategy can provide a comprehensive audit trail to assess the effectiveness of application and system controls. The auditing strategy can also help to verify if security measures are working by creating reports that internal auditors can use to monitor corporate use. In this lesson, you will learn guidelines for designing an auditing strategy for SQL Server 2005. You will evaluate the various considerations for choosing a technique to store auditing information. Finally, you will see guidelines for protecting auditing information.

2–34

Module 2: Designing a Security Strategy

Guidelines for Designing an Auditing Strategy

**************************************** Illegal for non-trainer use *************************************** Introduction

It is common to find databases in which auditing procedures are switched off to boost the performance of the database. In such cases, if the database is altered, there are no records available to determine the source of any unauthorized activity in the database. However, this does not imply that you should record every activity that occurs in the database (unless regulations or the law require it). Instead, you should define an intelligent auditing strategy that takes into account the nature of the system. You should design the auditing strategy to catch security breaches while maintaining optimal database performance. Important In all cases, laws and regulations subordinate considerations about optimal performance.

Guidelines for designing an auditing strategy

Use the following guidelines when designing an auditing strategy: ■

Determine the types of events to be audited. Excessive auditing often degrades system performance, and the volume of data can make it more difficult to analyze audit trails. You should create an audit statement to help you to define the events you need to audit, the level of detail that the audit needs to focus on, and the resources (logins, services, databases and so on) that have to be audited. You should audit only these events and resources. While creating audit statements, you should document only those events that are relevant to the business goals of the organization or that match the technical requirements of the application.

Module 2: Designing a Security Strategy ■

2–35

Identify your event requirements. Use the following list to help determine which events you should audit: ●

Identify regulatory requirements. Regulatory requirements such as Health Insurance Portability and Accountability Act (HIPAA) or General Principles for the Assessment of Certification Bodies for Product Certification (C2) mandate specific events that you should audit. For example, C2 auditing requires information that goes beyond server-level events such as shutdown or restart. It also extends to successful and failed use of permissions when accessing individual database objects and executing all data definition language (DDL), data access control (DAC), and data manipulation language (DML) statements.



Identify your business requirements. Some business processes require auditing of specific user operations. For example, a bank might want to track employees’ wire transfer operations to assure that the bank’s funds are not used illegally. This business requirement could indicate that this information should be part of the auditing strategy. To identify the business requirements necessary to create the auditing strategy, you should interview all department managers in your organization. Managers will provide the auditing requirements for each department.



Identify your security requirements. You might need to include security requirements beyond those of the regulatory and business requirements in the auditing strategy. For example, network security staff might require you to audit the location of all attempted accesses to the database. Your organization might have a security policy to audit all account lockouts. To accomplish this, you should audit these events in SQL Server.



Obtain stakeholder approval. It is important to gain and maintain senior management support for creating an auditing strategy for your organization. You must obtain a clear statement of support before you start creating the strategy. You also need to continuously keep the senior management informed and involved as you progress with the task. Without such support, employees are less likely to participate in the audit creation process or observe policies.



Document how your auditing solution meets the requirements. After identifying the necessary events that you need to audit, you should document the auditing solution. This will provide supporting information describing how the auditing solution meets the security, business, and regulatory requirements. You might also detect variations in the auditing requirements of different systems. Ideally, you should try and avoid such variations. If you cannot avoid these variations, you must record the reason for each variation and have the stakeholders sign off on each variation.



Design an audit review process. You need to examine audit logs constantly to investigate whether any security-breaching incident is in progress or has already occurred. Audit logs provide the evidence of a security violation. When designing an audit review process, you should be able to answer the following questions: ●

Who will be responsible for managing and analyzing an event?



How often will you analyze events?



How will you report specific incidents of a security breach to the management?



How will you preserve the chain of evidence in an audit log?

Note In an actual compliance audit, the auditors will want proof that the reviews were conducted in a timely manner. This is known as a control. Controls must exist; however, audit logs are not good controls unless someone is reviewing the logs.

2–36

Module 2: Designing a Security Strategy

Considerations for Choosing a Technique for Storing Auditing Information

**************************************** Illegal for non-trainer use *************************************** Introduction

After designing an auditing strategy, you must select an auditing information storage technique that meets all of your auditing requirements. The tools that you use for generating auditing information and the way in which you store audit information can have an impact on the performance of your database.

Techniques for storing auditing information

You can store auditing information in the following ways: Using the tables being audited. You can modify your tables to add a column for storing auditing information in each row. Using this technique, you can easily track the events that occur to each row in that table. For example, you can find information about the user or the time a record was last modified. However, the amount of data you can record will be limited, and maintaining a full history of all audit records can be difficult. Using a separate table. You can add an audit table (or set of tables) for holding audit information to the database. This technique enables you to store historical audit information at the cost of the additional storage required. Retrieving the audit information for a particular row in a table is more complex than the previous technique. You can use implement triggers to record audit information and provide stored procedures for querying this audit information.

Module 2: Designing a Security Strategy

2–37

Using SQL Profiler. You can audit events that SQL Server generates by using SQL Profiler. You can store auditing information in a table or in a file without writing any code. However, SQL Profiler is a resource-intensive tool. The enhancements made in SQL Profiler with SQL Server 2005 enable you to track all SQL Server 2005 database engine activities as well as events generated by SQL Server 2005 Analysis Services. Using Service Broker. You can send server events to a Service Broker queue by using trace event notifications. You can use trace events to perform asynchronous auditing for all events that SQL Profiler can capture. You can use this technique to capture most DDL events and many SQL Trace events. You can save the audit event information to a table, and you can arrange for processes to determine the criticality of the event and initiate any predetermined response. Using a third-party auditing solution. A number of third-party auditing tools are available for SQL Server. These solutions help you to store your auditing information without writing audit procedures. Selecting a technique to store auditing information

You need to consider the following points when choosing a technique for storing auditing information: ■

Determining events to audit. When choosing a technique, examine the type of events you need to record. Based on these events, the amount of auditing information storage required should become clear. For example, if you need to track information about only the date and time of the most recent operation on a row and the user who performed it, you can store the information in the table containing the row. In this case, only a small amount of auditing information is stored. If you need to maintain historical data, you can store the audit information in a separate table. Note If you need to audit server events (such as logins, logouts, or locks information), you must use SQL Profiler, event notifications, or a third-party tool.



Changing the database schema to support auditing. You might need to modify the database schema, either by adding columns to tables or by creating additional tables, to accommodate your auditing strategy. However, there are situations that do not support these schema changes. For example, if you are auditing a legacy application or a vendor-supported third-party application in which you do not have control over the database, you cannot modify table or database schemas. In these scenarios, you should use SQL Profiler or Service Broker event notifications to capture the audit trail.



Assessing the impact on performance. Using SQL Profiler to trace events can have a negative impact on your server performance. Using Service Broker event notifications can consume fewer resources and have a smaller effect on performance. You should evaluate a third-party auditing solution to determine the performance impact before it is used in a production system.

2–38

Module 2: Designing a Security Strategy

Guidelines for Protecting Auditing Information

**************************************** Illegal for non-trainer use *************************************** Introduction

Business, security, or regulatory provisions frequently require you to store audit information for a specific time period. You must design a secure archival strategy for auditing information.

Guidelines for protecting auditing information

Follow these guidelines for protecting auditing information: ■

Prevent unauthorized access to the audit information. To protect auditing information, you must implement a restrictive access policy to the audit logs. The technique that you use for implementing this will depend on the strategy you use for storing your audit logs. For example, if you store this information in the database, you must ensure that only authorized users can query this data. If you store this information in files outside of a database, you must use the operating system to protect the files and folders that comprise the audit log. You must document the users that can access auditing information in your audit statement. You must also document when users access the auditing information or backup tapes.



Determine the retention period for auditing information. The auditing information needs to be stored for a time period defined in your auditing policy. This auditing information must be made read-only to prevent modifications. When the time period expires, a destruction mechanism should be executed to destroy auditing information (unless the law requires that the records be permanently saved).



Store external archives in a protected location. If you need to store the audit information on external devices such as tape cartridges, you must select an appropriate storage location. You should also ensure that the storage location can withstand fire and flooding.

Module 2: Designing a Security Strategy

2–39

Lesson 5: Managing Multiple Development Teams by Using the SQL Server 2005 Security Features

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Explain the common challenges that you face when working with multiple development teams.



Apply the guidelines for managing developer access by using schemas.



Apply the guidelines for auditing development by using DDL triggers.

Developing large and complex enterprise applications frequently requires cooperation and collaboration between multiple development teams. In some environments, multiple development teams work within the same databases. Databases can contain data critical to the organization, and different teams might need different levels of access to data. Managing developer security privileges and keeping track of objects that belong to a particular application can become complex tasks. This complexity is increased when implementing database changes. These changes can cause system instability if different development teams attempt to introduce database changes that overlap and that are not adequately tested in an integrated environment. SQL Server 2005 provides several features, such as schemas and DDL triggers, to help you organize and manage these issues. These features can assist you in creating a robust and versatile environment for multiple development teams. In this lesson, you will learn how to use security features, such as SQL Server schemas, DDL triggers, and permissions to manage multiple development teams.

2–40

Module 2: Designing a Security Strategy

The Common Challenges of Working with Multiple Development Teams

**************************************** Illegal for non-trainer use *************************************** Introduction

When working with multiple development teams, it is difficult to ensure that each developer has only those privileges that are strictly necessary to perform his or her tasks. Dealing with multiple development teams poses certain challenges. You should be aware of these challenges and possible solutions to be able to design a secure system. SQL Server 2005 provides the support needed to define a shared, overarching security model for development teams. SQL Server 2005 also provides schema separation that isolates an object’s name and security context for future changes in a team and provides the ability to assign each team only those privileges that are strictly necessary to perform its tasks.

Planning access security for multiple development teams

When planning access security for different development teams, you need to consider the following points: ■

Controlling different security contexts. If you do not plan developer access security carefully, it is possible for each group of developers to have a different security context. The code written by developers in each group will run in these different security contexts. This provides an incoherent security model for your application. Moving code from one context to another might result in the code failing. For example, if one developer group creates database tables and another developer group creates views, you might not be able to use ownership chaining.



Managing different object schemas. In earlier versions of SQL Server, object names are qualified with the user name that created it in the form owner.object. In SQL Server 2005, each object is qualified by the schema that owns it, independent of the user who created the object. Note You will learn how to use schemas in the topic “Guidelines for Managing Developer Access by Using Schemas,” later in this lesson.

Module 2: Designing a Security Strategy ■

2–41

Planning privileges for developer teams. In many development environments, developers have full administrative access to the computers and databases they are using. Consequently, they create code that runs using full administrative rights but does not work in a more restricted security environment. Consequently, when the application is deployed, users need to be granted full administrative rights to run the application successfully. This is clearly a security risk. It is very important to plan the code privilege needs for developer teams prior to starting development. SQL Server 2005 has a permission hierarchy that enables you to assign only those permissions that are needed by each developer group to perform its tasks.

2–42

Module 2: Designing a Security Strategy

Guidelines for Managing Developer Access by Using Schemas

**************************************** Illegal for non-trainer use *************************************** Introduction

Managing developer access is essential to protect the information in the database. SQL Server 2005 introduces schemas to help you manage and control access to objects. Schemas enable you to separate objects from the database user that created them: Objects are owned by schemas rather than by users. You can use SQL Server 2005 schemas to manage developer access by grouping objects together and providing a common namespace for related objects.

Guidelines for managing developer access by using schemas

Consider the following guidelines for using schemas to manage developer access to a SQL Server 2005 database: ■

Identify the different developer groups. Analyze the application security needs required by each developer group. You then define application groups made up of one or more developer groups, based on the different groupings of application security needs.



Map developer groups to schemas. Map the developer groups you defined to schemas. This mapping ensures that all of the objects created by each developer group will be in the corresponding schema. You can then assign permissions to each schema.



Use descriptive names to define schemas. When you reference an object in SQL Server 2005, you should qualify the object with the object schema as schema.object. Use descriptive names so that the developers can easily identify object groupings by the schema name.

Module 2: Designing a Security Strategy

2–43



Use shared schemas. Multiple users can own a single schema through membership in roles or Windows groups. Using a shared schema for each developer group gives your application a structured namespace for database objects. You can change the owners of a schema at any time with little or no impact to the application components that reference database objects.



Use default schemas. Use default schemas to enable developers to store shared objects in a schema created for a specific application, rather than in the dbo schema. Assigning user-created default schemas affects the object name resolution algorithm that the SQL Server query engine uses to locate objects.

Note A default schema is available for globally shared objects within a database. Abstracting users from schemas provides a good approach to managing developer access to database objects.

Note Earlier releases of SQL Server use lookup precedence to locate objects in the schema that exactly match the current user name and then try to find the objects in the dbo schema. But in SQL Server 2005, the default schema can be set with only the privileges needed, resulting in a more secure database.

2–44

Module 2: Designing a Security Strategy

Guidelines for Auditing Development by Using DDL Triggers

**************************************** Illegal for non-trainer use *************************************** Introduction

Auditing the development process is a good security practice to use when managing the creation of objects in a database. This auditing also stores information about developers who create objects in the database. By using SQL Server 2005, you can audit the operations performed by developers by using DDL triggers that run in response to a DDL event.

Creating DDL triggers

You can create DDL triggers by using the CREATE TRIGGER statement. In the following example, the DDL trigger will run whenever a DROP TABLE or ALTER TABLE event occurs in the AdventureWorks database. The DDL trigger will display a message and disallow the requested operation. USE AdventureWorks go CREATE TRIGGER safety ON DATABASE FOR DROP_TABLE, ALTER_TABLE AS PRINT 'You must disable Trigger "safety" to drop or alter tables!' ROLLBACK

The following example uses the DDL_LOGIN_EVENTS option. The trigger displays a message if any CREATE LOGIN, ALTER LOGIN, or DROP LOGIN event occurs on the current server instance. It uses the EVENTDATA function to retrieve the text of the corresponding Transact-SQL statement. CREATE TRIGGER ddl_trig_login ON ALL SERVER FOR DDL_LOGIN_EVENTS AS PRINT 'Login Event Issued.' SELECT EVENTDATA().value('(/EVENT_INSTANCE/TSQLCommand/ CommandText)[1]','nvarchar(max)')

Module 2: Designing a Security Strategy

2–45

For More Information For more information on how to use EVENTDATA with DDL triggers, see the topic “Using the EVENTDATA Function” in SQL Server 2005 Books Online. You can create DDL triggers in the master database that then function as triggers created in user-designed databases. You can obtain information about database-scoped DDL triggers by querying the sys.triggers catalog view. To obtain information about serverscoped DDL triggers, query the sys.server_triggers catalog view. Guidelines for auditing development by using DDL triggers

When building database applications, you should monitor the objects and code being created to ensure that items conform to development standards. Development standards can cover topics such as naming conventions, standardizing primary key and foreign key definitions in tables, and defining indexes. You can use DDL triggers to monitor the objects that developers create in the database. Use the following guidelines for auditing development: ■

Evaluate whether DDL triggers are appropriate for auditing. You must first establish whether using DDL triggers is appropriate to your development environment. You should familiarize yourself with the events that you can control and monitor by using DDL triggers. In general, you can define a DDL trigger that fires whenever a user executes any DDL statement in the scope of a server or a database.



Determine the trigger scope. Different DDL statements have different scopes. The scope of the trigger depends on the event. Examples of server-level events include login events such as CREATE LOGIN, ALTER LOGIN, and DROP LOGIN. Examples of database-level events include table events such as CREATE TABLE, DROP TABLE, and ALTER TABLE. There are many other DDL events available. A DDL trigger created to raise a response to a CREATE TABLE event will run whenever a CREATE TABLE event occurs in the database. A DDL trigger created to run in response to a CREATE LOGIN event will run whenever a CREATE LOGIN event occurs in the server. Note that database-scoped DDL triggers are stored as objects in the database in which they are created. Server-scoped DDL triggers are stored as objects in the master database. Note DDL triggers respond only to DDL events specified by Transact-SQL DDL statements but do not respond to DDL-like operations executed by system-stored procedures.



Identify the events to audit. You must be able to quickly identify the events for each entry in an audit log and be able to search the audit log and locate a specific event. Many audit strategies fail because the design team records an excessively large number of audit events, making the auditing information difficult to analyze. In SQL Server 2005, you can write a trigger that responds to a group of events. There are predefined groups of events such as FOR DDL_TABLE_EVENTS that raise events for such statements as CREATE TABLE, ALTER TABLE, and DROP TABLE.



Specify the operations to allow. If you use DDL triggers, you can decide whether you want to allow the change that raised the trigger to become permanent or undo it. To prevent an operation, the DDL trigger can present an error message to the developer and then roll back the DDL statement. This is a good approach to avoid unintentional modifications or deleting the database objects.

For More Information For more information about predefined groups, see the topic “Event Groups for Use with DDL Triggers” in SQL Server Books 2005 Books Online.

2–46

Module 2: Designing a Security Strategy

Lab: Designing a Security Strategy

**************************************** Illegal for non-trainer use *************************************** Objectives

Scenario



Evaluating the security trade-offs of SQL Server services



Designing a database to enable auditing



Designing schemas and objects to meet security requirements



Defending security decisions

You are in the process of designing a security strategy for Fabrikam, Inc. Currently, Fabrikam is facing the following security issues: ■

There is no security strategy to audit developer activities. In addition, developers have administrative rights. As a result, the organization faces unexpected security problems when database objects are created or applications change or are deployed.



No auditing process is implemented; therefore, when a security issue occurs, there is inadequate information to accurately identify the source and resolve the security issue.



Applications connect to SQL Server by using the SQL Server built-in sa login without using a password.

In your design, you must take the following steps: ■

Apply the principle of least privilege to users and development teams.



Create a developer group for each application.



Allow the Finance department to generate bills and manage customer information in the Accounts database, the Sales department to create contracts and manage customer information, and the Marketing department to manage services information.

Module 2: Designing a Security Strategy ■

Preparation

2–47

Define a standard for creating objects that access data. The standard must meet the following requirements: ●

It must minimize the possibility of an SQL injection attack.



It must hide database complexity to end users.



It must encrypt object definitions whenever possible.



Track and maintain a history of updates to the PhoneNumber and the ContractStatus columns. Include in the history the name of the user making a change and the date and time the change is made. Store the information in an audit table.



Capture information on who creates and modifies database objects. For ALTER statements, capture the new entity definition. Store the auditing information in another audit table.



Maintain the existing table schemas.

Ensure that the virtual machine for the computer 2781A-MIA-SQL-02 is running. Also, ensure that you have logged on to computer. If you need to log on to the computer, use the following credentials: ■

Username: Student



Password: Pa$$w0rd

2–48

Module 2: Designing a Security Strategy

Exercise 1: Evaluating the Security Trade-Offs of SQL Server Services Introduction

Analyze the SQL Server services used by Fabrikam, Inc.

In this exercise, you will analyze the SQL Server services used by Fabrikam, Inc. and evaluate the security requirements for each service. You will also discuss the security approach with your group members.

Summary

Detailed Steps

Analyze the Fabrikam, Inc. proposed solution Visio diagram, and then answer the following questions: Note This document is a copy of the solution to the lab Selecting SQL Server Services to Support Business Needs. 1. How can you use schemas to reduce the security exposure of linked server calls from the Service Broker database to the master Accounts database?



Open the Fabrikam Inc.vsd Microsoft Office Visio® document located in the E:\Labfiles\Starter folder. Use the model represented in the diagram to evaluate and discuss the security options and implications of different services.

2. The Web server accepts requests for new services from Internet users and processes payment receipts. What approach would you use to ensure that no one is able to exploit the Web server or is able to spoof a payment receipt from the Internet connection? 3. What security options will you use in the HTTP endpoint between the ASP.NET application and the Web database? 4. How do you prevent SQL injection attacks in Notification Services? 5. Merge replication between the central and remote offices uses the Internet. How do you protect this replication? 6. To print bills in remote offices, reports should access the local Accounts databases at that office. What security mechanisms should you use to secure these connections? Discussion questions

1. How could you use schemas to reduce the security exposure of linked server calls from the Service Broker database to the master Accounts database? Possible answers include: ■

Use staging tables to audit changes sent from Service Broker.



Allow only activities that include explicitly approved changes (contact info and address) when data is processed from the staging table to the account table.



Use stored procedures or parameterized queries to prevent SQL injection attacks. The procedure could use the EXECUTE AS clause to gain access to modify the required tables.

Module 2: Designing a Security Strategy

2–49

2. The Web server accepts requests for new services from Internet users and processes payment receipts. What approach would you use to ensure that no one is able to exploit the Web server or spoof a payment receipt from the Internet connection? Possible answers include: ■

Limit endpoint connection permissions to specific users or groups.



Use SSL to exchange sensitive data.



Verify that the Windows Guest account is disabled on the server.



Control and update the endpoint state as needed.



Use secure endpoint defaults whenever possible.

3. What security options do you use in the HTTP endpoint between the ASP.NET application and the database? Possible answers include: ■

Use Kerberos Authentication.



Use SSL to encrypt communications.

4. How do you prevent an SQL injection attack in Notification Services? To prevent an SQL injection attack, validate all user input for subscriptions in Notification Services. Notification Services cannot validate protocol header fields. Therefore, if you do not validate this input, you are exposing your Notification Services applications to SQL injection vulnerability.

5. Merge replication between central and remote offices can use the Internet as a medium. How do you protect this replication? Possible answers include: ■

By using a virtual private network (VPN), you can replicate SQL Server data over the Internet.



By using Web synchronization through IIS, you can configure merge replication and specify the Web replication option that provides the ability to replicate data by using the HTTPS protocol.

6. To print bills in remote offices, reports should access the local Accounts databases at that office. What security mechanisms should you use to ensure these connections? Possible answers include: ■

Use Windows Authentication to authenticate users accessing reports.



Disable integrated security in the report data sources.

2–50

Module 2: Designing a Security Strategy

Exercise 2: Designing a Database to Support Auditing Introduction

In this exercise, you will design and create tables to implement auditing requirements for the Accounts database. You will generate scripts for your auditing solutions, including DDL triggers that audit ALTER statements. You will also create a summary document that provides an overview of your scripts and auditing strategy.

Create the Accounts database

The remaining exercises in this lab use the Accounts database implemented by Fabrikam, Inc. Perform the following steps to create this database: 1. On the Start menu, click Command Prompt. 2. Move to the folder E:\Labfiles\Starter\Preparation. 3. Execute the script setup.bat. 4. Close the Command Prompt window.

Design and create auditing tables

Summary 1. Review the requirements in the scenario at the start of the lab and the Fabrikam Auditing Strategy.doc document. 2. Complete the Events to Audit section in the document. 3. Review the database tables and stored procedures. 4. Design and create the necessary audit tables. 5. Complete the Audit Schema Tables section in the document.

Detailed Steps 1. Open the Fabrikam Auditing Strategy.doc Microsoft Office Word document located in the E:\Labfiles\Starter folder and review the information in the Solution Overview and Objectives sections. 2. Using SQL Server Management Studio, connect to the MIASQL\SQLINST1 server. 3. Review the Contracts table and the stored procedures that are used to insert, modify, and delete customer information in the Accounts database. 4. Design and implement tables that can be used to implement the auditing requirements stated in the document. Information about the changes to customer contract information and objects modified by using ALTER statements should be stored in separate tables.

Module 2: Designing a Security Strategy Design and create a DDL trigger for auditing

Summary 1. Design and implement a DDL trigger to perform the auditing required by Fabrikam, Inc whenever database objects are created or modified. 2. Complete the DDL Triggers section in the Fabrikam Auditing Strategy.doc file.

2–51

Detailed Steps 1. You need to create a DDL trigger that captures information when database objects are created and modified. For ALTER statements, you need to capture the new entity definition. The DDL trigger will provide data to one of the tables created in the previous task. 2. In SQL Server Management Studio, design and create a DDL trigger to audit DDL statements in the Accounts database.

Generate scripts that update stored procedures for your auditing solution

Summary 1. Generate scripts for your auditing solution that modify the Contracts_update and Contracts_updateStatus stored procedures to audit changes to customer telephone numbers and contract status. 2. Complete the Auditing Scripts section in the Fabrikam Auditing Strategy.doc file.

Discussion questions

Detailed Steps 1. Using SQL Server Management Studio, modify the stored procedures Contracts_updateStatus and Contracts_update to track update changes in the PhoneNumber and ContractStatus columns. 2. Execute and save the scripts generated by SQL Server Management Studio for modifying the stored procedures.

1. What is your experience with implementing functionality similar to DDL triggers without using DDL triggers? Answers will vary. Students might have used SQL Profiler to audit DDL changes and experienced performance-related issues and difficulty in defining adequate filters. In addition to these problems, students might have experienced difficulties in scanning and parsing the profiler output for specific events. Students might also mention the use of various third-party applications to perform DDL audits.

2. Could you develop audit scripts from stored procedures? Yes, it is possible to use triggers and the deleted and inserted tables to access the current and updated data. There is an added risk in trying to audit a stored procedure because there is no fail-proof mechanism to protect data or objects from changes that circumvent the stored procedure that contains the audit instructions.

2–52

Module 2: Designing a Security Strategy

Exercise 3: Designing Objects to Manage Multiple Applications Introduction Design schemas to group tables by usage

In this exercise, you will work in small groups to design the objects that manage application access for users and developers. You will use the Accounts database.

Summary 1. Review the security requirements in the scenario at the start of the lab. 2. Group tables in the Accounts database according to usage. 3. Design and create schemas to group tables together. 4. Design schemas to achieve scenario requirements.

Design roles for objects for application access

Summary 1. Design database roles for objects for application access. 2. Complete the Roles section in the Fabrikam Roles.doc file.

Detailed Steps 1. Open the Fabrikam ERD.vsd Visio document located in the E:\Labfiles\Starter folder, and then group tables according to usage. For more inforfmation, see the scenario at the start of the lab. 2. Design schemas based on the groups of tables created. 3. Update the Fabrikam ERD.vsd Visio document with the schema information.

Detailed Steps 1. Based on the schemas defined in the previous task, design the database roles needed to support the Accounts database. 2. Open the Fabrikam Roles.doc document located in the E:\Labfiles\Starter folder. In the RolesB section of the document, define the roles to support developer groups.

Define standards for creating objects that access data

Summary 1. Review the requirements in the scenario at the start of the lab and the Fabrikam Security Access.doc document. 2. Define standards for creating views that access data. 3. Define standards for creating stored procedures that access data. 4. Update the Views and Stored Procedures sections of the document with your standards for defining views and stored procedures that help to manage security access.

Detailed Steps 1. Open the Fabrikam Security Access.doc Microsoft Office Word document located in the E:\Labfiles\Starter folder and review the information in the Solution Overview section. 2. Open the Fabrikam Security Access.doc document located in the E:\Labfiles\ Starter folder, and then define the standards for creating objects to meet the requirements in the scenario.

Module 2: Designing a Security Strategy Discussion questions

2–53

1. Considering the databases that you currently work with on a daily basis, how would the environment be improved if multiple schemas were implemented? Answers will vary.

2. Assuming that you have identified the databases that would benefit from multiple schemas, what challenges do you foresee in moving those databases to a multiple schema configuration? Answers will vary. ■

Stored procedures, application code, and database references by tools and internal applications within the organization, which are not easily identified from SQL Server, might need to be identified and modified to support the new schemas.



Developers must receive adequate training to work with a multiple-schema configuration.

3. After you create the schemas for each developer group, how do you match developers groups with their correspondent schemas? Using the default schemas. Each developer has a default schema that corresponds with the schema associated with the developer group. Using the default dbo schema is not a best practice because it grants users high privileges. Following the principle of least privilege leads to a practice of using as the default schema a usercreated schema with only ordinary rights in the database.

2–54

Module 2: Designing a Security Strategy

Exercise 4: Defending Security Decisions Introduction

Present the proposed solution

In this exercise, a representative from each student group from Exercise 3 will present the security decisions made by the group to the rest of the class. When presenting the solution, you should refer to the solution documents that you created in the previous exercises.

Summary

Detailed Steps

1. Present an overview of the security decisions of the existing solution of Fabrikam, Inc. 2. Provide the reasons for choosing a particular option to meet the business requirements.

1. While presenting an overview of the shortcomings of the current solution, refer to the scenario for the requirements and expectations set by the business decision makers. 2. Refer to the Fabrikam Auditing Strategy.doc file that you created in Exercise 2. 3. Refer to the schema modifications in the Fabrikam ERD.vsd file that you modified in Exercise 3. 4. Refer to the Fabrikam Roles.doc file that you created in Exercise 3. 5. Refer to the Fabrikam Security Access.doc file that you created in Exercise 3.

Discussion questions

1. Which portions of your solution can be saved as templates for reuse? Possible answers are: ■

Audit DDL triggers



Audit tables



Prototype stored procedures and views

2. Did the presenter make a good case for the proposed changes and explain the walkthrough in a manner that would be easily understood by non-technical business people? Answers will vary.

3. What problems and limitations were identified by you and the members of your group as you went through the exercise of defining a security strategy? Answers will vary.

4. Does your organization have an enterprise auditing strategy? Is your SQL Server auditing strategy included in this enterprise strategy? Answers will vary.

Module 2: Designing a Security Strategy

2–55

5. In this lab, you used stored procedures and triggers to implement a synchronous auditing strategy. Does SQL Server 2005 provide any technique to implement an asynchronous auditing strategy? Yes. You can use Service Broker and send server events to a Service Broker queue by using trace event notifications. By using this technique, you can audit most DDL events and most SQL Trace events that can be monitored by SQL Profiler.

Important After the discussion, shut down the virtual machine for the computer 2781A-MIA-SQL-02. Do not save the changes.

Module 3

Designing a Data Modeling Strategy

Contents: Lesson 1: Defining Standards for Storing XML Data in a Solution

3-2

Lesson 2: Designing a Database Solution Schema 3-11 Lesson 3: Designing a Scale-Out Strategy

3-24

Lab: Designing a Data Modeling Strategy

3-33

Information in this document, including URL and other Internet Web site references, is subject to change without notice. Unless otherwise noted, the example companies, organizations, products, domain names, email addresses, logos, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place or event is intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. The names of manufacturers, products, or URLs are provided for informational purposes only and Microsoft makes no representations and warranties, either expressed, implied, or statutory, regarding these manufacturers or the use of the products with any Microsoft technologies. The inclusion of a manufacturer or product does not imply endorsement of Microsoft of the manufacturer or product. Links are provided to third party sites. Such sites are not under the control of Microsoft and Microsoft is not responsible for the contents of any linked site or any link contained in a linked site, or any changes or updates to such sites. Microsoft is not responsible for webcasting or any other form of transmission received from any linked site. Microsoft is providing these links to you only as a convenience, and the inclusion of any link does not imply endorsement of Microsoft of the site or the products contained therein. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. ©2006 Microsoft Corporation. All rights reserved. Microsoft, Active Directory, ActiveX, BizTalk, Excel, Microsoft Press, MSDN, MSN, Outlook, PowerPoint, SharePoint, Tahoma, Visio, Visual Basic, Visual C#, Visual SourceSafe, Visual Studio, Windows, and Windows Server. All other trademarks are property of their respective owners.

Module 3: Designing a Data Modeling Strategy

3–1

**************************************** Illegal for non-trainer use *************************************** Module objectives

Introduction

After completing this module, students will be able to: ■

Define standards for storing XML data in a solution.



Design a database solution schema.



Design a scale-out strategy for a solution.

In this module, you will learn the various considerations and guidelines for defining standards for storing Extensible Markup Language (XML) data in a solution. The module also provides you with information about implementing online transaction processing (OLTP) and online analytical processing (OLAP) functionality, determining normalization levels, and creating indexes. Finally, this module covers the various considerations for designing a scale-out strategy for a solution, concentrating on using multiple data stores, designing a scale-out solution for performance and redundancy, and integrating data stores in a scale-out solution.

3–2

Module 3: Designing a Data Modeling Strategy

Lesson 1: Defining Standards for Storing XML Data in a Solution

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Evaluate the considerations for storing data as XML.



Apply the guidelines for storing XML and relational data redundantly.



Evaluate the considerations for choosing a column type to store XML data.

XML is the standard format for data interchange between applications. Modern applications and databases frequently use data structured as XML documents. You must take the special requirements of XML data into account when planning for archiving, auditing, temporary storage, and other data management aspects. Additionally, storing XML data requires building XML documents and parsing them whenever you retrieve or insert them into the database. Sometimes applications or Web services can perform these tasks, but you need to assess whether these options are best suited for your solution. Microsoft® SQL Server™ 2005 introduces new XML features, such as the XML data type and SQLXML functionality. These features give you additional options for managing XML data. However, you need to understand the trade-offs and best practices for integrating XML data and documents in your solutions. In this lesson, you will learn about XML storage issues, and you will discuss how to address and solve them.

Module 3: Designing a Data Modeling Strategy

3–3

Considerations for Storing Data as XML

**************************************** Illegal for non-trainer use *************************************** Introduction

XML is useful as a standard data format to transfer information between applications and databases. You should carefully evaluate the need to store data as XML documents in a SQL Server database. You need to assess the benefits and drawbacks that arise when using XML data so that you can make an informed decision about whether to store data in this format in a database. The following sections describe some common scenarios in which you might use XML data, focusing on when to use or not use this format.

Storing data as XML

The following list describes the considerations you should take into account when deciding whether to store data as XML in a SQL Server database: ■

Understanding the structure of the XML data. If the data has a recursive or hierarchical structure, you can parse it to convert it to a self-referenced table by using some Transact-SQL coding, stored procedures, and functions. However, this can become a difficult task that consumes considerable processing power when retrieving and updating data.



Using XML schema definition (XSD). If an XML document uses a schema that includes rules such as MinOccurs or Choice or uses attributes extensively, you might find it difficult to split this data into rows in relational tables and to reconstruct the XML data from the relational tables. Some XSD options do not have a matching declarative relational integrity feature, which means that you have to provide additional code to implement this functionality.



Supporting elements. If an XML document contains elements, you cannot anticipate the structure of this data. This makes it difficult to plan a table structure for holding this data.



Sending data as XML to client applications. If you need to send the same XML repeatedly to client applications, it is more efficient to store the data as XML than to construct the data for every incoming request.



Splitting and combining XML data. If client applications expect the data set as a whole rather than a subset of the data or data that is joined with other information, you should consider storing the data as XML.

3–4

Module 3: Designing a Data Modeling Strategy

When is it beneficial to store data as XML?

When is it detrimental to store data as XML?



Updating XML data. If you need to store XML data for archiving purposes only and applications will not modify the data, you should retain the original XML document structure.



XML is a non-binding transfer medium. XML is an open standard and, as such, is technology independent. It is ideal when you need to transfer data between organizations, as they will often have heterogeneous systems. It is not ideal when changes to separate systems need to be instantaneous and is more suited to loosely coupling applications. XML is not

The following list describes some scenarios in which storing data as XML is more convenient than using a relational format: ■

The order of elements is important. When maintaining element order is important, it is simpler to store the original XML document than to store this information in a separate column in a table, because you must also maintain information about the sequence of the data.



The data is based on a volatile XML schema. If the XML schema for a document changes frequently, it can be difficult to update the structure of the data if you store it as columns in a table. You might also need to maintain different versions of the same document with different schemas.



Documents require validation. If you must use an XML schema to validate a document before storing it in the database, you will find it more convenient to store the data in its original XML format.

The following list describes some scenarios in which you should not store data as XML: ■

The data contains elements that are queried or updated frequently. If you store data as an XML document in a SQL Server database and applications frequently query or modify only certain elements of that data, the SQL Server engine will need to look for that data in the XML document, load it, and parse it by using Extensible Query Language (XQuery). Using XQuery is many times more expensive than performing a relational query.



The data requires indexes. If you need to optimize query performance for XML data, you will probably need to create indexes over XML columns. This is an expensive operation and adds significant overhead when any of the XML data changes.



The documents must support concurrent access. Many different users might attempt to access the same XML data in the database simultaneously. The rules for locking XML data are the same as for relational data. The smallest piece of data that SQL Server can lock is a row. If a row contains an XML document, the entire XML document will be locked, and concurrency issues could arise.

Module 3: Designing a Data Modeling Strategy

3–5

Guidelines for Storing Redundant XML Data

**************************************** Illegal for non-trainer use *************************************** Introduction

After you have determined that storing data as XML is feasible for your solution, you should consider how to optimize this data for query purposes. Querying XML data is not an easy task, and you should avoid using such queries whenever possible. Sometimes you will find it beneficial to introduce redundancy to improve the performance of queries. In this topic, you will learn some guidelines for storing redundant XML data for optimizing queries.

Adding a redundant column to hold frequently queried XML elements

If you regularly need access to a specific element in an XML document stored in a column in a SQL Server table, you should consider adding an additional column to the table to hold a copy of this element. This will make it easier to query this information without needing to perform complex XQuery search operations through the entire XML document. The cost of this approach is the need to extract the element from the XML document whenever you insert or update the XML column, although you can use a trigger to automate this process. You should also remember that an XML document can contain repeating elements, and you need to consider how to represent multiple element values for a single XML document.

Storing nonrepeating XML elements in computed columns

An alternative strategy for handling nonrepeating elements is to add a computed column to the table holding the XML data. You can specify or create a function that extracts the element data from the original XML column and then create an index on this computed column. This strategy reduces the amount of redundancy at the cost of the additional processing required when querying data.

Storing repeating XML elements in a separate table

If an XML document contains repeating elements, you should create an additional table for holding this data. Add a column to this table for storing the XML element, and add a foreign key that references the row containing the original XML data. You can create triggers on the table holding the original XML data to maintain the data in the new table whenever XML values are modified, inserted, or deleted.

3–6

Module 3: Designing a Data Modeling Strategy

Considerations for Choosing a Column Type to Store XML Data

**************************************** Illegal for non-trainer use *************************************** Introduction

You can store XML data in a SQL Server 2005 database by following several different strategies. SQL Server 2005 provides several mechanisms that you can use for transforming XML data into relational data and back again. Each strategy is suitable for different situations, and each has different benefits and drawbacks. This topic concentrates on two common techniques that you can use.

Storing XML data as an xml data type column

SQL Server 2005 provides the xml data type. You can use this data type for a column in a table. This data type has the following benefits: ■

xml data type columns support typed and untyped XML data. If you create a typed XML column, you can provide a schema that SQL Server will use to validate data in this column. If the data does not match the schema, SQL Server will not store it.



SQL Server 2005 provides several methods and functions specifically for acting on the XML data type. These include the query method for performing an XQuery operation, the modify method for changing the contents of an XML document, and the nodes method for shredding an XML document into a more relational format.



SQL Server 2005 enables you to create XML indexes over XML columns, improving the performance of some types of query operations.

The main drawback with using the xml data type concerns the overhead of performing validations. Validation occurs every time you insert or modify data in a typed XML column. If you have a set of complex schemas, the processing required can be considerable. Additionally, the xml data type does not guarantee byte-by-byte consistency with respect to the original document, so you might find that the document you retrieve from an XML column is not identical to the original document that you inserted.

Module 3: Designing a Data Modeling Strategy Storing XML data as an nvarchar or a varchar column

3–7

Prior to SQL Server 2005, you could store XML data only by using columns based on the long data types, such as nvarchar and varchar. Although SQL Server 2005 now adds the xml data type, using an nvarchar or varchar column to store XML data might still be suitable for the following reasons: ■

Storing an XML document as an nvarchar or a varchar value involves little overhead and thus provides the best performance when storing and retrieving the entire document.



You can use the long data types varchar(max) and nvarchar(max) to hold up to 2 gigabytes (GB) of data. Prior to SQL Server 2005, long columns required special handling. SQL Server 2005 now enables you to treat values in these columns like ordinary values, and you can query them and compare them by using ordinary SQL statements.



If most queries only use a limited number of elements, you could use property promotion to promote these into relational columns, while the bulk of the data remains in the xml data type. The promoted column could be a computed column based on XML data, but stored as varchar or nvarchar.

Note that using an nvarchar or a varchar column does not provide you with any additional XML functionality. You should use this approach only if you simply need to store and retrieve an XML document. If you need to perform any validation or additional processing, use the xml data type.

3–8

Module 3: Designing a Data Modeling Strategy

Practice: Defining Standards for Storing XML Data in a Solution

**************************************** Illegal for non-trainer use *************************************** Introduction

In this practice, you will examine three scenarios, analyze the requirements of each scenario, and select a strategy for storing XML data. Each scenario includes a number of questions that you will discuss with the rest of the class.

Scenario 1

Fabrikam, Inc., receives service requests from different partners. Before accepting service requests, Fabrikam follows an approval workflow. Managers process requests sorted by the date and time of receipt, the partner code, and the type of service requested. This scenario has the following requirements:

Discussion questions



Prior to approval, the original document containing the request must be accessible by the managers. Document consistency is not a necessity, but a solution that preserves the document is preferred.



Managers must be able to sort and filter requests by the date and time of receipt, the partner code, and the type of service requested.



On approval, the data from the request is entered into the service planning system. There is no need to retain the original requests for tracking or archiving purposes.

1. How should you store the request data in the solution database? You should store the data as XML in the solution database, to enable managers to retrieve the original document when requested.

2. If you plan to store data as XML in the solution database, what kind of storage will you use? Why will you use this kind of storage? The best option is to use a varchar(max) column, because the solution requires document consistency to be preserved. However, using an XML column data type is also acceptable.

Module 3: Designing a Data Modeling Strategy Scenario 2

3–9

The application at Fabrikam, Inc., receives call logs from the real-time call tracking system. Some customer contracts state that customers should pay only for the longest call placed every day to each person in a customer-selected group of mobile telephone numbers (pal-calls); additional pal-calls are free of charge. For example, Bob has his girlfriend, his mother, and his friend Paul on his pal-call list. Therefore, he will pay for the longest call placed on a given day to his girlfriend, but not for the other calls placed to his girlfriend. Similarly, he will pay only for the longest call placed on a given day to his mother or Paul. This scenario has the following requirements:

Discussion questions



Data comes from the call-tracking system, and the call-tracking system is responsible for performing auditing.



Call logs are submitted in daily batches, one file per call log per day. Call logs are split by sales region, and you receive as many files as the number of sales regions.



Customer pal-call lists are stored in a table in the solution database.

1. How should you store the call log data in the solution database? You should not store data in XML format within the solution because there is no requirement to maintain the data submitted in the original format. Also, storing XML data in the database does not provide any benefit. Because this data is frequently accessed and used to compute billing information, it needs to be stored in a format to support those operations. In such a case, XML will be used as a data transport mechanism only.

2. How can you check for the longest calls to each person on a pal-call list? You can deserialize the XML data into a temporary table within the database and construct a stored procedure to check the data against the pal-calls lists in the solution database.

Scenario 3

Fabrikam, Inc., receives contract-related requests from business partners. After the requests are accepted, they are stored as XML documents in a file share on the corporate local area network (LAN). Some customers try to use the Internet and call centers to request the same procedures (such as upgrading a contract or requesting a copy of a bill), and you need to find duplicate requests to avoid duplication of processing. The requirements of the solution are: ■

Managers must be able to quickly identify duplicate requests.



The systems that receive requests accept or reject them and perform any necessary auditing tasks.



Requests are received in batches. Duplicate requests can arrive in the same XML document or in previously received documents.

3–10

Module 3: Designing a Data Modeling Strategy

Discussion questions

1. Should data be stored as XML in the solution database? Why or why not? Yes, because you need to keep track of requests.

2. If you plan to store data as XML in the solution database, what kind of storage will you use? If document consistency is not critical, the XML column data type is the best option because XQuery and XML data indexing can be used.

3. How can you check for duplicate requests? To find and resolve duplicate requests, you can deserialize the XML into a temporary table and execute a Transact-SQL query to look for duplicate data.

Module 3: Designing a Data Modeling Strategy

3–11

Lesson 2: Designing a Database Solution Schema

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Describe the preparation tasks that you must perform before designing a database solution schema.



Evaluate the considerations for implementing OLAP with OLTP.



Evaluate the considerations for determining the level of database normalization required to meet your business needs.



Evaluate the considerations for creating indexes.



Evaluate the considerations for creating clustered indexes.

Designing a database solution schema by using entity-relationship modeling is a wellknown technique used by most database developers. However, in some cases a strict use of relational theory and normal forms is not the best solution. In this lesson, you will be introduced to several aspects of designing a database schema that go beyond the entity-relationship model. These techniques take the process of designing database schemas closer to real-world solutions, incorporating performance requirements and the intended use of the data into the design process.

3–12

Module 3: Designing a Data Modeling Strategy

Preparation Tasks for Designing a Database Solution Schema

**************************************** Illegal for non-trainer use *************************************** Introduction

The structure of data is one of the most important aspects of an application. However, when defining a database schema, paying exclusive attention to the way in which the data is structured to suit a specific application can lead to a decrease in database performance in the future. The application might not be the only item that uses this data. Other applications and users might be dependent on the same data and could place even more workload demands on the database. Therefore, you must take these consumers’ activities into account when designing the database schema.

Analyzing the application as a whole

It is common to begin database schema design by examining the entities and their properties and looking at the relationships between entities. However, having an overview of the general use of the data can be helpful when tuning database performance and storage. Therefore, you should resist generating entity-relationship models until you have gained a broader understanding of the application and the processing it performs. You will need to talk to the application developers and users to obtain this information.

Understanding how applications will use the data

Understanding how applications use the data is a key factor for designing a good data access strategy. Usually, it is possible to identify three kinds of usage: ■

Core usage. This is the everyday usage for an application when fulfilling the central application business objective.



Ancillary application usage. This usage satisfies complementary business objectives. This type of usage often grows with time as the system contains more and more data.



Data flow to and from other applications. Data transfer operations tend to put heavy workloads on systems. You must understand how other applications and data feeds affect the flow of data and the volume of traffic that they generate.

Module 3: Designing a Data Modeling Strategy

3–13

You should determine the users, data volume and peak hours for these systems. If there are overlapping peak hours for multiple applications, a greater load will be placed on the system. Verifying the usage balance

When you examine how applications will use the data, you should understand the balance between core usage and ancillary usage. If ancillary usage and data flow workload are estimated as being greater than core usage, consider designing a scale-out solution during the design phase.

Grouping application functionality

You can group application functionality into categories, such as “online” or “analytical.” Online functionality typically requires access to up-to-date data, whereas analytical functionality usually operates on historical data, which does not need to be as up-todate. Understanding these groupings will offer you some insight into how and when to introduce scale-out features. Functionality that does not require real-time data is a candidate for scaling out. Data transfer technologies such as replication can synchronize data in varying time periods, and you should include estimates concerning how up-to-date the data is expected to be when performing the usage analysis.

3–14

Module 3: Designing a Data Modeling Strategy

Considerations for Implementing OLAP with OLTP

**************************************** Illegal for non-trainer use *************************************** Introduction

Most databases need to support OLTP functionality of some sort, and you will most likely consider the OLTP requirements of any system as part of the initial database design. If the data usage analysis indicates that applications will make significant use of the data for analytical purposes, you must decide how to implement this OLAP functionality alongside any existing OLTP requirements. For example, a retail company will have many inserts and updates as orders are processed, but will also want to run spanning queries to analyze sales and look for trends. Running these two processes on one system will be detrimental to both. Not every application will require a complete OLAP solution, so it is often worth considering alternative strategies for supporting the OLAP requirements of the organization.

Considerations for implementing OLAP functionality with OLTP

SQL Server offers several technologies to help build an OLAP solution to meet varying requirements. If you need to support OLAP functionality in an OLTP database, you should consider the following solutions: ■

Scheduling OLAP usage for off-peak hours. Although this solution is the simplest one, it is not the most user-friendly and might not be feasible.



Denormalizing tables that are subject to OLAP queries. A fully normalized database contains many tables that you must join together to reconstruct data. This processing can noticeably increase OLAP processing times. Denormalizing where appropriate can reduce the number of joins required, at the expense of redundancy and the need to maintain duplicate data.



Adding indexes designed to match OLAP functionality. You can include indexes designed to support the most common OLAP queries and include calculated columns indexes and indexed views. The cost of this is the increased processing required when updating indexed data, and this could be critical to the OLTP requirements of your database.

Module 3: Designing a Data Modeling Strategy

3–15



Building a Unified Dimensional Model (UDM). If you need a complete OLAP solution, you should consider using SQL Server 2005 Analysis Services to construct and maintain OLAP data.



Moving OLAP functionality to a separate system. You can perform scheduled data transfers and use technologies such as SQL Server replication or SQL Server Integration Services and SQL Agent to copy data from the OLTP system to the OLAP database. However, this requires careful design and testing, and it introduces other security and other administrative concerns.

3–16

Module 3: Designing a Data Modeling Strategy

Considerations for Normalizing a Database

**************************************** Illegal for non-trainer use *************************************** Introduction

Database designers frequently attempt to normalize their tables as much as possible when designing a schema. A fully normalized schema reduces the degree of redundancy in the database and helps to eliminate the anomalies that can arise with duplicate data. It is important that you consider the integrity of the data in the database when designing tables; otherwise, the entire schema might be useless. When you are confident that your schema will meet your functional requirements, you can consider denormalizing selected tables to help improve performance.

The impact of normalization

In the relational model, the rules defining the normal forms reduce data redundancy at the cost of fragmenting tables into smaller and smaller chunks. This has the following impact on the database: ■

Executing queries that need to join data from several tables require additional processing. The tables in a fully normalized database might contain only a few columns, and queries will often need to retrieve data from several related tables to construct the information needed by an application. These join operations consume processing power and can impact the response time of the database. This degradation in performance can become particularly noticeable in an OLAP environment. You can define indexes to speed up join operations, but these can adversely affect the effort required by the database engine to maintain the data in the tables if you frequently modify or delete rows.



Maintaining integrity between tables adds overhead. If you have many tables, the database engine needs to perform more work to maintain the integrity of the relationships between them. You can create primary and foreign keys to define the relationships between tables and prevent orphans from occurring (such as orders for a customer with no corresponding customer in the database). However, when updating data, the database engine has to perform additional integrity checks to ensure that relationships are not violated. This can impact performance.

Module 3: Designing a Data Modeling Strategy ■

3–17

Retrieving small records requires fewer resources than retrieving large records. The fewer columns a table has, the smaller the record length, and the less I/O required to read records from disk. Denormalized tables with a large record length require more I/O. Large records also consume more cache memory. If you do not normalize a table but your application queries only a small subset of columns in a table most of the time, consider splitting the table into the columns that are used most of the time and those that are used infrequently.

The choice of whether to normalize or denormalize a database greatly depends on the pattern of data access of the applications that use it. This emphasizes the need for you to analyze the way in which applications use data before finalizing the database design.

3–18

Module 3: Designing a Data Modeling Strategy

Considerations for Creating Indexes

**************************************** Illegal for non-trainer use *************************************** Introduction

You should pay careful attention to the indexes when defining a database schema. The SQL Server database engine uses indexes extensively to resolve queries and quickly locate data. However, defining too many indexes adds an undesired overhead when updating information. SQL Server 2005 automatically generates indexes when creating primary keys. You can also define your own indexes to meet the performance requirements of critical queries in your applications. SQL Server 2005 supports clustered and nonclustered indexes. Clustered indexes physically sort the rows in a table to match the order specified by the index key. Nonclustered indexes contain pointers to rows in the table and do not reorganize the rows.

Defining nonclustered indexes to match query requirements

You can define different types of indexes to meet different application query requirements. You should consider creating nonclustered indexes to support the following types of queries: ■

Queries using JOIN or GROUP BY clauses. You should create indexes over the columns involved in the join operations or columns subject to grouping. This can reduce the time taken by the database engine to locate and sort data.



Queries that return a small proportion of the rows in a table. Indexes are useful for quickly identifying rows that match query criteria. However, if an application executes queries that regularly return more than a small proportion of the rows in a table, it might actually be more efficient for the database engine to scan the entire table for these rows rather than use an index. In these cases, the index simply adds overhead.



Queries that frequently return a specific set of columns from a table. If your applications frequently require the data in the same subset of columns from a table, consider creating a covering index that contains all of these columns. Order the columns in the index according to the search criteria used by these queries. The database engine can then locate and retrieve the data by using the index rather than the underlying table. In SQL Server 2005, you can also define indexes with included columns. These columns are not referenced in the internal nodes of the index, but they are included in the leaf nodes.

Module 3: Designing a Data Modeling Strategy

3–19

You should not create indexes over columns that contain relatively few distinct values compared to the number of rows in the table. Temporarily deleting or disabling indexes to improve insert, update, and delete operations

Indexes can improve the response time of queries but slow the performance of insert, update, and delete operations. If you are performing a bulk operation that modifies a large number of rows in an indexed table, consider temporarily removing or disabling the indexes on that table. You should rebuild the index when the bulk operation has been completed. However, the rebuild process can consume considerable processor, memory, and disk resources, so you should arrange for this task to be performed at offpeak hours.

Specifying an appropriate fill factor to postpone page splits and reduce index overheads

When an index page is full it splits into two pages. This process is known as a page split and causes an increased overhead on some insert and update operations. You can reduce the impact of an index on insert, update, and delete operations by specifying an appropriate fill factor when you create the index. The fill factor will cause the leaf pages of the index to be only partially full. However, fill factor is not maintained and therefore this technique only postpones issues of space management and index reorganization, so you should periodically rebuild indexes defined in this way.

Creating indexed views

You can create an indexed view by defining a clustered index over a view. SQL Server maintains a copy of the data in the index. Using indexed views can improve the response time for queries that join tables together or that perform aggregate operations. They are especially useful in query-intensive OLAP environments. However, the database engine must maintain the data in an indexed view, so using these views can be detrimental to OLTP systems.

3–20

Module 3: Designing a Data Modeling Strategy

Considerations for Creating Clustered Indexes

**************************************** Illegal for non-trainer use *************************************** Introduction

SQL Server creates a clustered index automatically whenever you define a primary key on a table. A clustered index causes the data in the table to be ordered by the values in the indexed columns. For this reason, you can create only one clustered index on a table.

Defining clustered indexes to optimize data access

A clustered index provides fast access to an entire row. Clustered indexes can be particularly useful for queries that identify a small number of rows, or a single row, in a large table. This is why SQL Server automatically creates them over primary key columns. However, the primary key is not always the best candidate for a clustered index. The following list describes some considerations for selecting the columns for a clustered index: ■

Understanding how applications retrieve data. If your applications use a different access pattern and frequently access the data in a table through other columns, you might find it beneficial to drop the clustered index and create it over a different column or set of columns. For example, if your applications frequently retrieve data in the same order or retrieve rows that fall into a specified range, you should consider creating the clustered index over the column specified as the sort key or the range key. The database engine can quickly identify the starting point for a range of values, and it can then scan sequentially through the subsequent rows, automatically returning data in the correct order.

Module 3: Designing a Data Modeling Strategy

3–21



Understanding how applications modify data. If your application frequently inserts rows into a table with a clustered index, you should define the clustered index to match the order of the insertions. In this way, new rows will be appended to the end of the index, rather than being inserted in the middle, reducing the number of page split operations and minimizing the maintenance overhead of the index.



Understanding when a clustered index is not useful. Do not create a clustered index over columns that undergo frequent changes. You should also be careful when creating a clustered index over columns that can vary in length or that can become wide, such as varchar or nvarchar columns. The overhead of maintaining these indexes, and the space required to hold them, can quickly outweigh any benefits that they provide.

3–22

Module 3: Designing a Data Modeling Strategy

Practice: Designing a Database Solution Schema

**************************************** Illegal for non-trainer use *************************************** Introduction

In this practice, you will work in groups to examine scenarios, analyze requirements, and design a database solution schema that meets the business requirements of two different scenarios. After you have designed a solution, you will discuss your design with the rest of the class.

Scenario 1

You are building a prototype database, and the application development team has developed a prototype application with data entry screens. These screens enable you to collect customer and sales data. Following are the requirements for this scenario:

Scenario 2

Discussion questions



You must minimize the storage space required by the database.



Every time a query is run to generate weekly sales summaries, the query should have the fastest response time possible.

Using the prototype application described in Scenario 1, you need to eliminate any delays between inserting the details in the sales data and displaying the weekly sales summary. The previous requirement of minimizing storage space has been relaxed, but the queries must still have a fast response time. Synchronization between both data stores is the main priority now. 1. Will you maintain a weekly sales summary table? Answers may vary. To make summary query as fast as possible, students will need to calculate summaries in advance and store the results in a summary table such as WeeklySales. However, the overhead involved in generating this summary information could be significant, especially if this data changes frequently. This overhead could impact the response time of the database engine. In this case, it might be better to design an appropriate indexing strategy and calculate the summary information only when required by a query.

Module 3: Designing a Data Modeling Strategy

3–23

2. Which synchronization mechanism could you use to maintain the weekly summary information, without using additional SQL Server services? Answers will vary. If students generate and store the weekly summary information in a separate table, they will need to use triggers or an equivalent mechanism to maintain this data. Alternatively, they could schedule a periodic job that regenerates all of the summary data during off-peak hours. However, this solution will not guarantee that the data is up to date. If students choose to calculate summary data dynamically instead of generating and storing it in a temporary table, synchronization is not an issue.

3–24

Module 3: Designing a Data Modeling Strategy

Lesson 3: Designing a Scale-Out Strategy

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Evaluate the considerations for choosing multiple data stores.



Evaluate the considerations for choosing a scale-out solution for performance.



Evaluate the considerations for choosing a scale-out solution for redundancy.



Evaluate the considerations for integrating data stores in a scale-out solution.

Scaling out is a common practice when designing solutions to meet requirements, such as performance or availability. Scaling out involves dividing the database workload over different resources and applying specific optimizations for every task where necessary. To scale out a solution, you need additional hardware and software resources. This adds complexity to the planning, setup, administration, and maintenance of your solution. This lesson introduces considerations for scaling out solutions to meet distinct goals. These considerations help you to make better-informed decisions concerning any particular strategy you adopt.

Module 3: Designing a Data Modeling Strategy

3–25

Considerations for Choosing Multiple Data Stores

**************************************** Illegal for non-trainer use *************************************** Introduction

Before considering scaling out to multiple servers, you should tune the current solution. You should start by reviewing how applications and users access data, validating the database schemas, examining the indexing strategy, and verifying the appropriate database settings. You can then identify the most appropriate strategy for scaling out your solution, as described in the following sections.

Scaling out to multiple SQL Server databases in the same instance

Scaling out to multiple SQL Server instances

The simplest method of scaling out is to create several databases on the same SQL Server instance. This configuration enables you to perform optimizations such as the following: ■

Grouping similar requirements by database. You can then optimize each schema for best performance. For example, you can tune the schema of one database to support OLAP by defining multiple indexes over tables and tune the same schema in another database to support OLTP by restricting the number of indexes.



Separating data. You can split historical data from current data. This keeps the size of active tables and indexes small. However, if you need to access historical and current data, your queries will be considerably more complex.



Optimizing I/O performance. You can optimize storage performance by exploiting files and filegroups in SQL Server and balancing the load over multiple disk devices. However, this can increase the complexity of backup and restore operations. Using multiple databases instead can avoid this complexity.

Scaling out to multiple SQL Server instances enables you to adjust instance settings independently from the databases. You can configure different instances to meet specific requirements—for example, by changing the memory settings or the CPU usage. You can also configure each instance with its own security settings.

3–26

Module 3: Designing a Data Modeling Strategy

Scaling out to multiple database servers

You can scale out to multiple database servers. This reduces the load on a specific server and enables you to select and tune hardware to meet specific requirements. You can partition the workload and direct applications that perform one specific type of processing to a particular server with the appropriate capacity. An alternative strategy is to spread the load evenly across multiple servers. Following this strategy causes additional hardware and software costs and increased management and maintenance effort. Keeping databases synchronized in a multipleserver environment also poses its own challenges. You must consider the security and infrastructure implications of scaling out. You will need to secure the channel between the servers and you will require additional bandwidth between them. Certificate servers will be required if certificates are used for encryption and validation, and the complexity and risks will increase if the data is distributed, particularly if it is across the Internet.

Module 3: Designing a Data Modeling Strategy

3–27

Considerations for Scaling Out for Performance

**************************************** Illegal for non-trainer use *************************************** Introduction

You might need to scale out your solution onto multiple servers if it needs to support high-performance OLTP. However, you should first examine your current solution and identify any further optimizations that you can implement before committing to this type of solution, because of the time and costs involved. You should consider a scale-out strategy only if the hardware is currently performing close to its maximum capacity. When scaling out onto multiple servers, it is not always necessary to use the same hardware for each server. You might find that cheaper hardware will meet your solution requirements. However, you should verify that this is the case before committing to the purchase of additional hardware. When you scale out to multiple servers, you can copy the entire database to each computer, you can partition the database across computers, or you can use a hybrid combination of both of these approaches. Note The term partitioning used in this topic simply means to split the database into pieces and host each piece on a different server. SQL Server also uses the term partitioning to refer to the way in which it can organize the physical structure of a database. SQL Server partitioning is just one example of the technologies that you can use to split data. The next topic describes considerations for scaling out by using redundant copies of data.

3–28

Module 3: Designing a Data Modeling Strategy

Partitioning data

Other considerations

If you follow a partitioning strategy, you should be aware of the different approaches you can use for dividing the data. You should bear in mind the following considerations: ■

Understanding the requirements of the application. The requirements of the application should dictate any partitioning scheme. You could split the data according to the geographical location of the server and the data (branch offices will probably benefit if the data they use is held local to the branch, for example), the use of the data (historical and active data), or many other factors.



Matching hardware to the workload. You must be careful to match the power of the computer hosting each partition with expected assigned workload.



Keeping most data access operations local. Partitioning data requires providing some means to direct changes and queries to the appropriate server. You can use distributed partitioned views and linked servers to redirect requests between servers, but you should carefully assess the network traffic that this approach generates and the time taken for queries that need to be resolved remotely. If you partition your databases carefully, you should find that the vast majority of queries and updates will occur on a server local to the user or application, minimizing the network overhead involved.

Partitioning a database impacts extensively on the solution design: ■

Additional servers and hardware introduce more potential points of failure.



Decentralization adds complexity at all levels from application development to maintenance.



A geographically distributed design introduces even more complexity, because of the key role of the communications infrastructure. However, if every location is self-sufficient, this architecture offers higher local availability.

Module 3: Designing a Data Modeling Strategy

3–29

Considerations for Scaling Out with Redundancy

**************************************** Illegal for non-trainer use *************************************** Introduction

Scaling out with redundancy is a strategy for maintaining multiple copies of data. This approach is typically adopted by distributed organizations, transporting data to multiple sites, and enabling applications to access data locally. You can also use this strategy to isolate and optimize different applications that have different requirements but require access to the same data. This strategy has several benefits but also some significant costs, as described in the following sections.

Advantages of scaling out

The following list describes some of the benefits of following a scale-out strategy: ■

Optimizes the database to support specific functionality. You can tune each database to match the specific requirements of an application. For example, you can define different instances of the database to support OLTP and OLAP processing. You are not concerned with other applications that need to share the same data.



Exploits local autonomy and availability. Keeping redundant copies of data locally enables different locations to act independently. Sites can follow their own maintenance procedures without directly affecting other sites. If one site becomes unavailable, applications and users at other sites can continue functioning.



Implements load balancing. You can use a pool of servers holding copies of the same database to implement load-balancing schemes. You can remove a server from the load-balancing pool if you need to perform maintenance, without losing availability.



Tolerates failure. If a server fails, you might be able to failover to another running server. You can exploit having multiple copies of data available as a form of backup. You can use one server as the source of data when recovering another.

3–30

Module 3: Designing a Data Modeling Strategy ■

Disadvantages of scaling out

Distributes data and minimizes latency. The principle concern when maintaining redundant copies of data is the complexity and cost of the mechanism you use to distribute data and keep it up to date. Most forms of distribution incur some form of latency. You can use SQL Server replication to transport data from server to server, but you need to consider carefully the topology and distribution mechanism you use. Frequent synchronization can result in considerable network traffic. Infrequent synchronization can result in a location not having up-to-date data and making a wrong business decision. Other solutions are available as well.

Two principal costs are associated with following a scale-out strategy: ■

Stores multiple copies of data. Storing redundant data requires additional data storage. Depending on the amount of data you are copying from server to server, this could result in a significant additional cost.



Manages additional complexity and security. Local sites can operate autonomously, but there is still a requirement to manage the solution as a whole. This could involve liaising with a number of geographically dispersed sites and ensuring that specific maintenance tasks are performed on a routine basis. Managing security in a distributed environment becomes extremely important. You must avoid one site becoming a weak link, enabling an attacker to penetrate your system.

Module 3: Designing a Data Modeling Strategy

3–31

Considerations for Integrating Data Stores

**************************************** Illegal for non-trainer use *************************************** Introduction

Implementing a scale-out strategy that involves distributing redundant data requires you to establish a mechanism for distributing and synchronizing data. This data can originate from different databases and data stores. You should attempt to make the distribution and synchronization mechanism as transparent as possible, making the distributed database appear as a single, integrated item. The following sections describe the issues that you need to consider when planning operations for integrating data stores in a scale-out solution.

Determining the latency period

All synchronization mechanisms, except transactional ones, incur some delay between updating the source database and updating the redundant copies. This period can range from minutes to several hours or even days. Depending on the mechanism that you used for distributing the data, you have a degree of control over how long this period is. You must balance the need to have up-to-date information at a particular location with the workload of the synchronization procedure.

Transferring and transforming data

In some cases, the redundant data is not an exact copy of the original data but has been transformed in some way to meet requirements. You need to include this data transformation when considering how to distribute the data. Depending on the volume of data involved, you could copy the original data to a staging site by using a fast data transfer mechanism and then transform and copy the data from this site to the remote locations. Using this technique has two benefits. First, system resources are locked for a shorter time. Second, an exact copy of source data is available for fast backup and failover purposes. The drawback is the additional complexity of this solution and the additional storage space required (which could be considerable).

Using alternative SQL Server services

SQL Server 2005 offers several tools and services that you can use to transform and synchronize data. These features include mirroring, SQL Server Integration Services, Service Broker, Notification Services, HTTP endpoints, replication, and log-shipping. Each technology is suited to different scenarios, and each has its own benefits and drawbacks.

3–32

Module 3: Designing a Data Modeling Strategy

Handling synchronization failure and recovering

You should remember that the synchronization mechanism is also a candidate for failure. Therefore, you must have a plan for resolving issues that can arise when synchronization stops working. The recovery strategy must include different scenarios based on the amount of time that synchronization has been unavailable. The longer the synchronization mechanism has been unavailable, the more data you will need to recover. In some cases, you might need to discard all redundant data and populate the database with a fresh copy of the current data.

Module 3: Designing a Data Modeling Strategy

3–33

Lab: Designing a Data Modeling Strategy

**************************************** Illegal for non-trainer use *************************************** Objectives

Scenario

In this lab, you will: ■

Design a database solution schema.



Design an integrated schema for multiple data stores.

Fabrikam, Inc., currently has a LocalAccounts database at each regional office. This database contains all customer account information in a non-summarized format. When bills are printed, a customer receives a bill for each contract that exists at a regional office, and each bill contains detailed usage information. To reduce the amount of money spent on printing bills, and to provide a better level of service to customers, Fabrikam would like to make some modifications to its current process to implement the following business requirements: ■

Consolidate all customer contract bills into a single bill. Currently, a customer who has several contracts will receive several bills—one for each contract.



Summarize monthly usage information for each customer.



Include functionality to provide customers and bill payment agents with access to detailed usage information through the Web service implemented at the Fabrikam central office. Information about agents’ requests must be stored for auditing purposes.



Store information about customers’ requests for new services at the Fabrikam central office.

You have been asked to review the current solution and propose an updated database model that supports these requirements.

3–34

Module 3: Designing a Data Modeling Strategy Currently, Fabrikam uses merge replication for distributing data to the regional offices. However, this might not be a long-term solution because new business requirements could arise in the future.

Preparation

Ensure that the virtual machine for the computer 2781A-MIA-SQL-03 is running. Also, ensure that you have logged on to computer. If you need to log on to the computer, use the following credentials: ■

Username: Student



Password: Pa$$w0rd

Module 3: Designing a Data Modeling Strategy

3–35

Exercise 1: Designing a Database Solution Schema Introduction

In this exercise, you will analyze the current bill printing solution. You will determine whether the existing solution design is still valid or requires updating. You will also design a database schema that supports generating and printing bills.

Create the Accounts database

The exercises in this lab use the Accounts database implemented by Fabrikam, Inc. Perform the following steps to create this database: 1. On the Start menu, click Command Prompt. 2. Move to the folder E:\Labfiles\Starter\Preparation. 3. Execute the script setup.bat. 4. Close the Command Prompt window.

Analyze the current bill printing solution

Summary

Detailed Steps

1. Review lab scenario and the existing solution architecture document Fabrikam Inc.vsd.

1. Open the file Fabrikam Inc.vsd, located in the E:\Labfiles\Starter folder.

2. Specify any additional databases required to support printing bills in the regional offices, and specify a mechanism to transfer summarized data to the regional offices.

2. Add a database to the Visio solution to store and manage the data required for printing bills in the regional offices.

3. Modify the solution architecture to meet the new business requirements at the regional offices.

3. Add a process to perform bill processing in the regional offices. 4. Add a data integration mechanism to synchronize the LocalAccounts database with the bill printing database.

3–36

Module 3: Designing a Data Modeling Strategy

Create a database schema to support bill printing

Summary

Detailed Steps

1. Review the current database schema that supports billing operations. 2. Design a schema for the Bill Printing database that provides the necessary summarized information to enable bill printing. Note The file Fabrikam Bill Sample.doc in the E:\Labfiles\ Starter folder contains a sample bill.

1. Open the Visio database schema document Fabrikam ERD.vsd, located in the E:\Labfiles\Starter folder. 2. Locate and examine the tables holding billing information. 3. Create a new database model diagram, and define a schema and tables for storing and summarizing phone lines service usage. The simplest way to do this is to reverse engineer the schema from the relevant tables in the existing Accounts database, and then make any modification required to support bill printing in the regional offices. Note To reverse engineer a schema from an existing database using Visio, you should create a new data source referring to the required SQL Server database.

Discussion question



How will the solution change if you decide to implement billing processing in the central office and transmit summarized data to the regional offices? This change will have a significant impact on several operations. Processing billing information in the central office means that: ■

The server at the central office will require additional capacity unless you can eliminate an equivalent workload from the replication process.



There is less data transfer to regional offices, and the ServicesUsage table is no longer required at each regional office.



There is no need for a Billing database in the regional offices because the new summarized information can be included in the LocalAccounts database.



Regional offices will be less autonomous because they will not have immediate access to the complete customer data.

Module 3: Designing a Data Modeling Strategy

3–37

Exercise 2: Designing Integration of Multiple Data Stores Introduction

Analyze solution architecture integration mechanisms

In this exercise, you will analyze the current database store integration solution. To meet the needs of Fabrikam, Inc., you will identify an appropriate integration mechanism for generating and storing summarized billing information in the solution design. You will also modify the database schema to support the required Web service functionality.

Summary ■

Modify the database schema to enable users to access usage information through a Web service

Review and update the existing solution architecture document Fabrikam Inc.vsd to specify how summarized billing information will be generated and stored in the Bill Printing database in each regional office.

Summary 1. Review the requirements in the lab scenario concerning storing information about customers’ requests and providing customers and bill payment agents with access to detailed usage information through the Web service 2. Extend the database schema for the Accounts database in the Fabrikam Central office to support this functionality.

Discussion questions

Detailed Steps 1. Using Visio, review the Fabrikam Inc.vsd file that you modified in Exercise 1. 2. Identify the appropriate mechanism for generating and storing summarized billing information.

Detailed Steps 1. Create a new database model diagram 2. Define a table for storing information about customers' requests for new services. 3. Define a table for storing information about agents’ requests for information about service usage. (Usage requests are different from new service requests.) 4. Ensure that the schema enables you to determine that an agent requesting information about a customer’s usage is valid for that customer. (The customer specifies who his or her agent is—other agents should not be able to request information about a customer’s usage.)

1. This exercise uses a different table for payment agent requests and customer requests. When would choosing to store requests in a separate table be a valid design choice? The design choice would depend on the structure of the data for each type of request. It would also depend on processing load and business requirements. If the data for each request has the same structure, it should be stored in a single table and include a column to designate the type of request. If the data for each request does not have the same structure, each request should be stored in a different table or a different column at a minimum. This avoids mixing XML document types in the same column. Designing the data structure to store each type of request in a separate table will make it easier to scale the process in the future. The customer request processing could be moved to a separate server for added capacity by relocating the corresponding table to this server.

3–38

Module 3: Designing a Data Modeling Strategy 2. If you selected a different strategy for storing XML data for customer and payment agent requests, can you provide a reason for such a design decision? Customer requests come from internal systems, so you must maintain binary consistency to trace potential application problems or bugs. Since payment agent requests are informational, binary consistency is not necessary.

Important After the discussion, shut down the virtual machine for the computer 2781A-MIA-SQL-03. Do not save the changes.

Module 4

Designing a Transaction Strategy for a SQL Server Solution

Contents: Lesson 1: Defining Data Behavior Requirements

4-2

Lesson 2: Defining Isolation Levels

4-8

Lesson 3: Designing a Resilient Transaction Strategy

4-17

Lab: Designing a Transaction Strategy for a SQL Server 2005 Solution

4-32

Information in this document, including URL and other Internet Web site references, is subject to change without notice. Unless otherwise noted, the example companies, organizations, products, domain names, email addresses, logos, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place or event is intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. The names of manufacturers, products, or URLs are provided for informational purposes only and Microsoft makes no representations and warranties, either expressed, implied, or statutory, regarding these manufacturers or the use of the products with any Microsoft technologies. The inclusion of a manufacturer or product does not imply endorsement of Microsoft of the manufacturer or product. Links are provided to third party sites. Such sites are not under the control of Microsoft and Microsoft is not responsible for the contents of any linked site or any link contained in a linked site, or any changes or updates to such sites. Microsoft is not responsible for webcasting or any other form of transmission received from any linked site. Microsoft is providing these links to you only as a convenience, and the inclusion of any link does not imply endorsement of Microsoft of the site or the products contained therein. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. ©2006 Microsoft Corporation. All rights reserved. Microsoft, Active Directory, ActiveX, BizTalk, Excel, Microsoft Press, MSDN, MSN, Outlook, PowerPoint, SharePoint, Tahoma, Visio, Visual Basic, Visual C#, Visual SourceSafe, Visual Studio, Windows, and Windows Server. All other trademarks are property of their respective owners.

Module 4: Designing a Transaction Strategy for a SQL Server Solution

4–1

**************************************** Illegal for non-trainer use *************************************** Module objectives

Introduction

After completing this module, students will be able to: ■

Define data behavior requirements.



Define isolation levels for the data store.



Design a resilient transaction strategy.

As the number of application users increases, so does the number of users who access the database concurrently. As a result, the application must be able to support higher transaction rates while maintaining a high level of performance and a low response time. You can meet these requirements, at least in part, by planning an effective transaction strategy during the application design phase. This module describes the considerations and guidelines that you should take into account when designing your transaction strategy. The module teaches you how to define such a strategy by identifying current data behavior requirements, defining isolation levels for the data store, and designing a resilient transaction strategy that meets the data behavior requirements.

4–2

Module 4: Designing a Transaction Strategy for a SQL Server Solution

Lesson 1: Defining Data Behavior Requirements

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Describe the process of identifying data stores and data behavior requirements.



Apply the guidelines for identifying data behavior requirements.



Apply the guidelines for defining data behavior requirements.

When designing a transaction strategy, you should first define your data behavior requirements. These requirements specify how data will be formatted, stored, manipulated, and used within the application. They also specify how data must behave in terms of concurrency, performance, and consistency. In this lesson, you will learn how to identify data stores and data behavior requirements, and apply the appropriate guidelines when defining these requirements.

Module 4: Designing a Transaction Strategy for a SQL Server Solution

4–3

The Process of Identifying Data Stores and Data Behavior Requirements

**************************************** Illegal for non-trainer use *************************************** Introduction

This topic describes the process that you should follow when identifying data stores and data behavior requirements. By following a well-defined process, you can design effective transaction strategies for your data applications.

Identifying data stores and defining behavior requirements

The process of identifying data stores and defining data behavior requirements includes the following steps: 1. Identify all existing data stores for your solution. You should identify every data source that supports your solution, including text files, binary flat files, XML files, Web services, databases, directory services, or any other store that contains source data. For each data source, identify the type of data being stored. This information is crucial to an effective transaction strategy. For example, you must know whether to convert data, join partitioned tables, or use Integration Services to import data. 2. Engage database administrators (DBAs) to understand behavior of existing data. DBAs work with the data every day. They can tell you how data is currently stored, how much data exists, how frequently users access the databases, where bottlenecks might exist, and other useful information about the current data structure. You can use this information to make decisions pertinent to your transaction strategy. For example, you can identify how your transactions might affect the current system, how the current system might affect your transaction, or how you might be able to make use of the current system.

4–4

Module 4: Designing a Transaction Strategy for a SQL Server Solution 3. Identify planned data storage usage. Business processes that cross systems can affect your transactions in ways that are not always obvious. If your transactions send data to or receive data from external partners, you must understand how the underlying systems function and understand the users who use these systems. Even if your business processes do not cross systems, you still must understand the underlying system and its users. To design an effective transactions strategy, you must understand how the data is being used. 4. Define data behavior for all data stores. You should define the data behavior to meet the performance, concurrency, and consistency requirements of your transactions. This includes determining the appropriate format for storing the data and the mechanisms necessary to eliminate redundancies among data sources. Efficient transactions rely on well-planned data stores.

Module 4: Designing a Transaction Strategy for a SQL Server Solution

4–5

Guidelines for Identifying Data Behaviors

**************************************** Illegal for non-trainer use *************************************** Identifying data behaviors

When determining your data behavior requirements, you should identify the following items: ■

Business processes. Identify and document all business processes. Most of these processes correspond to transactions. Keep in mind that a database transaction is a technical implementation of a business process. It is absolutely imperative to have the full cooperation of the business stakeholders to ensure that transactions properly implement business processes. These processes can include current and planned internal business processes and processes that span applications and systems.



Business rules. With the active participation of the business stakeholders, identify and document all business rules. These can constrain the way that you define the data.



Operational requirements. Identify and document operational requirements, such as transaction rates and acceptable delays, when processing transactions.



Access-related issues. Identify issues related to performance, concurrency, and consistency to anticipate issues such as unacceptable delay, contention, logical inconsistencies, and deadlocking. Data may have to be received from or sent to external partners, requiring security privileges to be well thought out.



Data access frequency. Determine how often the data is accessed. Frequent access might cause contention and result in unacceptable delay times. Note You can sometimes resolve issues related to excessive delay times and data contention by optimizing queries or upgrading hardware.



Transaction-related issues. Identify how the data behaves and what happens when the data is being accessed (read or written) within a unit of work. Using this information, you can identify issues related to performance, concurrency, and consistency. You should consider the effects of locking and blocking, index usage, concurrency control, wait states, data access modes, page splits, hot spots, declarative referential integrity, use of tempdb, and generating query execution plans.

4–6

Module 4: Designing a Transaction Strategy for a SQL Server Solution

Discussion: Guidelines for Defining Data Behavior Requirements

**************************************** Illegal for non-trainer use *************************************** Introduction

Well-defined data behavior requirements are essential to resolve performance, concurrency, and consistency issues in your system. This topic provides guidelines that you can use when defining the requirements that support the data behaviors previously identified.

Discussion questions

For the first two questions, you will review each one and then participate in the class discussion. For the last question, you will break into small groups, discuss the question within the group, and then participate in a class-wide discussion. The instructor will lead you through this process for the following questions: 1. Have you worked on a project in which you assumed that you had access to data and later discovered that you could not include that data in your solution? If so, describe the situation. Answers will vary.

2. Have you worked on a project in which you implemented read-only data for subsets of data? If so, how did you implement the solution? Did you have standards for updating that data? Possible answers include: ■

Using a duplicate copy of the database to store read-only data but permitting write operations on the primary database



Using views for read-only operations to access a subset of data but permitting write operations on the primary database



Using views for read-only and write operations to access a subset of data



Using the same database for read-only and write operations



Using a separate database for a subset of duplicated, read-only data, and updating that data in the database by firing a trigger whenever the primary table is modified



Using row versioning to provide snapshots of the read-only data

Module 4: Designing a Transaction Strategy for a SQL Server Solution

4–7

3. What are some of the essential guidelines for defining data behavior requirements? Possible answers include: ■

Specify static data as read-only. This step can improve performance and consistency.



Use snapshots where appropriate. In many situations, you do not need to work with the most recent data and can retrieve data from snapshots. However, you should specify a validity period for the snapshot and take into account any data-aging issues that might affect business requirements.



Use last-committed data when appropriate. In some scenarios, you do not need completely up-to-date data; using the last-committed version is sufficient.



Define an optimistic concurrency control strategy. You should evaluate concurrency strategies and choose the best approach for your application. Optimistic control supports greater concurrency, whereas pessimistic control provides greater data consistency. Your strategy should also take into account columns to be updated, values included in the WHERE clause of UPDATE and DELETE statements, and the response action of a concurrency conflict.



Evaluate index usage. Index usage is one of the most important factors related to performance and concurrency. Badly designed indexes can cause transactions to perform poorly and can require Microsoft® SQL Server™ to lock more data than necessary. Transactions that run for a long time can potentially lock large volumes of data and affect performance and concurrency.



Use filegroups effectively. Filegroup settings can help in situations in which data has fundamentally different behaviors. For example, you might have read-only data and read-write data in the same database. You can store the read-only data in one filegroup and the read-write data in another filegroup.



Specify the operations to be performed in transactions. To preserve the integrity of your business logic, you must specify the operations to be performed in each transaction.

4–8

Module 4: Designing a Transaction Strategy for a SQL Server Solution

Lesson 2: Defining Isolation Levels

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Apply the guidelines for choosing an isolation level for a database.



Describe the behavior of transactions that use the Snapshot isolation level.



Evaluate the considerations for offloading concurrent activities.



Apply the guidelines for accessing data that spans multiple data stores.

Managing multi-user systems is a complex task. In such systems, you should take into account the following considerations: ■

Ensuring a high level of performance



Maintaining consistency



Avoiding concurrency problems

To achieve these goals, you should choose the appropriate isolation level for each transaction and carefully design any processes that access data spread across different servers. In some cases you might also need to enhance the system performance by other means, such as by offloading concurrent activities to another server. This lesson provides guidelines for choosing the appropriate isolation levels and for accessing data that spans multiple data sources. The lesson also describes considerations for offloading concurrent activities.

Module 4: Designing a Transaction Strategy for a SQL Server Solution

4–9

Guidelines for Choosing an Isolation Level for a Database

**************************************** Illegal for non-trainer use *************************************** Introduction

Ideally, a database would execute transactions serially, one at a time, to ensure that they do not interfere with each other and to guarantee logical consistency. However, this approach is usually not practical because most databases must support multiple users concurrently. Concurrent access can cause conflicts in your application. For example, two users might try to update the same record at the same time. If their transactions are not isolated from each other, the changes made by one user might accidentally overwrite changes made by the other. To avoid these conflicts, you must isolate transactions in such as way as to ensure logical consistency, while processing the maximum number of simultaneous transactions in the shortest possible time.

Choosing an isolation level

You should consider the following guidelines when choosing an isolation level: ■

Determine the minimum isolation level that meets the required consistency. The isolation level can affect the response time and the volume of system resources required to manage locks. To determine the minimum isolation level, consider the concurrency issues associated with each isolation level and the type of locks established by each isolation level when accessing the data.



Use alternatives to restrictive isolation levels. Instead of using a restrictive isolation level, such as Serializable, to ensure data integrity, consider following an alternative strategy, such as: ●

Redesigning the Transact-SQL logic and tables.



Redesigning how the application accesses data.



Overriding the default locking strategy for the isolation level by specifying locking hints and application locks.

4–10

Module 4: Designing a Transaction Strategy for a SQL Server Solution ■

Use row versioning isolation levels. Transactions that result in readers blocking writers and vice versa affect performance and concurrency. If an application does not require the most recently committed data, consider using row versioning. As with the Read Committed isolation level (in its default state), row versioning does not require the database engine to acquire shared locks over data and block concurrent write operations when reading data. However, unlike Read Committed in its default state, row versioning does not experience dirty reads, nonrepeatable reads, and phantom reads. This is because row versioning uses the last committed data when reading from the database. Concurrent transactions that might be modifying data have no impact on transactions that use row versioning because these transactions read a snapshot of the data rather than the most current data. If you implement row versioning, keep in mind that there is increased overhead in the tempdb database for creating and managing row versions.

Note Snapshot transactions do not experience concurrency problems such as inconsistent analysis and phantom reads. However, Snapshot transactions do not prevent other transactions from making changes to the data read by the Snapshot transactions. Therefore, the result of Snapshot transactions might be logically inconsistent when they update tables, and those updates are based on data from other tables. SQL Server does not detect these conflicts. Connection pooling

When you create connection pools that have different transaction isolation levels, be sure to configure the appropriate isolation level for each transaction. If an application does not reset a connection when retrieving it from a connection pool, the connection retains its existing isolation level setting. To avoid this problem, set the Connection Reset property to True in the connections string, as shown in the following Microsoft Visual Basic® .NET example: strConn = "Server=Domain01\Server01;" _ + "Integrated Security=SSPI;" _ + "pooling=yes;" _ + "connection reset=true;" _ + "Database=AdventureWorks;"

You can also run the following SQL statement immediately after allocating a connection, replacing isolation_level with the isolation level you want to use: SET TRANSACTION ISOLATION LEVEL isolation_level

Module 4: Designing a Transaction Strategy for a SQL Server Solution

4–11

Multimedia: The Behavior of Transactions That Use Snapshot Isolation

**************************************** Illegal for non-trainer use *************************************** Introduction

Some systems designed for SQL Server 2000 can experience contention and blocking problems. You can often resolve these problems by using the new row versioning isolation levels introduced in SQL Server 2005. Row versioning can help to ensure absolute accuracy in multi-statement read operations without blocking other concurrent transactions. Earlier versions of SQL Server require you to use more restrictive isolation levels to achieve the same result. This topic describes the behavior of transactions that use row versioning isolation levels.

Discussion questions

Review the following questions and then discuss your answers with the rest of the class: 1. When should you not use the Snapshot isolation level when updating data? Possible answers include: ■

When there is a high probability of causing an update conflict. When an update conflict occurs, SQL Server rolls back one of the conflicting transactions. If this occurs frequently, you are simply loading the database engine with useless work.



When Snapshot isolation might not be able to provide the required consistency. For example, suppose you are using the MAX function to implement strict sequential row numbering. (In this case, you would use the function to return the highest value from the specified column and then add 1 to that value.) Snapshot isolation does not prevent other transactions from inserting rows during your transaction, which can result in duplicate values. You can use the Serializable isolation level, but this can cause multiple deadlocks. You can also use application locks or generate the sequence in the application layer.

4–12

Module 4: Designing a Transaction Strategy for a SQL Server Solution 2. Which isolation level provides the highest accuracy in multi-statement, read-only operations—Snapshot or Serializable? The Snapshot isolation level provides the highest accuracy when performing multistatement, read-only operations. A transaction using the Serializable isolation level does not prevent other transactions from modifying a data set before the Serializable transaction reads it. In addition, a Serializable transaction attempts to use the current version of rows. Therefore, the results of a multi-statement, readonly operation can generate inconsistent results. In contrast, when you use the Snapshot isolation level, the results of the operation will be consistent, corresponding to a well-defined point in time. This point in time is the moment when SQL Server assigns the transaction sequence number (TSN) to the Snapshot transaction.

3. What are the drawbacks of using row versioning isolation levels? Creating and managing row versions increases costs in disk storage, memory, and processing. SQL Server stores row versions in tempdb. In addition, to select the appropriate row version, SQL Server must scan the version chain until it finds the required row version.

Module 4: Designing a Transaction Strategy for a SQL Server Solution

4–13

Considerations for Offloading Concurrent Activities

**************************************** Illegal for non-trainer use *************************************** Introduction

In some cases, a good database design, appropriate isolation levels, and a good transaction strategy are not sufficient to achieve the required level of performance. In such cases, you must make other design decisions, such as whether to offload some concurrent activities to another server. Offloading concurrent activities improves performance by reducing the contention for data between conflicting processes. However, you might need to maintain multiple copies of data, and this can greatly increase the administrative overhead of your solution.

Considerations for offloading activities

You should consider offloading concurrent activities when you want to achieve the following goals:

Methods for offloading activities



Avoid contention and blocking problems. Two different activities running at different times or on different SQL Server instances cannot block each other.



Improve system performance. By running fewer processes on an instance of SQL Server, you can reduce the number of resources and the time required to run these processes, resulting in improved performance.

You can offload concurrent activities by scaling out your system, creating a data warehouse, or rescheduling activities for a different time: ■

Scale out your system. Consider scaling out your system if your primary concurrency conflicts are caused by lack of adequate system resources such as memory, disk space, or CPU time. Running some processes on another server reduces the workload of the existing server and can improve the overall system performance.



Create a data warehouse. If you have an online analytical processing (OLAP) application querying an online transaction processing (OLTP) system, consider creating a data warehouse with its own copy of the data. Data warehouses are better suited for OLAP queries than are OLTP databases.

4–14

Module 4: Designing a Transaction Strategy for a SQL Server Solution ■

Reschedule activities. Consider rescheduling activities in the following situations: ●

You do not require an immediate response. Some activities can be scheduled to run at off-peak hours.



You can divide a process into smaller processes. When you divide a process into smaller processes, you can sometimes run the first process immediately and schedule the other processes to run at a later time. For example, suppose that your system needs to process a customer order when it is placed and then send a message to the procurement department if the related inventory drops below a defined minimum level. You can process the customer order immediately but defer checking the inventory level and sending the message until later.

Module 4: Designing a Transaction Strategy for a SQL Server Solution

4–15

Guidelines for Accessing Data Across Multiple Data Stores

**************************************** Illegal for non-trainer use *************************************** Introduction

Enterprise applications often require access to multiple SQL Server instances. Accessing distributed data can impose a considerable burden on overall system performance. This topic provides guidelines for accessing data across multiple data stores while minimizing the effect on overall performance.

Accessing Data Across Multiple Data Stores

Follow these guidelines when designing a solution that needs to access data across multiple data stores: ■

Determine where query processing occurs. Consider the following items when determining the best place to perform query processing logic: ●

Data location. Typically, you should process a distributed query on the server where most of the data is stored. When you minimize network traffic, the query performs better. If the WHERE clause of the query filters a large data set, consider using the OPENQUERY function to create a pass-through query to fetch the filtered data from the remote server.



Server workload. If the server where most of the data is stored is heavily loaded, consider running the distributed query on a different server.



Client applications. You should not use a connection from an application to a server to run pass-through queries for another server. The application should connect directly to the server on which the data is stored. In some cases, you can perform some processing on the application side rather than on the data side. For example, you should consider application-side processing for operations that require cursors on small data sets, operations that require formatted results, or operations that cannot be performed by using set-oriented processing. Also note that you might not be able to run a distributed query from within SQL Server if the transaction involves other database systems; you must perform the distributed query directly from the application.

4–16

Module 4: Designing a Transaction Strategy for a SQL Server Solution ■

Plan for unavailable data stores. When your applications access data from multiple systems, data stores can sometimes be unavailable. Consider the following options when planning an availability strategy: ●

Retrying transactions. Add logic to your application to retry a transaction if it fails the first time. In some cases, the transaction will succeed on the second attempt. For example, your application might try to perform a transaction just as a cluster node fails over. If the application retries the transaction, the failover node might be operational.



Using Service Broker. Use Service Broker as a failover mechanism for distributed transactions. Service Broker does not require all participating systems to be available simultaneously. Note If you implement a Service Broker solution, you must design and implement compensating transactions.





Storing changes locally. Store changes locally when a system is unavailable, and send those changes later when the system is back online. However, you should consider this approach only if conflicting updates are unlikely to occur.

Consider security across the system. You must protect data from unauthorized access and ensure the privacy of sensitive data across the entire system. When planning security, consider the following questions: ●

Which credentials are forwarded from one server to another?



How do servers authenticate to each other?



What Microsoft Windows® identities are used for middle-tier applications?



Should data be encrypted during transit?

Note Module 2 of this course provides more information about security considerations. Course 2787: Designing Security for Microsoft SQL Server 2005 provides a complete overview of security.

Module 4: Designing a Transaction Strategy for a SQL Server Solution

4–17

Lesson 3: Designing a Resilient Transaction Strategy

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Describe best practices for designing a resilient transaction strategy.



Apply the guidelines for designing an object access strategy.



Apply the guidelines for performing transaction rollback.



Evaluate the considerations for designing large transactions.



Apply the guidelines for using hints.

A good transaction strategy must be resilient and efficient, and it must ensure consistency. You must design your transactions to minimize the scope for errors and provide error-handling logic. This lesson describes the best practices, guidelines, and considerations for designing a resilient transaction strategy. You will also learn how to evaluate whether your transactions adhere to a specified object access strategy.

4–18

Module 4: Designing a Transaction Strategy for a SQL Server Solution

Discussion: Best Practices for Designing a Resilient Transaction Strategy

**************************************** Illegal for non-trainer use *************************************** Introduction

A resilient transaction strategy minimizes errors, improves performance, and increases concurrency. In this topic, you will discuss several options for designing a resilient transaction strategy.

Discussion questions

Review the first question, and then discuss it with the rest of the class. For the remaining questions, you will break into small groups, discuss each question with the group, and then discuss the question with the class as a whole. Your instructor will guide you through this process for each of the following questions: 1. How should you design transactions to avoid errors? Possible answers include: ■

Avoid deadlocking errors by enforcing a consistent order in which applications access objects.



Perform a logical delete against a table instead of a physical delete. To perform a logical delete, set the delete flag in the row to false rather than actually deleting the row. This enables you to logically delete the row without having to lock all parent and child rows.



Prevent ad hoc query access to the database, and require that all applications access tables by using stored procedures.

2. How should you handle exceptions? Possible answers include: ■

Design your application to retry operations interrupted by a system exception, such as a deadlock, without returning an error message to the user.



Build an error handling routine that returns a user-friendly message to the user and allows the user to decide how to resolve the issue.



Roll back the transaction, and log it in an exception table for analysis at a later time.

Module 4: Designing a Transaction Strategy for a SQL Server Solution

4–19

3. What process should you follow for each occurrence of an error? Use the following process for each occurrence of an error: ■

Design a recovery process that automatically corrects the error rather than informing the user about the error.



For errors that cannot be fixed automatically, return a user-friendly error message to the application so that the user can take the appropriate action.



Add error logging to capture the circumstances that caused the error.



Analyze the error logs to determine the root cause of repeated or related exceptions.



Utilize the analysis performed in step 4 as input for future application enhancements.

4. What steps can you take to avoid transaction rollback caused by application errors? Answers will vary. One of the most common reasons for a rollback is that the transaction violates a validation rule. By moving simple validation (such as data formatting, time constraints, and list membership) to the application layer, you can validate data before it is submitted to SQL Server.

5. When should you consider using Service Broker as part of your resilient transaction strategy? Consider using Service Broker when your transactions do not need to be processed immediately. For example, an order submission process can include many steps, some of which are manual. In this case, you should run only those parts of the process that must run immediately. For those parts of the process that can wait or that are dependent on other activities, you can use Service Broker to facilitate asynchronous processing. You can also use Service Broker to communicate with systems that are only occasionally connected or that are unreliable. Service Broker includes built-in features that ensure the reliable delivery of messages even if it must wait for application availability. In addition, Service Broker can queue up transactions when the server receives them faster than it can process them.

4–20

Module 4: Designing a Transaction Strategy for a SQL Server Solution

Guidelines for Designing an Object Access Strategy

**************************************** Illegal for non-trainer use *************************************** Introduction

A resilient transaction must avoid errors and deadlocks that can result in rollbacks. One way that you can avoid deadlocks is to design an effective object access strategy. A wellplanned strategy ensures that all transactions access objects in the same order. This topic provides guidelines that you should follow when planning your strategy.

Designing an object access strategy

When designing an object access strategy, use the following guidelines: ■

Design an object access order based on usage. Whenever possible, define the order in which applications should access your objects. For example, consider a database that includes the Customers, Orders, OrderDetails, and Shipments tables. An application accesses the Customers table more frequently than the other tables and accesses the Shipments table least frequently. As a result, you should specify that transactions should access the Customers table first and the Shipments table last.



Manage conflicts when accessing objects. In some cases, you cannot design the object access order based on usage. In such cases, you should consider the following options: ●

Design transactions to be as short as possible. Define transactions to access the least amount of data necessary, and ensure that you have defined effective indexes to reduce the time required to locate this data.



Use logical deletes instead of physical deletes. If you physically delete data, you should do so in a specific order. For example, if you delete a customer, you should first delete rows for any orders for that customer. This will reduce the scope for conflict and the need for the database engine to roll back your work if you attempt to break referential integrity. However, you can also use logical deletes to prevent this type of error. A logical delete relies on a flag that can be set for each row in a table. Rather than physically deleting the row, you set the flag to False indicating that it has been deleted. Then you use the flag value when querying the table to retrieve only those rows whose flags are set to True.

Module 4: Designing a Transaction Strategy for a SQL Server Solution ■

4–21

Communicate the object access order to the development team, and enforce compliance. Along with an object access strategy, you should also implement quality assurance processes that enforce compliance. Note When communicating the object access order to other developers, explain the database structure, the projected usage statistics, and the rationale for the object access strategy. You should also document how developers can request making exceptions to this strategy.



Discourage ad hoc queries. Users executing ad hoc queries introduce a high level of unpredictability within a database. You should discourage their use wherever possible. If users must perform their own queries for reporting or analysis purposes, consider copying the data to a separate database designed to support OLAP processing and granting users access to this database instead.

4–22

Module 4: Designing a Transaction Strategy for a SQL Server Solution

Practice: Determining Whether Transactions Adhere to Object Access Strategies

**************************************** Illegal for non-trainer use *************************************** Introduction

In this practice, you will evaluate a set of transactions to determine whether they adhere to the object access strategy. After you have completed the exercises, your instructor will divide the class into small groups, and each group can discuss the questions at the end of the practice.

Create the PracticeDB database

This practice uses the PracticeDB database. You must create this database by using the following steps. 1. Start SQL Server Management Studio. Connect to the MIA-SQL\SQLINST1 server using Windows authentication. 2. In the File menu, point to Open and then click File. Open the file PracticeDB.sql in the E:\Practices folder. 3. Connect to the MIA-SQL\SQLINST1 server using Windows authentication when prompted. 4. Execute the script. 5. In the Object Explorer window, expand the Databases folder and verify that the PracticeDB database has been created. 6. Leave SQL Server Management Studio running to perform the rest of the practice.

Scenario

You have developed an object access strategy for an application that uses data in the PracticeDB database. Your strategy defines the following object access order: 1. Customers 2. Employees 3. Categories 4. Products 5. Orders 6. OrderDetails

Module 4: Designing a Transaction Strategy for a SQL Server Solution

4–23

For this practice, you will evaluate the following three transactions. Concurrent applications frequently perform these transactions: Note The code for these transactions is available in the file Practice.sql in the E:\Practices folder. The statement numbers included in the comments are for reference purpose only. Transaction A (inserting an order) USE PracticeDB GO /*1*/ BEGIN TRANSACTION /*2*/ DECLARE @UnitPrice MONEY /*3*/ SELECT @UnitPrice = UnitPrice FROM Sales.Products WHERE ProductID = 1 /*4*/ DECLARE @OrderID INT /*5*/ INSERT INTO Sales.Orders ( CustomerID, EmployeeID, OrderDate, RequiredDate, ShippedDate, Freight, ShipName, ShipAddress, ShipCity, ShipRegion, ShipPostalCode, ShipCountry ) SELECT 'ALFKI' AS CustomerID, 1 AS EmployeeID, GETDATE() AS OrderDate, DATEADD(dd, 15, GETDATE()) AS RequiredDate, NULL AS ShippedDate, 36.71 AS Freight, CompanyName AS ShipName, Address AS ShipAddress, City AS ShipCity, Region AS ShipRegion, PostalCode AS ShipPostalCode, Country AS ShipCountry FROM Sales.Customers WHERE CustomerID = 'ALFKI' /*6*/ SELECT @OrderID = SCOPE_IDENTITY() /*7*/ INSERT INTO Sales.OrderDetails ( OrderID, ProductID, UnitPrice, Quantity, Discount ) SELECT @OrderID, 1, UnitPrice, 2, 0 FROM Sales.Products WHERE ProductID = 1 /*8*/ COMMIT

Transaction B (modifying the quantity in an order) /*1*/ BEGIN TRANSACTION /*2*/ DECLARE @OrderID INT /*3*/ SET @OrderID = 2 /*4*/ DECLARE @ProductID INT /*5*/ SET @ProductID = 1 /*6*/ DECLARE @NewQuantity INT /*7*/ SET @NewQuantity = 12 /*8*/ DECLARE @PreviousQuantity INT /*9*/ SELECT @PreviousQuantity = Quantity FROM Sales.OrderDetails WHERE OrderID = @OrderID and ProductID = @ProductID /*10*/ UPDATE Sales.OrderDetails SET Quantity = @NewQuantity WHERE OrderID = @OrderID and ProductID = @ProductID /*11*/ UPDATE Sales.Products SET UnitsInStock = UnitsInStock - @NewQuantity + @PreviousQuantity, UnitsOnOrder = UnitsOnOrder + @NewQuantity - @PreviousQuantity WHERE ProductID = @ProductID /*12*/ COMMIT

4–24

Module 4: Designing a Transaction Strategy for a SQL Server Solution Transaction C (modifying the price of a product) /*1*/ BEGIN TRAN /*2*/ DECLARE @percentIncrement MONEY /*3*/ SET @percentIncrement = 5 /*4*/ DECLARE @ProductID INT /*5*/ SET @ProductID = 1 /*6*/ DECLARE @NewUnitPrice MONEY /*7*/ UPDATE Sales.Products SET @NewUnitPrice = UnitPrice = UnitPrice * (1 + @percentIncrement/100) WHERE ProductId = 1 /*8*/ UPDATE OD SET UnitPrice = @NewUnitPrice FROM Sales.OrderDetails OD JOIN Sales.Orders O ON OD.OrderID = O.OrderID WHERE O.ShippedDate IS NULL AND OD.ProductID = @ProductID /*9*/ COMMIT

Practice Tasks

1. For each transaction, write the name of the objects used in the order in which they are accessed. Transaction

Accessed objects

A B C Answers: Transaction

Accessed objects

A

1. 2. 3. 4.

Products Customers Orders OrderDetails

B

1. OrderDetails 2. Products

C

1. Products 2. Orders or OrderDetails

2. In the following table, write Yes or No in the second column to indicate whether the transaction follows the accepted object access order. Transaction

Follows the established object access order?

A B C A = No; B = No; C = Yes

Module 4: Designing a Transaction Strategy for a SQL Server Solution

4–25

3. Are any of the transactions likely to cause deadlocks? If so, which transactions might cause deadlocks, and which statements might produce those deadlocks? If not, why not? Transactions B and C are likely to produce deadlocks if other transactions are also running. The statements causing deadlocks in Transaction B are statements 10 and 11 and in Transaction C, statements 7 and 8.

4. How would using the Snapshot isolation level affect the transactions in this scenario? Snapshot isolation might cause problems because Transaction B is updating the same rows as Transaction C. For example, suppose that changing the quantity in an order results in a pricing adjustment (as a result of a quantity discount or special offer scenario). The price update in Transaction B might override the pricing adjustment and charge the customer the wrong price. Although Snapshot isolation can be a useful tool, you must take into consideration the implications for using a non-blocking mechanism to manage concurrency. In many situations, locking might be preferred over row versioning.

Discussion questions

Review the following questions with your small group and then discuss your answers with the rest of the class: 1. Would you consider changing the object access strategy? If so, what would you change? If not, why not? There are a several possible reasons to change the object access strategy: ■

The strategy is not appropriately defined, such as when it becomes difficult to write efficient transactions or ensure consistency.



New business requirements change the way a system is utilized.



User access patterns change significantly, causing bottlenecks in the system.

4–26

Module 4: Designing a Transaction Strategy for a SQL Server Solution 2. Transaction B reads information about an order and then updates it. Can this transaction cause inconsistency problems? If so, how can you solve them? If not, why not? Yes, the transaction can cause problems. For example, assume that Transaction B runs under the default isolation level. This means that the transaction does not prevent other transactions from modifying the order information before Transaction B updates it. This can result in incorrect stored data (an incorrect number of units in stock or units on order). Using the Repeatable Read or Serializable isolation level can solve the inconsistency problem because these isolation levels maintain a shared lock on the order information, thereby preventing other transactions from modifying the row. However, using these isolation levels can cause deadlocking when two connections attempt to modify the same order quantity simultaneously. Using the Snapshot isolation level can solve the inconsistency problem because the concurrency conflict will be detected; that is, if another transaction modifies the same order information, SQL Server raises a concurrency conflict error and rolls back the transaction. Using a locking hint such as XLOCK or UPDLOCK in the read operation is a better solution than using Snapshot isolation because the hint prevents row modifications. As a result, SQL Server does not roll back the transactions. The best solution is to use the OUTPUT operator to read the previous quantity at the same time the quantity is updated. In this way there is no opportunity for other transactions to modify the order information because it is locked exclusively by the update operation. In addition, this approach simplifies the code because only one line of code is necessary to read and update the row.

Module 4: Designing a Transaction Strategy for a SQL Server Solution

4–27

Guidelines for Transaction Rollback

**************************************** Illegal for non-trainer use *************************************** Introduction

To maintain a database in a consistent state, an application must sometimes roll back a transaction that encounters an error. However, a rollback is a costly operation, and you should avoid them as much as possible. Even so, there are times when rollbacks cannot be avoided. This topic provides information on how you can define a strategy to minimize the occurrence of rollbacks and prepare for them in situations in which they do occur.

Planning a rollback strategy

Use the following guidelines when planning for transaction rollback: ■

Minimize errors. Errors cause rollbacks. Each error you eliminate can reduce the number of rollbacks.



Pre-validate data. You can eliminate many situations that cause rollbacks by validating data in different layers of your system. However, performing validation processes to the application layer can create additional overhead for the application. One solution is to divide the validation process among the layers. For example, the application layer should apply any special formatting and data entry validation (such as checking that the user enters the correct type and range of data), and the data layer can perform more complex business validations, such as checking credit card numbers.



Monitor for rollbacks. Design client applications to monitor for rollbacks. If rollbacks occur, take the appropriate actions.

4–28

Module 4: Designing a Transaction Strategy for a SQL Server Solution ■

Define a rollback strategy. Define a rollback strategy in collaboration with application developers. The strategy should define the actions that the system must take if rollback occurs. These actions might include the following: ●

Rerunning the transaction. In some cases a transaction fails because of cluster failover, mirror failure, or deadlocks. In situations such as these, the transaction will often succeed on the second attempt.



Logging rollback information. Record as much information about the rollback as possible. You can then use this information to determine the root cause of the error later.

Note If you do not log information about errors, you undermine the process of designing an effective rollback strategy. ■

Inform end users when a transaction is rolled back. If a transaction is rolled back, the application should inform the user about the situation and give the user a chance to correct the error. However, in cases in which the application can retry the transaction automatically and correct the situation, you do not need to notify the end user of the error.

Note Giving the user the opportunity to correct an error should be part of a larger error-handling strategy. The user must also be able to understand the error messages that your application displays, and these messages must be relevant.

Module 4: Designing a Transaction Strategy for a SQL Server Solution

4–29

Considerations for Designing Large Transactions

**************************************** Illegal for non-trainer use *************************************** Introduction

Some business needs, such as end-of-period operations, require large transactions. You must design these transactions carefully; otherwise, they can negatively affect system performance and cause concurrency issues.

Designing large transactions

When designing large transactions, you should take into account the following considerations: ■



Handling large transactions

Impact on system performance. Large transactions can affect your system for several reasons: ●

They hold locks for longer periods of time, potentially blocking other concurrent transactions.



They require more server resources.



They require more space in the transaction log.

Cost associated with rollbacks. Large transactions can result in more rollbacks because of the increased amount of affected data. If possible, use savepoints to partially roll back transactions and reduce the need to completely roll back.

You can use the following strategies to reduce the impact of large transactions: ■

Split large transactions into smaller ones. If possible, split large transactions into smaller ones. Smaller transactions can reduce or eliminate performance and concurrency issues.



Schedule large transactions to run at off-peak hours. If you cannot avoid a large transaction, you should see if it is possible to execute it at a time when it is less likely to cause contention and affect other users.

4–30

Module 4: Designing a Transaction Strategy for a SQL Server Solution

Guidelines for Using Locking Hints

**************************************** Illegal for non-trainer use *************************************** Introduction

You might encounter situations in which you need finer control over locking behavior than that provided by isolation levels. Locking hints provide this control. However, you should use them carefully, as overriding the locking strategy used by SQL Server can have a detrimental impact on performance. This topic provides guidelines for using locking hints to control data access, to avoid deadlocks, and to use lower isolation levels.

New SQL Server 2005 locking hints

SQL Server 2005 includes new locking hint functionality: ■

XLOCK at row level. In earlier versions of SQL Server, you can specify an XLOCK hint only at page and table levels. In SQL Server 2005, you can also specify an XLOCK hint at row level, which helps to minimize the number of locked rows.



READPAST. You can use the READPAST locking hint to specify that the database engine does not read rows and pages locked by other transactions. Instead, the database engine skips past the rows and page rather than blocking the current transaction. You can specify the READPAST locking hint only for Read Committed and Repeatable Read transactions. The hint causes both row-level and page-level locks to be skipped.



READCOMMITTEDLOCK. You can use the READCOMMITTEDLOCK hint to specify that read operations behave the same as the Read Committed isolation level. When you use this hint, the database engine sets shared locks when reading the data and releases those locks when the read operation ends.

Module 4: Designing a Transaction Strategy for a SQL Server Solution Using locking hints

4–31

You can use locking hints to implement the following functionality: ■

Provide finer locking control. Locking hints can provide more precise locking control than isolation levels.



Reduce deadlocking. Repeatable Read and Serializable transactions, which first read and then write, can escalate deadlocking. You can sometimes avoid deadlocking by using the appropriate locking hint in the read operation. You must specify a hint that is stronger than the one shared by other operations.



Lower the isolation level. Often, only a few statements in a transaction require a high isolation level. You can run the rest at lower levels. In these situations, you should consider running the transaction under a lower isolation level and then specifying locking hints for the high-level isolation statements.

Important Use locking hints only when absolutely necessary. Isolation levels abstract the internal mechanism used for transaction management, which is why they are the preferred way to manage transactions. Isolation levels are part of the ANSI SQL standard, whereas locking hints are proprietary extensions within SQL Server. Locking hints add complexity to Transact-SQL code, which makes writing and maintaining the code more difficult. Using a locking hint can sometimes be indicative of a design flaw in your system.

4–32

Module 4: Designing a Transaction Strategy for a SQL Server Solution

Lab: Designing a Transaction Strategy for a SQL Server 2005 Solution

**************************************** Illegal for non-trainer use *************************************** Objectives

Introduction



Determine the database isolation level.



Determine the order for object access.



Design transactions.



Defend your transaction strategy.

In this lab you will design a transaction strategy that addresses the requirements of the Fabrikam, Inc. solution. The solution includes several processes initiated through stored procedures. You will analyze these stored procedures, identify any problems, and provide a solution to solve the identified problems. The lab includes four exercises, as follows:

Scenario



Exercise 1. In this exercise, you will examine the requirements and determine the isolation level for the transactions. You will also decide whether to enable row versioning on the database.



Exercise 2. In this exercise, you will determine the object access order to avoid deadlocking problems.



Exercise 3. In this exercise, you will redesign several processes to solve the identified problems.



Exercise 4. In this exercise, you will present and justify your solution to the class.

The Fabrikam, Inc., solution includes processes for performing the following operations: ■

Registering service usage



Assigning new telephone numbers



Managing contracts

Module 4: Designing a Transaction Strategy for a SQL Server Solution

4–33

Each process is implemented by using a stored procedure. Refer to the Stored Procedures.sql script file in the E:\Labfiles folder to view the procedure definitions. The following sections describe these processes in more detail. The Fabrikam, Inc., solution is experiencing contention, performance degradation, and other transaction-related problems. To solve these problems, you must redesign the transaction strategy. Registering service usage

Every time a customer uses one of the Fabrikam, Inc services (such as making a phone call), the Fabrikam, Inc wireless telephone system sends the following data to the head office: ■

PhoneNumber



ServiceID



Time



Duration



CalledPhoneNumber

When this data arrives at the head office, the Accounts application uses the RegisterServiceUsage stored procedure to insert one row into the ServicesUsage table. The application queries data from and inserts data into the ServicesUsage table at a very high rate. As a result, SQL Server is experiencing contention issues and is blocking other operations. However, this process is critical to Fabrikam, Inc., because it serves as the basis for creating customer bills. No other processes should interfere with registering service usage. You must ensure that this process will not negatively affect other operations, and that other processes will not interfere with registering service usage. Assigning new telephone numbers

When the Accounts application receives a request for a basic telephone service, it uses the AcceptBasicService stored procedure to assign a telephone number to the customer. Fabrikam, Inc., has a limited range of telephone numbers, so no telephone numbers can be wasted. Fabrikam, Inc., distributes the numbers among the regional offices. When the application runs the stored procedure, deadlocks occur if two or more basic services are accepted simultaneously for the same regional office. You must ensure that this process avoids deadlocking and allows more than one basic service to be accepted simultaneously for the same regional office.

Managing contracts

The Accounts application uses the following stored procedures to manage contracts: ■

AddNewServiceContract. This stored procedure adds new service contracts.



CancelContract. This stored procedure cancels service contracts.



GetContractStatus. This stored procedure provides reports on contract status.

The stored procedures are blocking each other. You must determine the appropriate isolation level to avoid such blocking. Preparation

Ensure that the virtual machine for the computer 2781A-MIA-SQL-04 is running. Also, ensure that you have logged on to computer. If you need to log on to the computer, use the following credentials: ■

Username: Student



Password: Pa$$w0rd

4–34

Module 4: Designing a Transaction Strategy for a SQL Server Solution

Exercise 1: Determining the Database Isolation Level Introduction Create the Accounts database

In this exercise, you will determine the most appropriate database isolation level to use to meet the business requirements. 

The exercises in this lab use the Accounts database implemented by Fabrikam, Inc. Perform the following steps to create this database:

1. On the Start menu, click Command Prompt. 2. Move to the folder E:\Labfiles\Starter\Preparation. 3. Execute the script setup.bat. 4. Close the Command Prompt window. Determining the database isolation level

Summary 1. Review the Registering service usage, Assigning new telephone numbers, and Managing contracts sections in the scenario.

Detailed Steps 1. Review the Accounts ERD.vsd file in the E:\Labfiles folder. 2. Open the Stored Procedures.sql script file in the E:\Labfiles folder, and identify any blocking problems.

2. Examine the existing database schema in the Accounts ERD.vsd file. 3. 3. Analyze the definitions of the contract management stored 4. procedures. 4. Implement the appropriate database isolation level.

Choose the appropriate isolation level for the database. Execute the statements necessary to enable the selected isolation level in the Accounts database.

Module 4: Designing a Transaction Strategy for a SQL Server Solution

4–35

Exercise 2: Determining the Order of Object Access Introduction Identify the objects accessed in each transaction, and determine an appropriate object access order

In this exercise, you will determine the most appropriate object access order to use in transactions to reduce or eliminate deadlocking problems.

Summary

Detailed Steps

1. Analyze the definitions of the contract management stored procedures.

1. Review the Stored Procedures.sql script file in the E\Labfiles\Starter folder.

2. For each stored procedure, determine the order in which the objects are accessed.

2. For each stored procedure in the file, list the objects being accessed.

3. Establish an appropriate object access order to reduce or eliminate deadlocking.

3. List the objects in the order in which they should be accessed to reduce or eliminate deadlocking. Take into account the objects accessed most frequently.

4–36

Module 4: Designing a Transaction Strategy for a SQL Server Solution

Exercise 3: Designing Transactions Introduction

In this exercise, you will redesign the following two processes that are causing problems: ■

Service usage registration process. The ServicesUsage table receives a high number of queries and insertions. These two activities interfere with each other. Students should propose ways to offload the insertion activity to off-peak hours.



Telephone number assignment process. The AcceptBasicService stored procedure implements this process. The stored procedure uses the Serializable isolation level as well as the MAX+1 technique for generating sequence numbers for telephone number records. The stored procedure can cause deadlock errors when two or more connections try to run the stored procedure simultaneously.

You should redesign the transactions used by these processes to meet the following business requirements:

Creating objects to support general requirements



The established object access order must be followed.



All errors must be handled by using a TRY…CATCH construct.



All errors must be registered in an ErrorLog table.



The client application must be notified about errors.

Summary ■

Create the objects necessary to support the error-logging requirements.

Detailed Steps 1. Create a table called ErrorLog to store the following data: ●

Error number



Error message



Error procedure



Windows user



Time

2. Create a stored procedure called LogErrorAndReRaise. This stored procedure must register error information in the ErrorLog table and reraise the original error with as much fidelity as possible. Redesign the service usage registering process

Summary ■

Redesign the service usage registration process so that it interferes as little as possible with other operations in the database. Note that retime availability of service usage information is not required.

Detailed Steps 1. Examine to the RegisterServiceUsage stored procedure. 2. Choose an appropriate technique to meet the business requirements. 3. Write the Transact-SQL code needed to create the objects and databases to support the selected technique.

Module 4: Designing a Transaction Strategy for a SQL Server Solution Redesign the telephone number assigning process

Summary ■

Discussion questions

4–37

Detailed Steps

Redesign the telephone number assignment process to meet the following requirements:

1. Examine the AcceptBasicService stored procedure.



Avoid deadlocking problems.

2. Choose an appropriate technique to to meet the business requirements.



Allow more than one phone number assignment simultaneously.

3. Write the Transact-SQL code needed to create the objects and databases to support the selected technique.



Avoid wasting phone numbers.

Review the following questions, and then discuss your answers with the rest of the class: 1. Describe another way to offload the services usage registration process. Possible answers: ■

Set up an application to listen for incoming service usage information. When the information arrives, the application appends it to a text file, which is kept open for performance reasons. On a scheduled basis, the application bulk loads the text file into the ServicesUsage table.



Set up an application to listen for incoming service usage information. When this information arrives, the application stores it in a Service Broker queue. On a scheduled basis, Service Broker inserts the information from the queue into the ServicesUsage table.

2. If you remove the requirement for simultaneous phone number assignment, what solution would you propose? You can use application locks to serialize the process of assigning new telephone numbers.

4–38

Module 4: Designing a Transaction Strategy for a SQL Server Solution

Exercise 4: Justifying the Transaction Strategy Introduction Present and justify your transaction strategy

In this exercise, the instructor will choose some of you to present and justify your proposed transaction strategy.

Summary ■

Explain the technical reasons for your proposed design decisions.

Detailed Steps 1. Present and justify the isolation level. 2. Present and justify the object access order. 3. Present and justify the design of the service usage registering process. 4. Present and justify the design of assigning new telephone numbers.

Important After the discussion, shut down the virtual machine for the computer 2781A-MIA-SQL-04. Do not save the changes.

Module 5

Designing a Notification Services Solution

Contents: Lesson 1: Defining Event Data

5-2

Lesson 2: Designing a Subscription Strategy

5-8

Lesson 3: Designing a Notification Strategy

5-18

Lesson 4: Designing a Notification Delivery Strategy

5-23

Lab: Designing a Notification Services Solution

5-28

Information in this document, including URL and other Internet Web site references, is subject to change without notice. Unless otherwise noted, the example companies, organizations, products, domain names, email addresses, logos, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place or event is intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. The names of manufacturers, products, or URLs are provided for informational purposes only and Microsoft makes no representations and warranties, either expressed, implied, or statutory, regarding these manufacturers or the use of the products with any Microsoft technologies. The inclusion of a manufacturer or product does not imply endorsement of Microsoft of the manufacturer or product. Links are provided to third party sites. Such sites are not under the control of Microsoft and Microsoft is not responsible for the contents of any linked site or any link contained in a linked site, or any changes or updates to such sites. Microsoft is not responsible for webcasting or any other form of transmission received from any linked site. Microsoft is providing these links to you only as a convenience, and the inclusion of any link does not imply endorsement of Microsoft of the site or the products contained therein. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. ©2006 Microsoft Corporation. All rights reserved. Microsoft, Active Directory, ActiveX, BizTalk, Excel, Microsoft Press, MSDN, MSN, Outlook, PowerPoint, SharePoint, Tahoma, Visio, Visual Basic, Visual C#, Visual SourceSafe, Visual Studio, Windows, and Windows Server. All other trademarks are property of their respective owners.

Module 5: Designing a Notification Services Solution

5–1

**************************************** Illegal for non-trainer use *************************************** Module objectives

Introduction

After completing this module, students will be able to: ■

Define event data and how this data will be stored.



Design a subscription strategy for a Notification Services solution.



Design a notification strategy.



Design a notification delivery strategy.

In this module, you will learn the guidelines for and processes of designing a Notification Services solution. You will learn how to define event data and how to determine how this data will be stored. The module also provides you with the information you need to design a subscription strategy, a notification strategy, and a notification delivery strategy. You will execute a Notification Services solution to see its design in action. This presentation shows an overview of a Notifications Services solution, including the events, a subscription to an event, and the notification. Tip To view this presentation later, open the Web page on the Student Materials compact disc, click Multimedia, and then click the title of the presentation.

Discussion questions

Read the following questions and discuss your answers with the class: 1. Why is it necessary to break down the Notification Services architecture into events, subscriptions, and notifications? Subscribers will want to select notifications based on individual circumstances. Events are the trigger, and notifications are the actual message. Splitting the process makes it more scalable and configurable.

2. Why does Notification Services not send all notifications in real time? Many notifications are not required instantly, and attempting to send them immediately would cause a massive load on the system.

5–2

Module 5: Designing a Notification Services Solution

Lesson 1: Defining Event Data

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Evaluate the considerations for choosing a data store.



Evaluate the considerations for designing an event schema.



Evaluate the considerations for implementing custom indexes for an event.



Evaluate the considerations for using event chronicles.

Events are the trigger for notification generation. Events can be periodic or sporadic and can contain small or large amounts of data. Understanding the nature of your events is crucial for good design. As with every Notification Services component, events are stored in Microsoft® SQL Server™ tables, and general SQL Server optimization will apply to them as well. Events might have different behaviors, from sporadic and non-time-critical events, such as new book publishing, to regular and time-sensitive events, such as delivering stock exchange information. Therefore, understanding event behavior in your solution is critical. Aspects such as event quantity, lifetime, and urgency—in combination with event data size—are key points for solution correctness with regards to event data. In this lesson, you will look at considerations for choosing a data store for Notification Services data and then focus on considerations for event schemas, event indexing, and event chronicles.

Module 5: Designing a Notification Services Solution

5–3

Considerations for Choosing a Data Store

**************************************** Illegal for non-trainer use *************************************** Introduction

When you deploy an instance of Notification Services, it creates the Notification Services database. You do not configure the tables of the Notification Services databases directly; instead, you define them by using the parameters of the Application Definition File (ADF). You will still need to configure the SQL Server instance for optimal performance, and you must consider several factors when using Notification Services. Each notification solution shows a different balance between workloads generated by event management, notification generation, notification formatting, and notification delivery. Therefore, you must evaluate all of these workloads at design phase.

Storing Notification Services data

Inside the ADF, the event class fields will provide the schema for the event table in SQL Server. The design of this table is as important as any traditionally created table in SQL Server. The event table is likely to have several indexes, and this can increase the size of the table considerably. You should take care when considering indexing candidates, and you should consider the data type of indexed fields. Notification Services handles events in batches. After a batch has been processed, Notification Services treats the event data as expired and later removes it. Sometimes you do not want it to remove this data. For example, this data might be required later to provide a version history. You can create event chronicle rules to load the data into event chronicle tables. These tables can increase the storage requirement considerably. Notification Services will use tempdb heavily, so tempdb should have a large initial capacity to avoid the cost of resizing.

5–4

Module 5: Designing a Notification Services Solution A Full recovery model will enable point-in-time restore operations. This will increase functionality but will result in larger log files. Log files should have a large initial size to avoid the cost of resizing. Locating the log files on a separate volume from the data files will reduce concurrent disk access and improve performance. The data store should use Microsoft Windows® Authentication. Windows Authentication has many benefits over SQL Server Authentication, whereas there is no benefit from using SQL Server Authentication.

Estimating workload

Notification Services can place a heavy workload on server resources, and you must consider this at design time. You will find that in some cases, a single-processor system is not enough to cope with expected workload. Hence, a scale-up or scale-out strategy might be necessary.

Module 5: Designing a Notification Services Solution

5–5

Considerations for Designing an Event Schema

**************************************** Illegal for non-trainer use *************************************** Introduction

Events are the ultimate purpose of the notification process. Gathering enough event information to fulfill business requirements is critical. Events can run frequently and need processing quickly, so keeping information size to a minimum is essential. In this topic, you will consider how event data schema relates to these competing requirements.

Event classes

When you define an application, Notification Services uses the event class properties to create the event tables, views, and procedures. Each event class defined in Notification Services becomes a table in the Notification Services application database, and the event class fields define the columns in the event tables. The more event classes there are, the more the generator module needs to work to create notifications, and therefore it is sensible to consolidate initial event classes as much as possible. You define event classes per event, but events might overlap and enable consolidation. For each event, decide which values need capturing for your notifications. Keep this information to a minimum, and consider storing a reference to extended information. Conversely, the extra overhead in looking up extended information in other tables or databases needs balancing against the performance improvement of notifications. You define event classes through Application Definition Files (ADFs) or, programmatically, through Notification Services Management Objects (NMO). Note A Notification Services database is like any other database, but Notification Services automatically manages it. When configuring a Notification Services application, you can choose to use an independent database or an existing database.

5–6

Module 5: Designing a Notification Services Solution

Considerations for Event Indexing

**************************************** Illegal for non-trainer use *************************************** Introduction

Indexing will have contrasting effects on system performance. Indexes can dramatically speed up read operations while slowing down inserts, updates, and deletes. It is important to consider both types of queries to choose the optimal indexing strategy. If events fire quickly, there will be many inserts into the event table, which would normally suggest minimal indexing. On the other hand, fast event generation can benefit from indexing, as indexing speeds up event processing. For example, stock market price change notifications would have many events firing quickly. The data is time sensitive, so indexing is worthwhile because of the faster event processing. A weather data system might also have many inserts, but the data is less time sensitive and the columns are likely to be larger in size and quantity, causing indexing to be less essential and more costly. These conflicting requirements will require careful monitoring to ensure the optimal balance of insert performance against query performance.

Defining custom indexes

Notification Services creates indexes on the EventID and EventBatchID system columns. You can also create custom indexes on other columns by using the Application Definition File (ADF). Candidate columns for custom indexing will be those used in generator module queries—specifically, in JOIN and WHERE clauses. Covering indexes cover all columns required by a query. This type of index will improve the performance of a query, as the query can run entirely from the index, without accessing the underlying data.

Analyzing performance

Initial usage analysis will always be a simulation of notification table content. You should analyze performance once the system is in production to verify that the system functions in a timely manner. You can use the usual SQL Server performance analysis tools.

Module 5: Designing a Notification Services Solution

5–7

Considerations for Using Event Chronicles

**************************************** Illegal for non-trainer use *************************************** Introduction

When event-driven subscriptions process event data, Notification Services removes old values from regular event tables. Therefore, if you need to access previous event values, or if any other kind of persistent information is required, another mechanism is required. Event chronicles provide this archiving ability.

Event chronicles

When scheduled subscriptions are processed, Notification Services sends the most recent data to the event chronicle tables. This allows historical data to be stored and can act as a filter for notifications. For example, a user might require a notification whenever there is a change or a notification every hour even if data remains unchanged. Event chronicle tables can provide this hourly data. They can also limit the number of notifications sent by checking the last notification time in the event chronicle table and sending a new notification only if a certain period of time has passed. The event chronicle tables should contain a timestamp column if version control is required and a datetime column if time-specific notification is required. You can create multiple event chronicle tables for each event table to avoid storage of dissimilar content in a denormalized structure.

5–8

Module 5: Designing a Notification Services Solution

Lesson 2: Designing a Subscription Strategy

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Evaluate the considerations for designing a subscription strategy.



Evaluate the considerations for designing a subscription schema.



Evaluate the considerations for implementing chronicles for a subscription class.



Evaluate the considerations for designing a subscription rules strategy.



Evaluate the considerations for implementing subscription indexing.



Evaluate the considerations for designing a subscription management strategy.

Users and applications specify which events they are interested in by using subscriptions. Notification Services uses event data and subscription data to decide whether it should send a particular notification to a specific user or application. Notification Services implements subscriptions by using database tables to store subscription data and by using rules to specify when a particular user or application should be notified of an event based on this subscription data. You can apply general SQL Server optimization principles to these tables, but an understanding of the nature of subscriptions is required to optimize your system effectively. In this lesson, you will learn about the process of designing a subscription strategy and details about different aspects of that process, such as subscription schemas, subscription rules, indexing, subscription chronicles, and subscription management.

Module 5: Designing a Notification Services Solution

5–9

Considerations for Designing a Subscription Strategy

**************************************** Illegal for non-trainer use *************************************** Introduction

Subscription-related issues, in conjunction with notification delivery, are the most complex components of Notification Services. Rules are critical in the overall solution performance, and the subscription design phase includes defining event rules and scheduled subscription rules, among other aspects. A good subscription strategy will ensure the correct subscription configuration and execution.

Designing a subscription strategy

You should consider the following issues when designing a subscription strategy: ■

Event-driven or scheduled subscriptions. Notification Services uses both eventdriven and scheduled subscriptions. These subscriptions are different in nature and will require a different strategy. It is therefore important to decide which subscription types you will use based on business requirements.



Subscription load. You will need to understand the quantity and size of events, subscribers, subscriptions, and notification delivery channels to plan for the anticipated load of the subscriptions.



Additional information. The subscription schema holds the subscription information. Notification Services will generate some required fields, but knowledge of additional required information is necessary to plan the schema.



Historical information. If you need to store any kind of historical information about subscriptions, subscription chronicles will be required.



Rules usage. Notification Services uses rules to decide when it must notify a subscriber. Applying rules is the most critical Notification Services task, because general performance and notification generation depends on this step. Planning rules usage carefully is crucial.



Indexing. Each time a subscription rule fires, Notification Services queries the subscription views. These queries join tables and filter data based on WHERE clauses. Using an appropriate indexing strategy will improve system performance.



Subscription management. Subscriptions have a life cycle, from creation to obsolescence. A subscription strategy should consider subscription management.

5–10

Module 5: Designing a Notification Services Solution

Considerations for Designing the Subscription Schema

**************************************** Illegal for non-trainer use *************************************** Introduction

Each application has one or more subscriptions, and Notification Services generates notifications by querying event and subscription tables. The subscription schema will hold the information about subscriptions. Notification Services will generate the basic required fields, but often these fields do not include enough detail, and additional information will be required. When you have decided what, if any, additional information is required, you should decide whether to store that information in the subscription schema. When you define an application, Notification Services uses subscription class properties to create the subscription tables, views, and procedures. The subscription class fields define columns in subscription tables; you create these fields by using the ADF or Notification Services Management Objects (NMO). In this topic, you will learn about considerations to apply when designing a subscription schema.

Designing the subscription schema

You should consider the following issues when designing a subscription schema: ■

Allowing user customization. Identify your subscription and decide what values you will enable users to provide in their subscriptions—for example, a city for a weather application. Some subscriptions will have no user customization and will simply enable a user to subscribe or not.



Specifying the locale. If your application contains only one locale, provide that locale in your Transact-SQL rule to generate notifications. If locale information is used, it can be selected by the user or taken from another source, such as a user profile table.



Identifying the device. If your application supports only a single device, provide that device in your Transact-SQL rule to generate notifications. If device information is used, it can be selected by the user or provided dynamically.

Module 5: Designing a Notification Services Solution

5–11

Considerations for Implementing Subscription Chronicles

**************************************** Illegal for non-trainer use *************************************** Introduction

Subscription chronicles store data about previous notifications. Applications generating notifications can use subscription chronicles to determine whether a subscriber has already received a similar notification and, if so, when. For example, you could query subscription chronicles to limit the number of notifications that a subscriber receives within a given time period.

Subscription chronicles

The database developer must evaluate the business needs and identify the need for subscription chronicles as appropriate. Following is a list of situations in which you should consider using a subscription chronicle: ■

Storing a subscription history. You should use a subscription chronicle when you need to maintain some sort of archive or subscription history. For example, if you need to send notifications for value changes only if the change surpasses a certain threshold, you can store the last value notification in the subscription chronicle.



Timestamping the last notification. You should use a subscription chronicle when you need to know when the last notification was sent. For example, if a subscriber does not want to receive more than one notification per hour, you can check the last time a notification was sent in the subscription chronicle.



Archiving specific values. You should use a subscription chronicle when you need to archive specific values from the subscription table.

Unlike other subscription tables, Notification Services does not automatically rename subscription chronicle tables when you update an application. Therefore, you should skip or drop subscription chronicle tables when you update an application; otherwise, you will receive errors when the application update attempts to create the subscription chronicle tables.

5–12

Module 5: Designing a Notification Services Solution

Considerations for Designing a Subscription Rules Strategy

**************************************** Illegal for non-trainer use *************************************** Introduction

Notification Services uses rules to decide when to notify a subscriber. Applying rules is the most critical Notification Services task, because general performance and final notification generation depends on this step. Planning rules usage carefully is critical to the performance of your application.

Designing a subscription rules strategy

You should consider the following issues when designing a subscription rules strategy: ■

Joining event data with subscription data. The primary purpose of subscription rules is to generate notifications by joining event data with subscription data. Do not use subscription rules to update event or subscription tables.



Scheduling subscriptions. The load that notification events places on a system can be limited by scheduling subscription processing. However, business requirements do not always enable you to schedule notifications, so this is not always possible. For example, the business could enable subscribers to dictate when event notifications take place. The resource demands from this type of requirement could be limited by allowing users to specify a time frame for delivery rather than an exact time.



Avoiding event-driven subscriptions. You should avoid using event-driven subscriptions unless business requirements truly demand them and users need to receive notifications immediately. You have no control over how often events will arrive and, therefore, how much workload they will generate. Notification Services includes a safety mechanism that will skip notifications to keep system processing data up to date, and if the workload thresholds are surpassed, you will lose notifications.



Filtering by using condition actions. If you need to enable subscribers to filter by an event field, use condition actions. Use condition actions only when action rules, in combination with parameters, are not able to express the filter condition. Condition actions have an impact on performance and require the creation of a login and the granting of permissions to that login.



Testing. Have a testing and troubleshooting strategy that includes using the provided stored procedures.

Module 5: Designing a Notification Services Solution

5–13

Considerations for Implementing Subscription Indexing

**************************************** Illegal for non-trainer use *************************************** Introduction

Subscription and event tables are usually the largest tables in terms of the number of records. Hence, you might consider using indexes to speed up SQL statements operating on these large tables. Each time subscription rules execute, Notification Services queries the subscription views. These rule queries include JOIN and WHERE clauses, which benefit from appropriate indexing.

Designing subscription indexing

You should consider the following issues when designing a subscription indexing strategy: ■

Using indexes to speed up read operations. Notification Services automatically generates indexes on primary keys. Because subscription tables are queried by using nonprimary key columns as filters when subscription rules fire, the database developer must consider whether to add indexes to those filtering columns.



Analyzing subscription table usage. Subscription data is queried in several processes, such as JOIN and WHERE clauses when generating notifications, or in subscription rule queries. The subscription management application could query subscription data to check for previous identical subscriptions and avoid duplication. You should analyze column usage in these queries and evaluate whether indexes will speed up operations.



Using covering indexes for subscription rules. If subscription rules rely on several values to perform notification generation, consider adding all of the corresponding columns to the index. This can be done by creating a covering index or, alternatively, by including columns in the index (with the INCLUDE keyword). For example, in a stock exchange scenario, a subscriber wants to receive a notification when a certain stock reaches a certain value. Adding both the stock name and its value in the covering index will improve the subscription rule query performance.



Using SQL Server tuning tools. The recommended process for tuning indexes is to use the tools provided by SQL Server, such as Profiler and Database Tuning Advisor, in your particular system.

5–14

Module 5: Designing a Notification Services Solution

Considerations for Designing a Subscription Management Strategy

**************************************** Illegal for non-trainer use *************************************** Introduction

You should provide a mechanism to enable users and applications to manage their Notification Services subscriptions.

Subscription management applications

SQL Server 2005 provides the Notification Services subscription management application programming interface (API) to enable developers to define user-friendly interfaces and components for managing subscriptions. By using the Notification Services subscription management API, developers can develop applications that enable users to easily query, add, update, and delete their subscriptions. Depending on your requirements, you might find it useful to automate subscription management rather than relying on users to do so. For example, you might require that subscriptions expire after a certain date, or you might need to suspend or cancel subscriptions that cause frequent delivery problems. Your subscription management strategy should include mechanisms to deal with these situations. The interface that you define to handle these situations can vary from scenario to scenario. Public subscribers are likely to require a Web-based interface, while corporate users might require functionality integrated into a graphical Windows-based application. Components that perform automated subscription management will not usually require a graphical front end and might simply run as part of a service or background process.

Module 5: Designing a Notification Services Solution

5–15

Practice: Designing a Subscription Management Strategy

**************************************** Illegal for non-trainer use *************************************** Introduction

In this practice, you will analyze a notification solution from the subscription management point of view. You will identify the Notification Services components required by the Subscription Management System and consider the scaling-out possibilities of the Subscription Management System.

Scenario

An Internet content provider offers financial data alerts, including stocks, bonds, and funds information, for 600 exchanges in all time zones. They provide alerts by e-mail, text message, and short message service (SMS). Multiple clients use the alerting system, with some clients having millions of subscribers. The system delivers about a million alerts per day. Alert volume is very heavy during market hours for North American, European, and Asian markets and especially heavy in the hour surrounding market opening and market closing for the NASDAQ/NYSE, FTSI, DAX, HSI, and NIKKEI exchanges. The business rules require that the application should remove alerts that have not generated a notification in six months and where the subscribing user has not accessed the subscription Web site in six months. The application should notify the subscriber of the removal. Subscribers activate subscription enrollment and modifications, including the ability for subscribers to remove their alerts through the Internet. However, an internal application that is not accessible from the Internet will perform the subscription purge.

5–16

Module 5: Designing a Notification Services Solution 1. List the Notification Services components required by a Subscription Management System for each of the business rules in the following table. Business rule

Component required

Alerts have not generated a notification in six months. The subscribing user has not accessed the subscription Web site in six months. The application removes the alerts from the system. The subscriber will be notified of the removal. Subscribers modify subscription enrollment and modifications through the Internet. An internal application will perform the subscription purge. The following table shows a possible solution. Business rule

Component required

Alerts have not generated a notification in six months.







The subscribing user has not accessed the subscription Web site in six months.



The application removes the alerts from the system.







The subscriber will be notified of the removal.





Subscribers modify subscription enrollment and modifications through the Internet.



An internal application will perform the subscription purge.







Subscription chronicle class is needed, containing last notification delivery. ■ If subscription chronicles are used, there should be a rule to update this information. Alternatively, you could adjust expiration ages in notification tables. Vacuuming should be configured and enabled. A place where Web site user activity is stored, such as the user profile system. Subscription Management System internal application will need to access this information. Subscription Management System internal application will use objects in Notification Services API to remove subscriptions. Access subscription class. This requirement suggests putting a notification solution in place if subscription turnaround is significant. ■ Alert expiration and Web site access age can be handled as events by a custom event provider. Alternatively, e-mail notification is an option, because this notification is not urgent. Subscription Management System public application will use objects in Notification Services API to manage subscriptions. Access subscription class. Subscription Management System internal application will use objects in Notification Services API to remove subscriptions. Access subscription class.

Module 5: Designing a Notification Services Solution

5–17

2. List which candidate processes to scale out for the Subscription Management System within a Notification Services design in the following table. Scale-out candidate

Notes

Subscription removal application Subscription removal notification Users subscription management application The following table shows a possible solution.

Discussion questions



Scale-out candidate

Notes

Subscription removal application

This automated application can run in another system, although it is accessing the notification application database.

Subscription removal notification

The notification can run in another system.

Users subscription management application

This application should run in another system, although it is accessing the notification application database.

Discuss the Notification Services components required by the Subscription Management System as compared to those required by the generator, the distributor, an event provider, and a nonhosted event provider. Answers will vary. This question should be used to engage discussion by looking at the provided scenario from the perspective of the notification application.

5–18

Module 5: Designing a Notification Services Solution

Lesson 3: Designing a Notification Strategy

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Describe the process of defining notifications.



Evaluate the considerations for designing the notification schema.



Evaluate the considerations for implementing notification indexing.



Apply the guidelines for designing notification delivery.

Notifications are the actual content that subscribers will receive. Therefore, notifications must include some relevant and timely information. Notifications are made up of tables and formatting procedures defined in Extensible Stylesheet Language Transformations (XSLT) files. General SQL Server optimization practices apply to the Notification Services tables, but having a specific understanding of the nature of notifications will help in choosing the appropriate design changes to apply to them.

Module 5: Designing a Notification Services Solution

5–19

The Process of Defining Notifications

**************************************** Illegal for non-trainer use *************************************** Introduction

Setting up notifications involves several steps. Each step has different requirements. This topic summarizes those steps to provide a general understanding of the whole process.

The process of defining notifications

This process consists of several design phases: ■

Determine what type of notifications you will support. You should identify the kind of information subscribers require, how frequently the information is required, and subscribers’ requirements for accessing additional details.



Design the notification schema. Notification differs from other components in that accessing external information sources is more frequent. Locate these sources and decide how to access them.



Decide whether to use notification indexing. You can improve the performance of notifications by using appropriate indexing; inappropriate indexing will be detrimental.



Design the notification delivery strategy. After data generation has taken place, notification formatting and delivery will complete the process. Delivery is the most common bottleneck and therefore requires an effective strategy.

5–20

Module 5: Designing a Notification Services Solution

Considerations for Designing the Notification Schema

**************************************** Illegal for non-trainer use *************************************** Introduction

Notifications differ from events and subscriptions in that accessing external information sources is more frequent. However, accessing external information will have an impact on notification formatting, and therefore a balanced approach is desirable.

Specifying a filegroup

Notification generation and formatting is a disk-intensive process. You should consider moving notification data to a specific filegroup, and preferably to a separate physical disk, for system performance reasons. On more intensively used systems, you should consider a storage area network (SAN).

Considering computed fields

Consider using computed fields to enable the distributor to compute notification data immediately before passing it to the content formatter. Although this will add more processing requirements at distribution time, it might reduce processing at the client and will have lower storage requirements than storing pre-calculated values in the tables.

Module 5: Designing a Notification Services Solution

5–21

Considerations for Implementing Notification Indexing

**************************************** Illegal for non-trainer use *************************************** Introduction

You can improve the performance of notification management by using an appropriate indexing strategy.

Indexing computed values

If you have a notification that includes computed values, consider indexing the corresponding columns in the table. These can be included as part of a covering index or by using the INCLUDE keyword.

Indexing large tables

If your notification table has a large number of records, consider indexing. The larger your table, the more it will benefit from indexing. Conversely, excessive indexes will have a much more noticeable effect on large tables, so the correct indexing strategy is essential as tables get larger.

5–22

Module 5: Designing a Notification Services Solution

Guidelines for Designing Notification Delivery

**************************************** Illegal for non-trainer use *************************************** Introduction

After data generation has taken place, notification formatting and delivery will complete the process. Delivery is the most common bottleneck, and therefore requires an effective strategy.

Formatting for device and locale

Device and locale information is mandatory for content formatting. Consider what devices and locales your notification solution will use. Each device-locale combination will require a separate formatter file.

Third-party formatters

Instead of starting to develop your formatter from scratch, leverage existing formatters to speed up development times.

Using a provided delivery protocol

The standard delivery protocols consist of the File delivery protocol and the Simple Mail Transfer Protocol (SMTP) delivery protocol. The File delivery protocol is primarily for testing purposes; where possible, you should use the SMTP delivery protocol. If you want to output to a file, and especially if you want to use several files, consider using custom delivery protocols.

Adjusting batch size

You can deliver notifications in batches. Those batches can adopt a multicast form or a digest form: ■

Multicast form is efficient at speeding up delivery protocols, such as SMTP, especially when the same notification is sent to several subscribers, because the notification is formatted only once. If one system is distributing your notifications, you can increase batch size as much as your protocol allows. If you scale out distribution over multiple servers, a large batch size will reduce the ability to balance work among the distributors. Each distributor selects one batch at a time, so you should adjust the batch size to enable different distributors to assume equal parts of the workload.



Digest form is effective when the application generates several notifications for a single subscriber within the same notification batch. Digest delivery combines the notifications and sends them as a single message to the subscriber.

Module 5: Designing a Notification Services Solution

5–23

Lesson 4: Designing a Notification Delivery Strategy

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Apply the guidelines for choosing a delivery protocol.



Apply the guidelines for defining delivery protocol execution settings.



Apply the guidelines for defining application execution settings.

This lesson covers both notification delivery and application execution settings. Up to now in the notification process, design considerations have focused on the Notification Services components themselves. Notification delivery will consider communications optimization rather than internal storage and optimization. This lesson also looks at application execution settings. You can adjust execution behavior by using several configuration mechanisms. In this lesson, you will learn about those mechanisms that are the most relevant.

5–24

Module 5: Designing a Notification Services Solution

Guidelines for Choosing a Delivery Protocol

**************************************** Illegal for non-trainer use *************************************** Introduction

In Notification Services, you can either use the standard SMTP delivery protocol and File delivery protocol or create your own custom delivery protocol. Each approach has advantages in different situations.

SMTP delivery protocol

Using a standard delivery protocol will reduce development time, and SMTP is the most widely used delivery protocol. SMTP is widely supported; it routes notifications for delivery to a broad set of devices by using an SMTP service such as Microsoft Exchange. You can use multicasting to improve performance when multiple subscribers are receiving the same data.

File delivery protocol

You should normally use the File delivery protocol only for application testing purposes. This protocol produces a single text file with a plain text header. You should consider custom delivery protocols if you want to produce file output, as they can be much more functional.

Custom delivery protocols

Custom delivery protocols implement the message creation and transport abilities of a network protocol. Creating a custom delivery protocol involves implementing one of the two interfaces provided in Notification Services: ■

IHttpProtocolProvider. The IHttpProtocolProvider interface makes it simple to create Hypertext Transfer Protocol (HTTP) messages. Minimal code is required, and you can use IHttpProtocolProvider to connect to most Web services.



IDeliveryProtocol. The IDeliveryProtocol interface will implement other protocols, such as File Transfer Protocol (FTP). You can also use it for HTTP when IHttpProtocolProvider does not provide necessary configuration options.

Module 5: Designing a Notification Services Solution

5–25

Guidelines for Defining Delivery Protocol Execution Settings

**************************************** Illegal for non-trainer use *************************************** Introduction

Communications systems will always have the possibility of failure, and you should anticipate this in your design. Notification Services provides several methods for dealing with failures.

Retry

When notifications have a delivery failure, you would normally want to retry delivery rather than give up. Many factors can cause delivery failures, such as busy networks, busy servers, or full inboxes, so a retry is quite likely to succeed even though you have not modified any properties. A retry schedule will have one or more retry delay values. The retry delay values specify the time delay before another delivery attempt. Retries will continue until they have used all retry delays, or until the notification has expired.

Fail

Failing a delivery is useful if the information is time sensitive and would be out of date if retried. You can also fail a delivery to prevent overloading a network with excessive retry attempts. The Windows application log holds failure information. You can configure the number of failures that can occur before they generate a log entry and also the minimum amount of time between entries. This is useful to prevent the application log from filling up or an overload of the server.

Time-out

When a work item arrives at the distributor, it calls the content formatter to process the notifications. If this process takes too long, it could prevent the distributor from processing other notifications, so you should define a work item time-out. At the configured value, the time-out will fail the notification or, if implemented to do so, retry it.

5–26

Module 5: Designing a Notification Services Solution

Guidelines for Defining Application Execution Settings

**************************************** Illegal for non-trainer use *************************************** Introduction

Notification Services includes several application execution settings that control how your application functions. These include how often Notification Services processes data, how it processes events, the maximum amount of data it can send and receive, and how often it removes old data.

Quantum duration

The generator does not run continuously; instead, it fires at the end of each quantum. A quantum is a unit of time; the duration specifies the length of a quantum. The generator will process any event and subscription rules that are contained in the quantum. If the quantum duration is too short, there will be too much server resources consumption, and if the duration is too long, there will be too much delay for subscribers.

Quantum limits

Quantum limits balance the requirements of timely notifications against the value of processing old data. The generator clock uses quanta to measure time, and you can configure how many quanta behind real-time the generator is allowed to be. If rules fall beyond this threshold, the generator ignores them. For example, if a limit of two quanta is set and the generator is five quanta behind real-time, it will ignore the oldest three quanta and process only the most recent two quanta. Keep in mind that the generator does not report skipped events, but setting quantum limits enables it to process newer events in a more timely way.

Event processing order

There are two options for event processing order: Subquantum sequencing will process event batches in order, whereas quantum sequencing will process all of the batches in a quantum as if they were in one batch. Quantum sequencing is more efficient, but you should balance its improved performance with the strict data correctness of subquantum sequencing.

Performance query interval

Updating Notification Services performance counters requires querying the application database and calculating the data. This consumes resources from the application database that you can manage by changing the query interval. The longer the interval, the less resources consumed, but counters will not be up to date.

Module 5: Designing a Notification Services Solution

5–27

Throttling

An application error or a malicious user with excessive privileges could insert a large number of spurious events or subscriptions, causing a denial of service. To handle denial-of-service situations, you can specify the maximum number of events per event class that the generator processes within a quantum period. If the number of events exceeds this limit, Notification Services creates an error in the Windows application log and stops all processing for that quantum. A value of zero turns off the event throttle. The default value of 1,000 might be too small; you should adjust this value according to the requirements of your application.

Logging

The Notification Distributor logging options for delivery status and notification text are LogBeforeDeliveryAttempt, LogStatusInfo, and LogNotificationText. By default, they are all set to true, but before deploying an application change, some or all of them are set to false, because logging everything could consume too much resources and database space. Even if all of these options are set to false, the Notification Distributor will still log delivery failures.

Data removal configuration

Data accumulates in the event, notification, and distribution tables as well as in the control tables, and this can cause the database to grow too large and affect performance. To avoid this, you should configure an automatic data removal process called vacuuming. Notification Services disables vacuuming by default, but you can enable it by defining a data removal schedule.

5–28

Module 5: Designing a Notification Services Solution

Lab: Designing a Notification Services Solution

**************************************** Illegal for non-trainer use *************************************** Introduction

In this lab, you will design a Notification Services solution and then discuss your results with the rest of the class.

Scenario

Fabrikam, Inc., manages new contract approvals and requests for changes of contract status at its central office. It needs a mechanism to notify customers of changes to their contracts, including approval, rejection, and whether they need additional information. Customers can choose whether to receive notification of these status changes by telephone text message or by e-mail.

Preparation

Ensure that the virtual machine for the computer 2781A-MIA-SQL-05 is running. Also ensure that you have logged on to the computer. If you need to log on to the computer, use the following credentials: ■

Username: Student



Password: Pa$$w0rd

Module 5: Designing a Notification Services Solution

5–29

Exercise 1: Defining Event Data Introduction

Design the event class and fields

In this exercise, you will design the event class required to meet the business requirements, and define its schema. You will also write code to handle an event chronicle for storing the history of this event.

Summary 1. Design the event class needed to notify customers of changes to their contracts. 2. Identify the event fields needed by this class.

Detailed Steps 1. You need to track events concerning changes to the contract status. You only need to create a single event class. Name this class AccountContractEvents. 2. Examine the schema for the Accounts database in the Accounts ERD.vsd file, and identify the fields needed for the event by analyzing the definition of the Contracts table. 3. Record your design decisions in a Microsoft Office Visio® diagram.

Write code to store an event chronicle

Summary ■

Write the Transact-SQL code for an event chronicle to store the event history for the AccountContractEvents event.

Detailed Steps 1. Using SQL Server Management Studio, create a new query. 2. Add Transact-SQL statements to store information about the AccountContractEvents event in a table called AccountContractStatusHistory. Note Do not try to run your query, as the corresponding tables do not yet exist. They will be created later by using an Application Definition File (ADF). 3. Save the query.

Discussion questions

1. How would the event class definition differ if Notification Services used an existing database rather than creating a new database for the Notification Services application? There will be no differences. Using an existing database or a Notification Services–generated database affects only the ability to perform database configuration tasks such as using filegroups or changing the data file location.

5–30

Module 5: Designing a Notification Services Solution 2. Should you use an event chronicle in this solution? Whether to use an event chronicle is always dependent on requirements. If you have requirements such as sending the status once a day until it is approved or rejected, you would use scheduled subscriptions that benefit from chronicles, although the use of chronicles is not mandatory. If you need to notify customers immediately and the corresponding regional office in several batches a day, the event must be read twice, but without an event chronicle, events will be deleted after the first read.

3. What is the relationship between the generator and chronicle processing in the event class? The generator fires in the following order: chronicle rules, event rules, and scheduled rules. Chronicle rules are used to update event chronicle tables. Processed events will be deleted when the vacuuming process is fired according to the configuration file.

4. What standard event providers are available in Notification Services? SQL Server 2005 Notification Services includes three standard providers: File System Watcher Event Provider, SQL Server Event Provider, and Analysis Services Event Provider. More information about these providers is available in SQL Server Books Online.

Module 5: Designing a Notification Services Solution

5–31

Exercise 2: Designing a Subscription Strategy Introduction

Design the subscription class and fields

Customers can choose to receive contract status change notifications. Therefore, you need to create a subscription class for customers. In this exercise, you will design the subscription class required to meet the business requirements, and define its schema. You will also define a subscription event rule.

Summary ■

Identify the subscription class and fields needed for this solution.

Detailed Steps 1. Define a subscription class named AccountContractSubscriptions. 2. Identify the fields required by this subscription class. 3. Record your design decisions in the Fabrikam Notifications.vsd Visio diagram.

Write code for a subscription event rule

Summary ■

Write Transact-SQL code for a subscription event rule that will send notifications of contract status changes.

Detailed Steps 1. Using SQL Server Management Studio, create a new query. 2. Add Transact-SQL statements to store information about the AccountContractEvents event and the AccountContractSubscription subscription in a table called AccountContractAlerts. Note Do not try to run your query, as the corresponding tables do not yet exist. They will be created later by using an Application Definition File (ADF). 3. Save the query.

Discussion questions

1. What are the differences between event-driven subscription rules and scheduled subscription rules? When would you use each? Event rules fire with little delay, when events occur and the next generator quantum occurs. Scheduled rules fire according to their schedule and in the corresponding generator quantum. You should use event rules when notifications must be sent shortly after events occur. You should use scheduled rules when you or your users want to specify when to receive notifications.

5–32

Module 5: Designing a Notification Services Solution 2. How does the quantum duration affect the scheduled rule processing? Time allotted as quantum for processing can be too short. Therefore, the quantum clock can fall behind the real-time clock. Usually, the quantum clock will be synchronized again after some quanta have elapsed, if some of the quanta are not fully utilized. However, if a configurable number of quanta is reached without achieving synchronization, Notification Services will skip firing the scheduled rules that should be processed in that quantum. Notification Services will log this exception; therefore, administrators can track it.

3. Why is the event rule contained in the subscription class and not in the event class? To offer a greater degree of flexibility. By relating rules to subscriptions, room is left to subscribe to combinations of event classes, instead of to only one. Additionally, subscription classes can define condition actions that allow subscribers to express their own subscription condition preferences.

4. How is subscription rule processing optimized? By indexing columns used in event subscription matching. You should consider other optimizations, but indexing is the most important one.

Module 5: Designing a Notification Services Solution

5–33

Exercise 3: Designing a Notification Strategy Introduction

Design the notification class and fields

In this exercise, you will design the notification class for the information to be sent to customers, and define its schema. You will also identify the content that formatters needed to meet the business requirements.

Summary ■

Identify the notification class and fields needed for this solution.

Detailed Steps 1. Although customers can receive notifications by using two different device types, you need to send the same information to both types of device. Therefore you only need to create one notification class. Name this class AccountContractAlerts. 2. Identify the fields required by this notification class. 3. Record your design decisions in the Fabrikam Notifications.vsd Visio diagram.

Identify the content formatters required for receiving notifications

Summary ■

Identify the content formatters needed for this solution.

Detailed Steps 1. Determine the different devices customers can use to receive notifications. 2. Verify whether each device needs its own formatter, or whether different devices can use the same formatter.

Discussion questions

1. Is digest or multicast delivery mode appropriate for this solution? Why? Digest delivery mode can be used to group several notifications for the same subscriber, device, and locale. If the customer has several contract status changes occurring close in time, digest mode can be useful. However, this situation is improbable, and digest implies processing overhead. Therefore, digest is not appropriate for this solution. Multicast delivery mode can be used to send an identical message to multiple subscribers. Because contracts are related to only one customer, multicast is not appropriate for this solution.

2. Should you use a notification expiration for this solution? Why? You use the Notification Expiration setting to specify when it is not necessary to continue trying to deliver a notification that has failed or when it is not useful to deliver a notification because it is too old and probably obsolete. In this solution, contract status changes are not likely to become obsolete, and therefore notification expiration should not be used.

5–34

Module 5: Designing a Notification Services Solution 3. Is a notification batch size necessary for this solution? Why? The answer will depend on system resources available. You use notification batch size to organize working batches when several processing threads are available. You can configure processing threads in the ADF file. The need for a notification batch size also depends on the number of notifications generated. Because you use notification batch size to set a maximum number of notifications per work item, with few notifications, the batch might not reach specified batch size.

Module 5: Designing a Notification Services Solution

5–35

Exercise 4: Executing a Notification Services Solution Introduction

In this exercise, you will review the Instance Configuration File (ICF) and the Application Definition File (ADF) for a Notification Services implementation. You will then register the edited configuration files and start the Notification Services solution.

Review the ADF Summary

Detailed Steps

1. Open the ADF for the Notification Services solution in the FabrikamNS.ssmssln SQL Server Management Studio solution file.

1. Using SQL Server Management Studio, open the solution file located at E:\LabFiles\Starter\ FabrikamNS.ssmssln.

2. Review the definitions for the event class, the subscription class, and the notification class and schema.

2. Open the solution ADF XML file, FabrikamNSADF.xml, and locate the event class. Compare it to your definition from Exercise 1. 3. Locate the subscription class definition in the ADF file, and compare it to your definition from Exercise 2. 4. Locate the notification class definition in the ADF file, and compare it to the notification class and notification schema from Exercise 3.

5–36

Module 5: Designing a Notification Services Solution

Configure and run the Notification Services solution

Summary 1. Create the Notification Services instance. 2. Register the Notification Services instance. 3. Configure security for the Notification Services instance. 4. Enable the Notification Services instance. 5. Verify that the Notification Services instance has been configured and is running.

Detailed Steps 1. Create a new Notification Services instance by using the E:\Labfiles\Starter\FabrikamNSIC F.xml file. 2. Register the Notification Services instance and run the service using the MIA-SQL\SQLService account, with the password Pa$$w0rd. 3. Grant permissions to the service account to access the Notification Services application databases with the NSRunService role. 4. Enable the Notification Services instance and start running the service. 5. Execute the NSSnapshotApplications stored procedure to verify that the service is running. 6. Execute the NSAdministrationHistory stored procedure to view the activity that has occurred. 7. Review the objects created by Notification Services.

Important After the lab, shut down the virtual machine for the computer 2781A-MIASQL-05. Do not save the changes.

Module 6

Designing a Service Broker Solution

Contents: Lesson 1: Designing a Service Broker Solution Architecture

6-3

Lesson 2: Designing Service Broker Data Flow

6-14

Lesson 3: Designing Service Broker Solution Availability

6-24

Lab: Designing a Service Broker Solution

6-29

Information in this document, including URL and other Internet Web site references, is subject to change without notice. Unless otherwise noted, the example companies, organizations, products, domain names, email addresses, logos, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place or event is intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. The names of manufacturers, products, or URLs are provided for informational purposes only and Microsoft makes no representations and warranties, either expressed, implied, or statutory, regarding these manufacturers or the use of the products with any Microsoft technologies. The inclusion of a manufacturer or product does not imply endorsement of Microsoft of the manufacturer or product. Links are provided to third party sites. Such sites are not under the control of Microsoft and Microsoft is not responsible for the contents of any linked site or any link contained in a linked site, or any changes or updates to such sites. Microsoft is not responsible for webcasting or any other form of transmission received from any linked site. Microsoft is providing these links to you only as a convenience, and the inclusion of any link does not imply endorsement of Microsoft of the site or the products contained therein. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. ©2006 Microsoft Corporation. All rights reserved. Microsoft, Active Directory, ActiveX, BizTalk, Excel, Microsoft Press, MSDN, MSN, Outlook, PowerPoint, SharePoint, Tahoma, Visio, Visual Basic, Visual C#, Visual SourceSafe, Visual Studio, Windows, and Windows Server. All other trademarks are property of their respective owners.

Module 6: Designing a Service Broker Solution

6–1

**************************************** Illegal for non-trainer use *************************************** Module objectives

After completing this module, students will be able to: ■

At the end of this lesson, you will be able to: Design a Service Broker solution architecture.



At the end of this lesson, you will be able to: Design the Service Broker data flow.



At the end of this lesson, you will be able to: Design Service Broker solution availability.

Introduction

One of the most valuable components introduced in Microsoft® SQL Server™ 2005 is Service Broker, an asynchronous messaging system integrated into the database engine. Service Broker provides reliable messaging services by guaranteeing that it will deliver each message exactly one time, deliver messages in the order they are sent, and ensure fault tolerance so that no message is lost. With Service Broker, you can design asynchronous processes within your solution to help you scale out that solution. Service Broker enables you to defer and distribute application tasks that might otherwise require synchronous processing. This module describes the considerations that you should take into account when designing a Service Broker solution. You will learn how to design the solution architecture, data flow, and solution availability.

6–2

Module 6: Designing a Service Broker Solution

An example of solution processes

Throughout this module, you will see references to a list of processes that are based on an example application. The application, which enables customers to order products online, includes the following processes: 1. Entering the order 2. Processing the payment 3. Fulfilling the order 4. Packaging the order 5. Shipping the order 6. Tracking the shipment 7. Replacing the inventory 8. Processing rebates At first, it might appear that these processes must be performed in the order listed here. However, some of these processes can occur independently of each other, so they are not necessarily restricted to a certain order. For example, replacing the inventory can occur right after fulfilling the order and can occur at the same time as packaging the order, independent of tracking the shipment or processing rebates. In other words, you can process some of the tasks asynchronously and in parallel with each other. A solution that can support the asynchronous and parallel processing of tasks is a good candidate for Service Broker.

Module 6: Designing a Service Broker Solution

6–3

Lesson 1: Designing a Service Broker Solution Architecture

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Evaluate the considerations for implementing a Service Broker solution.



Evaluate the considerations for identifying Service Broker services.



Evaluate the considerations for identifying Service Broker conversations.



Evaluate the considerations for designing dialog standards.



Evaluate the considerations for designing queue usage.

The first step in designing a Service Broker solution is to design the solution architecture. The solution architecture provides the structure for defining the components of a Service Broker application. This lesson describes the considerations that you should take into account when designing the solution architecture. Specifically, the lesson teaches you how to evaluate the considerations for implementing Service Broker, identifying Service Broker services and conversations, and designing dialog standards and queue usage.

6–4

Module 6: Designing a Service Broker Solution

Considerations for Implementing a Service Broker Solution

**************************************** Illegal for non-trainer use *************************************** Introduction

When designing a data-driven solution, you might find that the processes within the solution are made up of tasks that can occur independently of and in parallel with each other. In addition, these processes might rely on data that is stored in multiple databases or rely on applications and data stores outside your solution. Prior to SQL Server 2005, your ability to coordinate these processes was restricted to using a solution outside of SQL Server. However, with the introduction of Service Broker, you can now implement a service that supports the asynchronous messaging requirements of your enterprise solutions. Service Broker delivers reliable, asynchronous messaging that enables process integration and provides a fault-tolerant environment for those processes. This topic describes the Service Broker characteristics that support these capabilities and discusses the considerations that you should take into account for implementing a Service Broker solution.

Understanding Service Broker

Service Broker enables you to distribute and defer many of your solution’s processes. To support this capability, Service Broker has the following characteristics: ■

Reliability. Because Service Broker components are part of SQL Server 2005 databases, they share the same fault tolerance as the databases. For example, when you back up your database, you are backing up the Service Broker components.



Guaranteed delivery. Service Broker delivers each message exactly one time. If a message fails to reach its destination, Service Broker holds the message in the sender’s queue until it can be delivered.

Module 6: Designing a Service Broker Solution

Implementing a Service Broker solution

6–5



Guaranteed delivery order. Service Broker delivers messages in the order sent. This can be critical in many application processes. For example, a payment for an order should be processed before that order is fulfilled.



Asynchronous/parallel processing. Many applications contain processes that are independent of one another and can therefore be performed asynchronously and in parallel. Service Broker supports asynchronous and parallel processing. Once the initiator (sender) sends a message and the target (recipient) receives it, the initiator process is complete. The target then processes the message while the initiator continues with other tasks. For example, an application can replace inventory at the same time it initiates packaging and shipping.

When determining whether to implement a Service Broker solution, you should take into account the following considerations: ■

Processing tasks asynchronously and in parallel. If you can process tasks independently of each other or at the same time,, your application is a good candidate for Service Broker.



Processing tasks in a specific order. When processes must be performed in a specific order, the messages that coordinate those processes must be sent and received in a specific order. Service Broker delivers messages in the order they are sent. If your application processes require this type of support, your application is a good candidate for Service Broker.



Waiting for another application before processing tasks. In some cases, an application must wait for another application before starting or completing a process. In these cases, the applications are often very integrated, and a synchronous process is more efficient. This type of application is usually not a good candidate for Service Broker.



Configuring databases as broker instances. A broker instance is a SQL Server database on which you enable Service Broker. A broker instance acts as a public interface that permits the database to send and receive Service Broker messages. You must enable Service Broker on each database that you want to act as a broker instance. The more databases that are part of a Service Broker solution, the greater the number of broker instances. A large number of broker instances can increase network traffic and require additional resources, but they can also provide an effective way to scale out your application.



Ensuring fault tolerance. Some messaging systems cannot persist messages in the event of server failure. If you must ensure the fault tolerance of your messaging system, your application is a good candidate for Service Broker.

6–6

Module 6: Designing a Service Broker Solution

Considerations for Identifying Services

**************************************** Illegal for non-trainer use *************************************** Introduction

A number of components make up a Service Broker solution, including services, conversations, and queues. This topic explains what a service is and provides considerations for identifying potential services. The remaining topics in this lesson cover conversations and queues.

Understanding Service Broker services

A service is a named interface, or endpoint, within a Service Broker solution that provides a structure for a set of business tasks. Services use queues and conversations to send messages to each other. Each process that participates in a Service Broker solution must be exposed through a service. When you create a service, you should identify the associated queue and, optionally, one or more contracts. A contract specifies the types of messages that the service can send or receive. (Lesson 2 discusses contracts.) For example, suppose that your application includes a process that allocates inventory. You can set up a service for the tasks associated with that process. You can then use contracts to specify that the service will accept only specific types of messages from other services. Service Broker stores services at the database level. You can define one or more service within a database. Some solutions use multiple services in a single database, while others use services in multiple databases on multiple servers. If a service initiates a message, it must guarantee the delivery of that message. Once the message has been delivered, the initiator service is no longer responsible for the message. If a message cannot be delivered, the initiator service must store the message locally until the message can be delivered to its target. If a service is the target of a message, the service must store the message until it is ready for processing. The service stores the message in a mapped queue. Multiple services can map to a single queue, or each service can map to a queue dedicated to that service.

Module 6: Designing a Service Broker Solution Identifying Service Broker services

6–7

When identifying Service Broker services, you should take into account the following considerations: ■

Processing tasks independently of each other. Service Broker enables services to be processed independently of each other. For example, you can replace inventory, ship orders, and process rebates independently of each other. You should consider creating a service for each of these processes. Even if the processes access data stored in the same database, they should be considered as candidates for services.



Processing tasks in parallel with each other. Service Broker enables services to be processed in parallel with each other. For example, to fulfill an order, you must first process the payment. However, you can fulfill other orders whose payments have already been processed. Processing payments and fulfilling orders can all be performed asynchronously as long as you process the payment for a specific order before fulfilling that order. As a result, both processes are good candidates as Service Broker services.

Returning to the application processes example, you can see that all processes are good candidates for services. Each one can run independently of other processes, in parallel with other processes, or both.

6–8

Module 6: Designing a Service Broker Solution

Considerations for Identifying Conversations

**************************************** Illegal for non-trainer use *************************************** Introduction

To enable communication between services, the application identified within a service must create a conversation, which is a mechanism that provides the structure to facilitate this communication. The application, through its related service, initiates the conversation and specifies a target service and contract. This topic provides information about conversations and explains the considerations that you should take into account when identifying conversations.

Understanding Service Broker conversations

Service Broker services send and receive messages within the context of conversations established between initiator services and target services. A conversation is a named, two-way dialog between two services. To initiate a conversation, an application must issue a BEGIN DIALOG CONVERSATION statement. The following syntax shows the basic components of that statement: BEGIN DIALOG [CONVERSATION] @dialog_handle FROM SERVICE initiator_service_name TO SERVICE 'target_service_name' [ON CONTRACT contract_name] [LIFETIME = dialog_lifetime]

As you can see, you should specify the initiator service, the target service, and, optionally, a contract and conversation lifetime. Once the application establishes the conversation, either service can send a message. To send a message, the application associated with the service must issue a SEND statement. The following syntax shows the basic components of a SEND statement: SEND ON CONVERSATION conversation_handle [MESSAGE TYPE message_type_name] [(message_body_expression)]

Module 6: Designing a Service Broker Solution

6–9

The SEND statement identifies the name of the conversation and, optionally, the message type and the message body expression, which represents the actual message body. Identifying Service Broker conversations

When identifying Service Broker conversations, you should take into account the following considerations: ■

Establishing communication between services. You should create a conversation whenever one service requires one-to-one communication with another service. Take, for instance, the application processes example. Once an order is shipped, the application initiates a conversation through the service associated with order shipping. The conversation specifies the target of the conversation as the service associated with the shipment tracking process. By initiating a conversation, the order shipping service can communicate with the track shipping service.



Initiating a dialog between services. Some messaging services support both dialog and monolog conversations. A dialog conversation supports two-way conversations between the initiator service and the target service, enabling the initiator and the target to participate in a one-on-one conversation. A monolog conversation supports only one-way conversations between the initiator service and one or more target services; the initiator can send out messages through that conversation but cannot receive them. Service Broker supports only dialog conversations.

6–10

Module 6: Designing a Service Broker Solution

Considerations for Designing Dialog Standards

**************************************** Illegal for non-trainer use *************************************** Introduction

Dialog conversations, also referred to as dialogs, support two-way messaging between exactly two services. If your solution includes more than two services, each conversation between any two services is unique. For instance, in the ordering system example, the conversation between the order fulfilment process and the packaging process is different from the conversation between the fulfilment process and the inventory replacement process. Each conversation enables the participating services to initiate a message. When a target service receives a message, the service issues an acknowledgement message to the initiator service. The initiator service places outgoing messages in a special transmission queue until the target acknowledges the message. (Acknowledgement messages do not appear in any queue.) This topic provides considerations that will help you design dialog conversations.

Designing Service Broker dialogs

When designing Service Broker dialogs, you should take into account the following considerations: ■

Delivering messages in the correct order. Service Broker delivers messages in the order they are sent. Take this into account when planning your processes and their order.



Ending conversations manually. You can use the END CONVERSATIONS statement to end a conversation manually. Both services must call this statement to end the conversation. Once this is done, Service Broker removes all messages associated with that conversation from the queue, even if retention is turned on for an associated queue. Remember that your applications must include the code necessary to end conversations.

Module 6: Designing a Service Broker Solution

6–11



Setting the time-out period. You can define a conversation to time out after a specified amount of time. If the conversation times out, a service cannot send messages. However, the conversation continues to support existing messages, and it must be ended manually by both services. Keep time-outs in mind when planning your conversations so that conversations do not time out prematurely.



Maintaining conversation sessions. Once a dialog has been created between two services, it remains open, in a stateless fashion, until the services end the dialog. For example, the order entry service sends a message to the fulfillment service. The fulfillment service determines that some of the items are not available (they are back-ordered) and sends a message back to the order entry service. The order entry service sends a message back asking the fulfillment service to send another message when the item becomes available. This, of course, may take days or even weeks. The dialog remains open during this period, although it is not sending any messages. When the item does become available, the fulfillment service sends a message back to the order entry service. Finally, the order entry service ends the conversation, and the fulfillment service in turn ends the conversation, completing (closing) the dialog. You should take into account a conversation’s session length when planning that conversation. Maintaining numerous conversations for lengthy periods of time might affect performance and resource availability.

6–12

Module 6: Designing a Service Broker Solution

Considerations for Designing Queue Usage

**************************************** Illegal for non-trainer use *************************************** Introduction

A queue is at the core of any messaging service. In Service Broker, you cannot create a service without specifying a queue, so the queue must exist before you can create the service. Queues persist the messages that are sent from one service to another. When an initiator service sends a message, it places the message in an outgoing queue, and the target service receives the message in an incoming queue. This topic describes the considerations for designing queue usage.

Designing queue usage

When designing how queues are going to be used in your Service Broker solution, you should take into account the following considerations: ■

Storing messages for a service. A queue stores received messages until the service processes those messages. A queue stores sent messages until the target service acknowledges receipt or until the services terminate the conversation. When designing queue usage, keep in mind storage requirements and any impact on performance. You can specify which filegroup to use and plan your hardware requirements based on the anticipated number of messages.



Enabling message retention. By default, a service deletes an outgoing message from a queue when the target service acknowledges receipt of that message. However, you can override this behavior by enabling the RETENTION option in the queue definition. When the option is enabled, Service Broker retains all messages in a conversation until the initiator and target services terminate the conversation. You should consider enabling message retention in the following situations: ●

Auditing transactions. Enable retention when the service needs to maintain a record of all messages received and sent. Design the application to retrieve a copy of all messages in the conversation prior to ending that conversation.



Compensating transactions. Enable retention when a service needs to be able to undo a series of individual transactions. Design the application so that it has the capability to step backwards through the messages and undo the work for each message.

Module 6: Designing a Service Broker Solution

Receiving messages

6–13



Processing groups of messages. You can group messages into a single transaction so that the service can process them together. This approach reduces the number of transactions that your application must generate. However, you should not group too many messages together. Otherwise, you might overload your transactions and they will take too long to run, possibly causing locking contention.



Disabling unused queues. You can enable or disable a queue as necessary. By disabling a queue, you can prevent a service from receiving messages before the application is ready.

When a service receives a message, it places that message in the receiving queue. The application must then manually retrieve the message from the queue. Depending on whether retention is enabled on the queue, the service will either delete the message or update the message to indicate that it has been processed. To retrieve a message from the queue, your application must use a RECEIVE statement. The following syntax shows the basic components of the statement: RECEIVE [TOP ( n )] [ ,...n ] FROM [INTO table_variable] [WHERE { conversation_handle = conversation_handle | conversation_group_id = conversation_group_id }] [, TIMEOUT timeout]

As the syntax shows, you must specify the columns to include in the result set and the name of the queue. You can use the RECEIVE statement to retrieve messages one at a time or to retrieve a group of messages. If you must process a series of messages together, you should consider processing them as a group. Keep in mind, however, that processing a group of message increases the length of time that a transaction is open, which increases the potential for locking contention. Note You can use a SELECT statement to view the contents of the messages within a queue. However, you cannot use an INSERT, UPDATE, DELETE, or TRUNCATE statement to manage those messages.

6–14

Module 6: Designing a Service Broker Solution

Lesson 2: Designing Service Broker Data Flow

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Describe the process of designing the Service Broker data flow.



Evaluate the considerations for identifying conversation groups.



Evaluate the considerations for identifying data-staging locations.



Evaluate the considerations for identifying service routes.



Evaluate the considerations for identifying service activation methods.

Once you determine which components your Service Broker solution will include, you should establish how the information is going to flow among those components. When defining this flow, you should identify the types of messages that can be sent to and from various services, whether to group conversations, and how messages might need to be routed. This lesson describes the process of designing data flow. Specifically, the lesson provides you with the considerations to take into account when identifying conversation groups, data-staging locations, service routes, and service activation methods.

Module 6: Designing a Service Broker Solution

6–15

The Process of Designing Service Broker Data Flow

**************************************** Illegal for non-trainer use *************************************** Introduction

After you design the Service Broker solution architecture, you should consider other factors, such as how components will communicate with each other and where they will be located. You must also determine what types of messages can be sent to and from the various services. You must then determine whether the service should process orders serially or in parallel. This topic describes the procedures that you should follow when designing the Service Broker data flow.

Designing Service Broker data flow

The process of designing Service Broker data flow includes the following steps: 1. Determine how Service Broker validates messages. When you create a message type, you specify how Service Broker validates the message body of messages based on that type. You can choose one of the following validation options when you define a message type: ●

NONE. Service Broker performs no validation. The message can contain any type of data, or it can be NULL.



EMPTY. The message must be NULL.



WELL_FORMED_XML. The message must contain well-formed XML.



VALID_XML. The message must contain XML that conforms to the specified schema collection.

2. Determine contract specifications. A contract specifies which message types can be used within a conversation. For each message type listed in the contract, the contract also specifies whether an initiator service, a target service, or both services can send a message based on that type. For example, the order entry service might send a message containing the specifications of the order to the fulfillment service. Once the order has been fulfilled, the fulfillment service might send a message back to the order entry service. Both messages can be based on the same message type. However, the shipping process might send a message to the tracking process with information about the delivery. The tracking process does not need to return a message to the shipping process, so the message can be based on a message type that supports only initiator messages.

6–16

Module 6: Designing a Service Broker Solution 3. Specify the message order. Service Broker delivers messages in the order they are sent. As a result, you must determine how tasks should be processed. For example, payments must be processed before orders are filled, so the order entry service must notify the payment processing service before the payment processing service notifies the fulfillment process. 4. Identify messages that can be processed in parallel. Although Service Broker guarantees that messages will be delivered in order, not all messages need to be processed in order. For example, multiple orders can be placed in the order entry service and sent to the payment processing service. The payment processing service can process these messages in parallel. 5. Specify the lifetime of all messages. Not only do you need to establish how messages flow within various dialog conversations, but you also need to set how long they remain viable. The most effective way to manage a lifetime requirement is to specify a conversation’s lifetime.

Module 6: Designing a Service Broker Solution

6–17

Considerations for Identifying Conversation Groups

**************************************** Illegal for non-trainer use *************************************** Introduction

A conversation provides a structure for the exchange and processing of messages. Conversations help to ensure that messages are processed in the correct order. By default, Service Broker associates each conversation with one conversation group and associates each conversation group with a specific service. The conversations within a group support only messages that are sent to or from the specified service. Service Broker uses conversation groups to group together related conversations to allow applications to more easily coordinate those conversations. When you create a conversation, you can specify a conversation group. You can also move an existing conversation to another conversation group. This topic describes the considerations for grouping conversations.

Identifying conversation groups

When identifying conversation groups, you should take into account the following considerations: ■

Determining message order. Sometimes, multiple conversations need messages to be processed serially as a single session. For example, suppose that someone places an order through your online application. Shortly after placing the order, the user amends it. The order entry service might or might not have begun a dialog with the shipping service, so the amended order is communicated to the shipping service through a separate dialog. However, you must ensure that the original order is processed before you can process the amended order. Conversation groups solve this problem. You can associate both messages within the same group. Service Broker uses conversation group locks to lock all messages within the group (as needed) to control the flow of these messages. By using a conversation group, you can treat the messages as a single unit.



Applying locks to conversation groups. By default, Service Broker uses conversation group locking even if only one conversation exists per group. Every time an application issues a command against a conversation in a conversation group (such as BEGIN DIALOG CONVERSATION or RECEIVE), SQL Server locks all conversations in that group for the duration of the transaction. Most conversations exist alone within their default groups, so locking occurs at the conversation level by default. However, you can see the full effect of conversation group locking when a group includes more than one conversation.

6–18

Module 6: Designing a Service Broker Solution

Considerations for Identifying Data-Staging Locations

**************************************** Illegal for non-trainer use *************************************** Introduction

When you design a Service Broker solution, you must identify where each service will be located and how you will reference these services. Although small-scale solutions usually reside within a single database on a single server instance, enterprise solutions are often located on multiple servers throughout a network. This topic discusses the considerations for identifying data-staging locations.

Identifying data-staging locations

When identifying data-staging locations, you should take into account the following considerations: ■

Naming services across the enterprise. Service names must be unique within a database. Even if you locate services on different servers, the target service names are logically cataloged on each server that uses them. This allows the target service to move from server to server without the initiator service needing to know the location of the target service. As a result, you should design service names to be unique across the enterprise. In some cases, service names might be repeated, as in the following circumstances: ●

Development, test, and production environments use the same set of service names. You distinguish between services through their route addresses. When you change environments, you must modify the route addresses as appropriate.



Multiple servers host the same set of services to distribute the workload. Although you should try to use unique service names across the enterprise, you can use the same name (not in the same database) because names are uniquely identified by their routes. For example, suppose that you must send a message to two services named Sales on different servers. The route address for the first service is tcp://Server22.Adventure-Works.com:4022/, and the route address for the second service is tcp://Server29.AdventureWorks.com:4022/. As you can see, the route addresses distinguish one service from the other. The next topic discusses routes.

Module 6: Designing a Service Broker Solution ■

6–19

Determining the physical location for services. To support distributed processing, you can implement a service on multiple instances of Service Broker throughout the network. This configuration enables you to distribute processing so that you can balance the workload of tasks that require more processing. For example, you might want to distribute the payment processing service across multiple servers but run the shipment tracking service on a single server.

6–20

Module 6: Designing a Service Broker Solution

Considerations for Identifying Service Routes

**************************************** Illegal for non-trainer use *************************************** Introduction

Once you have identified the data locations for your Service Broker solution, you should determine how services on independent servers will communicate. Often, the services can communicate directly with each other. However, in cases in which hosting servers cannot connect directly, you must set up intermediary servers to forward messages. Whenever services must communicate from one server to another, you must identify service routes. This topic provides consideration for identifying service routes for your Service Broker solution.

Identifying service routes

When identifying service routes for your solution, you should take into account the following considerations: ■

Storing routing information in multiple places. Service Broker stores routing information in the msdb database and in the originating database: ●

The msdb database contains routing information for all services on the SQL Server instance and contains information about requests coming from outside the server.



The originating database contains routing information for its local services. All local route requests use this information to send messages to the correct local service.



Using service route names. Service Broker uses unique names to identify routes. A route name includes the service name, instance identifier, and network name. In some cases, you might support multiple services with the same name on your network. For example, a service with the same name might run in the development, test, and production environments. In these cases, you must ensure that you use the correct service route name when calling a specific service.



Exposing Service Broker endpoints. Service Broker endpoints are TCP/IP-based interfaces exposed by each instance of SQL Server. Each instance can support only one endpoint, regardless of the number of services running on that instance. When you run services on different instances of SQL Server, you must define a Service Broker endpoint on each instance.

Module 6: Designing a Service Broker Solution ■

6–21

Using intermediary broker instances to forward messages. You can use intermediary broker instances to forward messages. You should consider using forwarding broker instances in the following situations: ●

The initiator and target services cannot connect directly to each other. You can use a broker instance as a gateway between the two services.



Many initiator services communicate with a single target. You can set up your initiator services to connect to a forwarding broker instance. The broker instance then communicates directly with the target service, thus reducing the number of connections to the target.

Broker instances that forward messages do not persist those messages but act as an intermediary for messages. The initiator service sends a message that passes through a forwarder and onto a target. ■

Using routing to scale out your applications. Routes provide a reliable way to scale out your application. By using routes, you can spread your services throughout the enterprise. In addition, you can change the location of your applications with little impact on your Service Broker solution. You need only to change the routing information.



Securing your routes. When planning your Service Broker routes, you should take into account dialog security and transport security: ●

Dialog security provides security between the initiator and target server, regardless of how many forwarding servers are between them. Service Broker does not secure the connection between the servers; it encrypts the actual message. Dialog security, which is certificate-based, is more difficult to implement than transport security.



Transport security provides security between servers, including the forwarding servers. Service Broker secures the connection between servers, but does not secure the message itself. Transport security is less difficult to implement than dialog security, but it requires that you set up security on every server. Transport security can be Windows-based or certificate-based.

6–22

Module 6: Designing a Service Broker Solution

Considerations for Identifying Service Activation Methods

**************************************** Illegal for non-trainer use *************************************** Introduction

As part of the process of designing Service Broker data flow, you should identify the methods that you will use to activate your queues. Activation enables you to control how a service processes messages in the queue. Many messaging applications use continuous polling to verify whether the queue has received messages. If the application finds messages, it begins processing them. However, as the number of queues grows, this solution becomes less scalable. Some messaging applications use triggers that fire each time a message arrives. However, triggers do not scale well to a high rate of messages. Service Broker offers a third option. You can activate a queue so that it runs a stored procedure or an external application when messages are received. Activation scales well because it can be used in any queue, no matter how many there are, and it can process any number of messages. As a result, you should plan your activation strategy when designing data flow. This topic explains the considerations for identifying service activation methods.

Module 6: Designing a Service Broker Solution Identifying service activation methods

6–23

When identifying service activation methods, you should take into account the following considerations: ■

Determining whether to activate the queues. Service Broker creates a queue monitor for each queue. The monitor checks regularly for new messages and other events. If you activate a queue, Service Broker initiates the associated stored procedure or raises an event when messages arrive. Normally, you should not activate a queue that supports an application that must meet either of the following requirements: ●

The application requires quick responses to infrequent messages.



The application requires substantial resources during startup.

Despite these limitations, activating a queue provides a number of benefits, such as scalability, improved performance, and a finer degree of control over message processing. You should considering activating a queue in the following situations: ●

The service should scale dynamically to accommodate incoming traffic. If a queue is activated, Service Broker verifies whether the associated stored procedure is running. If it is not running, the service starts the stored procedure. If it is running, the service determines if the stored procedure is keeping up with the number of incoming messages. If the stored procedure is not keeping up, the service runs another instance of the stored procedure. This process enables the service to scale dynamically. However, you can specify the maximum number of activated stored procedures permitted to run concurrently.



The service traffic varies unpredictably. When you activate a queue, Service Broker starts the associated application only when there are messages in the queue. As a result, you do not need to use resources simply to support an idle application.

Note You can use activation to forward messages. To do so, create an intermediary service. Activate the service’s queue, and associate the queue with a stored procedure that forwards messages. Configure the initiator service to send messages to the intermediary service. ■

Activating the queues externally. If you plan to activate a queue, you will most likely activate it internally, which means that you will associate a stored procedure with the queue. However, you can also activate a queue externally. External activation is event-based. When you implement external activation, Service Broker generates an event whenever a message arrives into the specified queue. Service Broker sends an event notification to a second service, which is associated with an external program that reads and processes the message in the original queue. However, external processing can be slower to start and to establish connections with the database. As a result, you should consider using external activation only in the following situations: ●

You improve performance by processing queues externally. For example, you might see performance gains when the rate of arriving messages varies significantly.



You want to use one external application to monitor a large number of queues.

Important If you use external activation, you must remove the assigned stored procedure from the queue definition, if one has been specified. You can assign only one activator to a queue regardless of whether it is a stored procedure or an external process.

6–24

Module 6: Designing a Service Broker Solution

Lesson 3: Designing Service Broker Solution Availability

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Evaluate the considerations for designing Service Broker fault tolerance.



Evaluate the considerations for designing a Service Broker backup strategy.

As you design your Service Broker solution, you should plan your solution’s availability so that you are prepared for such events as hardware failure, power failure, or loss of connectivity. Your goal should be to prevent any data loss and to ensure that your system can readily return to a normal state once availability-related issues have been addressed. This lesson describes the considerations that you should take into account when planning your solution’s availability. Specifically, the lesson discusses the considerations for designing fault tolerance and a backup strategy.

Module 6: Designing a Service Broker Solution

6–25

Considerations for Designing Service Broker Fault Tolerance

**************************************** Illegal for non-trainer use *************************************** Introduction

Service Broker includes a number of built-in features that help to prevent the loss of data, such as transactional messaging and message persistence. However, you must take into account other considerations when planning your system’s fault tolerance. For example, messages within a database can be lost if you do not ensure fault tolerance on the database itself or on the server on which SQL Server runs. This topic discusses the considerations for designing fault tolerance to support your Service Broker solution.

Designing Service Broker fault tolerance

When designing your solution’s fault tolerance, you should take into account the following considerations: ■

Setting up database mirrors. When you implement database mirroring, SQL Server maintains two copies of a database. If the primary database fails, the system fails over to the mirror. Clients can access only one database at any given time. Service Broker is tightly integrated with SQL Server mirroring, thus providing application failover. You can create a route that targets both the primary and mirrored databases. In the route definition, specify the primary route address as normal, and also specify the mirror address. When you send a message over that route, Service Broker first tries the primary database. If it cannot deliver the message to the primary database, it sends the message to the mirrored database. The application does not have to open a second connection or take any action to transition to the mirrored database. If Service Broker cannot send the message to either database, SQL Server rolls back the message transaction and holds the message in queue until the service can receive it. Important Currently, Microsoft does not support the mirroring functionality in SQL Server 2005. SQL Server disables this functionality by default. Although you can enable mirroring, Microsoft recommends that you do not implement mirroring in a production environment. See SQL Server 2005 Books Online and visit the SQL Server home page on the Microsoft Web site for more information.

6–26

Module 6: Designing a Service Broker Solution ■

Creating server clusters. A cluster is a set of two or more computers that provide failover services in the event of application or server failure. Clustering requires fewer network and disk resources than mirroring, but it is more difficult to implement and it takes longer to fail over should a failure occur. You must also allocate more memory to clustering than to mirroring because you need to have enough memory to support failover.



Assessing the effects on performance. Implementing a highly available solution can reduce system performance and use up resources. When determining the degree of fault tolerance that you want to implement, you should take into account how much data loss is tolerable. If processes are easily repeated and data easily replaced, you do not need to invest nearly as much as you do if you cannot afford any downtime or data loss. For example, certain RAID configurations are more fault tolerant than others, but they can be more expensive and can affect performance.

Module 6: Designing a Service Broker Solution

6–27

Considerations for Designing Service Broker Backup Strategy

**************************************** Illegal for non-trainer use *************************************** Introduction

Even after you implement a fault-tolerant solution, events can occur that can potentially result in the loss of data. An important strategy in protecting your data—in addition to ensuring fault tolerance—is to back up your database regularly. Your strategy should include plans for restoring your database should that become necessary. This topic describes the considerations for designing a Service Broker backup strategy.

Designing a Service Broker backup strategy

When designing a backup strategy, you should take into account the following considerations: ■

Backing up your databases. SQL Server stores Service Broker metadata and regular data in the application’s database. When you back up the database, you also back up the following objects: ●

Service Broker objects, including services, queues, message types, contracts, and routes



XML schema collections, including those used by message types



Stored procedures, including those associated with activated queues



Certificates, including certificates used for dialogs



Users, including those used within an execution context

6–28

Module 6: Designing a Service Broker Solution ■

Restoring your databases. When designing your backup strategy, you should also plan for recovering data should you need to restore a database. Your recovery plan should take into account the following considerations: ●

Losing messages. When restoring a Service Broker database, you might lose some messages, which can create synchronization issues among the processes supported by the solution. The less frequently you back up your database, the more messages you might lose and the more messages you might need to roll back on other servers to resynchronize the system.



Associating users with logins. If you restore your database to the same instance of SQL Server, SQL Server preserves the system metadata (not the database metadata contained in the database that you must restore). However, if you restore the database to a different instance of SQL Server, you must reassociate users with the appropriate logins. For information about reassociating users, see the topic “Troubleshooting Orphaned Users” in SQL Server 2005 Books Online.



Reconfiguring routes. You must modify service routes on the various servers as necessary to ensure accurate message delivery.



Creating endpoints. If you restore your database to the same instance of SQL Server, SQL Server preserves the Service Broker endpoint. If you restore the database to a different instance, you must re-create the endpoint.



Enabling Service Broker. After you restore a database, you might need to enable Service Broker on that database. You can enable Service Broker by issuing an ALTER DATABASE statement.



Creating Transact-SQL script files. SQL Server enables you to create script files based on object definitions. For example, you can create script files for each endpoint, login, and route. The script files provide the statements necessary to create the object. You should create script files for any object that you cannot easily preserve in a database backup. Creating script files should be part of a larger backup-and-restore strategy that encompasses all components in your SQL Server installation. You can then use the script files to generate the new object on different instances of SQL Server. After creating the script files, you should back them up as necessary to ensure their availability and reliability.



Protecting external applications. Service Broker solutions often integrate with application components outside of SQL Server. Your backup strategy should include plans for backing up and restoring those components.

Module 6: Designing a Service Broker Solution

6–29

Lab: Designing a Service Broker Solution

**************************************** Illegal for non-trainer use *************************************** Objectives

Introduction



Design a Service Broker solution architecture.



Design the Service Broker data flow.



Execute a Service Broker solution.

In this lab, you will design a Service Broker solution. The solution must take into account the solution architecture and data flow. After you design your solution, you will implement it using an instance of SQL Server. The lab contains the following three exercises: ■

Exercise 1: Designing a Service Broker Solution Architecture. In this exercise, you will identify the broker instances and conversations necessary to support your Service Broker solution. You will then create a Microsoft Office Visio® diagram based on your solution. You will perform this exercise in small groups and appoint one person in that group to create the diagram.



Exercise 2: Designing a Detailed Service Broker Solution. In this exercise, you will identify the services, queues, stored procedures, and data flow necessary to support your Service Broker solution. You will then create a Visio diagram based on your solution. You will perform this exercise in small groups and appoint one person in that group to create the diagram.



Exercise 3: Executing a Service Broker Solution. In this exercise, you will implement your Service Broker solution. You will work individually to complete this exercise.

6–30

Module 6: Designing a Service Broker Solution

Scenario

Preparation

Fabrikam, Inc. wants to ensure that addressing and pricing information is synchronized between the central office and the branch offices. Processes should be implemented to handle the following events: ■

Customer address information changes at the branch office. A stored procedure in the branch office database should write the changes to that database and then send a message (with the updated address information) to the central office database. A stored procedure in the central office database should receive the message, update that database, and then send a message (with the updated address information) to the other branch office databases.



Item pricing information changes at the central office. A stored procedure in the central office database should write changes to that database and then send a message (with the updated item pricing information) to the branch office databases.

Ensure that the virtual machine for the computer 2781A-MIA-SQL-06 is running. Also, ensure that you have logged on to the computer. If you need to log on to the computer, use the following credentials: ■

Username: Student



Password: Pa$$w0rd

Module 6: Designing a Service Broker Solution

6–31

Exercise 1: Designing a Service Broker Solution Architecture Introduction

Identifying the Service Broker instances

In this exercise, you will design the architecture for a Service Broker solution. You will identify the Service Broker instances and conversations necessary to support your solution. You will create a Visio diagram to model your solution. You will perform this exercise in small groups.

Summary ■

Identify the Service Broker instances needed to implement a solution that meets the business requirements. For the purposes of the exercise, assume that there are only two branch offices.

Detailed Steps 1. Identify the Service Broker instances necessary to enable changes to customer information to be propagated between branch offices and the Fabrikam, Inc. central office, and to enable changes to item prices to be propagated from the central office to the branch offices. 2. Create a new Visio diagram using the shapes in the Service Broker.vss stencil in the E:\Labfiles\Starter folder. Add and name the instances of Service Broker you have identified to this diagram. 3. Add the databases for each location to the Visio diagram. 4. Save the diagram as Service Broker Conversations.vsd in the E:\Labfiles\Starter folder. (You will add information about Service Broker conversations to this diagram in the next task.)

Identifying the conversations

Summary ■

Identify the conversations required to propagate changes between the databases at the central office and the branch office.

Detailed Steps 1. When identifying the conversations, take into account the Service Broker instances and databases you identified in the previous task. Your solution should include all conversations between each branch office and the central office. 2. Add the identified conversations to your Visio diagram.

6–32

Module 6: Designing a Service Broker Solution

Exercise 2: Designing a Detailed Service Broker Solution Introduction

Identifying the services and queues

In this exercise, you will design the detailed Service Broker solution that meets the business requirements of Fabrikam, Inc. You will identify the services, queues, stored procedures, and data flow necessary to support your solution. You will create a Visio diagram to document your solution. You should perform this exercise in the same groups as before.

Summary ■

Identify the Service Broker services and queues needed to send data between the branch offices.

Detailed Steps 1. Identify the services necessary to support the customer update process and the price update process. 2. Identify the queues necessary to support the services. 3. Create a Visio diagram to model your solution using the Service Broker.vss stencil in the E:\Labfiles\Starter folder. Note Only include the CentralOffice and BranchOffice1 databases in the diagram. The requirements for BranchOffice2 will be the same as BranchOffice 1. 4. Add the services that will receive messages about updated pricing and and addressing information. For this exercise, do not include any services that send messages but do not receive them. 5. Add the appropriate queue to each service. 6. Save this diagram as Service Broker Data Flow.vsd in the E:\Labfiles\Starter folder. You will extend this digram in the following task in this exercise.

Module 6: Designing a Service Broker Solution Identifying the stored procedures and data flow

Summary ■

Identify the stored procedures needed to update the customer and item information in each database, and keep the databases synchronized.

6–33

Detailed Steps 1. You should identify two types of stored procedures: ●

Stored procedures that are activated by receiving Service Broker messages. These stored procedures update the database to keep it synchronized using the data sent as part of the message.



Stored procedures that applications use to update the data locally and initiate a Service Broker conversation to transmit the updated data to other sites.

2. Add the stored procedures to the Visio diagram you created in the last task. 3. Add connectors that indicate which messages in which queues activate stored procedures, and which stored procedures send messages to update remote databases. 4. Add arrows to indicate which table each stored procedure updates. 5. Save the diagram. Discussion questions

1. Is it necessary to add multiple queues to the branch office databases? You can use one queue for each branch office database, but this approach complicates your application. You would have to create a stored procedure that can distinguish between the types of messages and take the necessary action as appropriate. This approach results in extra processing and more complicated coding, both of which are unnecessary.

2. Is there a way to prioritize price update messages over address update messages in the branch offices? Because the branch office databases each contain two queues, Service Broker processes messages in parallel. As a result, you can update prices at the same time you update customer information. This might satisfy your application’s requirements. However, you can also consider setting the maximum number of concurrent stored procedure instances to 1 on the CustomerUpdateQueue queue. That way, you can guarantee that this queue never uses more resources than what is necessary to perform one update at a time.

6–34

Module 6: Designing a Service Broker Solution

Exercise 3: Executing a Service Broker Solution Introduction Creating the databases and Service Broker objects

In this exercise, you will implement your Service Broker solution. You will work individually to complete this exercise.

Summary 1. Create the databases.

Detailed Steps ■

Open SQL Server Management Studio, and then open the Build Solution.sql script in the E:\Labfiles\Starter folder.



The SQL script includes a number of comments. Replace the comments with the code necessary to create and configure the necessary objects.



Run the SQL script. For this exercise, you will use a single instance of SQL Server.

2. Set database options to support Service Broker. 3. Create tables and stored procedures. 4. Create Service Broker objects.

Testing the solution Summary ■

Discussion questions

Run the test script to validate the Service Broker solution.

Detailed Steps ■

In SQL Server Management Studio, open the TestSolution.sql script file.



Run the script and verify that the results are what you would expect. Refer to the comments in the script for information about the test.

1. How do you troubleshoot Service Broker problems? The method you use to troubleshoot the problem depends on the type of problem. Most issues result from incoming messages not being processed correctly. In this case, you should disable the stored procedure activation in the receiving queue and try to process each message manually. Use print statements or a specially developed script for debugging.

2. How do you back up a Service Broker configuration? When you back up your databases, you are also backing up the Service Broker components.

3. How do you resubmit a message that failed because of connectivity issues? You do not have to resubmit the message. If a service cannot deliver a message, it holds that message in its own outgoing queue until connectivity has been reestablished or until both services end the conversation. The service sends the message when the connection is re-established.

Important After the discussion, shut down the virtual machine for the computer 2781A-MIA-SQL-06. Do not save the changes.

Module 7

Planning for Source Control, Unit Testing, and Deployment

Contents: Lesson 1: Designing a Source Control Strategy

7-2

Lesson 2: Designing a Unit Test Plan

7-10

Lesson 3: Creating a Performance Baseline and Benchmarking Strategy

7-20

Lesson 4: Designing a Deployment Strategy

7-28

Lab: Planning for Source Control, Unit Testing, and Deployment

7-36

Information in this document, including URL and other Internet Web site references, is subject to change without notice. Unless otherwise noted, the example companies, organizations, products, domain names, email addresses, logos, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place or event is intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. The names of manufacturers, products, or URLs are provided for informational purposes only and Microsoft makes no representations and warranties, either expressed, implied, or statutory, regarding these manufacturers or the use of the products with any Microsoft technologies. The inclusion of a manufacturer or product does not imply endorsement of Microsoft of the manufacturer or product. Links are provided to third party sites. Such sites are not under the control of Microsoft and Microsoft is not responsible for the contents of any linked site or any link contained in a linked site, or any changes or updates to such sites. Microsoft is not responsible for webcasting or any other form of transmission received from any linked site. Microsoft is providing these links to you only as a convenience, and the inclusion of any link does not imply endorsement of Microsoft of the site or the products contained therein. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. ©2006 Microsoft Corporation. All rights reserved. Microsoft, Active Directory, ActiveX, BizTalk, Excel, Microsoft Press, MSDN, MSN, Outlook, PowerPoint, SharePoint, Tahoma, Visio, Visual Basic, Visual C#, Visual SourceSafe, Visual Studio, Windows, and Windows Server. All other trademarks are property of their respective owners.

Module 7: Planning for Source Control, Unit Testing, and Deployment

7–1

**************************************** Illegal for non-trainer use *************************************** Module objectives

Introduction

After completing this module, students will be able to: ■

Design a source control strategy.



Design a unit test plan.



Create a performance baseline and benchmarking strategy.



Design a deployment strategy.

In this module, you will learn the guidelines and considerations you need to know to plan for source control, unit testing, and deployment during the design of a Microsoft® SQL Server™ 2005 solution. Tasks include designing a source control strategy, designing a unit testing plan, creating a performance baseline and benchmarking strategy, and designing a deployment strategy. At the end of the module, you will work in teams to perform design tasks and then present and defend your design decisions to your peers.

7–2

Module 7: Planning for Source Control, Unit Testing, and Deployment

Lesson 1: Designing a Source Control Strategy

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Explain the need for source control in your organization.



Explain how to integrate source control with Management Studio.



Explain the process for creating a source control plan.



Evaluate the considerations for encrypting source code.

A database is not only a set of files to store data that users access through applications. For a database developer, a database starts as a set of scripts that need to be developed and run to create and configure the database, as well as all of the objects that the database contains. This is code, and as with any development process, this code will be subject to change over time. It is likely that multiple developers participate in creating these scripts. Managing these files in a shared access environment can become an important management task. The case described here is not too different from the standard application development process, and as such, it could benefit from using source control. In this lesson, you will see how to use source control techniques in database applications and why these techniques are important for any database system.

Module 7: Planning for Source Control, Unit Testing, and Deployment

7–3

Discussion: The Need for Source Control

**************************************** Illegal for non-trainer use *************************************** Introduction Discussion questions

In this discussion, you will asses the advantages of source control. You will also discuss some of the problems that you can avoid through proper use of source control. 1. Have you worked on projects for which you did not implement source control? What are some of the issues that you faced because of this? Answers will vary. Without source control, there is no way to find differences between code versions; moreover, it is not possible to track changes between versions. If you have worked with other developers on the project, some code might be lost because another developer might overwrite your changes. For example, you cannot recover accidentally deleted objects unless you have source control in place. Source control not only prevents accidental loss of code, but it also provides comparison between versions to track down bugs introduced by new versions of the code.

2. Have you worked on a project for which you implemented development code in production by mistake? What steps could you have taken to avoid this? Answers will vary. In many projects, you can find the code deployed to production. Causes vary from one project to another—in some cases, the root of the problem might be improper version control, and in other cases, it could be inadequate control of dependencies between different software components. For example, changes could be applied to a component without a full regression test to make sure that these changes do not affect other components of the system. Therefore, deploying these insufficiently tested code changes can lead to errors in production. The way to avoid this issue is through proper regression testing of all dependent components before deciding to deploy to production.

7–4

Module 7: Planning for Source Control, Unit Testing, and Deployment

Demonstration: Integrating Source Control with Management Studio

**************************************** Illegal for non-trainer use *************************************** Introduction

In this demonstration, you will see the integration of Microsoft SQL Server Management Studio and Microsoft Visual SourceSafe®. You will learn how to configure and use some of the most important features of Microsoft Visual SourceSafe and how to use these features in a SQL Server project.

Preparation

Ensure that the virtual machine 2781A-MIA-SQL-07 is running and that you are logged on as Student. If the virtual machine has not been started, perform the following steps: 1. Close any other running virtual machines. 2. Start the virtual machine. In the Log On to Windows dialog box, complete the logon procedure by using the user name Student and the password Pa$$w0rd.

Procedure for configuring integration between SQL Server Management Studio and Microsoft Visual SourceSafe

Perform the following steps to configure integration between SQL Server Management Studio and Microsoft Visual SourceSafe: 1. Click Start, point to All Programs, point to Microsoft SQL Server 2005, and then click SQL Server Management Studio. 2. In the Connect to Server dialog box, specify the values in the following table and then click Connect. Property

Value

Server type

Database Engine

Server name

MIA-SQL\SQLINST1

Authentication

Windows Authentication

3. In SQL Server Management Studio, on the File menu, point to Source Control and then click Launch Microsoft Visual SourceSafe.

Module 7: Planning for Source Control, Unit Testing, and Deployment

7–5

4. On the Welcome to the Add SourceSafe Database Wizard page, click Next. 5. On the Database Selection page, select Create a new database and then click Next. 6. On the New Database Location page, enter E:\Democode\SampleDB for the Location and then click Next. 7. On the Database Connection Name page, enter SampleDB for the Name and then click Next. 8. On the Team Version Control Model page, review the explanation of the two options, select Lock-Modify-Unlock Model, and then click Next. 9. On the Completing the Add SourceSafe Database Wizard page, click Finish. 10. In the Log On to Visual SourceSafe Database dialog box, specify a user name of Student and a SourceSafe password of Pa$$w0rd, and then click OK. 11. In the SampleDB – Visual SourceSafe Explorer window, right-click the $/ project to show the available commands. 12. Close the SampleDB – Visual SourceSafe Explorer window. Procedure for creating a new Database Engine project and adding it to source code control

Perform the following steps to create a new Database Engine project and add it to source code control: 1. In SQL Server Management Studio, on the File menu, point to New and then click Project. 2. In the New Project dialog box, enter the following values and then click OK. Property

Value

Name

SampleProject

Location

E:\Democode

Solution Name

SampleSolution

Create directory for solution

Selected

Add to Source Control

Selected

3. In the Log On to Visual SourceSafe Database dialog box, specify a user name of Student and a SourceSafe password of Pa$$w0rd, and then click OK. 4. In the Add to SourceSafe dialog box, accept the default location and then click OK. 5. Click Yes to create Project $/SampleSolution.root. 6. Right-click the Connections folder in Solution Explorer and then click New Connection. Accept the default values to connect to MIA-SQL\SQLINST1 by using Microsoft Windows® Authentication and then click OK. 7. In Solution Explorer, right-click the new connection and then click New Query. 8. In the Solution Explorer, in the Queries folder, right-click the SQLQuery1.sql query and then click Rename. Enter SampleQuery.sql for the new name. 9. In the Query Editor window, add the following comment to SampleQuery: /* Version 1 */

10. On the File menu, click Save All to save all files to disk. If a warning about character encoding appears, click OK. 11. In the Pending Checkins pane, click Check In. In the message box, click Check In. Notice that your files are marked with a lock symbol in Solution Explorer.

7–6

Module 7: Planning for Source Control, Unit Testing, and Deployment 12. In the Query Editor window, modify the comment in the SampleQuery query as follows: /* Version 2 */

This is version 2 of the query. 13. In the Pending Checkins pane, click Check In. In the message box, click Check In. 14. Right-click the SampleQuery.sql file in Solution Explorer and then click View History. 15. In the History Options dialog box, accept the default options and then click OK. A window displaying the version history of files appears. 16. Select both versions of the file, click Diff, and then click OK to show the differences between versions. 17. Close the Differences between SampleQuery.sql version 1 and SampleQuery.sql version 2 window and the History of $/SampleSolution.root/ SampleSolution/SampleProject/SampleQuery.sql window. Testing source code conflict prevention

1. On the Start menu, right-click SQL Server Management Studio and then click Run as. 2. In the Run As dialog box, click The following user. Enter Administrator for the user name, enter Pa$$w0rd for the password, and then click OK. 3. In the Connect to Server dialog box, specify the values in the following table and then click Connect. Property

Value

Server type

Database Engine

Server name

MIA-SQL\SQLINST1

Authentication

Windows Authentication

4. On the File menu, point to Open and then click Project/Solution. 5. In the Open Project window, open the E:\Democode\SampleSolution folder and open the solution SampleSolution.ssmssln. 6. In the Log On to Visual SourceSafe Database dialog box, specify a user name of Student, a SourceSafe password of Pa$$w0rd, and then click OK. Important Student.

The default user name is Administrator. Make sure you change it to

7. In the Queries folder in Solution Explorer, double-click the SampleQuery.sql file. 8. In the Query Editor window, modify the comment in the SampleQuery query as follows: /* Version 3 */

9. Switch to the original instance of SQL Server Management Studio and then change the version number of the query to 4. Notice that you cannot check in this copy, because it is already checked out by the process running the other instance of SQL Server Management Studio. 10. Close all instances of SQL Server Management Studio without saving any files.

Module 7: Planning for Source Control, Unit Testing, and Deployment

7–7

The Process for Creating a Source Control Plan

**************************************** Illegal for non-trainer use *************************************** Introduction

Integration with source control is a new feature for database developers (DBDs); however, it is a very common tool for application developers. In this topic, you will learn the principal steps to set up the source control tool to store your database scripts. For enterprise-wide source control, you should consider Microsoft Visual Studio® Team System (VSTS). This is an enhanced version control system, and it includes features such as reporting and links to Microsoft Project and Microsoft SharePoint® Team Services. VSTS is a suite of products including Visual Studio Team Foundation Server, which provides collaboration functionality, and Visual Studio Team Test Load Agent, which generates test loads to simulate multiple users.

Creating folders

Creating a folder in the source control tool allows you to organize your source code into projects, where each project can store many related files. Bear in mind that inside this folder, you can store any type of file—not just Transact-SQL, MDX, or XMLA queries, but also documentation about the project, photos, digital images, and so on. However, you can effectively only compare text files to previous versions. This can be a very effective way to organize your work, allowing you to maintain all files related to the same project in the same source control folder.

Creating scripts

You can store the script to create each object in a separate file. However, some objects are dependent on other objects, and it might make more sense to keep their scripts together in the same file. You could even consider keeping the entire database script in one file, which certainly solves the dependency problem but makes it more difficult to have effective source control. Bear in mind that any changes by any developer to any object in the database require a new version of this unique file. Group scripts in files in a way that makes sense and provides the functionality that the development team requires.

7–8

Module 7: Planning for Source Control, Unit Testing, and Deployment

Checking out documents

You must check out documents if you need to edit them. After you complete your edits, you then check in the documents. It is crucial to always check out the latest version and not assume that a local copy is up to date. By checking in the document after each significant modification, it is easy to compare versions. SourceSafe includes a Show Differences utility that will highlight the changes between document versions, allowing you to quickly find the cause of problems.

Keyword expansion

You should provide comments with every edit in a document. This allows reviewers to see your changes and your justification for each change. When developers write SQL scripts, they can add keywords that will cause Visual SourceSafe to automatically insert comments. The most common keyword insertion is: --$History: $

This has no effect on the original code, but Visual SourceSafe will add the user, the date and time, and any comments at this point in the script. The Visual SourceSafe .ini file, srcsafe.ini, allows you to specify keyword comments so that these insertions will not effect the execution of the SQL script. Keyword expansion is off by default, but you can enable it in Visual SourceSafe Administrator or in srcsafe.ini. Creating baselines

When a project reaches a deliverable milestone, you should label and create a baseline using this version before any more development takes place. A baselined version will not have any changes applied to it; all new changes will be stored in a new folder. This gives you the option to revert to any version of the entire set of files at any time, and still make sure that these files are kept together, maintaining their compatibility and without affecting future versions. Labeling versions properly gives you the chance to work with different versions of the system, fixing bugs in an old version while still working on the new version. Once you have deployed a version, it should be baselined, and all new development should occur in a new folder, allowing you to compare production and development versions.

Module 7: Planning for Source Control, Unit Testing, and Deployment

7–9

Considerations for Encrypting Source Code

**************************************** Illegal for non-trainer use *************************************** Introduction

You often need to secure data to prevent undesired access from unknown sources. The main reason for this is that your data has a very high value to your organization. For the same reason, you should secure your source code. Source code implements many business rules, and these rules can be a key factor for business success and are part of the intellectual property of the organization. Anyone can learn about your company by reading the course code of its applications, and this is not the type of information that you want to share with strangers. Therefore, you should consider encrypting your source code, which will allow access only to authorized members of your organization.

Consider using the encrypting file system (EFS)

SQL Server Encryption encrypts only data stored in the database, but Visual SourceSafe stores files in the file system. By using EFS, you can encrypt the entire folder structure that supports the project in SourceSafe. EFS does not prevent undesired physical access to the media, but an attacker cannot read the code in the files it has encrypted. In an enterprise environment, you must consider granting the minimum set of permissions on these files for every user that might need access to Visual SourceSafe.

Consider using Secure Sockets Layer (SSL)

You can configure Visual SourceSafe to provide Hypertext Transfer Protocol (HTTP) access to the source files. If you do not use Secure Sockets Layer (SSL) in this environment, the data in the source files is accessible directly from the packets that transport it across the network. If you need to access your files over the Internet, SSL is the main way to avoid undesired access to your source code.

Check whether existing encryption strategies suit your encryption requirements

EFS prevents undesired read access to encrypted files, but you must ensure that proper access to the files is not affected and that only authenticated and authorized users can access the directories in the file system that support Visual SourceSafe. You can use certificates to ensure that only appropriate users can access SourceSafe projects. Pay attention to isolating different applications by using different Visual SourceSafe databases. You configure access permissions and encryption at the folder level, and you do not know about the physical storage of your source code files if you use a single database. If you need to manage different applications in an isolated way, you must create different SourceSafe databases.

7–10

Module 7: Planning for Source Control, Unit Testing, and Deployment

Lesson 2: Designing a Unit Test Plan

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Describe the problems that functional testing does not identify.



Explain what unit testing is.



Explain how to perform a unit test.



Evaluate the considerations for creating a unit test plan.

In this lesson, you will learn about unit test plans. Advantages of unit testing include: ■

Well-tested code works better. If you design and develop an automatic unit test for all of your objects, you can test them at any time, particularly before deployment.



You must test every object many times. In a standard development process, you need to run tests on all components at every milestone to ensure that the application works correctly. It could take a long time to manually test each object multiple times, but automating testing can shorten the time required.

Module 7: Planning for Source Control, Unit Testing, and Deployment

7–11

Discussion: Problems Not Identified by Functional Testing

**************************************** Illegal for non-trainer use *************************************** Introduction

Discussion questions

Functional testing is a process to determine whether each function of an application works as intended. In this discussion, you will consider some of the problems that functional testing does not identify. 1. Have you worked on projects for which you did not implement unit tests? What are some of the issues faced because of this? If you use only functional testing or do not test at all, the database will not verify that all code and objects are performing correctly and consistently. In a database system, performance is a key factor because many concurrent users can access the same object. If performance and concurrency are not tested, applications might not work in a production environment. When you have projects without test unit control, there is no easy way to determine whether changes affect other parts of the application. To determine whether changes affect other processes, you must test the entire application, and in many cases, this is an impossible task. One stored procedure can produce locks and blocks with others, and testing of independent objects cannot detect these kinds of conflicts.

2. What was the best environment that you worked in with respect to testing code? What are some of the best practices that you followed? Answers will vary. When thinking about how to test a stored procedure or function, you should think of ways to make it fail. You can find varied and unusual issues writing code this way. Testing your own code will reduce the number of bugs delivered to test teams and minimize the amount of time needed to test the application.

7–12

Module 7: Planning for Source Control, Unit Testing, and Deployment 3. In your experience, what are some examples in which code was syntactically correct but did not function correctly for your solution? Answers will vary. A query can often be syntactically correct but not return the correct results. For example, you might need to return the names of your customers and the number of invoices they have this year. If you use an INNER JOIN to connect the Customers table with the Invoices table, customers who do not have any invoices will not appear in the list. The code is syntactically correct, but it does not meet the requirements.

Module 7: Planning for Source Control, Unit Testing, and Deployment

7–13

What Is Unit Testing?

**************************************** Illegal for non-trainer use *************************************** Introduction

Unit testing is a series of tests that you develop to test objects and pieces of functionality. These tests individually pass in all accepted inputs and then validate the output. They should test both inputs that should cause an error and inputs that should succeed. Unit testing also enables you to detect problems caused by changes in one component that affect other components. Unit testing enables you to detect and resolve bugs earlier, minimizing the time spent finding and solving bugs. It allows you to change your code, guaranteeing proper functionality after the changes. A previously fixed component that stops working unexpectedly is a regression bug, and unit testing will help you to see what caused this regression. It is possible that the solution is fragile or that you have made a similar mistake in a new component. The goal of unit testing is to allow you to run tests that will easily show those parts of the application that are broken before or after introducing new changes in features. In many circumstances, these new features are the principal reason for unexpected failures of applications. Microsoft Visual Studio Team System includes support for unit testing.

Automatic testing

You can write applications to perform automatic testing or use tools such as Microsoft Visual Studio Team Edition for Software Testers to run automated tests. Automatic testing allows you to run unattended tests rather than waiting while manual tests run. Automatic testing also makes it less likely that you might overlook important tests.

7–14

Module 7: Planning for Source Control, Unit Testing, and Deployment

Unit testing compared to application testing

Unit testing tests the individual units of an application to isolate errors early in the development process. If you tested the whole application at the end of development, it would be nearly impossible to isolate the cause of some failures, as there are so many interconnected components. Unit testing cannot replace application testing, as you must still test the entire solution together. When testing the whole application, you can use products such as Visual Studio Team Test Load Agent with Microsoft Visual Studio Team Edition for Software Testers to simulate many users at one time. Manually simulating the number of users required might be difficult and expensive, if not impossible. It is sensible to use unit testing while an application is in development, but test the whole application at each significant step, whether this is an alpha version or a beta version, or prior to deployment.

Repeatability of tests

Benefits of unit testing

Limitations of unit testing

The act of testing itself changes the environment. For example, any successful test of a data modification component will necessarily change data. Because of this, you must restore the environment to its original state after tests have run; otherwise, you cannot accurately repeat the tests. You can do this by using Backup and Restore, by using virtual machine images that you can copy and test, or by creating undo disks to roll back changes. ■

Better code quality. All applications have bugs, but you should resolve these bugs before deploying your application. By detecting and solving bugs earlier, the code quality is improved.



Easier changes. Changes introduce unexpected errors. Unit testing helps to resolve these errors, resulting in easier changes to your application.



Shorter development cycle. Even though you need to spend time writing unit test code, this time is less than the time it would take to resolve bugs after deployment.

Unit testing is not a replacement for stress tests or functional tests. All of these tests are necessary to ensure that an application will work properly in the production environment. Unit testing provides an automatic level of testing necessary when writing high-quality scripts, but stress and functional testing are still necessary to ensure development quality.

Module 7: Planning for Source Control, Unit Testing, and Deployment

7–15

Demonstration: Performing a Unit Test

**************************************** Illegal for non-trainer use *************************************** Introduction

In this demonstration, you will learn about the importance of performing comprehensive unit testing. This demonstration uses the following scenario: You are the developer of a procedure that inserts orders into a sales database. You need to test the procedure and resolve any bugs you find. This unit test is simple to enable you to quickly understand the logic. Unit tests are often much more complex. This demonstration does not return the database to its original state, but in the real world, you should add code to perform this step as well. Additionally, this demonstration does not show you how to fix the error that occurs; it only serves to show you how important it is to cover all eventualities when performing unit testing.

Preparation

Ensure that virtual machine 2781A-MIA-SQL-07 is running and that you are logged on as Student. If the virtual machine has not been started, perform the following steps: 1. Close any other running virtual machines. 2. Start the virtual machine. In the Log On to Windows dialog box, complete the logon procedure by using the user name Student and the password Pa$$w0rd.

Procedure for running a unit test

You must perform the following steps to install and run a unit test: 1. Click Start, point to All Programs, point to Microsoft SQL Server 2005, and then click SQL Server Management Studio.

7–16

Module 7: Planning for Source Control, Unit Testing, and Deployment 2. In the Connect to Server dialog box, specify the values in the following table and then click Connect. Property

Value

Server type

Database Engine

Server name

MIA-SQL\SQLINST1

Authentication

Windows Authentication

3. On the File menu, point to Open and then click Project/Solution. 4. In the Open Project window, open the E:\Democode\Orders folder and open the solution Create and Test Orders.ssmssln. 5. In the Solution Explorer, expand the Queries folder and then double-click the Create Orders.sql query. 6. Examine the code for this query. This code creates a database called Orders containing a table called OrderHeader, and a stored procedure called InsertOrderHeader that adds a new row to the OrderHeader table. Notice that the InsertOrderHeader stored procedure sets the transaction isolation level to READ COMMITTED, and generates an OrderId for the OrderHeader row by calculating the current maximum OrderId value and adding 1 to it. 7. Click Execute to create the database, table, and stored procedure. 8. In the Solution Explorer, in the Queries folder, double-click the Create Log.sql query. 9. Examine the code for this query. This code creates a table called LogTestOrders for logging the results of unit tests run against the stored procedure. 10. Click Execute to create the LogTestOrders table. 11. In the Solution Explorer, in the Queries folder, double-click the UT Create Orders.sql query. 12. Examine the code for this query. This code performs a unit test of the InsertOrderHeader stored procedure. It repeatedly executes the InsertOrderHeader stored procedure (1500 times by default) with some generated test data. The code then verifies that the OrderHeader table contains the expected number of rows. Any errors are recorded in the LogTestErrors table. 13. Click Execute, and verify that the Results window displays a single row containing the text Success. 14. In Object Explorer, expand the Databases folder, expand Orders, expand Tables, right-click dbo.LogTestOrders, and then click Open Table. Verify that this table contains no record of any errors, and then close the table. 15. On the Start menu, click Windows Explorer. Open the folder E:\Democode\Orders. 16. Right-click the file RunUTCreateOrders.cmd and then click Edit. Examine the code. This is a command file that uses sqlcmd to connect to SQL Server and run the UT Create Orders.sql script. Close the file.

Module 7: Planning for Source Control, Unit Testing, and Deployment

7–17

17. In Windows Explorer, right-click the file RunConcurrentCreateOrders.cmd and then click Edit. Examine the code. This is a command file that runs two concurrent instances of the RunUTCreateOrders.cmd command file, simulating two users repeatedly running the InsertOrderHeader stored procedure at the same time. Close the file. 18. In Windows Explorer, double-click the file RunConcurrentCreateOrders.cmd to run the command file. Two Command Prompt windows will appear, both using sqlcmd to run the UT Create Orders.sql script. Both scripts run, but at least one (possibly both) reports the message Test run with errors. Close both Command Prompt windows. 19. Return to SQL Server Management Studio. In Object Explorer, right-click the LogTestOrders table and then click Open Table. This table will now contain a number of rows. Scroll the window to display the ErrorMessage column. Examine the contents of this column in any row. It will contain a message indicating that the scripts attempted to insert rows containing duplicate key values. Close the table. The error was caused by using an inappropriate isolation level in the InsertOrderHeader stored procedure. You should report this error back to the developer, together with the circumstances that caused it to occur. 20. Close SQL Server Management Studio without saving any files.

7–18

Module 7: Planning for Source Control, Unit Testing, and Deployment

Considerations for Creating a Unit Test Plan

**************************************** Illegal for non-trainer use *************************************** Test the entire solution

Make sure that you test the entire solution. The test scenario must be complete and must pay special attention to the most critical parts. Design a wide variety of scenarios to reproduce known errors and to test the most common data. Review the development schedules so that you can coordinate the source code check-ins and perform unit testing on these.

Design tests for query performance

Tests often run on critical queries against response times that you define. Check the performance while simulating stress on the server. You can download and use tools such as SQLIOStress and SQLIO from the Microsoft Web site to simulate pressure on the resources of your server and ensure that your application works properly in this scenario.

Design tests for data consistency

Ensure that sample sizes and sources vary sufficiently to depict meaningful outputs. Pay attention to the size of fields and number of rows involved in each query.

Design tests for application security

When testing, pay special attention to application security. As suggested in Writing Secure Code (Microsoft Press, 2002), by Michael Howard and David C. LeBlanc, you should assume that “the user is evil.” Try to enter dangerous values in fields, such as a quotation marks (‘) or comment sequences (- -). When a user enters values in a field on a form, this data usually forms part of a SQL string. SQL injection is the practice of placing SQL code in these text boxes and is a threat to all SQL-based database products. For example, a user could type the following value in a search field: '; UPDATE Products SET UnitPrice = 0.01 WHERE ProductId = 1--

With the original SQL, this forms the following statement: SELECT * FROM Products WHERE ProductName='

Module 7: Planning for Source Control, Unit Testing, and Deployment

7–19

The resulting SQL statement would be: SELECT * FROM Products WHERE ProductName=''; UPDATE Products SET UnitPrice = 0.01 WHERE ProductId = 1--

In this example, the SQL injection modifies the price of a product, but if the attacker knows the structure of the database and you have not protected your system against SQL injection, the implications could be far worse. Design tests for system resource utilization

When testing, collect data about resources from System Monitor, such as CPU use and I/O queues. You should also collect client statistics such as read and write values and network traffic from the Query menu in Microsoft SQL Server Management Studio. Establish a goal, and ensure that in every test, your results meet the goal.

7–20

Module 7: Planning for Source Control, Unit Testing, and Deployment

Lesson 3: Creating a Performance Baseline and Benchmarking Strategy

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Describe the process of creating a performance baseline and benchmarking strategy.



Apply the guidelines for identifying performance objectives.



Apply the guidelines for establishing performance benchmarks.



Apply the guidelines for measuring performance.



Apply the guidelines for creating a performance change strategy.

In this lesson, you will learn how to create a performance baseline and a benchmarking strategy. You will learn how to identify the performance objectives to achieve. You will also learn to define a performance baseline and how to measure performance over time. Because systems are in continuous evolution, you should upgrade your performance strategies for new changes in the system. You will learn how to address these changes and learn guidelines to upgrade your system baseline.

Module 7: Planning for Source Control, Unit Testing, and Deployment

7–21

Multimedia: The Process of Creating a Performance Baseline and Benchmarking Strategy

**************************************** Illegal for non-trainer use *************************************** Introduction

This presentation describes the process of creating a performance baseline and benchmarking strategy. Tip To view this presentation later, open the Web page on the Student Materials compact disc, click Multimedia, and then click the title of the presentation.

Discussion questions

Read the following questions and discuss your answers with the class: 1. Have you ever had a baseline for your solution and later upgraded hardware? If so, how did you update your baseline? Answers will vary. Possible answers include: You need to upgrade the server with new processors, new memory, new disk devices, and so on. You run SQLIOStress to benchmark the maximum capacity of the system. After this benchmarking, you install all the required software for running your applications, and then you run the current performance test on a regular basis. After upgrading the server, you run the performance test analysis in the same manner as before. You detect improvements in the performance of the server, but no more information regarding the maximum capacity of the new server is revealed.

7–22

Module 7: Planning for Source Control, Unit Testing, and Deployment 2. What situations other than server hardware or network changes would require you to re-create your baselines? Answers will vary. Possible answers include: Every single change in the hardware configuration of the system would affect the overall performance of the system. On the other hand, the cost of performing maximum capacity analysis on the system is time consuming, and you should decide when to perform full analysis of the new scenario or only part of the analysis. Changes in the business process are required to adapt the performance baseline. For example, adding database mirroring to the solution would affect the performance of the system. The performance would even be different if you applied database mirroring in its different modes (safety on and safety off). You should create specific baselines to analyze the performance of each member in the database mirroring infrastructure. Another example that requires adapting the performance baseline is using distributing partitioned views across servers; you should analyze each server specifically to detect potential problems in the overall solution.

3. Have you ever automated capturing baseline performance data throughout a solution? If so, what were some best practices that you followed? Answers will vary. Possible answers include: No. Yes. You have user-defined performance counters for your business processes—for example, in your .NET Framework code, you measure the number of orders, the number of account transactions, and the number of requests. Accessing these performance counters in your .NET Framework code is relatively easy and provides a greater level of detail than performing the analysis by aggregating data from SQL Profiler traces.

Module 7: Planning for Source Control, Unit Testing, and Deployment

7–23

Guidelines for Identifying Performance Objectives

**************************************** Illegal for non-trainer use *************************************** Introduction

The objective of this phase is to define the response time of the solution components. The business requirements define which performance components to analyze. You must determine the acceptable response time for each associated solution component.

Review the business requirements and prioritize performance objectives accordingly

You should work with business managers to create a list of performance objectives. The purpose of these meetings is to detect the high-level processes to improve and/or analyze.

Review the solution components that map to the business requirements

During this phase, you should identify each business case requirement with the appropriate solution component. The solution component is the application, task, job, or process that performs the business requirement identified in the previous step. You should determine whether different business processes use several components. For example, a business process could be “invoice to customer,” which performs a credit card charge. Another business process, such us “refund invoice amount to customer,” would use the same credit card charge component. Optimizing this process should improve the performance of both business processes.

Determine acceptable response times for the solution components

After identifying the components associated with the business requirements, you should define the desired response time for each component. You could use historical information about the performance of the processes and then analyze the impact of improving some components on the rest of the components. For example, creating an indexed view that accesses several tables usually improves the performance of queries that are resolved by using the indexed view, but could also negatively affect data modifications on the associated tables.

7–24

Module 7: Planning for Source Control, Unit Testing, and Deployment

Guidelines for Establishing Performance Benchmarks

**************************************** Illegal for non-trainer use *************************************** Introduction

This topic gives you guidelines for establishing performance benchmarks. You should perform the capturing process by using multiple techniques such as SQL Profiler traces, tools such as System Monitor, or system user-defined functions such as fn_virtualfilestats. You should use the new tool Database Engine Tuning Advisor to analyze and get suggestions on captured traces.

Add query plan outputs from stored procedures, views, and functions to source control

You should add the query plan output and the STATISTICS PROFILE set option results to the objects definition. Adding this information to your source control repository allows you to analyze the evolution of the execution plan, object statistics, and physical and logical operations performed by the call to the object. The values will be different for development, quality assurance, and production environments. Because of these differing results, consider having different baseline values for the different environments.

Measure the effect of each client application on the database

You should measure how different client applications coexist using the same database or different databases. By dividing the load for each client application, you can create a picture of what percentage of resources each application uses. This is valuable for server consolidation processes.

Work with a DBA to establish the system performance benchmark under expected load conditions

You should create the system performance benchmark by simulating the closest scenario to reality. Work with a database administrator (DBA) to establish this scenario. You should use tools like System Monitor to audit the performance counters such as memory, processor, network use, and disk I/O. By using the system function ::fn_virtualfilestats, you could audit the physical I/O requests. This function gives I/O-related information for each database file.

Use SQL Profiler to measure performance

SQL Profiler captures the SQL Server instance activity. You should create trace templates for the performance benchmark using the appropriate filters to avoid performance problems on the server. You can save the trace file in Extensible Markup Language (XML) format for later use in Database Engine Tuning Advisor.

Module 7: Planning for Source Control, Unit Testing, and Deployment Use Database Engine Tuning Advisor

7–25

The new Database Engine Tuning Advisor gives recommendations based on a workload. The workload typically comes from SQL Profiler trace results, which can be an .xml file, a .trc file, or table based. Database Engine Tuning Advisor analyzes the trace information and creates recommendations for improving the performance of the selected objects. Be careful that the trace is representative of normal system workload. Database Engine Tuning Advisor generates analysis reports that it can view later, and you can use the XML output in other utilities.

7–26

Module 7: Planning for Source Control, Unit Testing, and Deployment

Guidelines for Measuring Performance

**************************************** Illegal for non-trainer use *************************************** Introduction

In this topic, you will learn guidelines to measure system performance. You should take performance measurements following company standards, and you must define the level of detail required when measuring the performance of queries.

Identify business performance standards

If they are available, familiarize yourself with the company performance standards. Companies define these standards to avoid performance problems on the system. You should violate these standards only in crisis scenarios. You should carefully analyze other alternatives that follow the company performance standards and, if there is no alternative, clearly inform your superiors of your performance measures.

Specify the level of detail when measuring queries

You should decide how detailed the query analysis should be. You can use tools such as Showplan, SQL Profiler (Stmt events), or dynamic management views for measuring the performance of queries. You should not include the Showplan event in SQL Profiler traces because of its high use of resources.

Regularly snapshot and measure against baseline metrics

Establish the frequency of performance measurements. You should select the optimal point that avoids missing significant information due to not capturing data frequently enough and also avoids causing performance problems by capturing data too frequently.

Identify peak periods and peak trends

You should identify and analyze the peak periods. Performance is likely to be different for different time ranges. For example, a system might have large numbers of data modifications during the day with high concurrency. At night, reports could run with large number of aggregations that have very low concurrency but higher processor usage. You should adapt the measuring scripts to capture the information needed for each time period.

SQL Server Health and History Tool

SQL Server Health and History is a tool that helps you to gather SQL Server instance information automatically. It is a separate tool that you can download from the Microsoft Web site. The tool collects server, instance, and performance information. You can configure it to capture performance counter values with a custom frequency. The application collects the performance counter values in a text file and regularly incorporates the data into a SQL Server database.

Module 7: Planning for Source Control, Unit Testing, and Deployment

7–27

Guidelines for Creating a Performance Change Strategy

**************************************** Illegal for non-trainer use *************************************** Introduction

In this topic, you will identify guidelines to update the performance strategies. You should analyze the impact on the current monitoring processes, test the changes, and evaluate the new performance baseline.

Update baselines after all significant system changes

You should update the system performance baselines after all significant system changes. By re-creating the baseline, you will have a starting point for analyzing trends in the system. You should analyze what kinds of changes are more significant for the system—for example, new data archiving strategy in large tables, data partitioning, server upgrades, and so on.

Review automated monitoring

Because of the change in the system, you should evaluate the impact of the new configuration on the system. For example, you should identify the new limitations of the system, and you will have to redesign the automated monitoring processes. If the system incorporates new data partitioning strategies, analysis of the table fillfactor will be different, and you will have to update the fillfactor analysis for the affected tables.

Evaluate performance issues in a test environment

You should analyze the new performance of the system in a test environment. Ideally, the test environment should be as similar to production as possible, but in many cases, it is difficult to produce the same environment. You should use Database Engine Tuning Advisor to analyze the new scenario and then make changes to optimize the solution. You can also use a trace replay from SQL Profiler.

Tune queries

If necessary, you will need to rewrite queries and perform tests until the process works as expected. Although not ideal, you can use query hints to force index usage. You can also use plan guides, which allow the creation of hints for specific queries or stored procedures.

Update the monitoring and change strategies

Ensure that you update your monitoring strategies to incorporate the new changes to the system. You should adapt the performance strategy to the new system requirements, capabilities, or features. For example, if you add a feature such as database mirroring, you should create the appropriate monitoring tasks.

7–28

Module 7: Planning for Source Control, Unit Testing, and Deployment

Lesson 4: Designing a Deployment Strategy

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Explain the need for a deployment strategy.



Describe the process of designing a deployment strategy.



Apply the guidelines to design a deployment strategy.



Evaluate the considerations for deploying Notification Services.

As a database administrator or database developer, part of your job is to write database routines to provide data to applications. Just as with application code, you must deploy database code when it is time to install the application. Unfortunately, this step is easy to overlook during the development process, resulting in applications that are difficult or impossible to deploy. This lesson focuses on methods and processes to help design deployment strategies, which can help avoid deployment problems. The first three topics are a general overview of deployment strategy, and the final topic focuses on deployment of Notification Services solutions.

Module 7: Planning for Source Control, Unit Testing, and Deployment

7–29

Discussion: The Need for a Deployment Strategy

**************************************** Illegal for non-trainer use *************************************** Introduction

Discussion questions

Database administrators (DBAs) and database developers (DBDs) often give deployment insufficient priority. Often, deadlines and pressure to complete projects can get in the way of thinking about how the project will actually get to its final destination. Unfortunately, this means that many projects experience failure when someone tries to actually install and use the application. Such a failure can be catastrophic, because you have already spent a large amount of time and energy to create the software. No one can use an application that cannot be properly deployed. 1. Have you ever been in a situation in which deployment was the final task for the project? What were some issues that you faced? Answers will vary. Generally, students will explain that when it came time to deploy the application, there was no tested plan and some or all of the deployment process failed, resulting in loss of time or money.

2. How could you have addressed deployment issues during the development process? Answers will vary. The important idea to steer students toward is that deployment should be a tested, proven process that you consider throughout development, instead of something left until the end.

3. What types of items do you need to deploy in a typical database application? The answer should include items such as application code, databases, database code, data, logins and other credentials, and additional components such as extended stored procedures, as required by the data tier.

7–30

Module 7: Planning for Source Control, Unit Testing, and Deployment 4. Are there any special considerations that you need to address when deploying databases? What is the best way to deploy a database? Answers will vary. The common methods of deployment are by using detach/attach, by using backup/restore, or by scripting changes. Consider what methods you think are best in your work circumstances and why.

5. How should you test deployment? Answers will vary. One possible answer is that deployment can be tested when moving code from development environments to quality assurance environments. If developers and quality assurance engineers work to ensure that every deployment to quality assurance environments mirrors the type of conditions encountered during real deployments, the deployment will be well tested by the time a live deployment is necessary.

Module 7: Planning for Source Control, Unit Testing, and Deployment

7–31

The Process of Designing a Deployment Strategy

**************************************** Illegal for non-trainer use *************************************** Introduction

Procedure for defining a deployment strategy

Designing a deployment strategy is a simple process, centered on the idea of documenting the system such that deployment needs will be obvious. You will need a new deployment strategy for each application, and especially for new applications and version updates. As you define a deployment, ask yourself the “what,” “why,” and “how” questions at every step: What do you need to deploy? Why do you need to deploy it? (Is it a prerequisite for deploying something else?) And how should you deploy it? By following and documenting the process, you can make your deployment strategy as robust as necessary to allow repeated successful deployments of databases and related software. 1. Determine the items that need to be deployed: ●

Data and database code. These include scripts, detached databases, backedup databases, and flat files.



Credentials. These include server names, logins and passwords, and database access rights.



External resources. These include extended stored procedures, SQL Server Integration Services (SSIS) packages, and components that reside on the data tier.

7–32

Module 7: Planning for Source Control, Unit Testing, and Deployment 2. Determine the conditions under which deployment will occur: ●

New install of code and data. There is no need to worry about overwriting any existing data.



Update to an existing database. You should consider how to update data and code without causing problems.

3. Determine what constitutes a successful deployment: ●

Can you use self-testing processes or checksums to verify deployment?



Can you use the same criteria for new installs as well as updates?

4. Define a rollback strategy in case of problems: ●

In the case of a new install, performing a rollback should remove anything that the deployment process added to the system.



In the case of an update, performing a rollback should leave the system in exactly the same state it was in before the deployment process started.

Module 7: Planning for Source Control, Unit Testing, and Deployment

7–33

Guidelines for Designing a Deployment Strategy

**************************************** Illegal for non-trainer use *************************************** Introduction

The main goal of a deployment strategy is to produce a method of installing code and data that can be reproduced as many times as necessary in production environments, even by people who are unfamiliar with the process. The main factor in designing a successful strategy is ensuring that documentation is complete and well-written, as well as kept up to date at every stage of the development process.

Guidelines

Consider the following guidelines when designing a deployment process: ■

Start early in the development process. The earlier in the development process that considerations for deployment begin, the easier it will be to ensure that a consistent, well-thought-out strategy is in place. If possible, begin considering how deployment should occur during the design phase to minimize the possibility of including difficult-to-deploy components in the data tier.



Do not modify the code. If a failure in deployment occurs because of an error in the application, you should go back to development. You should not fix even the smallest error at the deployment phase just to get a successful deployment. If possible, developers should not be part of the deployment process, thereby removing the temptation to apply small fixes.



Document items needed to deploy. During the development process, maintain a central document that describes items that you need to deploy. Be sure to keep this document up to date throughout development, and include how the application should be deployed, other components that need to be deployed first or that are dependent on it, and how to roll back in case of problems. This document will eventually become the main deployment manual for your databases.

7–34

Module 7: Planning for Source Control, Unit Testing, and Deployment ■

Consider managed code. SQL Server can contain managed common language runtime (CLR) code. In Visual Studio, you can click Deploy on the Build menu to deploy and register the assemblies to the instance of SQL Server that you defined when you created the project. It is important to check that this is the test server for deployment. You can also call the command-line compilers from Transact-SQL code. These compilers are csc for Microsoft Visual C#® and vbc for Microsoft Visual Basic ®. For More Information For more information, see the MSDN® article “Deploying CLR Database Objects.”



Avoid missing seemingly unimportant items. In focusing on deployment of the larger components of the data tier, such as the databases themselves, it is easy to overlook smaller items such as credentials and server information. Be sure to document every part of the process, creating a complete guide that will help you, or a colleague, to deploy the database even with no prior knowledge of the project.



Do a dry run before rolling changes to production. Make sure to test your deployment strategy one or more times in a development or quality assurance environment before attempting to use it in a production environment. Test both successful deployments and failures to determine whether your rollback strategy is sufficient. To ensure that the documentation is complete, it is advisable for someone unfamiliar with the deployment strategy to perform the test deployment.

Module 7: Planning for Source Control, Unit Testing, and Deployment

7–35

Considerations for Deploying Notification Services

**************************************** Illegal for non-trainer use *************************************** Introduction

Many options are available when deploying Notification Services, such as where and how you deploy various components and which tools you use to conduct the deployment tasks. It is important to understand and consider the impact of various decisions when working with Notification Services deployments.

Considerations

When designing a Notification Services deployment, consider the following: ■

Determine placement of service components. Design a topology for the event provider, generator, and distributor services. You can host all of these on the same server or on different servers, depending on the scalability needs of the installation. You can enable these services using the nscontrol command-line utility, which you will have to script with the correct options for the deployment.



Determine placement of database components. When defining deployment, determine whether the notification databases should reside on the same servers or instances as the production databases used by the application for transactional purposes. Also determine how you should deploy the databases. You can create them from configuration files by using the nscontrol command-line utility, which will require scripting, or you can create the databases in advance on a development server and then move them to the destination server. Determine which method is appropriate for your application’s needs.



Determine authentication. Determine the credentials that the service components will need to run and to log on to the databases. Determine how to deploy these credentials to the production system. Transact-SQL scripts or the use of SQL Server Management Studio might be needed to create the necessary database server credentials.



Determine placement of definition and other configuration files. Definition files are necessary to configure and run Notification Services instances, but you must secure them to minimize the risk of attackers using them as a vector for attacking the system. Determine the best way to flexibly deploy the instance while keeping configuration files secure.

7–36

Module 7: Planning for Source Control, Unit Testing, and Deployment

Lab: Planning for Source Control, Unit Testing, and Deployment

**************************************** Illegal for non-trainer use *************************************** Introduction

In this lab, you will work in small groups to design a source code strategy, a unit test plan, and a deployment strategy for supporting the development teams building solutions for Fabrikam, Inc. You will then present your design to the rest of the class.

Scenario

The Fabrikam, Inc., development group consists of separate teams developing databases, Notification Services solutions, Reporting Services solutions, and Service Broker solutions. To centralize the versioning of the project source code, Fabrikam, Inc. has decided to use Visual SourceSafe. Some source code contains sensitive information, which must be encrypted. Different teams will have different project milestones when code freezes occur. Currently most teams implement functional testing. Because of the scope of the system, you have been asked to define a unit testing strategy. You must test the integration of solutions developed by the different teams. You must also take security and performance issues into consideration. You have also been asked to design a deployment strategy that will coordinate deployment efforts, taking into consideration any dependencies between the different components that comprise the complete system.

Preparation

Ensure that the virtual machine for the computer 2781A-MIA-SQL-07 is running. Also, ensure that you have logged on to computer. If you need to log on to the computer, use the following credentials: ■

Username: Student



Password: Pa$$w0rd

Module 7: Planning for Source Control, Unit Testing, and Deployment

7–37

Exercise 1: Designing a Source Control Strategy Introduction

Design a source control strategy

In this exercise, you will design a source control strategy based on Visual SourceSafe. The strategy will specify how to use Visual SourceSafe to manage code for SQL Server 2005 solutions.

Summary ■

Design a source control strategy using Visual SourceSafe to manage the code for SQL Server 2005 solutions and multiple teams of developers.

Detailed Steps 1. Using Microsoft Office Word, open the file Source Control Strategy.doc in the E:\Labfiles\Starter folder. This document contains hints and points that you should address in your design. However, you should feel free to document additional guidelines as you see necessary. 2. Define the folder structure and naming conventions. 3. Define the Visual SourceSafe usage standards.

Discussion question



Have you ever had to share or branch projects? Describe your experiences. Answers will vary. The main problem is how to maintain different versions of the same project. At times, you need to share a file with no changes, and sharing a pinned version of the file is a good solution in this case. Labeled pinned versions allow you to access a specific version of the project. If you need to make changes, unpin the version, apply the fix, and then pin it again. This way, both versions are up to date. However, if the file has changes, this process can become more complex, and branching would be a better solution. For this reason, you normally branch versions, and only label and share files for less important milestones.

7–38

Module 7: Planning for Source Control, Unit Testing, and Deployment

Exercise 2: Designing a Unit Testing Plan Introduction

In this exercise, you will design a plan for integrating unit testing into the development cycle. This plan can be useful when the development teams find and resolve bugs. The design should be an abstract design, explaining all of the steps necessary to ensure a well-defined unit test strategy without focusing on any particular scenario.

Design a unit testing plan

Summary ■

Design a high-level unit testing plan that addresses testing code and managing the dependencies between solution components developed by different teams.

Detailed Steps 1. Using Microsoft Word, open the file Unit Test Plan.doc in the E:\Labfiles\Starter folder. This document contains hints and points that you should address in your design. However, you should feel free to document additional guidelines as you see necessary. 2. Specify the security requirements for testing code. 3. Specify the strategy for performance testing of code. 4. Specify the policy for defining and updating test plans.

Module 7: Planning for Source Control, Unit Testing, and Deployment

7–39

Exercise 3: Designing a Deployment Strategy Introduction

Design a deployment strategy

In this exercise, you will design a strategy for deploying SQL Server 2005 solutions developed by the teams at Fabrikam, Inc. You will documents the high-level considerations for defining a deployment strategy, without focusing on any particular scenario.

Summary ■

Design a high-level deployment strategy for Fabrikam, Inc that specifies how to deploy SQL Server 2005 solutions.

Detailed Steps 1. Using Microsoft Word, open the file Deployment Strategy.doc in the E:\Labfiles\Starter folder. This document contains hints and points that you should address in your design. However, you should feel free to document additional guidelines as you see necessary. 2. Identify the solution components. 3. Identify the dependencies between components. 4. Define an order to deploy the components. 5. Define a deployment method. 6. Define a performance baseline and benchmarking strategy.

Discussion questions

1. In your experience, what are the benefits and drawbacks of automating the deployment of solutions? Answer will vary. Regularly automating deployment by using tools such as Microsoft Application Center helps to clone environments, avoiding human error. Also with this kind of deployment, you might have a lower level of control over errors and other issues affecting the deployment of your applications.

2. How would you test whether the deployment plan was successful? Answer will vary. Every deployment strategy must include specific checks to ensure that the deployment has been successful. You write these steps during development.

7–40

Module 7: Planning for Source Control, Unit Testing, and Deployment

Exercise 4: Justifying the Source Control, Unit Test, and Deployment Strategies Introduction

Present the source control strategy, unit test plan, and deployment strategy

In this exercise, a representative from each student group will present the strategies designed by the group to the rest of the class. When presenting the solution, you should refer to the solution documents that you created in the previous exercises.

Summary 1. Present an overview of the source control, unit testing, and deployment strategies designed to support development in Fabrikam, Inc. 2. Provide the reasons for choosing a particular strategy to meet the business requirements.

Detailed Steps 1. While presenting your designs, refer to the scenario for the requirements at the start of the lab. 2. Refer to the source control strategy design document that you created in Exercise 1. 3. Refer to the unit test plan design document that you created in Exercise 2. 4. Refer to the deployment strategy design document that you created in Exercise 3.

Discussion question



What are the strengths and weaknesses of the proposed strategies? Answers will vary.

Important After the discussion, shut down the virtual machine for the computer 2781A-MIA-SQL-07. Do not save the changes.

Module 8

Evaluating Advanced Query and XML Techniques

Contents: Lesson 1: Evaluating Common Table Expressions

8-2

Lesson 2: Evaluating Pivot Queries

8-16

Lesson 3: Evaluating Ranking Queries

8-30

Lesson 4: Overview of XQuery

8-39

Lesson 5: Overview of Strategies for Converting Data Between XML and Relational Forms

8-53

Lab: Evaluating Advanced Query and XML Techniques

8-63

Information in this document, including URL and other Internet Web site references, is subject to change without notice. Unless otherwise noted, the example companies, organizations, products, domain names, email addresses, logos, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product, domain name, e-mail address, logo, person, place or event is intended or should be inferred. Complying with all applicable copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express written permission of Microsoft Corporation. The names of manufacturers, products, or URLs are provided for informational purposes only and Microsoft makes no representations and warranties, either expressed, implied, or statutory, regarding these manufacturers or the use of the products with any Microsoft technologies. The inclusion of a manufacturer or product does not imply endorsement of Microsoft of the manufacturer or product. Links are provided to third party sites. Such sites are not under the control of Microsoft and Microsoft is not responsible for the contents of any linked site or any link contained in a linked site, or any changes or updates to such sites. Microsoft is not responsible for webcasting or any other form of transmission received from any linked site. Microsoft is providing these links to you only as a convenience, and the inclusion of any link does not imply endorsement of Microsoft of the site or the products contained therein. Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual property rights covering subject matter in this document. Except as expressly provided in any written license agreement from Microsoft, the furnishing of this document does not give you any license to these patents, trademarks, copyrights, or other intellectual property. ©2006 Microsoft Corporation. All rights reserved. Microsoft, Active Directory, ActiveX, BizTalk, Excel, Microsoft Press, MSDN, MSN, Outlook, PowerPoint, SharePoint, Tahoma, Visio, Visual Basic, Visual C#, Visual SourceSafe, Visual Studio, Windows, and Windows Server. All other trademarks are property of their respective owners.

Module 8: Evaluating Advanced Query and XML Techniques

8–1

**************************************** Illegal for non-trainer use *************************************** Module objectives

After completing this module, students will be able to: ■

At the end of this lesson, you will be able to: Evaluate the use of common table expressions.



At the end of this lesson, you will be able to: Evaluate the use of pivot queries.



At the end of this lesson, you will be able to: Evaluate the use of ranking queries.



At the end of this lesson, you will be able to: Evaluate the use of XQuery.



At the end of this lesson, you will be able to: Evaluate strategies for converting data between XML and relational formats.

Introduction

This module provides an opportunity for you to evaluate and practice using advanced query and XML techniques that you might use when designing a Microsoft® SQL Server™ 2005 solution. Query tasks include evaluating common table expressions (CTEs), pivot queries, and ranking techniques. XML tasks include defining standards for storing XML data, evaluating the use of XQuery, and creating a strategy for converting data between XML and relational formats. As part of your evaluation tasks, you will write advanced queries, and you will evaluate techniques for converting XML data into relational data.

8–2

Module 8: Evaluating Advanced Query and XML Techniques

Lesson 1: Evaluating Common Table Expressions

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Explain the functionality of a CTE.



Explain the syntax for writing a nonrecursive CTE query.



Explain the syntax for writing a recursive CTE query.



Explain how to write a CTE query for a multi-parent hierarchy.



Explain the common issues that you face when querying hierarchical data.

A CTE is a temporary result set that you can access within the execution scope of a SELECT, INSERT, UPDATE, DELETE, or CREATE VIEW statement. By using CTEs, you can divide complex queries into logical blocks of code that are easier to build and maintain. This lesson describes CTEs and explains how you can use them in your expressions to simplify code and improve reliability and performance. You will learn how to write various types of CTEs and how to use CTEs to work with hierarchical data.

Module 8: Evaluating Advanced Query and XML Techniques

8–3

What Is a CTE?

**************************************** Illegal for non-trainer use *************************************** Introduction

One of the most important enhancements to Transact-SQL in SQL Server 2005 is the addition of CTEs, which are part of the ANSI SQL:1999 standard. A CTE defines a temporary result set that you can reference multiple times within the execution scope of a SELECT, INSERT, UPDATE, DELETE, or CREATE VIEW statement. You should consider using CTEs in your query when you want to reference a result set multiple times, define hierarchies based on the source data, or eliminate the storage of unnecessary view definitions within your database. This topic describes CTEs and how they compare to views and derived queries. The topic also introduces you to the two types of CTEs: nonrecursive and recursive.

8–4

Module 8: Evaluating Advanced Query and XML Techniques

Comparing CTEs with views and derived tables

A CTE is similar to a view or derived query (subquery), although there are a number of differences. The following table lists many of these similarities and differences. Database object Views

Similarities to CTEs ■



Derived queries

Types of CTEs

You define the view before calling it from within your query. You define a CTE before calling it from within your query. The SELECT statements in views and CTEs must follow the same requirements.



You can reference views and CTEs multiple times within the query that calls them.



Both CTEs and derived tables are temporary objects.



You can create CTEs and derived tables within stored procedures, views, or triggers.

Differences from CTEs ■

Views are permanent objects; CTEs are not.



A view can reference other views. A CTE can reference other CTEs only if they are defined within the same WITH clause.



You can create a CTE within a stored procedure, view, or trigger. You cannot create a view within these objects.



You define a derived query directly within the outer query. You define a CTE before calling it from within your query.



A CTE can reference itself, but a derived query cannot.



A CTE can reference other CTEs within the same WITH clause. A derived query cannot reference other derived queries. You must redefine the derived query.



You can reference a CTE multiple times within the calling query. You cannot reference a derived query. You must repeat the derived query each time you need it.

SQL Server supports the following two categories of CTEs: Nonrecursive CTEs. A nonrecursive CTE is any CTE that does not include a reference to itself within the CTE query definition. However, you can reference the nonrecursive CTE as often as necessary in the referencing query. By using nonrecursive CTEs, you can simplify advanced queries, particularly those that contain derived queries that must be referenced multiple times. Recursive CTEs. A recursive CTE is any CTE that includes a reference to itself. Recursive CTEs simplify the task of creating recursive queries that retrieve hierarchical data such as an organizational chart or a bill of materials. A recursive CTE works by retrieving an initial set of data and then retrieving subsets of data, repeating the process for each level within the hierarchy. For example, a recursive CTE might begin by retrieving information about a single employee. The CTE then retrieves information about those individuals who report to that employee. Next, the CTE retrieves the third level of employees, the ones who report to those in the second level. This process continues until the CTE retrieves the entire hierarchy into a result set. You can then reference that result set from within your primary query. In the past, this sort of operation required temporary tables, cursors, or flow control statements, but CTEs can make this entire process easier and more efficient.

Module 8: Evaluating Advanced Query and XML Techniques

8–5

The Syntax for a Nonrecursive CTE Query

**************************************** Illegal for non-trainer use *************************************** Introduction

As you learned in the previous topic, you can use nonrecursive CTEs in your queries to provide named temporary result sets that you can reference within the execution scope of that query. This topic describes—and provides an example of—the basic syntax that makes up a nonrecursive CTE. The topic also compares that CTE to a view definition and a derived query.

The syntax of a nonrecursive CTE

To create a nonrecursive CTE, you must specify a WITH clause prior to the query that will reference the CTE. The WITH clause defines the name of the CTE and, optionally, the name of the CTE columns. You must then define the query that returns the temporary result set. The following syntax shows the main components of a CTE: WITH CTE_name [(column_name [,...n])] AS (CTE_query_definition) referencing_query

As the syntax shows, you should specify the WITH keyword, the CTE name, the column names if applicable, the AS keyword, and the CTE query definition within parentheses. For example, the following statement defines a CTE containing a WITH clause that includes only one CTE: USE AdventureWorks; GO WITH ContactFullName (ContactID, FullName) AS ( SELECT ContactID, FirstName + COALESCE(' ' + MiddleName + ' ', ' ') + LastName FROM Person.Contact ) SELECT * FROM ContactFullName;

8–6

Module 8: Evaluating Advanced Query and XML Techniques The CTE in this example is named ContactFullName and includes two columns: ContactID and FullName. The CTE query definition includes a SELECT statement that retrieves data from the Contact table in the AdventureWorks database. The SELECT statement that follows the CTE query definition references the CTE within the FROM clause, as it would a table or view. You can reference the CTE in the referencing query as often as necessary. This, of course, is a very simple example of a CTE, but it does demonstrate the fundamental components that you should include in a nonrecursive CTE. You might have noticed that the CTE definition is similar to that of a view definition. In fact, SQL Server imposes the same limitations on a CTE SELECT statement as it does a view SELECT statement. The following view definition returns the same result set as the CTE in the previous example: CREATE VIEW Person.ContactFullName (ContactID, FullName) AS SELECT ContactID, FirstName + COALESCE(' ' + MiddleName + ' ', ' ') + LastName FROM Person.Contact; GO SELECT * FROM Person.ContactFullName;

You can also use a derived query to generate the same result set as the CTE and the view. The following SELECT statement uses a derived query in the FROM clause to retrieve data from the Contact table: SELECT * FROM (SELECT ContactID, FirstName + COALESCE(' ' + MiddleName + ' ', ' ') + LastName AS FullName FROM Person.Contact) AS ContactFullName;

As these examples demonstrate, you can achieve the same results through various means. However, CTEs enable you to achieve these results without creating unnecessary views. And CTEs can help to streamline your code if you find that you must repeat the derived query numerous times within the same statement.

Module 8: Evaluating Advanced Query and XML Techniques

8–7

The Syntax for a Recursive CTE Query

**************************************** Illegal for non-trainer use *************************************** Introduction

One of the main advantages of a CTE is that you can define it to reference itself. Because of this feature, you can create recursive queries that are difficult to create by using other Transact-SQL elements. This topic describes the basic syntax that makes up a recursive CTE and walks you through an example of how to create one.

The syntax of a recursive CTE

The syntax of a recursive CTE is similar to that of a nonrecursive CTE except that the CTE query element must include two queries—the anchor query and the recursive query—that you connect with a UNION ALL operator, as shown in the following syntax: WITH CTE_name [(column_name [,...n])] AS ( CTE_anchor_query UNION ALL CTE_recursive_query ) referencing_query

The anchor query provides the CTE with the base row or rows to populate the result set. The recursive query then uses this base data as a starting point for looping through each level of the hierarchy. The anchor query never includes a self-reference to the CTE, and the recursive query always includes a self-reference to the CTE.

8–8

Module 8: Evaluating Advanced Query and XML Techniques The best way to illustrate how this works is with an example. The following recursive CTE retrieves hierarchical data from the Employee table in the AdventureWorks database: --CTE declaration WITH EmpHierarchy (EmpID, MgrID, EmpLevel) AS ( -- CTE anchor query SELECT EmployeeID, ManagerID, 1 AS EmpLevel FROM HumanResources.Employee WHERE ManagerID IS NULL UNION ALL -- CTE recursive query SELECT e.EmployeeID, e.ManagerID, EmpLevel + 1 FROM HumanResources.Employee AS e INNER JOIN EmpHierarchy AS d -- self-reference ON e.ManagerID = d.EmpID ) --referencing query SELECT MgrID, COUNT(EmpID) AS EmpTotal, EmpLevel FROM EmpHierarchy WHERE MgrID IS NOT NULL GROUP BY MgrID, EmpLevel ORDER BY EmpLevel, MgrID;

In this example, the EmpHierarchy CTE retrieves the employee IDs and their managers’ IDs from the Employee table. At the same time, the table assigns a level that corresponds to the employee’s level within the hierarchy. The referencing query then groups together the results from the EmpHierarchy CTE to provide the number of employees who report to each manager and the level of those employees. The following result set shows a sample of data that the query returns: MgrID ----------109 6 12 42 140 148 273 3 21 30 44 71 139

EmpTotal ----------6 8 1 7 4 4 3 7 22 5 4 1 7

EmpLevel ----------2 3 3 3 3 3 3 4 4 4 4 4 4

As you can see, six employees at level 2 report to manager 109, eight employees at level 3 report to manager 6, and so on. In total, the query returns 47 rows of data. Notice also that the EmpLevel column does not contain the value 1 in the result set, although 1 is specified in the first CTE query. This is because the referencing query returns only those rows with a MgrID value of NOT NULL. To better understand how the query works, you can break down the query into small components, which begins with the following CTE declaration: CTE declaration WITH EmpHierarchy (EmpID, MgrID, EmpLevel) AS

Module 8: Evaluating Advanced Query and XML Techniques

8–9

In this part of the query, you simply specify the WITH keyword, the name of the CTE (EmpHierarchy), the name of the columns within parentheses, and the AS keyword. In the next part of the query (after the opening parenthesis), you specify the following anchor query: CTE anchor query SELECT EmployeeID, ManagerID, 1 AS EmpLevel FROM HumanResources.Employee WHERE ManagerID IS NULL

An anchor query is made up of one or more SELECT statements. The anchor query cannot self-reference the CTE, but it can include multiple statements joined together by UNION, UNION ALL, EXCEPT, or INTERSECT operators. The anchor query in this example retrieves one EmployeeID value and the associated ManagerID value and assigns the value 1 to EmpLevel. The value indicates that this employee is at the top level of the hierarchy, and it represents the starting point from which the CTE defines subsequent levels. The anchor query returns the following result set: EmployeeID ManagerID EmpLevel ----------- ----------- --------109 NULL 1

The result set from the anchor query provides the initial values for the CTE result set. The recursive query uses these results as a base from which to start looping through the hierarchy. The UNION ALL operator must follow the anchor query, the recursive query must follow the operator, and the recursive query must always contain a self-reference to the CTE. The following recursive query joins the CTE to the Employee table: CTE recursive query SELECT e.EmployeeID, e.ManagerID, EmpLevel + 1 FROM HumanResources.Employee AS e INNER JOIN EmpHierarchy AS d -- self-reference ON e.ManagerID = d.EmpID

The join between the Employee table and CTE is based on the results generated by the anchor query. In this case, the ManagerID values in the Employee table join to the current EmpID value in the CTE, which is 109. Based on this join, the recursive query generates additional results for the next level in the hierarchy and adds the results to the CTE result set. The recursive query then runs again, but this time, the join is based on the last values added to the result set. This process continues for each level of the hierarchy until the query returns no additional rows. Once the CTE generates the final result set, you can reference the CTE one or more times in your referencing query. The referencing query in this example uses the following GROUP BY construction to group the data by MgrID values and EmpLevel values: Referencing query SELECT MgrID, COUNT(EmpID) AS EmpTotal, EmpLevel FROM EmpHierarchy WHERE MgrID IS NOT NULL GROUP BY MgrID, EmpLevel ORDER BY EmpLevel, MgrID;

The query uses the COUNT aggregate function to find the total number of employees within each group. As you can see, the query references the EmpHierarchy CTE as it would any table or view in the database. However, once you run the query, you can no longer reference the CTE. It exists only within the execution scope of the query.

8–10

Module 8: Evaluating Advanced Query and XML Techniques

Demonstration: Writing a CTE Query for a Multi-Parent Hierarchy

**************************************** Illegal for non-trainer use *************************************** Introduction

In the last topic, you saw an example of a recursive CTE. The example CTE in that topic generates a single-parent hierarchy. In other words, the anchor query generates a single row, which forms the basis for the hierarchy. In this demonstration, you create a multiparent hierarchy. A multi-parent hierarchy is based on multiple rows returned by the anchor query. The goal of the CTE is to produce a list of materials that currently support product assembly 753.

Preparation

Ensure that the virtual machine 2781A-MIA-SQL-08 is running and that you are logged on as Student. If a virtual machine has not been started, perform the following steps: 1. Close any other running virtual machines. 2. Start the 2781A-MIA-SQL-08 virtual machine. 3. In the Log On to Windows dialog box, complete the logon procedure by using the user name Student and the password Pa$$w0rd.

To review the BillOfMaterials table definition

To review the BillOfMaterials table definition, perform the following steps: 1. On the Start menu, point to All Programs, point to SQL Server 2005, and then click SQL Server Management Studio. 2. In the Connect to Server dialog box, use Microsoft Windows® Authentication to log on to the MIA-SQL\SQLINST1 server, and then click Connect. 3. In Object Explorer, expand Databases, expand AdventureWorks, and then expand the Tables node. 4. Right-click the Production.BillOfMaterials table node and then click Modify.

Module 8: Evaluating Advanced Query and XML Techniques

8–11

5. In the Table – Production.BillOfMaterials window, review the information for the columns listed in the following table. Column

Description

ProductAssemblyID

Parent product identification number. Foreign key to Product.ProductID.

ComponentID

Component identification number. Foreign key to Product.ProductID.

EndDate

Date that the component stopped being used in the assembly item.

PerAssemblyQty

Quantity of the component needed to create the assembly.

6. Close the Table – Production.BillOfMaterials window. To create the CTE declaration

To create the CTE declaration, perform the following steps: 1. In SQL Server Management Studio, click New Query to open a new query window. 2. In the query window, type the following USE statement and CTE declaration: USE AdventureWorks; GO WITH BOMHierarchy (ProdID, Qty, ProdLevel) AS (

When you create the declaration, be sure to include the opening parenthesis to enclose the anchor and recursive queries. To create the CTE anchor query and the recursive query

To create the CTE anchor and recursive queries, perform the following steps: 1. Add the following anchor query to the code you created in the last procedure: SELECT ComponentID, PerAssemblyQty, 1 FROM Production.BillOfMaterials WHERE ProductAssemblyID = 753 AND EndDate IS NULL

The anchor query is a simple SELECT statement that populates the CTE result set with its initial values. The result set is limited to those materials with a ProductAssemblyID value of 753 and whose EndDate value is NULL. The following results show a partial list of the data returned by the anchor query: ComponentID ----------519 721 807 813 820

PerAssemblyQty --------------------------------------1.00 1.00 1.00 1.00 1.00

In all, the anchor query returns 14 rows.

ProdLevel ----------1 1 1 1 1

8–12

Module 8: Evaluating Advanced Query and XML Techniques 2. Add the following recursive query to the code you created previously: UNION ALL SELECT bom.ComponentID, CAST(bh.Qty * bom.PerAssemblyQty AS DECIMAL(8,2)), bh.ProdLevel + 1 FROM BOMHierarchy AS bh JOIN Production.BillOfMaterials AS bom ON bom.ProductAssemblyID = bh.ProdID WHERE bom.EndDate IS NULL )

The recursive query must start with the UNION ALL operator and end with the closing parenthesis. To retrieve the various levels of the product hierarchy, the recursive query uses a join to self-reference the CTE. To create the CTE referencing query

To create the referencing query, perform the following steps: 1. Add the following referencing query after your CTE: SELECT bh.ProdID, p.Name, bh.Qty, bh.ProdLevel FROM BOMHierarchy AS bh JOIN Production.Product AS p ON p.ProductID = bh.ProdID ORDER BY bh.ProdLevel, p.ProductID;

2. Click Execute to run your query. The following result set shows part of the results you should receive from the query: ProdID -------519 721 807 813 820 828 894 907 940 945 948 951 952 996 1 3 4

Name --------------------------HL Road Seat Assembly HL Road Frame - Red, 56 HL Headset HL Road Handlebars HL Road Front Wheel HL Road Rear Wheel Rear Derailleur Rear Brakes HL Road Pedal Front Derailleur Front Brakes HL Crankset Chain HL Bottom Bracket Adjustable Race BB Ball Bearing Headset Ball Bearings

Qty ----------1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 10.00 8.00

ProdLevel ----------1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2

Notice that the result set includes the various product levels. The result set should include 89 rows and 4 product levels.

Module 8: Evaluating Advanced Query and XML Techniques

8–13

Discussion: Common Issues When Querying Hierarchical Data

**************************************** Illegal for non-trainer use *************************************** Introduction

Working with hierarchical data can be a complex process. The data might include anything from organizational information to accounting-related information to a bill of materials. This topic provides an opportunity for you to discuss your experiences working with complex hierarchical data and how you solved any issues related to that process.

Discussion questions

To discuss these questions, you will first divide into small groups. After you have discussed the questions within your groups, you will participate in a class-wide discussion. The instructor will lead you through this process for the following questions: 1. What are some of the issues that you have experienced when working with complex hierarchical data? Answers will vary, but might include the following: ■

Performance problems related to using cursors



Performance problems related to using temporary tables



Complex Transact-SQL syntax



Wrong results caused by logic errors (a consequence of complexity)

2. Would any Transact-SQL statements help you solve these issues? If so, how? If not, what issues would you still have, and what guidelines would you propose? Answers will vary, but might include the following: ■

You can use a data definition language (DDL) statement to denormalize parts of the database to help improve performance. However, denormalization can sometimes impact data integrity.



You can use user-defined functions to help simplify complex hierarchical queries.

8–14

Module 8: Evaluating Advanced Query and XML Techniques

Practice: Writing CTE Queries

**************************************** Illegal for non-trainer use *************************************** Introduction

Creating a CTE to count the number of employees that report directly to each manager

In this practice, you will create two queries that use CTEs to retrieve data from the Employee table of the AdventureWorks database. The first query returns a count of the number of employees who report directly to each manager. The second query returns a count of the number of employees who report directly and indirectly to each manager. Both queries should return only two columns: the manager ID and the number of employees who report to that manager. 

Create a query that includes a CTE that retrieves data from the Employee table. The query should list the number of employees who report directly to each manager. Name the CTE DirectReports. Name the columns in the result set MgrID and NumEmp.

The following result set shows part of the results that you should receive: MgrID ----------3 6 7 12 14 16 18 21

NumEmp ----------7 8 6 1 6 12 8 22

The query should return a total of 47 rows. Creating a CTE to count of the number of employees who report directly and indirectly to each manager



Create a query that includes a CTE that retrieves data from the Employee table. The query should list the number of employees who report directly and indirectly to each manager. Name the CTE TotalReports. Name the columns in the result set MgrID and NumEmp.

The query should return a total of 47 rows.

Module 8: Evaluating Advanced Query and XML Techniques Discussion questions

8–15

Review the following questions, and then discuss your answers with the class: 1. How difficult was it to create the DirectReports query? Answers will vary. In general, creating a nonrecursive query should be similar to creating a view.

2. How difficult was it to create the TotalReports query? Answers will vary. In general, students probably had a more difficult time creating the second query. However, it was probably easier than some of the solutions they have used in the past when working with hierarchical data.

3. What other methods can you use to produce the same results produced by the TotalReports query? Answers will vary, but solutions might include user-defined functions, temporary tables, and queries.

8–16

Module 8: Evaluating Advanced Query and XML Techniques

Lesson 2: Evaluating Pivot Queries

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Explain the functionality of a pivot query.



Explain the syntax of a pivot query.



Explain how to write a pivot query on an open schema.



Explain how to write an unpivot query.

In this lesson, you will learn about using the PIVOT and UNPIVOT operators to create crosstab queries. A crosstab query restructures and calculates data to support reporting and analytical requirements. This type of query generally aggregates groups of normalized data to present that data in meaningful way, similar to what you might see in a Microsoft Office Excel® spreadsheet. The crosstab query might also convert spreadsheet-like data to a more normalized form. When using versions of SQL Server prior to SQL Server 2005, creating crosstab queries often resulted in cumbersome or complex code. However, the PIVOT and UNPIVOT operators, introduced in SQL Server 2005, simplify the task of creating crosstab queries and presenting data in a meaningful way.

Module 8: Evaluating Advanced Query and XML Techniques

8–17

What Is a Pivot Query?

**************************************** Illegal for non-trainer use *************************************** Introduction

When retrieving data to support summary reports, you must sometimes write crosstab queries that rotate the data and convert rows into columns. When using versions prior to SQL Server 2005, writing these summaries could often prove to be a cumbersome task, resulting in long, complex queries. However, the PIVOT and UNPIVOT operators can help you greatly simplify your queries. In this topic, you will learn about each of these operators. The topic provides various examples of queries that do not use PIVOT or UNPIVOT and then provides queries that use them, thus demonstrating how they can simplify the process of creating crosstab queries.

The pivot query

A pivot query uses the PIVOT operator to retrieve normalized data and display it in a way that is similar to how data appears in a spreadsheet. For example, the SalesOrderDetail and SalesOrderHeader tables in the AdventureWorks database store information about the various sales transactions. For each sale, you can retrieve the product ID for the item sold, the dollar amount that the product sold for, and the salesperson who sold that product.

8–18

Module 8: Evaluating Advanced Query and XML Techniques Suppose that you want to view the total dollar amount of each product sold for one or more salespeople. To do so, you must create a column for each salesperson and provide a sales summary for each product. One way to do this is to use a derived query in the SELECT clause for each salesperson whom you want to include in the results, as shown in the following example: SELECT s.ProductID, (SELECT SUM(d.LineTotal) FROM Sales.SalesOrderDetail AS d JOIN Sales.SalesOrderHeader AS h ON d.SalesOrderID = h.SalesOrderID WHERE h.SalesPersonID = 268 AND d.ProductID = s.ProductID) AS [268], (SELECT SUM(d.LineTotal) FROM Sales.SalesOrderDetail AS d JOIN Sales.SalesOrderHeader AS h ON d.SalesOrderID = h.SalesOrderID WHERE h.SalesPersonID = 275 AND d.ProductID = s.ProductID) AS [275], (SELECT SUM(d.LineTotal) FROM Sales.SalesOrderDetail AS d JOIN Sales.SalesOrderHeader AS h ON d.SalesOrderID = h.SalesOrderID WHERE h.SalesPersonID = 276 AND d.ProductID = s.ProductID) AS [276] FROM Sales.SalesOrderDetail AS s GROUP BY s.ProductID;

If you were to include a large number of salespeople in this query, it would become very cumbersome. However, the logic behind the query does return the necessary data. The following result set shows a sample of the information that this query returns: ProductID ----------707 708 709 710 711 712 714 715 716 717 718 719 722

268 -------------1576.269067 1420.189500 102.600000 NULL 1401.752000 413.701042 1427.979224 2566.296592 492.594000 2576.700000 3435.600000 NULL 1177.204400

275 -------------6324.728180 6570.446787 572.397800 57.000000 8014.996351 2809.074670 7518.769392 15337.620841 6599.836400 51485.483700 51820.553400 10306.800200 25430.438200

276 --------------10859.642388 11677.535352 875.211250 45.600000 12041.731684 3747.460484 12237.426893 21303.943045 10159.200880 33562.295000 28890.273200 14991.709200 17606.814200

As you can see, the query results include a row for each product and a total value for each salesperson. In all, the query returns 266 rows (unless you have added data to or deleted data from these tables). You can achieve the same results by using the CASE clause to retrieve data about each salesperson, as shown in the following statement: SELECT d.ProductID, SUM(CASE WHEN h.SalesPersonID = 268 THEN d.LineTotal ELSE NULL END) AS [268], SUM(CASE WHEN h.SalesPersonID = 275 THEN d.LineTotal ELSE NULL END) AS [275], SUM(CASE WHEN h.SalesPersonID = 276 THEN d.LineTotal ELSE NULL END) AS [276] FROM Sales.SalesOrderDetail AS d JOIN Sales.SalesOrderHeader AS h ON d.SalesOrderID = h.SalesOrderID GROUP BY d.productid;

Module 8: Evaluating Advanced Query and XML Techniques

8–19

Although this statement is not quite as complex as the previous example, it can still become quite cumbersome if you want to retrieve data about many salespeople. However, SQL Server 2005 enables you to return summarized data in a much simpler fashion—by using the PIVOT operator, as shown in the following example: SELECT * FROM (SELECT h.SalesPersonID, d.ProductID, d.LineTotal FROM Sales.SalesOrderDetail AS d JOIN Sales.SalesOrderHeader AS h ON d.SalesOrderID = h.SalesOrderID WHERE salespersonid IS NOT NULL) As Sales PIVOT (SUM(LineTotal) FOR SalesPersonID IN ([268], [275], [276]))AS Pvt;

This statement returns similar results but is much simpler to write and much easier to expand to include many more salespeople. The following results show part of the results showed by this statement: ProductID ----------707 708 709 710 711 712 714 715 716 717 718 719 722

268 -------------1576.269067 1420.189500 102.600000 NULL 1401.752000 413.701042 1427.979224 2566.296592 492.594000 2576.700000 3435.600000 NULL 1177.204400

275 -------------6324.728180 6570.446787 572.397800 57.000000 8014.996351 2809.074670 7518.769392 15337.620841 6599.836400 51485.483700 51820.553400 10306.800200 25430.438200

276 --------------10859.642388 11677.535352 875.211250 45.600000 12041.731684 3747.460484 12237.426893 21303.943045 10159.200880 33562.295000 28890.273200 14991.709200 17606.814200

Note that this statement returns fewer rows than the previous two statements because it does not return rows when the LineTotal value is NULL for all specified salespeople. In the next topic, you will learn more about the syntax of a pivot query and how you can use it to effectively summarize your data. The unpivot query

The UNPIVOT operator is similar to the PIVOT operator in that it enables you to rotate data and turn rows into columns. However, you should use this operator to retrieve data that is already stored in a way similar to a spreadsheet and display it in a more normalized fashion. For example, suppose that the AdventureWorks database includes the AccountBalances table. The following code shows the AccountBalances table definition and sample data: USE AdventureWorks; GO CREATE TABLE Sales.AccountBalances (AcctID INT PRIMARY KEY, Jan MONEY, Feb MONEY, Mar MONEY, Apr MONEY, May MONEY, Jun MONEY, Jul MONEY, Aug MONEY, Sep MONEY, Oct MONEY, Nov MONEY, Dec MONEY); INSERT INTO Sales.AccountBalances VALUES (1234, 37.42, 22.22, 57.39, 18.22, 0, 97.67, .03, 47.00, 77.44, 45.45, 56.56, 67.67); INSERT INTO Sales.AccountBalances VALUES (1274, 7.42, 2.22, 7.39, 1.22, 7.00, 9.67, 7.03, 4.00, 7.44, 145.45, 156.56, 667.67); INSERT INTO Sales.AccountBalances VALUES (1288, 337.42, 322.22, 357.39, 318.22, 30.00, 397.67, 3.03, 347.00, 377.44, 345.45, 356.56, 367.67); INSERT INTO Sales.AccountBalances VALUES (1347, 437.42, 422.22, 457.39, 418.22, 440.00, 497.67, 4.03, 447.00, 477.44, 445.45, 456.56, 467.67);

8–20

Module 8: Evaluating Advanced Query and XML Techniques The table stores the dollar amount owed by each customer account at the end of each month. If you were to retrieve all rows from this table, the result set would be similar to a spreadsheet. However, suppose that you want to display data for each account for each month, in a manner that is more normalized. To do so, you can use UNION ALL operators to join together multiple SELECT statements, as shown in the following example: SELECT AcctID, Jan AS Sales.AccountBalances UNION ALL SELECT AcctID, Feb AS Sales.AccountBalances UNION ALL SELECT AcctID, Mar AS Sales.AccountBalances UNION ALL SELECT AcctID, Apr AS Sales.AccountBalances UNION ALL SELECT AcctID, May AS Sales.AccountBalances ORDER BY AcctID;

SalesAmount, 'Jan' AS SalesMonth FROM

SalesAmount, 'Feb' AS SalesMonth FROM

SalesAmount, 'Mar' AS SalesMonth FROM

SalesAmount, 'Apr' AS SalesMonth FROM

SalesAmount, 'May' AS SalesMonth FROM

Each SELECT statement retrieves data for a specified month. Because the statement specifies five months, the query results return one row for each account for each month. The following result set shows a sample of the retrieved data: AcctID ----------1234 1234 1234 1234 1234 1274 1274 1274 1274 1274

SalesAmount --------------------37.42 22.22 57.39 18.22 0.00 7.42 2.22 7.39 1.22 7.00

SalesMonth ---------Jan Feb Mar Apr May Jan Feb Mar Apr May

In all, the statement returns 20 rows. You can achieve the same results by using the UNPIVOT operator, as shown in the following example: SELECT * FROM (SELECT AcctID, Jan, Feb, Mar, Apr, May FROM Sales.AccountBalances) AS AcctBal UNPIVOT (SalesAmount FOR SalesMonth IN ([Jan], [Feb], [Mar], [Apr], [May])) AS Unpvt;

As you can see, the UNPIVOT operator enables you to create statements that are much simpler than the method used in the previous example. Rather than creating a SELECT statement for each month, you simply specify all the months in a single clause. Later in this lesson, you will learn more about the UNPIVOT operator.

Module 8: Evaluating Advanced Query and XML Techniques

8–21

The Syntax of a Pivot Query

**************************************** Illegal for non-trainer use *************************************** Introduction

In the last topic, you were introduced to the PIVOT operator and how it can be used to create crosstab queries. This topic continues with that discussion by describing the syntax of a pivot query and explaining the components of a sample statement that uses the PIVOT operator.

The syntax of a pivot query

A pivot query is one that includes the PIVOT operator. The PIVOT operator is an extension to the FROM clause of a SELECT statement, as shown in the following syntax: SELECT * FROM table_source PIVOT (pivot_clause) [AS] table_alias ::= aggregate_function(value_column) FOR pivot_column IN (column_list)

When you create a pivot query, you must use the asterisk (*) wildcard in the SELECT clause. If you must limit the columns returned by your statement, you should include a derived query in the FROM clause as your table source. The derived query must include exactly those columns that should be returned by the outer SELECT statement. After you specify the table source in your FROM clause, you must specify the PIVOT keyword, then the pivot clause enclosed in parentheses, and finally a table alias. The pivot clause includes an aggregate function, the column to be aggregated, and the FOR clause. The column to be aggregated contains the values that the query will summarize based on the specified aggregate function. The FOR clause must contain the name of the pivot column. The pivot column includes the values that act as column names in the result set. The FOR clause also include an IN operator that specifies the list of values from the pivot column that should be used as columns in the result set.

8–22

Module 8: Evaluating Advanced Query and XML Techniques To better understand the syntax, take a look at the following example: SELECT * FROM (SELECT h.SalesPersonID, d.ProductID, d.LineTotal FROM Sales.SalesOrderDetail AS d JOIN Sales.SalesOrderHeader AS h ON d.SalesOrderID = h.SalesOrderID WHERE SalesPersonID IS NOT NULL) As Sales PIVOT (SUM(LineTotal) FOR SalesPersonID IN ([268], [275], [276])) AS Pvt;

This is the same example of a pivot query that you saw in the last topic. As you can see, the SELECT clause include the asterisk (*) wildcard, and the FROM clause includes a derived query. The FROM clause also includes the PIVOT operator, the pivot clause, and the table alias. The pivot clause specifies the SUM aggregate function, the LineTotal column as the aggregated column, and the SalesPersonID column as the pivot column. This means that the statement will add together the values in the LineTotal column based on how the statement pivots the results. Because the statement specifies the SalesPersonID column as the pivot column, the statement will use values in this column as columns in the query results. The IN operator specifies that the values 268, 275, and 276 should be used as columns. The following results show a part of the result set returned by the query: ProductID ----------707 708 709 710 711 712 714 715 716 717 718 719 722

268 -------------1576.269067 1420.189500 102.600000 NULL 1401.752000 413.701042 1427.979224 2566.296592 492.594000 2576.700000 3435.600000 NULL 1177.204400

275 -------------6324.728180 6570.446787 572.397800 57.000000 8014.996351 2809.074670 7518.769392 15337.620841 6599.836400 51485.483700 51820.553400 10306.800200 25430.438200

276 --------------10859.642388 11677.535352 875.211250 45.600000 12041.731684 3747.460484 12237.426893 21303.943045 10159.200880 33562.295000 28890.273200 14991.709200 17606.814200

As you can see, the results include the three pivoted column values: 268, 275, and 276. In all, the pivot query returns 266 rows.

Module 8: Evaluating Advanced Query and XML Techniques

8–23

Demonstration: Writing a Pivot Query on an Open Schema

**************************************** Illegal for non-trainer use *************************************** Introduction

Pivot queries are especially useful when querying data stored in tables defined with an open schema. Open schemas provide a flexible mechanism for storing data that describes the attributes of an object. For example, suppose that your database stores information about the products that your company sells. There are several approaches that you can take to store the data. You might include all product information in one table. This approach can work well if the attributes that describe the products are consistent from product to product. For instance, if each product includes the same attributes, such as size, color, style, weight, and so on, you can store this data in one table. However, if different products have different attributes, a single-table solution can result in numerous NULL values and complex redesign if you add products that have attributes that are different from those that already exist. One solution to this problem is to use an open schema design to create your tables. In an open schema, you store the basic product information in one table and the attributes in another table. For example, you might store the product ID and the product name in one table and the product ID, attribute name, and attribute description in another table. This approach enables you to store different attributes about different products and allows you to add and remove attributes as necessary. As a result, you do not have to include the product’s attributes as part of the table schema but rather as rows that you can add, modify, or delete, resulting in a fully normalized, yet flexible, structure. You can use the PIVOT operator to retrieve data from tables with an open schema. You use the operator in much the same way that you use it for other type of pivot queries. In this demonstration, you will create a pivot query that retrieves data from two tables designed with an open schema. You will first create and populate these tables and then write the query.

8–24

Module 8: Evaluating Advanced Query and XML Techniques

Preparation

Ensure that the virtual machine 2781A-MIA-SQL-08 is running and that you are logged on as Student. If a virtual machine has not been started, perform the following steps: 1. Close any other running virtual machines. 2. Start the 2781A-MIA-SQL-08 virtual machine. 3. In the Log On to Windows dialog box, complete the logon procedure by using the user name Student and the password Pa$$w0rd.

To create the Products and ProdAttributes tables

To complete this demonstration, you must create the Products table and the ProdAttributes table. You will create these tables in the AdventureWorks database by using the Production schema. To create these tables, perform the following steps: 1. On the Start menu, point to All Programs, point to SQL Server 2005, and then click SQL Server Management Studio. 2. In the Connect to Server dialog box, use Windows Authentication to log on to the local server and then click Connect. 3. In SQL Server Management Studio, click New Query to open a new query window. 4. If the Connect to Server dialog box appears, use Windows Authentication to log on to the local server and then click Connect. 5. In the query window, type the following statements: USE AdventureWorks; GO CREATE TABLE Production.Products ( ProdID INT PRIMARY KEY, ProdName VARCHAR(200) NOT NULL ); CREATE TABLE Production.ProdAttributes ( ProdID INT NOT NULL, AttrName VARCHAR(100) NOT NULL, AttrValue VARCHAR(100) NOT NULL CONSTRAINT [PK_Products_ProdID_AttrName] PRIMARY KEY CLUSTERED (ProdID, AttrName), CONSTRAINT [FK_ProdAttr_Products_ProdID] FOREIGN KEY(ProdID) REFERENCES Production.Products (ProdID) ); INSERT INTO Production.Products VALUES (101, 'Mtn bike - 100'); INSERT INTO Production.Products VALUES (102, 'Mtn seat/saddle'); INSERT INTO Production.Products VALUES (103, 'Jersey'); INSERT INTO Production.Products VALUES (104, 'Brake cable'); INSERT INTO Production.ProdAttributes VALUES (101, 'Color', 'Silver'); INSERT INTO Production.ProdAttributes VALUES (101, 'Size', '38'); INSERT INTO Production.ProdAttributes VALUES (101, 'Frame', 'Aluminum'); INSERT INTO Production.ProdAttributes VALUES (102, 'Material', 'Gel/steel rails'); INSERT INTO Production.ProdAttributes VALUES (102, 'Gender', 'Mens'); INSERT INTO Production.ProdAttributes VALUES (103, 'Material', 'Polyester/spandex'); INSERT INTO Production.ProdAttributes VALUES (103, 'Style', 'Long sleeve');

Module 8: Evaluating Advanced Query and XML Techniques

8–25

INSERT INTO Production.ProdAttributes VALUES (103, 'Color', 'Blue'); INSERT INTO Production.ProdAttributes VALUES (103, 'Size', 'Large'); INSERT INTO Production.ProdAttributes VALUES (104, 'Length', '66');

You can find the code to create and populate these tables in the Products_ProdAttributes_tables.sql file in the E:\Demos folder. As you can see, you can modify attributes in the ProdAttributes table as needed. The open schema design provides you with substantial flexibility with this type of data. 6. Click Execute to run the query. You should receive messages indicating that the objects have been successfully created and the rows have been inserted. To create the pivot query

To create the pivot tables, perform the following steps: 1. In the query window, type the following statement: SELECT * FROM (SELECT p.ProdID, p.ProdName, a.AttrName, a.AttrValue FROM Production.Products AS p INNER JOIN Production.ProdAttributes AS a ON p.ProdID = a.ProdID) AS Attr PIVOT (MAX(AttrValue) FOR AttrName IN ([Material], [Style], [Color], [Size])) AS Pvt WHERE ProdID IN (SELECT ProdID FROM Production.Products WHERE ProdName = 'Jersey');

Be sure that you correctly identify the values for the IN operator of the FOR clause. Note Notice that you must still specify an aggregate function. When retrieving data from an open schema in this way, you should use the MAX or MIN aggregate function. 2. Click Execute to run your query. The following result set shows the results you should receive from the query: ProdID ProdName Material Style Color Size ------- --------- ------------------ ------------ ------ -----103 Jersey Polyester/spandex Long sleeve Blue Large (1 row(s) affected)

Notice that the result set includes only those attributes that apply to product 103.

8–26

Module 8: Evaluating Advanced Query and XML Techniques

Demonstration: Writing an Unpivot Query

**************************************** Illegal for non-trainer use *************************************** Introduction

In the first topic in this lesson, “What Is a Pivot Query?” you saw an example of an unpivot query. As you will recall, the unpivot query basically normalizes data that exists in a spreadsheet-like format. In this demonstration, you will retrieve data through the vSalesPersonSalesByFiscalYears view in the AdventureWorks database. The view lists the total sales amount for salespeople for three consecutive years (2002, 2003, and 2004). The view includes a column for each yearly total. To retrieve the data, you will create an unpivot query that displays one row per salesperson per year. This means that your query results will include three rows for each salesperson, one row per year.

Preparation

Ensure that the virtual machine 2781A-MIA-SQL-08 is running and that you are logged on as Student. If a virtual machine has not been started, perform the following steps: 1. Close any other running virtual machines. 2. Start the 2781A-MIA-SQL-08 virtual machine. 3. In the Log On to Windows dialog box, complete the logon procedure by using the user name Student and the password Pa$$w0rd.

To review the vSalesPersonSalesByFisc alYears view

To review the vSalesPersonSalesByFiscalYears view definition, perform the following steps: 1. On the Start menu, point to All Programs, point to SQL Server 2005, and then click SQL Server Management Studio. 2. In the Connect to Server dialog box, use Windows Authentication to log on to the local server and then click Connect. 3. In Object Explorer, expand Databases, expand AdventureWorks, and then expand Views. 4. Right-click Sales.vSalesPersonSalesByFiscalYears and then click Open View.

Module 8: Evaluating Advanced Query and XML Techniques

8–27

5. In the View – Sales.vSalesPersonSalesByFiscalYears window, review the following columns and their data. Column

Description

SalesPersonID

Employee ID (primary key)

2002

Total sales for that employee for 2002

2003

Total sales for that employee for 2003

2004

Total sales for that employee for 2004

Each row in the view provides information about the salesperson and the total amount of sales for the years 2002, 2003, and 2004. 6. Close the View – Sales.vSalesPersonSalesByFiscalYears window. To create the unpivot query

You will create an unpivot query that retrieves data from the vSalesPersonSalesByFiscalYears view. Each row of the result set should include the salesperson’s name and total sales for each year. In other words, you must retrieve one row for each salesperson for each year. To create the unpivot query, perform the following steps: 1. In SQL Server Management Studio, click New Query to open a new query window. 2. In the query window, type the following USE and SELECT statements: USE AdventureWorks; GO SELECT * FROM (SELECT FullName, [2002], [2003], [2004] FROM Sales.vSalesPersonSalesByFiscalYears) AS TotalSales UNPIVOT (SalesTotal FOR SalesYear IN ([2002], [2003], [2004])) AS Unpvt;

3. Click Execute to run your query. The following result set shows part of the results that you should receive from the query: FullName --------------------------Michael G Blythe Michael G Blythe Michael G Blythe Linda C Mitchell Linda C Mitchell Linda C Mitchell Jillian Carson Jillian Carson Jillian Carson Garrett R Vargas Garrett R Vargas Garrett R Vargas Tsvi Michael Reiter Tsvi Michael Reiter Tsvi Michael Reiter Pamela O Ansman-Wolfe Pamela O Ansman-Wolfe Pamela O Ansman-Wolfe

SalesTotal ----------------1951086.8256 4743906.8935 4557045.0459 2800029.1538 4647225.4431 5200475.2311 3308895.8507 4991867.7074 3857163.6331 1135639.2632 1480136.0065 1764938.9857 3242697.0127 2661156.2418 2811012.715 1473076.9138 900368.5797 1656492.8626

SalesYear --------2002 2003 2004 2002 2003 2004 2002 2003 2004 2002 2003 2004 2002 2003 2004 2002 2003 2004

Notice that the result set includes a row for each salesperson for each of the three years. The result set should include 35 rows total, 3 for each salesperson.

8–28

Module 8: Evaluating Advanced Query and XML Techniques

Practice: Writing Pivot Queries That Use Aggregation

**************************************** Illegal for non-trainer use *************************************** Introduction

Practice tasks

In this practice, you will create a crosstab query that retrieves data from the SalesOrderHeader table in the AdventureWorks database. Your query results should include data only from the SalesPersonID, TerritoryID, and SubTotal columns in the SalesOrderHeader table. Keep in mind that a crosstab query often denormalizes data, so you might see a high rate of NULL values in your query results. ■

Create a crosstab query that retrieves data from the SalesOrderHeader table. The query should retrieve data from the SalesPersonID, TerritoryID, and SubTotal columns. The result set should include one column for the SalesPersonID values and one column for each of the following four territories: 1, 2, 3, and 4. The data should include one row for each salesperson ID. Your code should look similar to the following query: SELECT * FROM (SELECT SalesPersonID, TerritoryID, SubTotal FROM Sales.SalesOrderHeader WHERE SalesPersonID IS NOT NULL) AS soh PIVOT (SUM(SubTotal) FOR TerritoryID IN ([1], [2], [3], [4])) AS Pvt;

The following result set shows part of the results that your query should return: SalesPersonID ------------284 281 278 275 276 287 279 290 288 285 268

1 -----------NULL 936689.47 NULL NULL 2562534.431 2814958.8942 NULL NULL NULL NULL 279997.5919

2 -----------NULL NULL NULL 3547034.3961 NULL NULL NULL NULL NULL NULL 100165.0377

3 -----------NULL 621258.9399 NULL 3127053.0254 2225871.1494 NULL NULL NULL NULL NULL 43824.0227

4 -----------NULL 6371216.5566 NULL 4230221.0496 7859324.2476 NULL NULL NULL NULL NULL 498966.7766

The query should return a total of 17 rows. Notice the large number of NULL values that this sort of query produces.

Module 8: Evaluating Advanced Query and XML Techniques Discussion questions

8–29

Review the following questions, and then discuss your answers with the class: 1. How difficult was it for you to create the crosstab query? How does this method compare to other methods that you might have used? Answers will vary. In general, students should find it easier to create a crosstab query by using the PIVOT or UNPIVOT operators, once they understand the syntax. Methods that use derived queries or CASE clauses can be more complex and cumbersome.

2. What issues might arise when creating a crosstab query such as the one in this practice? Answers will vary. One of the most likely issues to arise is that students fail to use a derived query in the FROM clause to retrieve only the columns that should be included in the result set. Too many columns can result in skewed aggregated values. Students will also run into problems if they do not specify the correct column in the aggregate function, the correct column in the FOR clause, or the correct values in the IN operator.

8–30

Module 8: Evaluating Advanced Query and XML Techniques

Lesson 3: Evaluating Ranking Queries

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Explain the different types of ranking queries.



Explain the syntax of a ranking query.



Apply the guidelines for using ranking queries.

SQL Server 2005 includes a set of functions that enable you to assign values to the rows returned in your result sets. These functions rank the rows based on the values in one or more specified columns. In this lesson, you will learn how you can use these ranking functions to simplify and improve the performance of your queries. This lesson describes the different types of ranking functions, explains the syntax of these functions, and provides guidelines for using the functions.

Module 8: Evaluating Advanced Query and XML Techniques

8–31

Types of Ranking Queries

**************************************** Illegal for non-trainer use *************************************** Introduction

SQL Server 2005 includes a new set of functions referred to as ranking functions. Ranking functions are different from other types of functions because the ranking function takes into account the entire result set and its position within that result set. Other functions have a single-row perspective and are unaware of the other rows in the result set. A ranking function assigns a sequential integer to a row based on that row’s position compared to other rows. The function bases that assignment on the returned values in one or more specified columns. This topic describes these ranking functions and provides an example SELECT statement that includes each function.

Types of ranking functions

SQL Server 2005 supports the following ranking functions: ROW_NUMBER. This function returns a sequential integer for each row, starting at 1 and incrementing the number by 1 for each successive row. This function disregards duplicate values in the specified column. For example, if your query returns three rows, the ROW_NUMBER function assigns a 1 to the first row, a 2 to the second row, and a 3 to the third row. Even if the rows are identical, each row receives a unique, sequential number. RANK. This function returns a sequential integer for each row, starting at 1 and increasing that number according to the number of duplicate values in the specified column. For example, if the first two rows return identical values and the third and fourth rows return values that are identical to each other but different from the first two rows, the RANK function assigns 1 to the first two rows and 3 to the third and fourth rows. If the specified column contains no duplicate values, the RANK function assigns the same integer values as the ROW_NUMBER function. DENSE_RANK. This function returns a sequential integer for each row, starting at 1 and increasing that number according to the number of duplicate values in the specified column. However, unlike the RANK function, the DENSE_RANK function does not skip integer values when assigning values to a row. For example, if the first two rows return identical values and the third and fourth rows return values that are identical to each other but different from the first two rows, the DENSE_RANK function assigns 1 to the first two rows and 2 to the third and fourth rows. If the specified column contains no duplicate values, the DENSE_RANK function assigns the same integer values as the ROW_NUMBER function.

8–32

Module 8: Evaluating Advanced Query and XML Techniques NTILE. This function returns a sequential integer for each row, starting at 1 and incrementing that number by 1 for each successive group of rows, as identified by the integer passed as an argument with the NTILE function. Unlike other ranking functions, the NTILE function requires an argument. You must specify an integer value that defines the number of groups to use when dividing the result set. For example, suppose your result set includes 12 rows. If you specify NTILE(3), the function divides the result set into three groups. It then assigns 1 to the first four rows, 2 to the next set of four rows, and 3 to the last set of four rows.

Using ranking functions in a query

The following SELECT statement includes all four ranking functions: SELECT SalesOrderID, SubTotal, ROW_NUMBER() OVER(ORDER BY SubTotal DESC) AS RowNumber, RANK() OVER(ORDER BY SubTotal DESC) AS Rank, DENSE_RANK() OVER(ORDER BY SubTotal DESC) AS DenseRank, NTILE(3) OVER(ORDER BY SubTotal DESC) AS NTile FROM Sales.SalesOrderHeader WHERE SalesOrderID IN (51885, 51886, 52031, 56410, 56411, 72585, 65101, 65814, 73976) ORDER BY SubTotal DESC;

The statement retrieves data from the SalesOrderHeader table in the AdventureWorks database. As you can see, each column definition that contains a ranking function also contains an OVER clause, and each OVER clause includes an ORDER BY clause. Also notice that the NTILE function contains an integer passed as an argument to the function. (The next topic describes the syntax of ranking functions in more detail.) The preceding example returns the following result set: SalesOrderID SubTotal ------------ -------73976 7.28 65101 6.28 65814 6.28 72585 4.99 56410 3.99 56411 3.99 51885 2.29 51886 2.29 52031 2.29 (9 row(s) affected)

RowNumber --------1 2 3 4 5 6 7 8 9

Rank ---1 2 2 4 5 5 7 7 7

DenseRank --------1 2 2 3 4 4 5 5 5

NTile ----1 1 1 2 2 2 3 3 3

As you can see, the functions assign different values depending on the value in the SubTotal column. For example, the fifth and sixth rows include the same SubTotal value (3.99). The ROW_NUMBER function assigns 5 and 6 because they are the fifth and sixth rows. However, the RANK function assigns 5 to each row because the sixth row contains the same SubTotal value as the fifth row. On the other hand, the DENSE_RANK function assigns 4 to both rows because they contain the same SubTotal value and 4 is the next sequential number in the series of assigned integers. Finally, the NTILE(3) function assigns 2 to both rows. The function divides the result set into three groups. The fifth and sixth rows are part of the second group, so the function assigns a 2 to each row.

Module 8: Evaluating Advanced Query and XML Techniques

8–33

The Syntax of a Ranking Query

**************************************** Illegal for non-trainer use *************************************** Introduction

The syntax for all ranking functions is identical. The only exception is the NTILE function, which requires an argument that defines the number of groups to use when dividing the result set. You must always include an integer value whenever using this function. This topic describes the syntax for all ranking functions.

The syntax of a ranking function

All ranking functions use the following syntax: Ranking_function OVER(over_clause) ::= [PARTITION BY value_expression [,...n]] ORDER BY order_by_expression [ASC|DESC] [,...n]

As you can see, you must specify the function, then the OVER keyword, and finally, in parentheses, the OVER clause. The OVER clause must include an ORDER BY clause that specifies the column or columns on which the ranking should be based. You can also specify a PARTITION BY clause in the OVER clause. The PARTITION BY clause groups the result set according to the values in the specified columns. The rank function then assigns values within the scope of each partition. For example, the following SELECT statement uses a PARTITION BY clause in a DENSE_RANK function: SELECT CustomerID, SalesOrderID, SubTotal, DENSE_RANK() OVER(PARTITION BY CustomerID ORDER BY SubTotal DESC) AS DenseRank FROM Sales.SalesOrderHeader WHERE CustomerID IN (427, 428, 429) ORDER BY CustomerID, SubTotal DESC;

8–34

Module 8: Evaluating Advanced Query and XML Techniques The PARTITION BY clause groups the result set by the values in the CustomerID column. The ORDER BY clause specifies that the function rank the rows based on the SubTotal values and in descending order. The following result set shows how the DENSE_RANK function ranks each row: CustomerID ----------427 427 428 428 428 428 429 429 429 429

SalesOrderID -----------51732 57048 65267 59020 71900 53581 61180 51087 67264 55240

SubTotal --------------------440.1742 29.9626 13311.8424 10605.138 7287.5592 5119.5004 4208.8078 4119.673 2096.9376 1068.984

DenseRank --------1 2 1 2 3 4 1 2 3 4

(10 row(s) affected)

The query groups rows first by CustomerID values and then by SubTotal values. Each set of unique customer IDs forms a PARTITION BY group, and the DENSE_RANK function assigns a value to each row within each group. As a result, the function begins with 1 for each new group. For example, the result set includes two rows for customer 427. The function assigns 1 and 2 to these rows. The function then starts the numbering over and assigns a 1 to the first row for customer 428. This way, you can rank the total sales for each customer, rather than for all customers combined, without having to return a result set for each customer. If the query did not include the PARTITION BY clause, it would return the following results: CustomerID ----------428 428 428 428 429 429 429 429 427 427

SalesOrderID -----------65267 59020 71900 53581 61180 51087 67264 55240 51732 57048

SubTotal --------------------13311.8424 10605.138 7287.5592 5119.5004 4208.8078 4119.673 2096.9376 1068.984 440.1742 29.9626

DenseRank --------1 2 3 4 5 6 7 8 9 10

Without the PARTITION BY clause, the DENSE_RANK function ranks the sales totals based on all SubTotal values, not on subgroups of values. Note The ORDER BY clause used in the OVER clause of the ranking function specifies how rows should be ranked. However, you must still use an ORDER BY clause within the outer query if you want to return the rows in a specific order.

Module 8: Evaluating Advanced Query and XML Techniques

8–35

Guidelines for Using Ranking Queries

**************************************** Illegal for non-trainer use *************************************** Introduction

Ranking functions can help you simplify your code and improve performance. For example, you can use ranking functions to support paging your result sets or to remove duplicate data. This topic describes several of the guidelines that you should take into account when using ranking functions.

Using ranking functions in your queries

You should consider the following guidelines when using ranking functions: ■

Use ranking functions to assign values to rows. Ranking functions, at their most basic, enable you to assign ranking values to the rows in your result set. In situations when you simply want to know how specified values rank against each other, ranking functions provide an efficient method for achieving this. For example, you might want to rank employee sales or product purchases. Ranking functions make this easy, and unlike other Transact-SQL methods, which can complicate your queries and affect transaction processing, ranking functions do not affect performance.

8–36

Module 8: Evaluating Advanced Query and XML Techniques ■

Use ranking functions to support the paging of large result sets. In cases in which your result sets include a large amount of data, you might decide to use paging to improve performance. Paging refers to the process or returning only partial result sets in order to work with more manageable sets of data. You might choose to implement this functionality at the front-end tier. However, ranking functions provide a useful method for implementing paging at the data tier. For example, suppose that you want to return only a portion of the sales stored in the SalesOrderHeader table of the AdventureWorks database. You can use a common table expression (CTE) that includes a ranking function to define your result set. You can then specify which rows to return based on the values assigned by the ranking function. This way, you do not need to know the data values themselves, as shown in the following example: WITH PagedOrders AS ( SELECT SalesOrderID, CustomerID, SubTotal, ROW_NUMBER() OVER(ORDER BY SalesOrderID) AS RowNumber FROM Sales.SalesOrderHeader ) SELECT SalesOrderID, CustomerID, SubTotal FROM PagedOrders WHERE RowNumber BETWEEN 11 AND 21;

As you can see, the ROW_NUMBER function makes it easy to return a specific set of records, as the following result set illustrates: SalesOrderID -----------43669 43670 43671 43672 43673 43674 43675 43676 43677 43678 43679

CustomerID ----------578 504 200 119 618 83 670 17 679 203 480

SubTotal --------------------881.4687 7344.5034 9760.1695 7345.0284 4474.5179 3149.2584 6835.9493 17040.8246 9338.7639 11775.9248 1572.5395

(11 row(s) affected)

If you want to return a different set of rows, you simply modify the WHERE clause of the referencing query. Note You cannot use ranking functions in the WHERE clause. As a result, you must use a device such as a CTE to assign the ranking values. You can also use views, derived queries, or user-defined functions (UDFs). ■

Use ranking functions to remove duplicate data. When performing such tasks as migrating applications or building data warehouses, you might discover that some of the source data includes duplicate rows. Ranking functions provide a useful way to determine whether you have duplicate values in one or more columns. For example, suppose that you create a table named SalesOrderHeader2 in the AdventureWorks database that is a copy of the SalesOrderHeader table, except without any constraints. You then run the following INSERT statements against the SalesOrderHeader2 table:

Module 8: Evaluating Advanced Query and XML Techniques

8–37

INSERT INTO Sales.SalesOrderHeader2 SELECT * FROM Sales.SalesOrderHeader WHERE SalesOrderID < 50000; INSERT INTO Sales.SalesOrderHeader2 SELECT * FROM Sales.SalesOrderHeader WHERE SalesOrderID < 44000; INSERT INTO Sales.SalesOrderHeader2 SELECT * FROM Sales.SalesOrderHeader WHERE SalesOrderID < 43800;

You can now use ranking functions to determine which rows are duplicates and which rows are not. For example, the ROW_NUMBER function sequentially numbers rows regardless of duplicates, but the RANK function takes into account duplicates. By subtracting RANK values from ROW_NUMBER values, you can determine whether a row is a duplicate, as shown in the following example: WITH OrderDuplicates AS ( SELECT SalesOrderID, ROW_NUMBER() OVER(ORDER BY SalesOrderID) AS RowNumber, RANK() OVER(ORDER BY SalesOrderID) AS Rank, ROW_NUMBER() OVER(ORDER BY SalesOrderID) RANK() OVER(ORDER BY SalesOrderID) AS IsDuplicate FROM Sales.SalesOrderHeader2 ) SELECT * FROM OrderDuplicates WHERE IsDuplicate > 0;

When the ROW_NUMBER value matches the RANK value, the IsDuplicate value is 0. If the values do not match, the IsDuplicate value is 1 or higher. You can then use the IsDuplicate value in the WHERE clause of the referencing query to return values with an IsDuplicate value greater than 0. The following result set shows part of the results that this query returns: SalesOrderID -----------43659 43659 43660 43660 43661 43661 43662 43662 43663 43663 43664 43664 43665 43665 43666 43666 43667 43667 43668 43668 43669 43669

RowNumber -------------------2 3 5 6 8 9 11 12 14 15 17 18 20 21 23 24 26 27 29 30 32 33

Rank -------------------1 1 4 4 7 7 10 10 13 13 16 16 19 19 22 22 25 25 28 28 31 31

IsDuplicate ----------1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2

In all, the query returns 482 rows, all with IsDuplicate values greater than 0. From this information, you can then take the action necessary to eliminate duplicates.

8–38

Module 8: Evaluating Advanced Query and XML Techniques ■

Add tie-breaker columns to your ORDER BY clause. When using a ranking function in your queries, you must include an ORDER BY clause in the function definition. However, you do not need to specify columns that contain unique values. When a column contains duplicate values, the query is nondeterministic. In other words, the query might produce different results each time you run it. If you require deterministic results, you should add additional columns to your ORDER BY clause to act as tie-breakers in cases in which you want to guarantee the same results whenever you run the query. For instance, in the previous example, you used ranking functions to determine whether duplicates exist. You might consider adding the SalesOrderNumber, PurchaseOrderNumber, OrderDate, DueDate, or ShipDate columns to create a deterministic query.

Module 8: Evaluating Advanced Query and XML Techniques

8–39

Lesson 4: Overview of XQuery

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Explain the syntax for writing queries that use XQuery.



Evaluate the considerations for parameterization of XQueries.



Apply the guidelines for using path expressions.



Apply the guidelines for using XQuery in the WHERE clause of a data manipulation language (DML) statement.

The XML Query (XQuery) language is an industry-wide standard used to access XML data. Transact-SQL includes a subset of the XQuery language that you can use to access data stored in a column defined with the XML data type. In this lesson, you will about the various XQuery elements that Transact-SQL supports. The lesson also provides guidelines for parameterizing XQueries, using path expressions, and using XQuery in the WHERE clause of your data manipulation language (DML) statements.

8–40

Module 8: Evaluating Advanced Query and XML Techniques

The Syntax for Writing Queries Using XQuery

**************************************** Illegal for non-trainer use *************************************** Introduction

SQL Server 2005 supports the XML data type, which enables you to store XML documents and fragments in columns and variables defined with that data type. To access the XML data, you can use the XQuery extensions to Transact-SQL to maneuver through and select from the data hierarchy. This topic introduces you to the syntax used for creating XQuery statements in SQL Server and explains the components necessary to define the various types of expressions that make up these statements. Important This topic explains the basic syntax for creating XQuery statements. The topic is intended to provide you with only an overview of XQuery. For specific details about XQuery and its elements, refer to SQL Server 2005 Books Online.

Module 8: Evaluating Advanced Query and XML Techniques Primary expressions

8–41

A primary XQuery expression can be made up of literals, variables, path expressions, context item expressions, functions, and constructors. The following table describes and provides examples of each of these elements. Component

Description

Example

Literal

A numerical or string value. You must enclose string values in quotation marks.

SalesYear="2002" SalesType="{ descendant::sv:BusinessType }"

Variable

A reference to a specified value.

SalesGoal="{sql:variable("@goal")}"

Path expression

Describes the location of elements and data within an XML hierarchy. Path expressions can be relative or absolute. (The topic “Guidelines for Using Path Expressions” later in this lesson describes path expressions in more detail.)

Context item expression

A method used to reference a value within the current element in XML data.

Function

Any XQuery function that can be used when accessing data in an XML column or variable.

Constructor

An XQuery component that enables you to define XML structures. You can create direct or computed constructors. You must enclose computed constructors in braces.

AnnualSales= "{/sv:StoreSurvey/sv:AnnualSales}"

/sv:StoreSurvey/sv:Specialty[.="Road"]

StoreName="{sql:column("Name")}"

StoreID="{sql:column("CustomerID")}"

Throughout this topic and in topics throughout the rest of this lesson, you will see examples of each of these components. Refer back to this table as necessary for a description of these components. XQuery operators

As with other languages, XQuery supports a number of operators, including logical, comparison, and arithmetic operators. The logical operators include the and operator and the or operator. The comparison operators include the following: ■

Equal (=)



Not equal (!=)



Less than (<)



Greater than (>)



Less than or equal to (<=)



Greater than or equal to (>=)

8–42

Module 8: Evaluating Advanced Query and XML Techniques In addition to the logical and comparison operators, XQuery supports the following arithmetic operators: ■

Addition (+)



Subtraction (–)



Multiplication (*)



Division (div)



Modulus (mod)

The following example demonstrates the division and multiplication operators: SELECT Name, Demographics.value ('declare namespace sv= "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ StoreSurvey"; (/sv:StoreSurvey/sv:AnnualSales)[1] div (12*(/sv:StoreSurvey/sv:NumberEmployees)[1])', 'numeric(11,2)') AS MonthlyEmpSalesRate FROM Sales.Store WHERE CustomerID = 1;

The query first multiplies the NumberEmployees value by 12. It then divides that total into the AnnualSales value. XML constructors

XML constructors are the XQuery components that you use to create an XML structure. XQuery supports two types of constructors: direct and computed. In a direct constructor, you specifically define the XML elements and attributes. In a computed constructor, which is enclosed in braces, you define the specifics necessary to generate the XML dynamically. The following fragment shows the syntax for direct and computer constructors: SELECT ... ('declare namespace sv= "http://schemas.microsoft.com/sqlserver/Adventure-works/Store"; <StoreInfo xmlns:sv= "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ StoreSurvey" SalesYear="2002" SalesType="{ descendant::sv:BusinessType }"> <Sales> {descendant::sv:AnnualSales} {descendant::sv:AnnualRevenue} ... ') AS StoreInfo FROM Sales.Store

Notice that the code includes the SalesYear="2002" constructor. This is a direct constructor because it specifically defines the attribute. However, this is followed by SalesType="{ descendant::sv:BusinessType }", which is a computed constructor based on a relative path expression. The XML is generated dynamically. Notice that the example also includes an XML comment constructor () and the cdata directive to escape the block of text so that it is not interpreted as XML.

Module 8: Evaluating Advanced Query and XML Techniques Sequence expressions

FLWOR statements and iterations

8–43

An XQuery sequence expression is an ordered collection of values or XML nodes. You must use a comma to separate the list; however, you can repeat values and nest one sequence inside another. The following table provides examples of sequence expressions. Sequence type

Example

Values

(1, 2, (1, 3), 4, 5)

Nodes

((/sv:StoreSurvey/sv:Specialty), (/sv:StoreSurvey/ sv:AnnualSales))

FLWOR (pronounced flower) is an acronym based on “for-let-where-order by-return.” An XQuery FLWOR statement creates iterations, or cycles. For example, the following statement uses a FLWOR construction to generate a simple HTML page: SELECT JobCandidateID, Resume.query ('declare namespace rs= "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/Resume"; { for $e in /rs:Resume/rs:Employment order by $e/rs:Emp.JobTitle return }
{fn:string($e/rs:Emp.JobTitle)}
') AS EmpHistory FROM HumanResources.JobCandidate WHERE JobCandidateID = 1;

As you can see, the statement includes logic to create multiple rows based on the data retrieved from the XML column. The statement returns the following results:
Assistant Machinist
Lead Machinist
Machinist


Of course, your statement would normally include the logic necessary to produce more complex HTML, but this example still provides all the basic elements of a FLWOR construction.

8–44

Module 8: Evaluating Advanced Query and XML Techniques

Conditional expressions

XQuery also supports the conditional expression, which is made up of an if-then-else construction, as shown in the following example: SELECT Name, Demographics.query ('declare namespace sv= "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ StoreSurvey"; if (fn:number((/sv:StoreSurvey/sv:AnnualRevenue)[1]) > 250000) then "Sales greater than $250,000" else "Annual sales less than or equal to $250,000" ') AS TotalSales FROM Sales.Store WHERE CustomerID = 1;

Quantified expressions

Quantified expressions enable you to compare two sequences and return a Boolean value. XQuery supports two types of quantified expressions: Existential. If any item in the first sequence expression has a match in the second sequence expression, XQuery returns the value True. Universal. If every item in the first sequence expression has a match in the second sequence expression, XQuery returns the value True. A quantified expression takes the following syntax: (some | every) variable in expression (,...) satisfies expression

An existential expression uses the some operator, and a universal qualified expression uses the every operator. For instance, the following SELECT statement uses an existential expression to determine whether the any Location element within the Instructions column contains a MachineHours attribute: SELECT ProductModelID, Instructions.value (' declare namespace man="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ ProductModelManuInstructions"; some $m in //man:root/man:Location satisfies $m/@MachineHours', 'nvarchar(10)') as Result FROM Production.ProductModel where Instructions IS NOT NULL;

If any Location element contains a MachineHours attribute, the statement returns a value of true; otherwise, the statement returns false, as shown in the following result set: ProductModelID -------------7 10 43 44 47 48 53 66 67

Result ---------true true false false true true false false false

(9 row(s) affected)

Module 8: Evaluating Advanced Query and XML Techniques

8–45

If you change the statement to use a universal expression, all Location elements for each record must contain a MachineHours attribute to return the value True; otherwise, the expression returns False. As the following result set shows, no Instructions value in any row contained the MachineHours attribute in all Location elements: ProductModelID -------------7 10 43 44 47 48 53 66 67

Result ---------false false false false false false false false false

(9 row(s) affected)

Sequence type expressions

A sequence type expression is one in which you verify a value’s dynamic type or in which you convert a value’s type. To verify a value’s dynamic type, you must use the instance of operator. The dynamic type refers to the value’s type at run time. The instance of operator takes the following syntax: expression instance of SequenceType[?]

The optional question mark is the occurrence indicator. If specified, the expression can return 0 or 1 item. In other words, the instance of returns true when the expression type matches the specified SequenceType, whether or not the expression returns a singleton or an empty sequence. If you do not specify the question mark, instance of returns true only when the expression contains a singleton and the expression type matches the specified SequenceType. The sequence type expression in the following example verifies whether the value Book is an integer: DECLARE @x XML; SET @x = ''; SELECT @x.query('"Book" instance of xs:integer');

The SELECT statement returns false because Book is a string value. You can also convert a value’s type by using the cast as operator, as shown in the following syntax: expression cast as AtomicType?

The AtomicType placeholder refers to type to which the expression is converted. The question mark must always follow the type name. For example, the following cast as expression casts the integer 123 as a string: DECLARE @x XML; SET @x = ''; SELECT @x.query('123 cast as xs:string?');

8–46

Module 8: Evaluating Advanced Query and XML Techniques

Guidelines for Parameterizing XQueries

**************************************** Illegal for non-trainer use *************************************** Introduction

When working with XML and relational data, you must sometimes combine the two types of data. The two primary ways to do this are by using the sql:column and sql:variable XQuery functions. In this topic, you will learn what considerations to take into account when using both these functions.

Creating parameterized XQueries

You should consider the following guidelines when parameterizing XQuery statements: ■

Use the sql:column function to combine XML and non-XML columns. The sql:column function enables you to retrieve values from a non-XML column from within your XQuery script. You can then use the value along with values from the XML column to combine those values in a single XML structure. You can retrieve values from any column except columns defined as XML, DATETIME, SMALLDATETIME, TEXT, NTEXT, SQL_VARIANT, IMAGE, or CLR userdefined types. For example, the following statement retrieves both XML and character data from the Store table in the AdventureWorks database: SELECT Name, Demographics.query ('declare namespace sv= "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ StoreSurvey"; <Store StoreName="{sql:column("Name")}" AnnualSales="{/sv:StoreSurvey/sv:AnnualSales}"> ') AS StoreSales FROM Sales.Store WHERE CustomerID = 1;

The statement retrieves data from the Name column and the AnnualSales element of the Demographics column. Notice that the StoreName constructor uses the sql:column function to retrieve the Name value. The statement assigns that value and AnnualSales value to attributes in the Store element of the XML output structure, as shown in the following results: <Store StoreName="A Bike Store" AnnualSales="300000" />

Module 8: Evaluating Advanced Query and XML Techniques ■

8–47

Use the sql:variable function to retrieve variable values. The sql:variable function retrieves the current value from a Transact-SQL variable. You can retrieve data from a variable of any type except from variables defined as XML, DATETIME, SMALLDATETIME, TEXT, NTEXT, SQL_VARIANT, IMAGE, or CLR user-defined types. The following example expands on the previous example to demonstrate the use of the sql:variable function: DECLARE @goal MONEY SET @goal = 400000 SELECT Name, Demographics.query ('declare namespace sv= "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ StoreSurvey"; <Store StoreName="{sql:column("Name")}" AnnualSales="{/sv:StoreSurvey/sv:AnnualSales}" SalesGoal="{sql:variable("@goal")}"> ') AS StoreSales FROM Sales.Store WHERE CustomerID = 1;

As you can see, the statement now includes an additional attribute in the Store element. The SalesGoal attribute retrieves data from the @goal variable, which is defined as a MONEY type. The XML structure produced by this statement now includes data from a non-XML column and variable, as well as from an XML element, as shown in the following results: <Store StoreName="A Bike Store" AnnualSales="300000" SalesGoal="400000" />

8–48

Module 8: Evaluating Advanced Query and XML Techniques

Guidelines for Using Path Expressions

**************************************** Illegal for non-trainer use *************************************** Introduction

XQuery path expressions enable you to locate elements and attributes within XML columns and XML variables. XQuery supports relative and absolute path expressions. Relative path expressions are made up of one or more steps of the XML hierarchy that are separated by one or two forward slashes (//). The steps are relative to the node that is currently being processed. Relative paths are often preceded by axis steps that define the relative position. For example, the relative path descendant::sv:AnnualRevenue includes the descendant axis path, the sv schema alias, and the AnnualRevenue step. The descendant axis path returns all descendants of the sv:AnnualRevenue relative path. The absolute path provides the full path for locating the XML elements and attributes. Absolute path expressions begin with one or two forward slashes (//) and are followed by a relative path, as in /sv:StoreSurvey/sv:AnnualRevenue. Whether you use relative or absolute path expressions, parsing XML data can affect query performance and the maintainability of your code. This topic discusses guidelines for using path expressions in your XQuery statements.

Module 8: Evaluating Advanced Query and XML Techniques Using path expressions

8–49

You should consider the following guidelines when using path expressions to access XML data: ■

Create a user-defined function to add a computed column to a table. The XML data type supports methods such as the value and query methods that you can use to create XQueries that access the XML data. You must use these methods to specify path expressions that access specific elements and attributes within the XML data. However, you cannot use these methods to create computed columns. To work around this limitation, you should create a user-defined function that contains the path expressions necessary to access the XML data, as shown in the following example: CREATE FUNCTION Sales.StoreRevenue (@p1 XML) RETURNS NUMERIC(18,2) AS BEGIN DECLARE @Result NUMERIC(18,2) SET @[email protected] ('declare namespace sv= "http://schemas.microsoft.com/sqlserver/2004/07/ adventure-works/StoreSurvey"; (/sv:StoreSurvey/sv:AnnualRevenue)[1]', 'numeric(18,2)') RETURN @Result END;

The function uses the value method to define an XQuery that includes the necessary path expression. You can then use the function to create the computed column: ALTER TABLE Sales.Store ADD Revenue AS Sales.StoreRevenue(Demographics); ■

Create a user-defined function to add a constraint to a table. You cannot use the XML methods in a table-level or column-level constraint definition. As a result, you must implement a solution similar to what is shown in the preceding example. Create a function that contains the necessary logic, and then use that function in your constraint definition.



Use XML indexes effectively. You should create a primary index on your XML column if you retrieve data from the column frequently, or if your XML data is large but the retrieved parts are small. You should also consider implementing any of the following secondary XML indexes: ●

Value index. Create a value index on an XML column if you must query XML data often but do not know the element or attribute names. For example, if you use wildcards to specify all sub-elements (as in sv:StoreSurvey/*), you should consider a value index.



Path index. Create a path index on an XML column if your queries use path expressions significantly. A path index can be especially helpful when your queries often contain an XML exist method in the WHERE clause, as shown in the following example: SELECT CustomerID, Name FROM Sales.Store WHERE Demographics.exist('declare namespace sv="http://schemas.microsoft.com/sqlserver/2004/07/ adventure-works/StoreSurvey"; /sv:StoreSurvey/sv:Specialty[.="Road"]') = 1 ORDER BY CustomerID;

8–50

Module 8: Evaluating Advanced Query and XML Techniques ●

Property index. Create a property index on an XML column if your queries often use path expressions to retrieve multiple values from individual XML instances. For example, your SELECT clause might include multiple XQuery expressions, such as the query in the following example: SELECT Name, Demographics.value ('declare namespace sv= "http://schemas.microsoft.com/sqlserver/2004/07/ adventure-works/StoreSurvey"; (/sv:StoreSurvey/sv:AnnualSales)[1] div (12*(/sv:StoreSurvey/sv:NumberEmployees)[1])', 'numeric(11,2)') AS MonthlyEmpSalesRate, Demographics.value ('declare namespace sv= "http://schemas.microsoft.com/sqlserver/2004/07/ adventure-works/StoreSurvey"; (/sv:StoreSurvey/sv:AnnualRevenue)[1] div (12*(/sv:StoreSurvey/sv:NumberEmployees)[1])', 'numeric(11,2)') AS MonthlyAnnualRevenue FROM Sales.Store WHERE CustomerID = 1;

Notice that the SELECT clause statement accesses different elements from the Demographics column. If you create many queries of this sort, you should consider configuring this column with a property index. Note If you create indexes on your XML column, you must first create a primary index and then create any secondary indexes.

Module 8: Evaluating Advanced Query and XML Techniques

8–51

Guidelines for Using XQuery in the WHERE Clause of a DML Statement

**************************************** Illegal for non-trainer use *************************************** Introduction

The XML data type supports several methods that you can use to create XQueries. You can use three of these methods—query, value, and exist—in the WHERE clause of DML statements such as SELECT, DELETE, UPDATE, or INSERT. However, when using XQuery in a WHERE clause, you should take into account issues related to query performance so that you are using XQuery effectively in your DML statements.

Using XQuery in WHERE clauses

You should consider the following guidelines when using XQuery in the WHERE clause of a DML statement: ■

Use the exist method when possible. In some cases, you can rewrite your queries so that the WHERE clause uses the exist method, rather than the query or value method. The exist method can often use indexes more effectively and improves query performance. For example, the following SELECT statement uses the query method in the WHERE clause: SELECT CustomerID, Name FROM Sales.Store WHERE CAST(Demographics.query('declare namespace sv= "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ StoreSurvey"; data(/sv:StoreSurvey/sv:AnnualRevenue)') AS VARCHAR(24) ) = '80000';

You can achieve the same results by rewriting the statement to use the value method in the WHERE clause, as shown in the following example: SELECT CustomerID, Name FROM Sales.Store WHERE Demographics.value('declare namespace sv= "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ StoreSurvey"; data((/sv:StoreSurvey/sv:AnnualRevenue)[1])', 'int') = 80000;

8–52

Module 8: Evaluating Advanced Query and XML Techniques Using the value method does not result in a significant improvement in performance over the query method. However, you can also rewrite the query to use the exist method in the WHERE clause: SELECT CustomerID, Name FROM Sales.Store WHERE Demographics.exist('declare namespace sv= "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ StoreSurvey"; /sv:StoreSurvey/sv:AnnualRevenue[.=80000]') = 1;

Although all three statements produce the same result set, the last statement results in the greatest improvement in query performance. ■

Create a path index on the XML column. Create a path index when your queries often contain an XML exist method in the WHERE clause of your DML statement.



Do not create an XML computed column if you must index that column. You cannot create an XML index on a computed XML column. If your WHERE clauses access data from an XML column frequently, or if your XML data is large but the retrieved parts are small, you should create a regular XML column and index it as necessary and then use that column in your WHERE clause.

Note See the topic “Guidelines for Using Path Expressions” for more information on XML indexes and computed columns.

Module 8: Evaluating Advanced Query and XML Techniques

8–53

Lesson 5: Overview of Strategies for Converting Data Between XML and Relational Forms

**************************************** Illegal for non-trainer use *************************************** Lesson objectives

Introduction

After completing this lesson, students will be able to: ■

Explain the process for creating a data conversion strategy.



Apply the guidelines for converting relational data into XML data.



Apply the guidelines for converting XML data into a rowset.

As XML is implemented across a wider spectrum of applications, the need to convert data between relational and XML data sources becomes increasingly critical to a data management strategy. Not only must you plan on how to convert XML data to relational data, but also relational data to XML data. In this lesson, you will learn how to plan a data conversion strategy and to apply guidelines for converting relational data into XML data and converting XML data into relational data.

8–54

Module 8: Evaluating Advanced Query and XML Techniques

The Process of Creating a Data Conversion Strategy

**************************************** Illegal for non-trainer use *************************************** Introduction

Today’s applications often rely on the ability to convert XML data into a relational format and to convert relational data into an XML format. As part of the application planning process, you must take into account issues related to converting data between relational and XML data sources. This topic describes the process for creating a data conversion strategy for these types of applications.

Creating a data conversation strategy

The process for creating a data conversion strategy entails the following steps: 1. Evaluate the application requirements for the data format. As part of the application planning process, you must determine how the data will be used and presented to the users. Part of that process should include determining the required format of the data. For example, if you will be retrieving data from a Web service, you will be working with XML data. However, if your application also supports online transactions, you will be working with relational data. Your application might also access data from XML documents or need to generate XML documents from data stored in the database. As a result, your first step in planning a data conversion strategy is to ensure that you know the format requirements for all data that will be moving into, out of, and through your application. 2. Determine the most efficient database storage methods. Once you know the types of data that your application will support, you must determine how you will store that data. For example, if you retrieve data from XML documents, you can choose to store those documents in an XML column, or you can convert the data and store it in a relational format. You can also choose to convert relational data into XML data. The decision for how to store the data depends on your application requirements. The relational model uses a highly structured schema that divides the data from the metadata. This model uses normal forms to minimize redundancies and to ensure the integrity of the data. The relational model also supports efficient transactional processing and frequent data access and modification.

Module 8: Evaluating Advanced Query and XML Techniques

8–55

On the other hand, the XML model supports situations in which you require a more flexible schema. This is because the XML model stores the data and metadata together, allowing for a very adaptable model. XML is self-describing and document-based, unlike the relational model, which is entity-based. As a result, XML handles complete documents in a much simpler manner than the relational model. This, combined with the flexible schema, makes it well suited for quick access to data across heterogeneous platforms, without the overhead of data conversion. XML is platform-independent and portable. The XML model is also useful if you do not know the structure of the data, if the structure will change significantly, or if you want to be able to query or modify only part of the data. However, XML data is not well suited to transaction processing and frequent data access and modification. XML also requires more storage space than relational data and contains a large amount of redundant data. In addition, the XML model does not ensure data integrity as efficiently as the relational model. Note With regard to the topic of planning data storage, the process of planning XML data storage refers only to storing data within an XML column in a SQL Server database, not to storing data in separate XML document files. 3. Determine data access and modification methods. When planning a data conversion strategy, you must take into account data access and modification and the need for different applications to access the same data. Your data access and modification requirements can determine when and how you convert your data. For example, suppose that your application retrieves XML data from a Web service and you store the information in your database as XML data. Now suppose you use that data in transactions that require relational data. Each time a transaction retrieves the data, it must convert it from XML to relational. However, if you convert the data before you insert it into the database, you reduce the work that your transactions have to perform. In some cases, you might have to choose one format to support multiple applications. For example, if multiple applications access your database and most of them require that the data be in an XML format, the remaining transactional applications will have to convert the data as they retrieve it. However, if performance issues of the transactional applications outweigh the issues of the other applications, you might be forced to store the data as relational data and convert it to XML as the other applications retrieve it. You can also consider storing the data as both relational and XML, but you must be able to do so in a way that ensures that the data is always in sync and is accurate. This strategy is better suited for read-only scenarios than for scenarios in which data is frequently updated. However, in any application, the decision of when to convert data depends on the application requirements and the system supporting the application. 4. Determine whether to convert data on the server side or client side. As part of the process of planning a data conversation strategy, you should consider whether you will convert data on the server side or on the client side. You should take into account issues related to performance, available transport models, and system capabilities. For example, if you convert XML data to relational data on the server side, you need to send less data over the network. However, if a greater consideration is the load on the server, you can send the XML data over the network and allow the client to do the work. In addition, the client might require that the data be sent via the SQL Server Web service, in which case you must convert the data on the client side.

8–56

Module 8: Evaluating Advanced Query and XML Techniques

Guidelines for Converting Relational Data into XML Data

**************************************** Illegal for non-trainer use *************************************** Introduction

As part of your application requirements, you might need to convert relational data into an XML format. You can convert the data on the server or on the client. Converting it on the server is a very straightforward process. You simply include the FOR XML clause in your query. Converting data on the client is generally a more complex process, but you usually have more control over the conversion process, and you incur less network overhead. In addition, a client solution often provides greater scalability. However, this topic is concerned with converting relational data on the server. Specifically, it provides guidelines that you should take into account when using the FOR XML clause to convert SQL Server relational data into XML. Note This topic provides only an introduction to the FOR XML clause. For more details about the clause, see the topic “Constructing XML Using FOR XML” in SQL Server 2005 Books Online.

Converting relational data into XML data

To convert relational data in a SQL Server database, you can add the FOR XML clause to your SELECT statement. You define the clause based on the type of XML results that you want to return. When planning to use the FOR XML clause to convert the data, you should take into account the following guidelines: ■

Use RAW mode to generate one XML element per row. The database engine renders each row returned by the query as an XML element. The following example shows a FOR XML clause that includes the ELEMENTS and ROOT options: SELECT header.SalesOrderID, header.AccountNumber, detail.ProductID, detail.OrderQty, detail.UnitPrice FROM Sales.SalesOrderHeader header JOIN Sales.SalesOrderDetail detail ON header.SalesOrderID = detail.SalesOrderID WHERE header.SalesOrderID < 43662 FOR XML RAW('Order'), ELEMENTS, ROOT('SalesOrders');

Module 8: Evaluating Advanced Query and XML Techniques

8–57

The RAW keyword specifies the name of the primary element in your XML structure. This is the element that defines the structure for each row returned by the query. If you specify the ELEMENTS option, the query returns one subelement for each column in the row. If you do not specify the ELEMENTS option, the query returns one attribute for each column in the row. The ROOT option indicates that the results will use SalesOrders as the root element. The following results show part of the result set returned by the statement: <SalesOrders> <SalesOrderID>43659 10-4020-000676 776 1 2024.9940 <SalesOrderID>43659 10-4020-000676 777 3 2024.9940 ...

The XML includes one element for each row returned by the result set. The root element is <SalesOrders>. ■

Use AUTO mode to nest the elements based on the relational schema. AUTO mode is similar to RAW mode except that in AUTO mode, the database engine uses the table names and column names to render the XML structure. The following example shows the same SELECT statement as in the preceding example, except that it uses the AUTO option: SELECT header.SalesOrderID, header.AccountNumber, detail.ProductID, detail.OrderQty, detail.UnitPrice FROM Sales.SalesOrderHeader header JOIN Sales.SalesOrderDetail detail ON header.SalesOrderID = detail.SalesOrderID WHERE header.SalesOrderID < 43662 FOR XML AUTO, ELEMENTS, ROOT('SalesOrders');

When the database engine renders the query results, it creates an extra level in the hierarchy that takes into account the relationship between the two tables. The following results show part of the result set returned by this statement: <SalesOrders>
<SalesOrderID>43659 10-4020-000676 <detail> 776 1 2024.9940 <detail> 777 3 2024.9940 ...

8–58

Module 8: Evaluating Advanced Query and XML Techniques As you can see, the XML structure now shows the
and <detail> elements, which are based on the table aliases defined in the SELECT statement. Notice also that the <detail> element includes subelements that reflect the relationship between the SalesOrderHeader and SalesOrderDetail tables. ■

Use EXPLICIT mode to shape the XML data. EXPLICIT mode provides you with more control over how you shape your XML structure. You can map columns to specific attributes and elements and specify how the hierarchy should be created. To use EXPLICIT mode, you must take into account the following guidelines: ●

Use separate queries for each level of the XML hierarchy. Use the UNION ALL operator to join the queries. Specify NULL values for columns not used in a level.



Create two metadata columns—Tag and Parent. The Tag column contains an integer value that identifies the current element. The Parent column contains an integer value that identifies the Parent element. Set the initial Tag value to 1. Set the initial Parent value to NULL.



When specifying column names, use the format ElementName!TagNumber!AttributeName!Directive.



Use the ORDER BY clause to specify the order of the XML elements.

EXPLICIT mode is the most complex of the FOR XML modes. The following example shows the elements necessary to create a query that uses EXPLICIT mode: SELECT 1 AS Tag, NULL AS Parent, h.SalesOrderID AS [Order!1!OrderID!ID], h.AccountNumber AS [Order!1!AccountNumber!element], NULL AS [OrderDetail!2!ProductID!ID], NULL AS [OrderDetail!2!Quantity!Element], NULL AS [OrderDetail!2!Price!Element] FROM Sales.SalesOrderHeader h WHERE h.SalesOrderID < 43662 UNION ALL SELECT 2, 1, d.SalesOrderID, NULL, d.ProductID, d.OrderQty, d.UnitPrice FROM Sales.SalesOrderDetail d WHERE d.SalesOrderID < 43662 ORDER BY 3, 5 FOR XML EXPLICIT;

The statement includes two queries, one for each level of the XML hierarchy. The first level of the hierarchy is , and the second level is . Notice that the ORDER BY clause sorts the result set by columns 3 and 5— AccountNumber and ProductID. This sorting enables you to present the XML data in a logical order. The following result shows a portion of the result set returned by this statement:

Module 8: Evaluating Advanced Query and XML Techniques

8–59

10-4020-000676 6 5.7000 4 20.1865 2 5.1865 ...

As you can see, the element is at the top of the hierarchy, and all other elements are sub-elements of the top-level element. The XML is sorted first by the OrderID values and then by the ProductID values. ■

Use PATH mode to combine elements and attributes. The PATH mode provides the flexibility of the EXPLICIT mode but is easier to use. The PATH mode uses the source columns to create the XML hierarchies, as shown in the following example: SELECT h.SalesOrderID AS "@OrderID", h.AccountNumber AS "AccountNumber", d.ProductID AS "OrderDetail/@ProductID", d.OrderQty AS "OrderDetail/@Quantity", d.UnitPrice AS "OrderDetail/@Price" FROM Sales.SalesOrderHeader h JOIN Sales.SalesOrderDetail d ON h.SalesOrderID = d.SalesOrderID WHERE h.SalesOrderID < 43662 FOR XML PATH('Order'), ROOT('SalesOrders');

You must specify an AS alias for each column. The alias identifies the attribute or element as it should appear the XML structure. Notice that you separate hierarchy levels with a forward slash (/) and you precede each attribute name with the at (@) symbol. The following results show part of the XML returned by this statement: <SalesOrders> 10-4020-000676 10-4020-000676 10-4020-000676 ...

As the results show, the XML hierarchy maps to the column definitions within the SELECT statement.

8–60

Module 8: Evaluating Advanced Query and XML Techniques

Guidelines for Converting XML Data to a Rowset

**************************************** Illegal for non-trainer use *************************************** Introduction

As likely as it is that you will need to convert relational data to XML data in your applications, it is just as likely that you will need to convert XML data to relational data. When using versions of SQL Server prior to SQL Server 2005, the primary way that you could convert XML data was by using the OPENXML option in your query. However, now you can also use the XML nodes method to convert data. This topic provides guidelines for using Transact-SQL to convert XML data to relational data.

Converting XML data into a rowset

You should consider the following guidelines when converting XML data into a rowset: ■

Use the OPENXML option to convert XML data. The OPENXML option creates an in-memory table expression that can query and use the XML data. To use the option, follow these guidelines: ●

Before running the SELECT statement that contains the OPENXML option, call the sp_xml_preparedocument stored procedure to parse the XML and create the document handle.



Specify the OPENXML option in the FROM clause of the SELECT statement. The OPENXML option takes two parameters. Use the document handle returned by the stored procedure for the first parameter. Specify an XPath expression as the second parameter. An XPath expression is similar to an XQuery path expression.



Include a WITH clause with the SELECT statement. In the WITH clause, specify the column names and their data types for the result set.

Module 8: Evaluating Advanced Query and XML Techniques

8–61

The following statements show how to use the OPENXML option to convert XML data into a rowset: DECLARE @docHandle INT, @xmlDoc XML; SET @xmlDoc = ' 2005-01-25T00:00:00 12 12.35 3 28.15 2005-01-26T00:00:00 6 5.45 8 19.99 '; EXEC sp_xml_preparedocument @docHandle OUTPUT, @xmlDoc; SELECT * FROM OPENXML(@docHandle, N'//Orders/Order') WITH (OrderID INT '@OrderID', OrderDate SMALLDATETIME 'OrderDate');

In this example, the DECLARE and SET statements define XML text to be converted. The example next calls the sp_xml_preparedocument stored procedure, which uses the XML text for the input parameter and returns the document handle. The SELECT statement that follows uses the document handle and specifies an XPath expression (//Orders/Order). The statement returns the following results: OrderID ----------4568 4569

OrderDate ----------------------2005-01-25 00:00:00 2005-01-26 00:00:00

(2 row(s) affected)

8–62

Module 8: Evaluating Advanced Query and XML Techniques ■

Use the XML nodes methods to convert XML data. In addition to using the OPENXML option to convert XML data, you can use the XML nodes method. To use the method, follow these guidelines: ●

Use the nodes method within the FROM clause of the SELECT statement. Specify an XQuery expression as an argument to the method. The FROM clause returns an unnamed result set, so you must specifically define an alias that includes a table name and a column name.



Use the XML value method or query method in the column definitions to expose the XML elements and attributes. Specify an XQuery expression that identifies the source attribute or element as the method’s first argument. For the value method, specify a data type as the second argument.

You can filter the result set by using the following methods: ●

Add a predicate to the XQuery expression in the nodes method.



Use the exists method in the WHERE clause.



Use the CROSS APPLY or OUTER APPLY operators.

The following example, which depends on the same DECLARE and SET statements as in the previous example, uses the nodes method to convert the XML data: SELECT Doc.Ord.value('@OrderID', 'INT') AS OrderID, Doc.Ord.value('OrderDate[1]','SMALLDATETIME') AS OrderDate FROM @xmlDoc.nodes('/Orders/Order') AS Doc(Ord);

The statement returns the same results as the previous example. However, you do not need to first call the sp_xml_preparedocument stored procedure. ■

Use a user-defined function to encapsulate conversion logic. You can use userdefined functions to hide the details of the XML parsing. A user-defined function enables you to encapsulate the code to reuse it and to make it easier to maintain.

Module 8: Evaluating Advanced Query and XML Techniques

8–63

Lab: Evaluating Advanced Query and XML Techniques

**************************************** Illegal for non-trainer use *************************************** Objectives



Evaluate the use of CTEs.



Evaluate the use of pivot queries.



Evaluate the use of ranking queries.



Evaluate the different ways to convert XML data into relational data.

Introduction

In this lab, you will create several queries based on the information covered in this module. The queries all retrieve data from the AdventureWorks sample database included with SQL Server 2005. In this lab, you will create Transact-SQL queries that incorporate common table expressions (CTEs), use pivot operators and ranking functions, and convert XML data into a relational format.

Preparation

Ensure that the virtual machine for the computer 2781A-MIA-SQL-08 is running. Also, ensure that you have logged on to computer. If you need to log on to the computer, use the following credentials: ■

Username: Student



Password: Pa$$w0rd

8–64

Module 8: Evaluating Advanced Query and XML Techniques

Exercise 1: Evaluating Common Table Expressions Introduction

Evaluating and creating a CTE

In this exercise, you will evaluate and run a rollup query, rewrite the query to include a CTE, run the new query, and then compare the estimated execution plans for both queries.

Summary 1. Examine the original rollup query. 2. Run the rollup query. 3. Rewrite the query to use a common table expression. 4. Run the new query. 5. Compare the execution plans for the two queries.

Specifications 1. In SQL Server Management Studio, open the SalesRollup.sql file in the E:\LabFiles\Starter folder. 2. Evaluate and run the Transact-SQL code in this file. 3. Open a new query window, create a CTE query to achieve the same results as the original query, and then run the new query. 4. Generate the execution plan for each query, and then compare these plans.

Discussion questions

1. How does the CTE query affect performance compared to the manual rollup query? The CTE query is substantially faster, up to 4 times faster than the manual rollup query.

2. What do the execution plans indicate to be the reasons for the differences in performance? The CTE requires only a single cluster index scan on the SalesOrderHeader table. The manual rollup query requires four scans.

3. Beside performance, what other factors should you consider when deciding which method to use to create a recursive query? The CTE query is easier to create and maintain than the manual rollup query. The CTE dynamically determines the number of levels in the hierarchy to return. For the rollup query, you must manually enter each level.

Module 8: Evaluating Advanced Query and XML Techniques

8–65

Exercise 2: Evaluating Pivot Queries Introduction

Evaluating and creating a pivot query

In this exercise, you will evaluate and run a manual pivot query, rewrite the query as a pivot query, run the new query, and then compare the estimated execution plans for both queries.

Summary

Specifications

2. Run the original query.

1. In SQL Server Management Studio, open the SalesByMonth.sql file in the E:\LabFiles\Starter folder.

3. Rewrite the query to use the PIVOT operator.

2. Evaluate and run the Transact-SQL code in this file.

4. Run the new query.

3. Open a new query window, and then create a query that uses the PIVOT operator to achieve the same results as the original query. Run the new query.

1. Evaluate the original manual pivot query.

5. Compare the execution plans for the two queries.

4. Generate the execution plan for each query, and then compare these plans. Discussion questions

1. How does using the PIVOT operator affect performance compared to the manual rollup query? The performance of both queries is essentially the same.

2. Beside performance, what other factors should you consider when deciding which method to use to create a pivot query? The pivot query is much easier to write and maintain.

8–66

Module 8: Evaluating Advanced Query and XML Techniques

Exercise 3: Evaluating Ranking Queries Introduction

In this exercise, you will evaluate and run a paging query, rewrite the query to include a ranking function, run the new query, and then compare the estimated execution plans for both queries. Important Paging queries can take a substantial time to run. Your instructor will tell you whether you have time to run the query. If you do not have enough time, skip any steps in which you run the query.

Evaluating and creating a ranking query

Summary 1. Evaluate the original paging query. 2. Run the paging query (optional). 3. Rewrite the query to use a ranking function. 4. Run the new query. 5. Compare the estimated execution plans for the two queries.

Specifications 1. In SQL Server Management Studio, open the PagedOrders.sql file in the E:\LabFiles\Starter folder. 2. Evaluate and run the Transact-SQL code in this file. Note This query can take a long time to execute, so be prepared to cancel it if time runs short. 3. Open a new query window, and then create a query that uses a ranking function to achieve the same results as the original query. Run the new query. 4. Generate the execution plan for each query, and then compare these plans.

Discussion questions

1. How does the ranking query affect performance compared to the paging query? The ranking query is substantially faster than the manual rollup query.

2. What do the execution plans indicate to be the reasons for the differences in performance? The ranking query requires only a single cluster index scan on the SalesOrderHeader table. The paging query requires a nested loop that must run millions of times.

3. What alternatives can you use other than a ranking query? You can also use a server-side cursor and a temporary table; however, a ranking query is substantially faster and simpler to create.

Module 8: Evaluating Advanced Query and XML Techniques

8–67

Exercise 4: Evaluating Techniques for Converting XML into Relational Data Introduction

Converting XML data into a relational format

In this exercise, you will create two queries that convert XML data into a relational format. One query should use the OPENXML operator, and the other query should use the XML nodes method, thereby enabling you to compare these two methods for converting data.

Summary 1. Review the XML data. 2. Create an OPENXML query that converts the XML data into a relational format. 3. Use the XML nodes and value methods to create a query that converts the XML data into a relational format.

Specifications 1. In SQL Server Management Studio, open the XMLDocument.sql file in the E:\LabFiles\Starter folder. 2. Examine the XML data assigned to the @XMLDepartments variable in this file. 3. Open a new query window, and then create an OPENXML query that converts the XML into a relational format. Run the new query. 4. Open another new query window, and then use the XML nodes and value methods to create a query that converts the XML data into a relational format. Run the new query.

Important After the discussion, shut down the virtual machine for the computer 2781A-MIA-SQL-08. Do not save the changes.

8–68

Module 8: Evaluating Advanced Query and XML Techniques

Course Evaluation

**************************************** Illegal for non-trainer use *************************************** Your evaluation of this workshop will help Microsoft understand the quality of your learning experience. Please work with your training provider to access the workshop evaluation form. Microsoft will keep your answers to this survey private and confidential and will use your responses to improve your future learning experience. Your open and honest feedback is very valuable.

I-1

Index A

ADFs (Application Definition Files), 5-3 Agent, SQL Server considerations for using, 1-11 defined, 1-3 designing security for, 2-20 to 2-21 overview, 1-11 Analysis Services, SQL Server, 1-4 anchor queries, 8-7, 8-11 to 8-12 Application Definition Files (ADFs), 5-3 applications. See also databases deployment strategies, 7-28 to 7-35 large transactions in, 4-29 object access strategies, 4-20 to 4-21 offloading concurrent activities, 4-13 to 4-14 resilient transactions in, 4-17 to 4-19 role and schema guiidelines, 2-26 to 2-27 rolling back transactions, 4-27 to 4-28 security considerations for data objects, 2-25 to 2-32 stored procedure guidelines, 2-29 to 2-30 testing for security, 7-18 to 7-19 transaction strategies, 4-1 to 4-31 unit testing, 7-10 to 7-19 view guidelines, 2-31 APPLY operator, 1-27 architecture, SQL Server, 1-3 to 1-4 auditing database performance vs., 2-34 design guidelines, 2-34 to 2-35 logs, 2-35 protecting information, 2-38 storing information, 2-36 to 2-37 authentication vs. authorization, 2-7 custom, 2-12 discussion questions, 2-5 Kerberos type, 2-10 mode overview, 2-5 for SQL Server Notification Services, 5-4 for SQL Server Replication, 2-18 for SQL Server Reporting Services, 2-12 Windows mode, 2-12, 2-18, 5-4 authorization strategies identifying principals in, 2-8 identifying securable scope, 2-8 identifying securables in, 2-7 to 2-8 overview, 2-7 to 2-8

B

backup, Full-Text Search service enhancements in SQL Server 2005, 1-6 baselines. See also performance analysis creating, 7-8 updating, 7-27 BEGIN DIALOG CONVERSATION statement, 6-8 benchmarks. See performance analysis best practices for Code Access Security, 2-11 for protecting HTTP endpoints, 2-10 reducing scope for SQL injection attacks, 2-4

C

chronicles. See event chronicles; subscription chronicles clustered indexes, 3-20 to 3-21 Code Access Security (CAS) best practices, 2-11 objectives, 2-11 relationship to CLR integration, 2-11 SQL Server host policy permission sets and, 2-11 columns, database adding for storing auditing information, 2-36 encrypting data, 2-32 guidelines for protecting data, 2-32 storing XML data in, 3-6 to 3-7 common language runtime (CLR). See integrated common language runtime (CLR) common table expressions (CTE). See CTE (common table expressions) concurrent activities, offloading, 4-13 to 4-14 Connect permission, 2-10 connection pooling, 4-10 CONTAINS clause, 1-6 conversations, Service Broker defined, 6-8 identifying groups, 6-17 overview, 6-8 to 6-9 standards for dialogs, 6-10 to 6-11 conversion. See data conversion between XML and relational forms CTE (common table expressions) defined, 1-26, 8-3 vs. derived tables, 8-4 evaluating use, 8-2 to 8-13 multi-parent hierarchy queries, 8-10 to 8-12 nonrecursive, 8-4 to 8-6 overview, 8-3 to 8-4 query syntax, 8-5 to 8-9 recursive, 8-4, 8-7 to 8-9 types, 8-4 vs. views, 8-4

D

data conversion between XML and relational forms, 8-53 to 8-62 Data Definition Language (DDL) schema replication and, 1-10 SQL Server 2005 statements, 1-6 Data Protection Application Programming Interface (DPAPI), 2-15 data stores accessing data across, 4-15 to 4-16 choosing, 5-3 to 5-4 defining isolation levels, 4-8 to 4-11 identifying, 4-3 to 4-4 integrating, 3-31 to 3-32 scaling out to, 3-25 to 3-32 Database Engine, SQL Server creating new projects, 7-5 to 7-6 defined, 1-3 Transact-SQL enhancements, 1-26 to 1-27 Tuning Advisor tool, 7-25, 7-27 Database Mail, 1-12 to 1-13, 2-20, 2-21

I-2

databases

databases. See also applications backing up, 6-27 to 6-28 conversion strategies between XML and relational forms, 8-53 to 8-62 data behaviors and, 4-2 to 4-7 identifying data stores, 4-3 to 4-4 mirroring, 6-25 normalizing, 3-16 to 3-17 protecting column-level data, 2-32 scaling out to multiple data stores, 3-25 to 3-32 source control strategies, 7-2 to 7-9 db_dtsadmin database role, 2-17 db_dtsltuser database role, 2-17 db_dtsoperator database role, 2-17 DDL triggers creating, 2-44 to 2-45 defined, 1-27 determining scope, 2-45 role in auditing database development, 2-45 delivery protocols custom, 5-24 execution settings, 5-25 File, 5-22, 5-24 SMTP, 5-22, 5-24 DENSE_RANK ranking function, 1-26, 8-31 to 8-34 deployment design process, 7-31 to 7-34 need for strategies, 7-29 overview, 7-28 development teams, multiple managing, 2-39 to 2-45 overview, 2-40 to 2-41 planning access, 2-40 to 2-41 role of schemas in managing access, 2-42 to 2-43 dialog conversations defined, 6-8 designing standards, 6-10 to 6-11 identifying groups, 6-17 overview, 6-8 to 6-9 documents. See also source control baselined versions, 7-8 checking in and out, 7-8 keyword comments in, 7-8

E

EFS (encrypted file system), 2-15, 7-9 encrypting column-level data, 2-32 SQL Server Integration Services packages, 2-16 to 2-17 encryption keys, 2-13 END CONVERSATIONS statement, 6-10 endpoints, HTTP. See HTTP endpoints error handling keeping error messages from potential attackers, 2-4 new SQL Server 2005 TRY...CATCH feature, 1-27 event chronicles, 5-7 event classes, role in Notification Services, 5-5 events defining data for Notification Services, 5-2 to 5-7 indexing, 5-6 processing order, 5-26 schema design, 5-5 exception handling, 1-27 EXPLICIT mode, 8-58 to 8-59 EXTERNAL-ACCESS permission set, 2-11

F

fault tolerance, 6-25 to 6-26 firewalls, 2-10 FLWOR statements, XQuery, 8-43 Full-Text Search, SQL Server defined, 1-3 enhancements in 2005 version, 1-6 overview, 1-5 running queries against linked servers, 1-6 functions. See ranking functions

G-H

Guest account, Windows, 2-10 HTTP endpoints best practices for protecting, 2-10 considerations for using, 1-7 to 1-8 overview, 1-7 when not to use, 1-8 when to use, 1-8

I

indexes, database clustered, defining to optimize data access, 3-20 to 3-21 covering, 5-13 creating, 3-18 to 3-21 custom, defining, 5-6 event considerations, 5-6 nonclustered, defining to match query requirements, 3-18 to 3-19 notification considerations, 5-21 overview, 3-18, 3-20 SQL Server tools for tuning, 5-13 subscription considerations, 5-13 injection attacks. See SQL injection attacks INSTEAD OF trigger, 1-27 integrated common language runtime (CLR) considerations for using, 1-28 to 1-29 overview, 1-28 when not to use, 1-29 when to use, 1-29 Integration Services (SSIS), SQL Server considerations, 1-20 defined, 1-4, 1-20 designing security for, 2-16 to 2-17 encrypting packages, 2-16 to 2-17 overview, 1-20 isolation levels defining for data stores, 4-8 to 4-11 guidelines for choosing, 4-9 to 4-10 overview, 4-8 row versioning, 4-10 Snapshot Read, 4-10, 4-11

K

Kerberos, 2-10 keyword comments, 7-8

L

locales, in Full-Text Search queries, 1-6 locking conversation groups, 6-17 locking hints, 4-30 to 4-31 logs, audit, 2-35

Replication Agent security model

M

Management Studio configuring integration with Visual SourceSafe, 7-4 to 7-5 creating new Database Engine projects, 7-5 to 7-6 integrating source control with, 7-4 to 7-6 running unit tests, 7-15 to 7-17 testing source code conflict prevention, 7-6 messaging, Service Broker defined, 6-8 designing data flow, 6-14 to 6-23 designing dialog standards, 6-10 to 6-11 designing queue usage, 6-12 to 6-13 identifying conversations, 6-8 to 6-11 identifying queue activation methods, 6-22 to 6-23 receiving messages, 6-13 Microsoft Visual SourceSafe document versions, 7-8 encryption strategies and, 7-9 integrating source control with Management Studio, 7-4 to 7-6 Microsoft Visual Studio Team System (VSTS), 7-7, 7-13 Mixed Mode Authentication, 2-18 multiple development teams. See development teams, multiple

N

nonrecursive CTEs defined, 8-4 query syntax, 8-5 to 8-6 normalizing databases, 3-16 to 3-17 Notification Services, SQL Server adjusting batch sizes, 5-22 considerations for using, 1-15 defined, 1-4 defining event data, 5-2 to 5-7 delivery protocols, 5-22, 5-24 deployment considerations, 7-35 designing delivery strategies, 5-23 to 5-27 designing security for, 2-14 to 2-15 designing solutions, 5-1 to 5-27 designing subscription strategies, 5-8 to 5-14 estimating workload, 5-4 formatting content for delivery, 5-22 implementing indexing, 5-21 overview, 1-15 process of defining notifications, 5-19 schema design for notifications, 5-20 vs. Service Broker, 1-15 storing data, 5-3 to 5-4 subscription management API, 5-14 when not to use, 1-15 notifications, defined, 5-18 NTFS permissions, 2-15 NTILE ranking function, 1-26, 8-32, 8-33 nvarchar data type, 3-7

O

offloading concurrent activities, 4-13 to 4-14 OLAP (online analytical processing), 3-14 to 3-15 OLTP (online transaction processing), 3-14 to 3-15 OPENXML function, 8-60 to 8-61 operators new in SQL Server 2005, 1-27 XQuery, 8-41 to 8-42 OUTPUT operator, 1-27

I-3

P

parameterized queries, role in SQL security, 2-4 partitioning data, 3-28 path expressions, XQuery, 8-41, 8-48 to 8-50 PATH mode, 8-59 performance analysis creating benchmarks, 7-20 to 7-25 creating change strategies, 7-27 determining response times, 7-23 measuring performance, 7-26 permission sets EXTERNAL-ACCESS, 2-11 SAFE, 2-11 UNSAFE, 2-11 PIVOT operator, 1-27, 8-17, 8-19 pivot queries defined, 8-17 evaluating, 8-16 to 8-27 examples, 8-19, 8-22 open schema and, 8-23 to 8-25 overview, 8-17 to 8-19 syntax, 8-21 to 8-22 principals, in authorization strategies, 2-8 proxy accounts, 2-21 publication access lists (PALs), 2-18

Q

quantified expressions, XQuery, 8-44 to 8-45 quantum duration, 5-26 quantum limits, 5-26 quantum sequencing, 5-26 queries. See pivot queries; unpivot queries; XQuery (XML Query) language query plan outputs, 7-24 querying XML data, 3-5 queues, message activating, 6-22 to 6-23 designing usage, 6-12 to 6-13 overview, 6-12 receiving messages, 6-13

R

RANK ranking function, 1-26, 8-31, 8-32 ranking functions defined, 1-26, 8-30 evaluating, 8-30 to 8-38 query guidelines, 8-35 to 8-38 query syntax, 8-33 to 8-34 role in queries, 8-32, 8-35 to 8-38 types of queries, 8-31 to 8-32 ranking queries. See ranking functions READCOMMITTEDLOCK locking hint, 4-30 READPAST locking hint, 4-30 recursive CTEs defined, 8-4 example, 8-8 to 8-9 query syntax, 8-7 to 8-9 Replication, SQL Server considerations for using, 1-9 to 1-10 defined, 1-3 designing security for, 2-18 to 2-19 heterogeneous, 1-10 overview, 1-9 Replication Agent security model, 2-18 to 2-19

I-4

Reporting Services

Reporting Services, SQL Server considerations for using, 1-19 defined, 1-4 designing security for, 2-12 to 2-13 overview, 1-18 scaling out, 1-19 when to use, 1-18 response times, 7-23 risk management. See auditing roles, SQL Server, 2-25 to 2-27 rolling back transactions, 4-27 to 4-28 row versioning isolation levels, 4-10 ROW_NUMBER ranking function, 1-26, 8-32, 8-36, 8-37 rowsets, converting XML data to, 8-60 to 8-62

S

SAFE permission set, 2-11 scalability SQL Server Full-Text Service improvement, 1-6 SQL Server Notification Services support, 1-15 scaling out advantages, 3-29 to 3-30 data store integration considerations, 3-31 to 3-32 defined, 3-24 disadvantages, 3-30 to multiple data stores, 3-25 to 3-32 overview, 3-24, 3-29 for performance, 3-27 with redundancy, 3-29 to 3-30 schemas backing up, 6-27 default, 2-43 designing for databases, 3-12 to 3-13 designing for events, 5-5 designing for notifications, 5-20 designing for subscriptions, 5-10 guidelines for using, 2-25 to 2-27 managing objects in, 2-40 naming, 2-42 normalized, 3-16 to 3-17 overview, 2-42 to 2-43 replicating, 1-10 role in managing developer access, 2-42 to 2-43 role of indexes, 3-18 to 3-21 shared, 2-43 scripts, database source control strategy and, 6-28, 7-2 to 7-9 securables, in authorization strategies, 2-7 to 2-8 Secure Sockets Layer (SSL) encrypted source code and, 2-10, 7-9 and SQL Server Reporting Services, 2-13 security. See also authentication; authorization strategies designing for solution components, 2-9 to 2-21 designing for SQL Server Agent, 2-20 to 2-21 designing for SQL Server Integration Services (SSIS), 2-16 to 2-17 designing for SQL Server Notification Services, 2-14 to 2-15 designing for SQL Server Replication, 2-18 to 2-19 designing for SQL Server Reporting Services, 2-12 to 2-13 encryption strategies, 7-9 managing multiple development teams, 2-39 to 2-45 protecting auditing information, 2-38 Service Broker routing issues, 6-20 to 6-21 testing applications for, 7-18 to 7-19 SEND statement, 6-8 to 6-9 sequence expressions, XQuery, 8-43 sequence type expressions, XQuery, 8-45

server clusters, 6-26 Service Broker, SQL Server backup strategies, 6-27 to 6-28 characteristics, 6-4 to 6-5 considerations for using, 1-16 defined, 1-3, 6-1 designing data flow, 6-14 to 6-23 designing dialogs, 6-10 to 6-11 designing solutions, 6-1 to 6-28 enabling, 6-28 fault-tolerance considerations, 6-25 to 6-26 identifying activation methods, 6-22 to 6-23 identifying conversations, 6-8 to 6-11 identifying data-staging locations, 6-18 to 6-19 identifying services, 6-6 to 6-7 list of sample application processes, 6-2 vs. Notification Services, 1-15 overview, 1-16, 6-1, 6-4 role in accessing data across multiple data stores, 4-16 role in auditing events, 2-37 security routing issues, 6-20 to 6-21 when to use, 1-17, 2-37 Service-Oriented Architecture (SOA), 1-8 SMTP (Simple Mail Transfer Protocol), 5-22, 5-24 snapshot folders, 2-19 Snapshot Read isolation level choosing, 4-10 when to use, 4-11 SOA (Service-Oriented Architecture), 1-8 source control checking out documents, 7-8 creating baselined versions, 7-8 creating folders for, 7-7 creating plans for, 7-7 to 7-8 designing strategies, 7-2 to 7-9 encrypting code, 7-9 integrating with Management Studio, 7-4 to 7-6 need for, 7-3 role of keyword comments in documents, 7-8 storing scripts for, 7-7 SourceSafe. See Microsoft Visual SourceSafe SQL injection attacks example, 2-3 to 2-4 overview, 2-3 reducing scope, 2-4 stored procedures and, 2-30 SQL Mail vs. Database Mail, 1-12 to 1-13, 2-20, 2-21 SQL Profiler, 2-37, 7-24 SQL Server Agent component, 1-3, 1-11, 2-20 to 2-21 Analysis Services component, 1-4 architecture, 1-3 to 1-4 built-in services overview, 1-2 to 1-13 Connect permission, 2-10 creating auditing strategies, 2-33 to 2-38 Database Engine component, 1-3, 1-26 to 1-27, 7-5 to 7-6, 7-25, 7-27 Database Mail vs. SQL Mail, 1-12 to 1-13, 2-20, 2-21 designing security strategies, 2-1 to 2-45 Full-Text Search component, 1-3, 1-5 to 1-6 Health And History Tool, 7-26 Integration Services (SSIS) component, 1-4, 1-20, 2-16 to 2-17 list of components, 1-3 to 1-4 Management Studio. See Management Studio managing multiple development teams, 2-39 to 2-45

XSLT new features for Transact-SQL support, 1-26 to 1-27 new services, when to use, 1-14 to 1-20 Notification Services component, 1-4, 1-15, 2-14 to 2-15, 5-1 to 5-27 Replication component, 1-3, 1-9 to 1-10, 2-18 to 2-19 Reporting Services component, 1-4, 1-18 to 1-19, 2-12 to 2-13 Service Broker component, 1-3, 1-16 to 1-17, 2-37, 4-16, 6-1 to 6-28 services that support business needs, 1-1 to 1-29 SQL Server Agent, 1-11, 2-20 to 2-21 SQL Server Full-Text Search, 1-5 to 1-6 SQL Server Integration Services (SSIS), 1-20, 2-16 to 2-17 SQL Server Mobile Edition, 1-9 SQL Server Notification Services, 1-15 SQL Server Replication, 1-9 to 1-10, 2-18 to 2-19 SQL Server Service Broker, 1-16 to 1-17 SSIS. See Integration Services (SSIS), SQL Server SSL. See Secure Sockets Layer (SSL) statement operators, new in SQL Server 2005, 1-27 stored procedures backing up, 6-27 documenting, 2-30 encrypting, 2-30 guidelines for using, 2-29 to 2-30 overview, 2-29 to 2-30 vs. views, 2-31 stored security, role in SQL security, 2-4 structured exception handling, 1-27 subscription chronicles, 5-11 subscriptions designing Notification Services strategies, 5-8 to 5-14 designing rules strategies, 5-12 indexing, 5-13 management API, 5-14 overview, 5-9 schema design, 5-10 System Monitor, 7-19, 7-24

T

TDS (Tabular Data Stream) protocol, 1-7 teams. See development teams, multiple TOP (expression) operator, 1-27 Transact-SQL creating and backing up script files, 6-28 new SQL Server 2005 features, 1-26 to 1-27 XQuery extensions, 8-40 triggers, 1-27 TRY...CATCH construct, 1-27

U

unified large object programming model, 1-27 UNION ALL operator, 8-7, 8-9, 8-58 unit testing advantages, 7-10 vs. application testing, 7-14 automatic, 7-13, 7-27 benefits, 7-14 creating plans for, 7-18 to 7-19 defined, 7-13 functional test considerations, 7-11 how to perform, 7-15 to 7-17 limitations, 7-14 overview, 7-13 to 7-14 repeatability, 7-14 types of tests, 7-18 to 7-19

UNPIVOT operator, 1-27, 8-19, 8-20 unpivot queries creating, 8-26 to 8-27 defined, 8-19 example, 8-20 overview, 8-19 to 8-20 UNSAFE permission set, 2-11

V

varchar data type, 3-7 views, guidelines for using, 2-31 virtual private networks (VPNs), 2-19 Visual SourceSafe. See Microsoft Visual SourceSafe Visual Studio Team System (VSTS), 7-7, 7-13

W

Web synchronization, 2-19 Windows Authentication, 2-12, 2-18, 5-4

X

XLOCK locking hint, 4-30 XML constructors, 8-41, 8-42 XML data choosing column type for storing, 3-6 to 3-7 data conversion strategies with relational forms, 8-53 to 8-62 defined, 1-27, 3-6 defining standards for storing, 3-2 to 3-7 query considerations, 3-5 redundancy considerations, 3-5 in SQL Server 2005, 1-6 storage overview, 3-3 to 3-4 when not to store data as, 3-4 when to store data as, 3-4 XML data type, 3-6 XQuery (XML Query) language defined, 8-39 FLWOR statements, 8-43 operators, 8-41 to 8-42 overview, 8-39 to 8-52 parameterizing queries, 8-46 to 8-47 path expressions, 8-41, 8-48 to 8-50 primary expressions, 8-41 quantified expressions, 8-44 to 8-45 query syntax, 8-40 to 8-45 role in WHERE clause of DML statements, 8-51 to 8-55 sequence expressions, 8-43 sequence type expressions, 8-45 XSLT (XSL Transformation) files, 5-18

I-5

Related Documents