Web Application Testing A Methodology with Tools and Considerations Department of Transportation and Logistics Chalmers Göteborg 2001-02-06
Authors:
Jesper Rydén Pär Svensson
Tutor at Chalmers:
Ola Hultkrantz
Tutor at Sigma nBiT:
Peter Nielsen
Lars Patriksson Examiner Chalmers:
Kenth Lumsden
Preface It has been a challenge working in a new field, web application testing, and it has been great to see the striking resemblance between the software testing process and the development process of any project. Planning is certainly important in any kind of project. Most projects, at some point, require the hands of several persons to progress. So did this one. We would therefore like to thank the persons who at these points helped us. First we would like to thank our tutors, as well as project owners, at Sigma nBit AB in Gothenburg, Sweden, Peter Nielsen and Lars Patriksson, for their help getting the process started, and for sharing their insights throughout the project. We would also like to thank all the people helping to evaluate the Test Priority Sheet at different stages through its development; Jesper Almström, Anders Averö, Magnus Edvardsson, Dan Jonsson, Helena Olsson and Kenny Rubinsson. Finally, we’d like to thank our tutor at the Department of Transportation and Logistics, Chalmers, Ola Hultkrantz, for his guidance in the last stages of this Master Theses. Gothenburg, February 2001 Jesper Rydén & Pär Svensson
Sammanfattning Denna rapport är ett examensarbete gjort på institutionen för Transportteknik på Chalmers. Uppgiften för examensarbetet var att ta fram en metod att användas vid test av webb-applikationer samt att utvärdera verktyg för detta ändamål. Resultatet av arbetet blev Test Priority Sheet, en matris bestående av områden där behov av testning kan föreligga samt de faktorer vi anser styr behovet av testning hos webb-applikationer. Dessa områden att testa har vi delat in under olika så kallade testtyper. Testtypen beskriver var fokus ligger vid test av dessa områden, tillika vad som bör beaktas totalt sett vid test av webb-applikationer, i stor utsträckning även vid test av annan mjukvara. Dessa testtyper är: •
Funktionalitet
•
Användarvänlighet
•
Server gränssnitt
•
Kompatibilitet
•
Prestanda
•
Säkerhet
De faktorer vi anser styr behovet av testning, således var vikt bör läggas vid test, är komplexitet samt målsättning. Komplexiteten hos en webb-applikation säger oss hur den är uppbyggd, alltså vilka komponenter och tekniker som används. Vet vi detta så vet vi vad som finns att testa. Målsättning är en kombination av syfte och målgrupp, vad man vill uppnå med sin applikation. Där finns stora paralleller mellan våra faktorer, komplexitet och målsättning, och de faktorer som nämns vid riskanalys, sannolikhet för fel och effekt (kostnad) av fel. Vi använder också våra faktorer på samma sätt som vid riskanalys, d.v.s. att vi multiplicerar dessa faktorer med varandra och erhåller ett risk-, eller prioritets-, värde. För att erhålla numeriska värden och för att belysa vad vi anser viktigt i bedömningen av de olika områdena, har vi utformat två frågor att besvara under varje område, en för varje faktor. Vad gäller verktygen för automatiserade tester så utvärderade vi två typer av verktyg från två olika tillverkare. Dessa verktyg är:
•
Rational Robot (Från Rational Software)
•
Rational TestManager
•
Astra QuickTest (Från Mercury Interactive)
•
Astra LoadTest
Robot och QuickTest är verktyg för funktionstester och TestManager och LoadTest är verktyg för prestandatester. Rational TestManager är i själva verket ett verktyg som även används för kravhantering och liknande, men nu mera innehåller det också funktionaliteten från det tidigare fristående verktyget Rational LoadTest. Astra-serien från Mercury Interactive är speciellt framtagna för test av webb-applikationer, medan Rationals produkter kan användas även för test av annan mjukvara. Vi har utvärderat funktionen hos dessa verktyg och även jämfört dem med varandra. Slutsatser vi dragit kan kortfattat sägas vara följande: •
Det krävs att användargränssnittet inte genomgår större förändringar för att verktygen ska vara användbara
•
Ett uppenbart användningsområde är vid datadrivna tester där indata kan parametriseras
•
Användandet av verktyg för funktionstester kan medföra att testaren koncentreras mot att leta fel i automatiskt skapade skript eller hos verktyget, inte hos applikationen som testas
•
För att få ut full funktionalitet från verktygen krävs utbildning, erfarenhet och programmeringskunskap
•
Prestandatester bör i de flesta fall automatiseras p.g.a. svårigheter att utföra dessa manuellt.
Abstract The process of establishing a methodology for web application testing resulted in the Test Priority Sheet. This methodology helps determine the most important areas to test in any web application, as well as being a tool for prioritizing when time is short. The methodology is general to be useful for any web application. For this kind of tool to be used, it is required that it is short and easy to use. The Test Priority Sheet is all this. Testing is never really completed. Testing can only show presence of errors and not the absence of them. Due to the characteristics of the Internet, time is often short creating a need to prioritize the testing efforts. To be able to do this you need to know which factors that set the need for testing. The factors are: •
Complexity
•
Aim
Complexity is a factor considering the architecture of a web application and of which components it is built. Aim is a combination between purpose and target group, considering what you mean to achieve with the application. Today, web applications are sometimes categorized based on their interactivity. For testing, this categorization is not useful. Knowing the complexity of a site does not mean that we know what to test. The reason for this is that similarities between two web applications of different complexity categories are sometimes greater than within a single category. Two applications that look the same might in fact differ greatly in how they are built. Thus, even when combining complexity and aim there is no way to beforehand tell what one needs to test. Instead, every part of an application needs to be separately considered. All these areas where errors might occur are grouped under the appropriate test type. The test types are: •
Functionality
•
Usability
•
Server side interface
•
Client side compatibility
•
Performance
•
Security
To separately consider all areas where errors might occur we applying a risk-based approach, where complexity is likelihood of an error to occur and aim is impact, or cost, of an error. For each area under the test types there are two questions to answer, one for each factor. For example, Forms is an area under functionality where the following questions are to be answered: •
Are forms present? To what extent and how advanced?
•
How critical are forms for the site and how sensitive are users to problems with forms?
Numerical answers to questions such as these are multiplied with each other, creating a risk-based order of priority when compared to other values established the same way. All this is compiled into a matrix, where the rows consist of the areas to test and the columns of the factors to consider. This is the Test Priority Sheet. Further, tools for automating tests have been evaluated. Two types of tools were evaluated. First, tools for mainly functional tests, and second, tools for performance tests. The tools are used for script creation, script execution and result analysis. Functional testing tools are mainly used for regression tests, since when a test has been made, it can be rerun with minimal manual effort. Performance testing tools are used in order to evaluate how the tested application reacts to high numbers of simultaneous users. The tool simulates a number of actual users and captures and displays data on the performance of the site. Factors for evaluation where: •
Test functions or possibilities offered by the tools
•
Usability of the tools
•
Editing of scripts
•
Maintainability of scripts
•
Error analysis possibilities
•
Other functionalities (such as interface towards other software for enhanced testing)
Having used the tools, the following conclusions where established: •
Since the obvious use of graphic interface in web applications and the tools dependence of it, the GUI should have reached a rather high level of stability before using these tools
•
An obvious use of the tools is when performing data-driven tests. By using in-parameters for the application, the test effort may be greatly reduced. These tests are of interest even if the GUI is not finished due to the saving of resources
•
Straight replay of recorded tests results in low rate of bug detection. This further implies the need for regression testing or data-driven tests using these tools.
•
Using the tools often focus testers on weaknesses in the tool rather than the tested application
•
The tools are not easily learned. In order to take full advantage of them to create useful and maintainable test cases, experience and education on the tools is needed as well good skills in programming. The architecture of the tested application should also be well understood
•
Performance testing tools are of obvious use since the difficulty in performing performance tests manually is extent.
Table of content
Preface.........................................................................................................................2 Sammanfattning...........................................................................................................2 Abstract........................................................................................................................3 Table of content............................................................................................................4 Introduction......................................................................................................................................................... ....7 Background.................................................................................................................................................. ...........7 Goal 7 Problem definition........................................................................................................................ ..........................7 Method 7 Target group............................................................................................................................................. ...............7 Delimitations...................................................................................................................................................... .....7
Web Applications..........................................................................................8 1.1 The recent evolution of Internet.............................................................................8 1.2 The present state of the web..................................................................................8 1.3 Classification of web sites......................................................................................8 1.4 Web applications....................................................................................................9 1.5 User Issues.............................................................................................................9
Fundamentals for Testing................................................................... .........10 2.1 The Process.........................................................................................................10 2.1.1 Test planning.............................................................................................................................. ..................11 2.1.2 Test case design & implementation................................................................................................... ...........11 2.1.3 Test execution & evaluation.............................................................................................................. ...........12 2.1.4 Test phases............................................................................................................................ .......................12 2.1.5 Test types...................................................................................................................................................... 12 2.2 Structuring Test Types..........................................................................................13 2.3 Test Approaches...................................................................................................14 2.3.1 Walkthroughs and inspections................................................................................................................ ......14 2.3.2 White-box testing.............................................................................................................................. ...........14 2.3.3 Black-box testing................................................................................................................... ......................14 2.3.4 Gray-box testing......................................................................................................................................... ..14 2.4 Prioritizing.............................................................................................................14 2.4.1 Why prioritize?................................................................................................................. ...........................14 2.4.2 Risk-based analysis......................................................................................................................... .............14 2.5 Challenges............................................................................................................15
Establishing the Need for Testing............................................................. ...16 3.1 Factors Affecting the Testing Need......................................................................16 3.2 Areas to test..........................................................................................................17
The Methodology........................................................................................18 4.1 Test Priority Sheet................................................................................................18 Aim 18 Usability......................................................................................................................................................... .......18 Client side Compatibility.................................................................................................................. ....................19 Performance............................................................................................................................................ ..............19 Security 19 4.3 Differentiation of Focus........................................................................................19 4.4 When and How.....................................................................................................20 4.5 Evaluation.............................................................................................................20 4.5.1 Discussing pros and cons........................................................................................................... ..................21 4.6 Conclusion............................................................................................................22 4.6.1 Discussion....................................................................................................................................... .............22 4.6.2 Where might we go from here?............................................................................................................. .......22
The Possibility of Automating Testing..........................................................23 5.1 The Possibilities with Tools...................................................................................23 5.2 General Considerations........................................................................................23 5.3 Benefits and Problems.........................................................................................24 5.3.1 Benefits...................................................................................................................................... ..................24 5.3.2 Problems............................................................................................................................. .........................25 5.4 Automated Testing Tools Evaluation....................................................................25 5.4.1 What is offered?........................................................................................................................... ................25 5.4.2 Evaluation of tools.................................................................................................................... ...................25 5.4.3 Conclusion............................................................................................................................ .......................29
Appendix A: Test Lists................................................................ .................30 Appendix B: Test Areas.............................................................. .................32 Appendix C: The Test Priority Sheet...........................................................38 Complexity........................................................................................................................................................... .38 Aim 38 Usability......................................................................................................................................................... .......38 Client side Compatibility.................................................................................................................. ....................38 Performance............................................................................................................................................ ..............39 Security 39
User Manual........................................................................................... .....39 Step-by-Step........................................................................................................................................................ ..39 Answer the Questions...................................................................................................................... .....................39 Result 39 The questions...................................................................................................................................... ..................39
Appendix D: Methodology Evaluation Scores.............................................41 Complexity........................................................................................................................................................... .41
Aim 41 Usability......................................................................................................................................................... .......41 Client side Compatibility.................................................................................................................. ....................41 Performance............................................................................................................................................ ..............42 Security 42 Complexity........................................................................................................................................................... .42 Aim 42 Usability......................................................................................................................................................... .......42 Client side Compatibility.................................................................................................................. ....................42 Performance............................................................................................................................................ ..............42 Security 43 Complexity........................................................................................................................................................... .43 Aim 43 Usability......................................................................................................................................................... .......43 Client side Compatibility.................................................................................................................. ....................43 Performance............................................................................................................................................ ..............43 Security 43
References:............................................................................................ .....44
Introduction This report is a Master Theses at The Department of Transportation and Logistics at Chalmers, Gothenburg. The assignment was given by Sigma nBiT AB in Gothenburg and the work was done at Sigma nBiT AB. The Master Theses covers software testing in general and web application testing in specific. Background Internet is today an important competitive tool. In this environment it is an absolute must that your web site performs the way it is supposed to maintain customer satisfaction. To be confident in ones web application, thorough testing is needed before releasing the application on the web. Sigma nBiT AB has, as one of many areas, software testing as a key area of expertise. The company wishes to broaden its knowledge in the growing area of web application testing and the Master Theses is a step taken in that direction. Goal The assignment can be divided into two parts. The goal of the first part of the assignment was to establish a web application test methodology that would simplify web application testing. The methodology should be general in its character and easy to use but still comprehensive, covering all important aspects of web application testing. The methodology should pose as a mean for prioritizing where to allocate testing resources during the development of a web application. The goal of the second part of the assignment was to gain knowledge on automated tools for web application testing. Sigma nBiT wishes to broaden its knowledge on this type of tool as well as build up experience on these within the company. This report is an early step in that direction by evaluating some tools offered on the market. The goal is to recommend if, where and when tools of this type may be used in the test process, as well as which manufacturer of tools offer the best product. Problem definition In order to reach the goals, certain sub problems have been defined that can be presented as follows: •
Categories of web sites
How do web sites differ in how they are used? Is the site aimed at the general public or internal co-workers? •
Technologies used in modern web site construction
What builds up a web site today? What technologies and techniques are used in web design? •
How does the process of testing web applications differ from test of traditional software?
•
Are tools for automated tests useful when testing web applications?
What kinds of tests are the tools useful for and when throughout the process may they be used? •
Which manufacturer offers the best tools? Do they differ in functionality?
Method The area of web application testing is relatively new since web applications in them selves are a relatively new occurrence. Because of this the amount of books written on the subject is limited. Instead, much of the information needed to be able to complete this Master Theses was gathered using the medium in question, the Internet. Of course, books were still a big source of information. Besides the search for information in books and on the Internet, some interviews were conducted and a number of persons were involved in evaluating our first steps towards a useful methodology. These evaluations were mainly informative, giving many good insights on the way to the final result. As for the automated testing tools, evaluation copies were obtained when needed. Tools from different vendors were compared, testing the same web applications. Target group This report is mainly aimed at personnel in an organization with software testing as an activity. It is proposed to constitute background material for test of web based applications. The reader of this report is supposed to have basic knowledge of elements connected to information technology and web components. If terms used are not familiar to the reader, concerning any of these two areas, information may be found at www.webopedia.com. Delimitations Activities such as test management are not covered in this report. The methodology to be established covers all important areas to test. Under security issues, though, we only consider the general need of security. Security issues are a large area where much has been written. It will not play a major part in this report. We will not cover connected area to web, such as wap applications, in this report.
_Chapter One__ Web Applications Internet is still relatively new, which creates a need to discuss certain issues before tackling the aim of the project; the making of a web application test methodology.
1.1 The recent evolution of Internet Internet has grown, and is growing, rapidly. Between 1980 and 1994 the growth rate was close to one hundred percent per year and sometimes even over the one-hundred limit. Internet is still growing fast and today the growth rate is about sixty percent per year. This growth is measured in number of hosts (Robert Zakon, 2000).
Picture 1.1. The growth of Internet. The graph shows the growth after 01/1995.
Many think of World Wide Web as Internet. Today when you are on the net you are likely to be visiting a WWW-site, but WWW was not released until 1991. The market for certain offers is limited, making it necessary for new companies to have something different or better to offer. Together with the fact that the web now reaches almost all possible target groups, it makes it inevitable that new businesses enter the web. New businesses demand new features. All this adds to the growing complexity of the sites on the web. Not long ago most sites did not offer much of interactivity. Today the possibilities for interactivity are endless. The development of new techniques for Internet makes it easier every year to make reality of your ideas regarding your site. Special features are performed on your browser and live events are broadcast over the world, through the web. The possibility of real time publishing in many cases sets the pace for web site development. With the growing complexity and demands for rapid deployment the web site development tend to lack testing efforts even when the needs for it, in fact, increases. The growth of Internet, together with the increasing number of personal computers in the world, makes for an increase in accessibility, meaning Internet is now available to more or less everyone. This makes the Internet even more interesting to new companies which in addition means that it will keep growing.
1.2 The present state of the web E-commerce is the wave we are floating on at the moment. E-commerce creates a need for expanding the web. If it ever was, the web is no longer an isolated market, separate from other businesses. E-commerce shows the need for incorporating the web idea with your overall business idea. To incorporate the fast-paced web solutions with the idea of worldwide sales, which is a natural step, there is a need for effective logistics behind the scene. Many companies, wanting a piece of the new-baked cake of the web, have had serious economical difficulties due to logistical problems. They have lacked in knowledge on the running of companies and only focused on the web site. We are seeing many bankruptcies because of the overconfidence in the business bearing abilities of the web. The Internet is a good platform for spreading your message and for expanding your market, but just being on the net is not enough of a foundation for successful business. The Internet is also a good platform for simplifying and speeding up business transactions and communications. Within any given company, fast-paced transactions since long has been a known must for effective logistics and thereby for saving money, but the same is true for intercompany transactions. Client/Server relationships have been used for many years for this purpose, but basing it on the web makes it easy to expand on the client side since this only requires an Internet connection and a browser. The connection between Internet and logistics will not be further discussed in this report.
1.3 Classification of web sites When publishing a web site the construction and design, of course, is based upon what you hope to achieve with the site. Depending on this, the site may be classified as a certain type of site. There are a number of different types of sites published on the web. These sites have been
categorized by a number of authors. We have chosen two different classifications that we believe in a clear way show the different angels from which to view the sites. The first classification is based on the different business purposes of a commercial web site. James Ho (Evaluating the World Wide Web: A Global Study of Commercial Sites, 1997) classifies these purposes of commercial web sites into three categories: •
Promotion of products and services
•
Provision of data and information
•
Processing of business transactions
Promotion is information about products and services that are part of the company’s business, whereas provision is information about, for instance, the environmental care program the company may sponsor. Processing refers to regular business transactions. Although this classification is meant to show the purposes with one commercial web site, we believe it can also be used to categorize the main purpose of a web site. For instance, a company’s on-line catalogue would be a promotional site, a private person’s homepage may be considered a provisional site and, of course, a web site for banking services may be considered a site for processing. Another classification is based on the degree of interactivity the web site offers. Thomas A. Powell (1998) classifies web sites into five categories: •
Static Web Sites
The most basic web site. Presentation of HTML-documents. The only interactivity offered is the choice of page by clicking links. •
Static with Form-Based interactivity
Forms are used collecting information from the user, including comments or requests for information. Main purpose is document delivery, limited emphasis on data collection mechanisms. •
Sites with Dynamic Data Access
web site used as front-end for accessing a database. Via the web page users can search a catalogue or perform queries on the content of a database. The results of the search are displayed as HTML-documents. •
Dynamically Generated Sites
The wish to provide customized pages for every user creates a need to take a step away from the static web site. The page presented may be static, providing no interactivity, but the way it was created has similarities with the way screens are created in software applications. •
Web-Based Software Applications
Web sites that are part of the business process often have more in common with other client/server applications than with static web sites. This could be a inventory tracking system or a sales force automaton tool. This classification is derived from the need of methodology during the development of web sites. The classification is useful also for the testing process, not only for the need of methodology but also for how extensive the testing must be. For instance, for a static web site the demands may be, besides that the information is correct and up-to-date, that the source code is correct and that the load capacity of the server is great enough, i.e. the server can handle a large enough number of visitors at the same time. There is no need to go much deeper in this case. For the other extreme, Web-Based Software Application, the requirements are much greater, where, for instance, security is of great importance. These two classifications are two major ways of showing distinctions between web sites. Together they provide information about interactivity and purpose which gives us an idea on the site’s complexity.
1.4 Web applications The title of this paper, Web Application Testing, creates a need to define what we mean with web applications. Are we talking only about high complexity e-commerce web sites? Above we introduced two different authors classifications of web sites. We find it interesting that Powell (1998) in his definitions does not use the word application until a higher degree of interactivity is offered. Instead he uses the word Site for the first, simpler, categories. Regardless of if this is indeed intended or not, we choose to define web application as any web-based site or application available on internet or on an intranet, whether it be a static promotion site or a highly interactive site for banking services.
1.5 User Issues When we, ordinary web surfers, use the Internet, what is it that we experience as problems? Which sites make us leave and move on to another? What characteristics shall a site have in order to make users want to stay? It is hard, if not impossible, to give an answer of general character to these questions. What makes it so difficult is the diversity of users. Since visitors to a site may come from all corners of the world, they differ greatly in how they experience a site as satisfying. But regardless of which culture they are from or what kind of site that is visited, some things are never appreciated. For example, when a page takes too long to load many users get impatient and move on to another site or page. The same if a site is too difficult to navigate. Overall, users tend not to tolerate certain problems when out surfing the web (Jakob Nielsen, 1998). If we have trouble understanding the layout or if it takes to much effort to find the information we are seeking, the site is experienced as complex and we will start looking elsewhere for what we seek. Many sites today present animations or other graphical effects, which many users experience as positive. But if you are a visitor searching for specific information, you seldom appreciate waiting time in order to obtain what you seek. Today though, there is almost always an option to skip the feature, which is positive.
Another problem that always irritates when on the web is broken links. We don’t think that there is anyone with some web-browsing experience that hasn’t encountered this. It is an always-returning error that will continue to haunt the web for as long as pages are moved or taken off the Internet. These relatively small errors shouldn’t be too difficult to remove, and there is therefore no excuse to have broken links on a site for more than a short period of time. As for the dramatically increasing use of Internet banking services, one must feel secure; otherwise no one would want to make transactions over the web. Still today, many Internet users are skeptical towards exposing personal information, which should suggest even higher demands on security.
_Chapter Two_ Fundamentals for Testing Testing is the process of verifying that a product meets all requirements. A test is never complete. When testing software the goal should never be a product completely free from defects, because it’s impossible. The average is 16 faults per 1000 lines of code (Peter Nielsen, 2000) when the programmer has tested his code and it is believed to be correct. When looking at a larger project, there are millions of lines of code, which makes it impossible to find all present faults. Far too often products are released on the market with poor quality. Errors are often uncovered by users, and in that stage the cost of removing errors is extensive.
2.1 The Process In order to make the testing process as effective as possible it needs to be viewed as one with the development process (Fig 2.1.). In many organizations, testing is normally an ad-hoc process being performed in the last stage of a project, if performed at all.
Figure 2.1. Testing should be conducted throughout the development process.
Studies show that testing often represents between 30-50% of the software development cost (RUP1). In order to reduce testing costs, a structured and well defined way of testing needs to be implemented. Certain projects may appear too small to justify extensive testing. However, one should consider the impact of errors and not the size of the project. Important to remember is that, unfortunately, testing only shows the presence of errors, not the absence of them. The figure below (Fig 2.2.) gives a general explanation about both the development process and the testing process but it also shows the relation between these two processes. In real life these two processes, as stated earlier, should be viewed and handled as one. The colors in the figure are to show relations between phases. For example, system testing of an application is to check so that it meets the requirements specified at the start of the development. A large part of integration testing is to check the logical design done in the design phase of the development. The relationships between the phases are based partly on the V-model as presented by Mark Fewster and Dorothy Graham (Software Test Automation, 1999).
1
Rational Unified Process is a software engineering process developed by Rational. It provides an approach on how to assign tasks and responsibilities in the development
process.
Figure 2.2. Test phases relations with Development phases
As mentioned above, a well defined and understood way of testing is essential to make the process of testing as effective as possible. In order to produce software products with high quality one has to view the testing process as a planned and systematic way of performing activities. The activities included are Test Management, Test Planning, Test Case Design & Implementation and Test Execution & Evaluation. Test management will not be further discussed in this report. 2.1.1 Test planning As for test planning, the purpose is to plan and organize resources and information as well as describing the objectives, scope and focus for the test effort. The first step is to identify and gather the requirements for the test. In order for the requirements to be of use for the test, they need to be verifiable or measurable. Within test planning an important part is risk analysis. When assigning a certain risk factor, one must examine the likelihood of errors occurring, the effect of the errors and the cost caused by the errors. To make the analysis as exhaustive as possible, each requirement should be reviewed. The purpose of the risk analysis is to identify what is needed to prioritize when performing the test. Risk analysis will be further discussed later in this report. In order to be able to create a complete test plan, resources needs to be identified and allocated. Resources include •
Human – Who and how many are needed
•
Knowledge - What skills are needed
•
Time - How much time needs to be asserted
•
Economic - What is the estimated cost
•
Tools - What kind of tools are needed (hardware, software, platforms, etc.)
A good test plan should also include stop criterias. These can be very intricate to define, since the actual quality of the software is difficult to determine. Some common criterias used are (Rick Hower, 2000): •
Deadlines (release deadlines, testing deadlines, etc.)
•
Test cases completed with certain percentage passed
•
Test budget depleted
•
Coverage of requirements reaches a specified point
•
Bug rate falls below a certain level
•
Beta or alpha testing period ends
Since every error in today’s complex software very rarely is found, one can go on testing forever if stop criterias aren’t used. Specific criterias should therefore be defined for each separate test case in the process. The outcome of test planning should of course be the test plan, which will function as the backbone, providing the strategies to be used throughout the test process. 2.1.2 Test case design & implementation The main objective with test case design is to identify the different test cases or scenarios for every software build. These cases shall describe the test procedures and how to perform the test in order to reach the goal of each case. For each test case, the particular conditions for that test shall be described as well as the specific data needed and the supposed result. If there have been testing done on the subject before, the use of old cases becomes important. The design of test cases is based on what is to be tested. Features to be tested often present a unique need and the testing should be done in small sections to cope with the differences in test case design that occur due to this. When testing a single feature there are a number of things to consider; how does it work, what may cause it to crash, what are the possible variables? Both data input and user actions should be done in ways that test the designed logic so that we get answers to the questions; do we get the expected answer, what happens when wrong input is used? If, for instance, you are prompted to write your age in a field, the logic behind this may
expect a number between 0 and 100, but of course you might by mistake put a letter instead. What happens then? If the application is not designed to see this mistake it will cause the application to crash or at least give an unexpected answer. This makes it important to test this feature so that it does not accept the wrong input but still, of course, accepts the expected input. Considering the many ways to make similar mistakes there are thousands of different inputs that should be tested. It is not possible to test all these potential inputs, instead one should choose input that in a good way represent all possible groups of input. This procedure is often referred to as boundary testing. For the results of the test to be useful and to know when stop criterias are reached, the completeness of the test cases is very important. When creating tests based on the test cases, certain objectives should be addressed. •
Make your tests as reusable as possible
•
Make your tests easy to maintain
•
Use existing tests when possible
2.1.3 Test execution & evaluation When running your tests, the results have to be taken care of in a defined manner. Was the test completed or did it halt. Are the results of the test the expected ones and how can it be verified that the results originate from a correct run of a test. When the actual results from a test do not match the expected, certain actions have to be taken. The first is to determine why the actual and the expected results differ. Does the error lie in the tested application or e.g. the test script? When errors are found, they need to be properly reported. Information on the bug needs to be communicated so that developers and programmers can solve the problem. These reports should include, among others, application name and version, test date, tester’s name and description on the occurred error. When a bug has been properly taken care of, the software needs to be re-tested to verify that the problems have been solved, and that the corrections made did not create new conflicts. During the process of development and testing of software, many changes will surely be made to the software as well as its environments. When such changes are made, there will be a need to assure that the program still functions as required. This kind of testing is called regression testing. The difference between a re-test and regression test is that the latter is done when changes has been made to the program regarding, for example, its functionality, whereas re-test is done to test the software after bug fixing. 2.1.4 Test phases As the testing process should be viewed as parallel with the development, it will go through certain phases. When the development moves forward the scope and targets for testing changes, starting with each piece and ends with the complete system. The goal is that every part is tested and fully functioning before being integrated with other parts. Important in the development process is that neither of these phases is completely separated from each other. There is no definite border between when the different phases starts or ends. They can be seen as an overall approximate guideline for how to perform a successful test. Several authors, including Bill Hetzel and Hans Schaefer, describe the test process as consisting of the following phases: •
Unit testing
•
Integration testing
•
System testing
•
Acceptance testing
Unit testing Also called module test. The testing done on this stage is on the isolated unit. Integration testing When units interact with others, one must assure that the communication between them works. Conflicts often occur when units are developed separately or if the syntax to be used is not communicated in a sufficient way. System testing When the system is complete, testing on the system as a whole can commence. Test cases with actual user behaviour can be implemented and non-functional tests, such as usability and performance, may be made. Acceptance testing The purpose is to let end users or customers decide whether to accept the system or not. Are the users feeling comfortable with the product and does it perform the activities as required? 2.1.5 Test types The previous text describes the general guidelines for testing, whether it is software applications or web applications. But as the title of this report implies, the scope is centred on how to perform successful testing on web applications and how this process differs from the general test process.
In order to proceed to this area certain issues have to be addressed first. One must be acquainted with the different types of tests that are performed within the different stages throughout the process. The text that follows describes, in short, the most commonly used terms for test types used, with aim on the medium of the web. There are no definite borders between the types and several of them can seem overlapping with adjacent areas. Needless to say, there are several opinions on this matter and we base the following descriptions on authors such as Hans Schaefer, Bill Hetzel, Tim Van Tongeren and Hung Q. Nguyen. Functionality testing The purpose of this type of test is to ensure that every function is working according to the specifications. Functions apply to a complete system as well as a separated unit. Within the context of the web, functionality testing can, for example, include testing links, forms, Java Applets or ActiveX applications. Performance testing To ensure that the system has the capability that is requested the performance have to be tested for. The characteristics normally measured are execution time, response time etc. In order to identify bottlenecks, the system or application have to be tested under various conditions. Varying the number of users and what the users are doing helps identify weak areas that are not shown during normal use. When testing applications for the web this kind of testing becomes very important. Since the users are not normally known and the number of users can vary dramatically, web applications have to be tested thoroughly. The general way of performance testing should not vary, but the importance of this kind of test varies (Schaefer, 2000). Testing web applications for extreme conditions is done by load- and stress testing. These two are performed to ensure that the application can withstand, for example, a large amount of users simultaneously or large amount of data from each user. Other characteristics of the web that is important can be download time, network speed etc. Usability To ensure that the product will be accepted on the market it has to appeal to users. There are several ways to measure usability and user response. For the web, this is often very important due to the users low acceptance level for glitches in the interface or navigation. Due to the nearly complete lack of standards for web layout this area is dependent on actual usage of the site to receive as useful information as possible. Microsoft has extensive standards down to pixel level on where, for example, buttons are to be placed when designing programs for windows. The situation on the web is exceptionally different with almost no standards at all on how a site layout should be designed. Compatibility testing This refers to different settings or configuration of, for example, client machine, server or external databases. When looking at web this can be a very intricate area to test due to the total lack of control over the client machine configuration or an external database. Will your site be compatible with different browser versions, operating systems or external interfaces? Testing every combination is normally not possible, so identifying the most likely used combinations is usually how it’s done. Security testing In order to persuade customers to use Internet banking services or shop over the web, security must be high enough. One must feel safe when posting personal information on a site in order to use it. Typical areas to test are directory setup, SSL, logins, firewalls and logfiles. Security is an area of great importance as well as great extent, not least for the web. A lot of literature has been written on this subject and more will come. Due to the complexity and size of this particular subject, we will not cover this area more than the basic features and where one should put in extra effort.
2.2 Structuring Test Types As mentioned before, the types of tests can in several cases be included in other, depending of what kind of web application that is to be tested. For example, security can be included into functionality and load/stress separate from performance based on the scope of the test and the complexity of the site under test. In order to be able to present a structured way to test your web applications, one must relate to testing as a process as described earlier. The actual performing of a test, as to what to test at what stage of the test process, creates a need for dividing the different areas to test into groups. Due to the comparatively early stage of the history of the web, there are few accepted standards describing how to do this separating of areas to test. When searching for information regarding testing web applications, large amounts of information is found written by diverse knowledgeable authors describing their approach on how to perform complete tests. We have chosen four of them that we believe represent different opinions on how, when and what to test (See Appendix A.). When comparing Tim Van Tongeren’s (1998) way of structuring test types with Vincent Soberano’s (1998) views on the subject, one notices the similarities between the two authors’ way to classify and exemplify different types of tests and what is included within the type of test. The most interesting divergence is in the way they present interface issues. Soberano preferred to combine Functionality and Structural under the main heading User Interface were as Van Tongeren presents User interface separate from Functionality. Whether one is more useful than the other is difficult to determine, but they both represent interesting approaches toward how different areas relate to one another. Thomas A. Powell presents a somewhat different approach to this matter. His grouping of tests is interesting though he lists unit- and integration testing under functionality tests. One wonders how familiar Powell is with traditional test methods but he still offers an interesting view to when, during the phases of testing, different types are to be produced and executed. Hung Q. Nguyen (2001) share several similarities with foremost Van Tongeren and Soberano, mainly in the way he presents the main areas to test. One obvious difference is in that Nguyen emphasize Help and Installation test, as two areas that call for separate attention. Which author that has the best approach is impossible to say. They all share the same backbone but presents different ways to pinpoint vital areas when testing a web site.
Since the area of web testing is still in its cradle, there are few accepted standards and definitions to lean on. This often makes it necessary for the process of testing web applications to be defined for every test process done. Kathleen A. Iberle (www.stickyminds.com Step-by-Step Test Design, 2000) upholds that a unique test type list has been created for each major type of product she has worked on, and they have all been slightly different. The conclusion is that which test types will be more important than the next, and in what grade prioritization between types have to be done, will differ from test to test.
2.3 Test Approaches Test approaches, or test techniques used, will differ throughout the process. Depending on the stage of the test process, type of test etc. the scope and manner of test will change. 2.3.1 Walkthroughs and inspections Not all test approaches, or techniques, are supposed to actually test an application. Walkthroughs and Inspections are techniques that are used to decrease the risk of future errors to occur. The purpose is to find problems and see what is missing. Walkthrough is a review process in which a designer leads one or more through a segment of design or code, he or she has written. Inspection is a formal evaluation technique involving detailed examination by a person or group other than the author to detect faults and problems (Hetzel, 1988). 2.3.2 White-box testing White-Box testing (fig 2.3) focuses on finding faults in the code and architecture of the application. The code and architecture are known to the tester and the tests should be designed and executed in a manner that guarantees full coverage even though some areas are believed to be less important or less executed during actual running of the application. There are tools available that monitor coverage of tests so that one knows to which extent the code has been exercised.
If… …else… Figure 2.3. White-box Testing. The code and architecture are known.
2.3.3 Black-box testing Black-Box testing (fig 2.4), on the other hand, may be done without knowing how things are done, but instead concentrating on what should be done. The approach is often used during functionality testing. Cases are based on specifications and requirements of the application or function to be tested. Valid and invalid inputs are tested and actual outcome is compared to expected outcome based on requirements.
X
? Y(X)
Figure 2.4. Black-box Testing. Code and architecture unknown. Valid input and expected outcome are known.
The White-box approach is to an higher extent used early in the development and testing process, before there are any visible functions to test, while the Black-box approach is used later when functions are visible and can be tested. 2.3.4 Gray-box testing To completely test a web application one needs to combine the two approaches; White-box and Black-box testing. The Gray-box testing approach takes into account all components making up a complete web application, including the environment on witch it resides. Hung Q. Nguyen (2001) states that Gray-box testing is integral to web application testing because of the numerous components, both software and hardware, that make up the complete application. The factors considered in Gray-box testing are high-level design, environment and interoperability conditions.
2.4 Prioritizing 2.4.1 Why prioritize? Earlier we stated that all software released contains errors to some extent and that a testing process in reality is never finished. This in turn tells us that there is never enough time for complete testing and because of this there is a need to prioritize your efforts. The characteristics of web applications create a need for published information to always be up-to-date, with real-time publishing as an extreme. Short time-to-market further emphasize this need to prioritize. 2.4.2 Risk-based analysis We now understand the need of prioritizing but we have yet to discuss how this should be done. We have found several examples of what to consider when prioritizing that all can be summed up in these guidelines found at Microsoft Accessibility, Technology for Everyone (Microsoft, 2000):
•
Prioritize testing features that are necessary parts of the product.
•
Prioritize testing features that affect the largest number of users.
•
Prioritize testing features that are chosen frequently by users.
What these features are, differ from application to application and they are not always obvious. Considering the application’s purpose might help deciding the important parts of the site. Earlier we introduced purposes of web sites that we had derived from Ho’s (1997) business purposes. These purposes present different needs of prioritizing. A site for business transactions, for instance an Internet banking service, has security requirements that must be fulfilled for us users to feel confident in the application, or we will not use it. A promotional site, on the other hand, has no apparent need of high security in that sense. This can be translated into assessing the significance of a specific function or the importance of a function not to fail, which leads us to risk-based analysis where some ideas come from James Bach (2000). Whenever we make decisions there is something working in the background considering things that might go wrong and the effects that they might have. This is also the basis of risk-based analysis. Risk-based analysis is a way of determining the order of priority between all possible errors that might occur. Risk-based analysis takes into account the two factors mentioned above: •
The Likelihood of an error to occur (L)
•
The Cost of an error
(C)
These two factors are given numeric values and are multiplied with each other creating a risk-value. R=L*C
(Schaefer, 2000)
Likelihood of an error (L)
Risk
Cost of an error (C) Fig 2.5. Risk based analysis
The higher the value – the higher the risk – the higher the priority. Based on this the further test actions can be planned.
2.5 Challenges Web applications share many similarities with traditional client/server applications, but there are also a number of differences that create new problems when it comes to testing. One example is that web application developers often use a great number of different techniques to create the features on their web sites. The mix of techniques used consists of HTML, ASP, Java, JavaScript, ActiveX and others. Also creating problems is the wide range of user-side configurations. The web application must work on many different combinations of hardware configurations, Operating Systems and browsers. PC, Mac, OS/2, Windows NT, Windows 95/98, Internet Explorer, Netscape Navigator and more, make for a great number of possible combinations. For a traditional client/server application the number of simultaneous users often can be predicted. For web applications this is very difficult. It is hard to know the number of hits per day the site might get and also the variation over the day. These are some of the challenges you encounter in web application development and testing. More characteristics of web application testing are the difficulties of defect tracking. The many layers interacting can all be responsible for the error, or symptom of error, to occur. Hung Q. Nguyen (2000) presents what he considers to be five fundamental considerations: •
When we see an error on the client side, we are seeing the symptom of an error—not the error itself.
•
Errors may be environment-dependent and may not appear in different environments.
•
Errors may be in the code or in the configuration.
•
Errors may reside in any of several layers (Client/Server/Network).
•
Examining the two classes of operating environments—static versus dynamic—demands different approaches
Where Static Operating Environment is configuration and compatibility variables and Dynamic Environment is resource and time-related errors.
_Chapter Three___ Establishing the Need for Testing The goal was to achieve an easy to follow methodology with quantifiable measures, for any tester to follow when testing any type of web site or application. The methodology should be: •
Comprehensive
•
Short
•
Easy to use
The primary goal of the methodology is to: Show the most important areas to test The strategy on how to reach the goal was to study and establish the factors that set the testing need for a site. We have also studied what areas there are to test in web applications. We have previously presented different authors views on this area and in this chapter we will present our own views on the subject (see 3.2).
3.1 Factors Affecting the Testing Need Within the process of software development, the element of testing has become an area where more and more resources are being spent. But time is often short, creating a need to prioritize. To be able to do this prioritization, there is a need to identify the factors that constitute the foundation for what to test. The differences between web sites call for differences in testing each site. While a site for e-commerce or banking services has evident demands for security, a site for advertising has much less. Therefore the testing process will have a different approach depending on the site, and of course on what amount of resources (in time, human and monetary terms) that is allocated for the project. When Powell (1998) describes the development of web sites, he divides them into five groups based on their interactivity, as mentioned in chapter 1.1.3. •
Static Web Sites
•
Static with Form-Based interactivity
•
Sites with Dynamic Data Access
•
Dynamically Generated Sites
•
Web-Based Software Applications
With this list, he describes the differences in how web sites are constructed and also the differences in development strategies. With interactivity he points out differences in complexity and in the way the sites are built. Based on the differences in complexity, the test process will most certainly differ from site to site. Depending on the technology used and differences in how features are created, divergences in the test approach will be evident. It is therefore clear that the complexity of the site, in how it is built and to what extent different technologies are used, will be a main factor in determining the test approach and to prioritize urgent areas. Noticeable are the similarities between Powell’s five categories and the three purposes we derived from Ho’s (1997) classification of web sites or rather the purpose with them. •
Promotion of products and services
•
Provision of data and information
•
Processing of business transactions
Interesting here is that Ho points at the differences in purpose of the site and how one creates value with web sites as the tool. When considering risk based testing as described in chapter 2.4, one can see the connection between the value created, based on the business purpose, and the seriousness of an error that might occur. If the site is, for example, an online banking site, the value created may be reduced costs for personnel as well as making it easier for customers to use the bank’s services and therefore attract more customers. The connection between error effect and the value created based on the purpose now becomes evident. We have previously discussed what to consider when prioritizing. In that chapter (2.4) we did not mention interactivity, or complexity, which now seems to be an important factor when considering what to test. However, the complexity factor tells us more about what specific components there are to test while purpose helps us decide in what order they should be tested. We have, it would seem, established a base to help decide what to test in a specific web application. However, purpose is not the sole factor when prioritizing. Target group is also of importance. Who is it that you want to view your web site? If your site is meant to be a place where you write down funny stories for your friends to read, there might not be a great need for testing certain things or to test thoroughly. If, on the other hand, your web application is meant to help your company conquer the world, there is no room for performance slacks, broken links or usability problems. Viewing possible target groups and their characteristics we find several possible groups that may differ in their needs and demands on the application in a way that gives us incentives to use them for determining the testing need. One example of possible categorization of target groups is presented below. The categories are:
•
Internal: For instance an Intranet. Within the responsible company.
•
Public: Possible future customers.
•
Customers: Already acquired customers.
Together purpose and target group give a good view over the importance of certain features and the effect of an error in any of these features. At one point, after reading Powell’s categorization of web sites, the idea was to use a similar categorization of web sites, combining interactivity with other factors, to create a small number of categories. The idea at this point was to predefine a small number of test approaches, depending on the categories, where we would exclude areas unnecessary to test from the complete list we will present in chapter 3.2. We found that this cannot be done since, whatever the degree of interactivity and whatever purpose one might have, the way a site is built and of what components, differs greatly. Thus, we cannot predefine what one should test based strictly on interactivity, purpose and target group. This did not affect the primary goal. Depending on the factors, purpose (P) and target group (T), the occurrence of an error has different impact. A natural step to take is therefore to view the combination of them as varying cost of an error(C). What we get is one of the factors of Risk-based analysis. The more complex (Comp) an application is, the more different features there are, making it more likely that an error will occur. This reveals the other factor of Riskbased analysis, the likelihood of an error to occur (L). In compliance with the theory of Risk-based analysis discussed in chapter 2.4.2 we get the following formula on the risk value: R = L(Comp) * C(P,T) The risk value is in fact a value describing the testing need. The higher risk, the higher the need for testing (see fig. 2.5).
3.2 Areas to test Presented in chapter 2.2 are some different ways of structuring test types that are of importance when testing web applications. To be able to use a methodology for any given web test, it has to present a complete and structured list of possible problem areas whatever site it might be. To accomplish this we have seen a need to produce a compilation of areas to test that cover a broad spectrum. Based on interviews and different authors opinions, we have examined different approaches on how this compilation of tests should be arranged to be as complete as possible but still practical to use. Six main areas were identified. Mainly based on the authors mentioned in chapter 2.2, when presenting different ways to approach the test types, the list below is a representation of how we believe important areas should be grouped to make the test types as covering as possible. The groups also illustrate that the focus for the test differs from area to area, as do the test approach. It is supposed to supply the tester with an instrument to identify possible problem areas. Functionality 1.
Links
2.
Forms
3.
Cookies
4.
Web Indexing
5.
Dynamic Interface Components
6.
Programming Language
7.
Databases
1.
Navigation
2.
Graphics
3.
Content
4.
General Appearance
Usability
Server side Interface 1.
Server Interface
2.
External Interface
Client side Compatibility 1.
Platform, OS
2.
Browsers
3.
Settings, Preferences
4.
Printers
Performance 1.
Connection speed
2.
Load
3.
Stress
4.
Continuous use
1.
General Security
Security All important issues under each test type are addressed in Appendix B. When comparing the two versions by Hung Q. Nguyen and Tim Van Tongeren, we notice that they both separate User Interface from Functionality, whereas Soberano lists Functionality together with Structural design, as headings under User Interface. Noticeable here is that Soberano has separated the functional aspect of User Interface from the Structural design, i.e. separating usability from functionality. When comparing these two alternatives, we believe that the later better fit our needs and that it is more applicable on the web, based on how it is to be used. Why this is so, is based on the way the interface towards the user is made with the web as a medium. There are very few standards for how a web site should be designed in order to make users experience the site as user friendly and comprehensible. Web sites encountered often show poor consideration for how users respond and act on the web. We therefore believe that usability issues should be addressed separately to ensure that both the functionality and usability aspects are covered when testing web sites. We have further come to the conclusion to separate user side- from server side configuration issues by addressing Server side Interface issues separately from Client side Compatibility. When studying the authors’ opinions in this matter, we have used a definition closely related to both Van Tongeren’s and Soberano’s definitions. Nguyen chooses to combine client and server issues but presents database issues separately, whereas we list the functional aspect of database issues as a part of Functionality and configuration issues under Server side Interface. We have chosen Nguyen’s approach to include load- and stress testing in Performance testing and it seems a logical way to reason, since these types of tests actually measures the performance of the system. Other performance issues are also discussed under this heading such as connection speed and continuous use. Security issues will not be addressed at a more detailed level then General Security, since the area is of such great extent and would need a report of its own to be satisfyingly covered.
_Chapter Four__ The Methodology The discussion on why we do what we do throughout the process of establishing the sought methodology, along with other questions, such as at what point in a web application development process the methodology can be used, are presented throughout this chapter. In Appendix C there is a manual on how to use the Test Priority Sheet. Along with the manual are the questions that are to be answered in the matrix. First, though, we briefly present the result of our efforts.
4.1 Test Priority Sheet
(0-3) Links Forms Cookies Web Indexing Programming Language Dynamic Interface Components Databases Usability Navigation Graphics
(1-3)
Testing Need
Aim
Functionality
Complexity
The process of establishing a purposeful and easy to use methodology resulted in a matrix, where the rows consist of the areas to test and the columns of the factors to consider. The methodology employs the Risk-based approach presented in the end of chapter 3.1. Two of the factors established in the same chapter, 3.1, purpose and target group, are combined into to a new factor, Aim. For each area there are two questions to answer, one under each factor. Giving numerical answers to these questions and multiplying them creates a risk value, or a testing need value. The values may then be compared to each other and the highest values are assigned the highest testing need and should therefore be prioritized. Besides the testing need values, using the matrix, called Test Priority Sheet, give you a good idea on the overall complexity of your application and the testing effort needed throughout the development.
Content General Appearance Server side Interface Server Interface External Interface Client side Compatibility Platform Browsers Settings, Preferences Printers Performance Connection speed Load Stress Continuous use Security General Security Table 4.1. The Test Priority Sheet. When evaluating the results from the methodology it is important to bear in mind that the prioritization recommended by the matrix should be a guideline and a way to identify extreme values in either direction. Areas with high values should be tested as early as possible while low value test areas may occasionally not be tested at all. Mid range values may be difficult to distinguish from each other and may therefore not always be prioritized in a strict order. 4.2 Handling the Factors The factors established in chapter 3.1 represent the main foundation for prioritizing test areas in the test process. Naturally, prioritizing also depends on the resources available, but this is mostly concerning to which extent prioritizing needs to be done. The question to answer at this point is; how may these three factors, Complexity, Purpose and Target Group, be used to be able to determine what to test? As certain test areas might be overlooked if using one of a small number of predefined methodologies, we decide to use the three factors, not to categorize web sites, but to determine the need of certain test areas when having a specific web application in mind. Consider a web application, any web application, but only one. •
To what extent does it depend on links, forms or cookies?
•
Is the content valid according to the purpose of the site?
•
Does the site contain features sensible to slow connection speeds?
•
How will the target group react to long download times?
These are examples of questions we believe will help determine what to test, or what areas that have the greatest need for testing. Before answering these questions one must establish what the factors are. For Complexity, one must consider how the application is built, of what components, both hardware and software, and the architecture of the application. Purpose is what the application is meant to achieve and Target Group is, of course, for what group, or groups, your efforts are aimed. Based on the list of the test types and areas recommended, we formed a matrix. In the cells of the matrix there is supposed to be the answers to questions like the once presented above. To have any real use of the matrix the answers needs to be numerical. The numbers in each row will then represent the significance of the test area in that row. When the numbers within each row are multiplied with each other the product becomes a numerical value representing the need for testing, as established in chapter 3.1. The value is not in any way intended to show the effort involved in testing the specific area, it only shows the need for testing, relative other areas. When evaluating the matrix so far, we find that it is sometimes hard to distinguish differences between the answers in Purpose and in Target Group. The two factors are obviously connected and when answering these two, it’s often hard to decide where the line is drawn between them. Therefore, when analyzing test persons’ answers, it shows that they often differ in how they answer these two questions depending on how they interpret and separate them. On several occasions, some have put, for instance, a 2 under Purpose and 3 under Target Group while others put the opposite. On some questions, purpose is a more relevant approach than target group, and on others it is the other way around. When realizing that this will probably always be the case, we consider the relation between these two factors to be so strong, that it is worth considering combining the two. This is done and the factor is named Aim. This turned out to be a more comprehensive way without losing any information.
4.3 Differentiation of Focus The idea was to have three basic questions, one for each factor (Complexity, Purpose, Target Group) that could be asked and answered throughout the matrix. Purpose and Target Group are combined into Aim, as discussed above. Complexity ranges from 0 to 3, where 0 is when the
feature does not exist and 3 when the feature is highly complex. Aim ranges from 1 to 3, where 1 is not important or low impact of failure, and 3 is highly important or high impact of failure. We soon realized that some differentiations would have to be introduced, based on under which of the main areas the questions was to be answered. As a result, instead of two questions, twelve were needed, two for each main area. This makes it easier to answer the questions, since less work needs to be put in to understand what to consider when giving your answer. Presented below are examples of questions and the differences that are necessary between, for instance, functionality and performance. Functionality •
Is the feature present - To what extent and how advanced
•
How critical is the feature for the Aim of the site
Performance •
Does the site contain features that demand specific performance and to what extent
•
How sensitive are users to the performance of the site
When test persons evaluated the matrix, it was noted that even within the test type groups, the questions were still too general in order to be answered without need of interpretation and adjustment to the single test area. It was therefore considered if the use of the methodology wouldn’t be even more simplified by the construction and use of separate questions for each and every test area in the matrix. This would also reduce the amount of subjective interpretations and misunderstandings when used, as would otherwise be obvious. An example of where the same question may be difficult to ask, or answer, is on Usability. To define the complexity of graphics and content, interpretation difficulties become obvious. If they on the other hand could be addressed separately and in a way to suit the area, the effort to determine the value for that specific area would be reduced. Separate questions were therefore produced and evaluated with a positive result.
4.4 When and How The methodology is designed to give testers and test leaders both a birds eye view over the project determining where and when to allocate resources along the test process, as well as a more concrete identification of specific test areas and test types of interest. The use of the Test Priority Sheet may differ, or should differ, depending on where you are along the development process. The methodology is designed to make it useful in many parts of the development process. Along the development process The proper way to approach testing, as stated in chapter 2.1, is to view it as a part of the development process. Doing this means that you plan the testing efforts early. Having planned the development process, and thereby having a good view over the design and architecture of the application to be developed, creates a possibility to thoroughly plan the testing as well. As soon as the components of which the application is to be built are known, along with at what point integration is done, testing may be planned thoroughly on different levels. The Test Priority Sheet can be a helpful tool to gain knowledge on what areas that are most important to test. First, overall testing requirements may be established using the methodology on the application as a whole. For instance, the project leader and the test leader go through the Test Priority Sheet considering the established architecture and requirements of the application. A second step is to view the points of integration and consider what requirements that can and should be verified at those points. Integration is usually done on several levels, such as integrating single units with each other as well as putting larger systems together by integrating subsystems consisting of previously integrated units. The functionality that is being put together at the different levels must of course be tested, and having planned the development thoroughly enables the testing of that function to be planned at the start of the project. Third, having used the Test Priority Sheet on unit level, the project leader and test leader are able to give valid advise to developers what to test on unit level as well, even before the coding has started. Together these three steps constitute a ground for thorough test planning. The effort required for testing throughout the development of the application can be established at the start of the project, making it easier to have a valid time and cost estimate for the project. The final stages of the process Above is an example of how testing should be performed. However, this is seldom the way it is actually done. Testing is often done as an adhoc process at the final stages of the development. When time is short there is a great need to prioritise your efforts. The Test Priority Sheet may then be used establishing the most important areas to test making sure you do not spend valuable time testing areas where an error would perhaps not cause a problem. General In what ever way you use the Test Priority Sheet, it is important to be familiar with the area definitions as well the questions, so that no mistakes are made because of misinterpretations. What role interpretation plays is discussed as part of chapter 4.5. Furthermore, it is important that every user of the methodology is well aware of the aim with the application, which might otherwise lead to misguided testing effort. If the planning of unit testing is done by the developer of that unit, it is important that the project leader communicates the aim.
4.5 Evaluation To know whether the methodology is useful and to assure future acceptance of the methodology, it was presented, at different stages of development, to different people who evaluated it by using it on actual web applications as well as in web application development projects. It was the early evaluations that led to the use of the factor Aim, instead of the factors Purpose and Target Group. They also to some extent affected the questions to answer on the Test Priority Sheet.
The later evaluation of the methodology was done by people in a web application development project. The results of the Test Priority Sheet differed some between the test persons. This is not at all surprising, but it still needs to be analyzed why the results differed. Their evaluations are presented in Appendix D. We see five possible reasons behind differences such as these. They are: 1.
Role
Differences in what part of the application one develops. Answers are obviously shaped by what part of an application one is responsible for. 2.
Experience
Past experience will affect what one considers being important. 3.
Phase of development process
Is it applied on a single unit or on a system? Possibly, when time is short, some prioritizing sneaks in when answering even though this is what the methodology is there for. 4.
Interpretations
Imprecise formulated or sloppy read area definitions or questions lead to different interpretations and thereby the answers differ. 5.
Badly put question
If questions or area definitions are badly put, they are hard to answer or answers may be given to a question that was not intended. Highly connected to Interpretations (4). There is no value in it self to have all differences wiped out. Within a group or project, the differences are what in the end ensure that every aspect of an application has been considered. However, certain reasons behind the differences may be worth taking a look at. What role and past experience one has must of course be what shape the answers. These areas are key to the use of the Test Priority Sheet within a group. In what phase of development the Test Priority Sheet is being used is also a valid reason for differences. Unit level or system level will most likely differ in what areas are prioritized. Differences in result due to differences in interpretation should be minimized. There is no real danger of missing anything if the methodology is used by a number of people within a group, where results are discussed before used. However, if a single person uses the methodology, interpretations not being what was intended may lead to that certain areas are missed. Badly put questions or area definitions in the Test Priority Sheet may have the same result. These reasons behind differences are important that we reduce to a minimum and changes have been made after the evaluation. To be able to thoroughly evaluate the methodology one must use it as it is intended, which is throughout a web application development process. The evaluation done, however, was more static and captured opinions only on a small part of the possible use of the methodology. Nevertheless, the opinions are regarded as very significant. The persons performing the evaluation of the Test Priority Sheet were in the middle of a web application development process. They were all at the same stage of development but responsible for different features of the application. The different roles and experience of the test persons shape the results of the evaluation. In short, what they found was: Cons: •
Difficult to use the first time
•
Area definitions sometimes unclear
•
Certain areas not what they expected
•
Areas can sometimes be further divided
•
Needs project start-up meeting to assure same definitions
•
Shows areas of importance that would otherwise have been missed
•
Useful to gain acceptance for the present testing need
•
Can make developers of different units understand importance of other parts of the application development process
Pros:
4.5.1 Discussing pros and cons The Test Priority Sheet was said to be difficult to use the first time. This is mostly the case with any new tool or methodology. We are confident that with only little experience on the methodology it will be easy to use. Many of the difficulties are connected to the other problem areas reported. For instance, if the definitions on certain areas to test are unknown, of course, the questions regarding these areas are hard to answer. However, after implementing the methodology a second time, many aspects were immediately much clearer. As for some area definitions not being what they expected, changes have been made. There was, for example, an area simply called Indexing that was defined to be ways of making sure that different search engines on the web found the site. Test persons intuitively wanted to answer questions on database indexing and related areas. Therefore, the prefix Web has been added. Furthermore, database issues were taken out from Server Interface, where it first resided, and put separately under Functionality as Databases. Indexing is one area to consider under Databases. Test persons also wanted some areas to be split up further. We consider it possible to split several of the test areas further, depending on what your area of expertise is, but we choose not to. Two goals are to keep the methodology general and short, and we believe these goals may not be reached if further splitting of test areas is done. However, we are aware of this fact and trust future use will shape the methodology if, and as, needed.
The last aspect considered a problem is also connected to area definitions. It was stated that some sort of project start-up meeting is required to make all users of the methodology within a specific project use the methodology with the same definitions on all areas. We realize the problem but also believe that having slightly different ideas on what to weigh in when implementing the Test Priority Sheet might actually further emphasize the important areas to test. As for the positive opinions, they are well in line with what we aim at. The first, highlighting important areas to test, has been our primary goal with this methodology. The other positive effects, or reasons to use it, are more of side effects realized analysing the methodology throughout its development process. Having them presented to us when evaluating the methodology led us to really consider these areas as areas where the methodology may be useful, as discussed in chapter 4.2.3.
4.6 Conclusion After making certain changes based on the evaluation, we have reached our primary goal; to accomplish a methodology that shows the most important areas to test. Although the evaluation showed that the Test Priority Sheet was difficult to use the first time, we are confident that it will be easy to use once the user has gotten acquainted with it. The Test Priority Sheet is one page long consisting of a number of questions to answer making it a fairly short methodology to use. It covers all important areas of a web application which makes it as comprehensive as we wanted. All together we consider our goals to be achieved. 4.6.1 Discussion Web applications may be categorized based on their complexity. Today though, basing which areas that are in need of testing solely on what category of complexity the application is cannot be done. Even though two web applications may look the same, the way they both are built may differ greatly. The number of techniques available is almost endless. ASP, DHTML and ActiveX are just a few techniques used. This means that even though we know the complexity of a web application, we do not know what components there are to test. Still, complexity is an important factor when establishing a web site’s need for testing. Even when the categorization is based on both complexity and aim there is no way to beforehand tell what areas that are in need of testing. Therefore, a methodology must be designed to cope with the characteristics of any single web application. Of course, one might argue that a methodology may be designed with the intent only to cope with the characteristics of some types of web applications and therefore be more precise in its recommendations. But as we found throughout our project, the similarities between types of applications are sometimes greater than within a single category web applications. Thus, based only on factors such as complexity, purpose and target group, a single-category methodology may not be established, i.e. it will not differ much from the comprehensive methodology. In chapter 4.2.3 we mentioned that the use of the Test Priority Sheet varies depending on at what phase in the development process you are and also on what application, or part of application, it is you use the matrix on within that phase. The methodology is designed in a way that handles these variations. By giving the score 0 to any test area’s complexity factor, the area is neglected. One may also simply choose not to answer certain questions that one considers unnecessary. Since the method is designed to point out relative differences in importance between test areas on one specific application at a time, it does not matter if some questions are left unanswered. The most important result is that essential areas to test are not missed. We believe that using relative measures is a strong advantage with this methodology. The general character of the methodology means it may be used throughout the development process to assess the changes in testing need for different features, as the design and architecture changes. Also an important feature of the methodology is that it requires the developer and/or the tester to consider the whole application thoroughly, by which one gains important knowledge about the application. The evaluation of the Test Priority Sheet presented in chapter 4.2.4 stress the need for further dividing up the test areas under each test type. When gaining more experience in web application testing in general, as well as on how the methodology is, and should be, used, the shape of the Test Priority Sheet may become subjected to changes. At this point, the methodology is knowingly made general in its character to prevent areas in need of testing to fall between the test areas in the matrix. We choose to maintain our high-level dividing of the areas to let future experiences guide the way to an improved methodology. Throughout future employment of the methodology small changes in many directions may be made to cope with needs of specific projects. In the end, what we finally have might very well be a number of slightly different methodologies. 4.6.2 Where might we go from here? Mainly because of the facts discussed above, we have chosen not to introduce certain changes based one some of the recommendations gained from evaluations of the methodology. These recommendations were based on a single project and probably very true under those circumstances, but they might be wrong for other projects. We have chosen to let the methodology persist in its current form for now and let future experience shape it. Maybe it will be cloned into a number of slightly different methodologies, shaped by the types of projects in which they are used. This is, anyhow, what we hope. One approach to methodology discussed is to make the answers to the questions stricter, i.e. giving less room for subjective interpretations. Doing this may lead to the use of the methodology also for comparing different projects. At this point, the methodology does not support this, it is only meant for comparing the relative testing need within a single project. Stricter answers might increase the number of mid-range values, but summing up the testing need values in the right column would create a value representing the total testing need of the application. Certain other changes would, of course, have to be made. One of them would be to establish a factor for every area that would represent the effort involved in testing the area. Establishing this would enable the methodology to be used for comparison between projects.
_Chapter Five_
The Possibility of Automating Testing When planning a test one have the opportunity to choose if one should perform the test manually or if one should automate the test. Later in this chapter (see 5.3) we will discuss benefits and problems with automating tests, but we will start with the possibilities that lie at hand.
5.1 The Possibilities with Tools Below Hetzel (1988) explains the possibilities from a complete test process point-of-view. He presents a number of tool categories that together may constitute a maximal degree of automation. Possible tool usage:
•
Requirements testing: formal review, no automated tools are used.
•
Design testing: major testing technique is the formal review, but there are some possibilities to use automated tools.
•
Testing in the small (unit testing):
1. Test Data Generator, to generate files and test data input 2. Program Logic Analyzers, checks code 3. Test Coverage Tools, checks what parts have been tested (run) 4. Test Drivers, executes the program to be tested. Simulates. Runs with accurate data and input 5. Test Comparators, compares test case to expected outcome •
Testing in the large (Integration testing, System testing, Acceptance testing):
1. Test data generators, to generate files and test data input 2. Instrumenters, to measure and report test coverage during execution 3. Comparators, to compare outputs, files and source programs 4. Capture Playback systems, to capture on-line test data and play it back 5. Simulators and test beds, to simulate complex and real-time environments 6. Debugging aids, to aid in determining what is wrong and getting it repaired 7. Testing information systems, to maintain test cases and report on testing activities 8. File aids, to extract test data from existing files for use in testing System charters and documents, to provide automatic system documentation The tools presented above might not be single tools but parts of a tool performing multiple tasks. For instance, the tool presented above as capture/playback systems often include a comparator to check the outcome of the tests recorded and executed using capture/playback. Along with these tools there are test management tools to help plan and manage resources and other important areas. There are also test case design tools or test generation tools that help generate test cases that cover all important aspects of an application. Fewster and Graham (1999) talk about three categories of test generation: •
Code-based; generates tests that check that code does what code does. Does not check that it does what it should.
•
Interface-based; generates tests based on well-defined interfaces, such as GUI. Can create tests that visit every button, checkbox or menu on a site.
•
Specification-based; generates both input and expected output from specifications list.
5.2 General Considerations The reason for automating testing is of course to make the test process more efficient. Not all parts of testing may be more efficiently performed automated. The tasks most suitable for automation are tasks that are more clerical, such as test execution and comparison. The more intellectual tasks, such as designing test cases, are often best performed manually, although these parts sometimes qualify for automation. Another characteristic that makes a test case suitable for automation is that it is performed repeatedly. The cost and effort for automating the test is then divided among the many re-runs of the same test. Although a test may be re-run many times, it is almost always the first time it is run that most errors or deficiencies are found. Even when automating tests, using one of many possible automation tools, most errors are often found manually since the test case run is tested manually before executed. When being a part of a test team, that one feels not to be testing efficiently (not finding enough errors fast enough), one might recommend automating the test process or parts there of. This is often a mistake. Automating an ad-hoc testing process or a poorly maintained process might
instead worsen the problems. Fewster and Graham (1999) say “automating chaos just gives faster chaos” which highlights the possible consequences of automating what is not a structured process. Important to understand is also that automating test execution is not automating testing. Using a capture/playback tool mentioned above automates input but not comparison or verification. To automate testing, the verification as well needs to be automated. Many tools support this but even so, testing is not yet automated. Supporting comparison does not mean that the tool actually verifies the outcome. To achieve this, the outcome of the case executed needs to be compared to expected outcome, that in turn needs to be input in the comparator. While on the subject of comparing actual outcome to expected outcome, it is important to point out that even though the actual outcome might be as expected, it does not mean that the application passes the test, or that it is free from errors tested for. The expected outcome that is input in the comparator might be wrong which in fact means that you will always approve a faulty application. To some extent, this is also the case for manual testing, meaning that it all depends on how the test case is written. The difference here is that a poorly defined manual test may very well find many important deficiencies or errors which a likewise poorly defined automated test will not, since it can only perform actions specified and verify given expected outcome. We will be discussing benefits and problems later on, but there is something that needs to be addressed right away that might also qualify for the benefits and problems chapter. That is the differences between human testers and comparators or test execution tools. While tools only do what they are specified to do and compares only the specified outcome, human testers performing the test manually are able to adjust their test case to unforeseen events, such as an error dialog box, and also to check outcome in many more ways than tools. When a human tester performs a test, almost any test, he simultaneously can be said to be performing a usability test since he has to navigate, wait, view and understand the application being tested. These positive side effects are never achieved when automating testing. Automated testing, on the other hand, ensures that a test is always being run in the exact same way, which is often important to reproduce errors or when performing regression tests. A final note, also from Fewster and Graham (1999), is that test execution tools are not really testing tools but re-testing tools since the test is really being performed when recording or testing the test. Running the tests with test execution tools then means that you are re-running a test. This is perfect for regression testing.
5.3 Benefits and Problems If automation is used with good judgement, substantial benefits can be achieved. Of course there are also many problems, or very important think-abouts, to bear in mind when automating the testing lies at hand. Below follows a discussion on benefits and problems needed to be realized before going through with the plans for automation. Fewster and Graham (1999) have listed both a number of benefits and a number of problems that comes with automating testing. We base parts of our discussion on their list. 5.3.1 Benefits Among the most obvious benefits is the possibility to run the same test on new versions of a program. A program may then be tested to check for new bugs, on existing features, that may have been introduced when adding new features or correcting deficiencies. This is regression testing. The same applies to re-testing, i.e. testing the functionality of a debugged feature. If a test has been created previously it will most likely be very easy to run it again, which is not only a benefit, but as we stated earlier, it might also be a must for automation of some tests to be at all worth considering. The second benefit mentioned by Fewster and Graham is the possibility to run tests more often by which you gain more confidence in the program. They also state that people often believe that automating tests will mean that their tests will run faster, while in reality it tends to mean that more tests are run on a more frequent basis. A program may pass a test at one time but fail another. When executing a test only once, manually or automatic, it might fail to catch a deficiency that later on might cause an error. For web applications this is certainly the reality because of the varying loads, connection speeds and the possibility of the connection to be lost completely for a moment. A benefit that, when realized, is obvious, is the possibility to perform tests impossible to do manually. When discussing web applications there are obvious examples. Load testing, for instance, is sometimes possible but not very recommendable to do manually. Applying a load of hundreds of users might be done manually but would require a massive administrative effort. Applying the same load by simulating the users using a tool decreases the required effort immensely. Sometimes certain user actions may trigger events that don’t call for any response to be shown on the screen. A tool might then be helpful when checking that the event actually occurred. Most likely there will always be some testing that should be done manually. Automating the tests that are tedious and requires only low skill, might free resources, in the shape of skilled testers, that serve better purpose coming up with better test cases. Tests run manually will also be performed better if there are fewer cases to run. In consistency with running tests more often, mentioned above, a benefit also mentioned earlier is the fact that a test will always be run exactly the same way when automated. This is important for both regression testing and to gain confidence in the program’s reliability. The reuse of tests we have established here means decreased cost per test run, meaning that more effort may be spent making the automated tests as good as possible. One of the last benefits mentioned by Fewster and Graham is that time to market can be shortened. A fully automated set of tests may be run much faster than the same tests manually. This means benefits for the release of later versions of a program, where tests necessary for full confidence in the release can be performed in much shorter time. This is often of great importance in the development of web applications. The time saving abilities are true if the automated test set is maintained throughout the development of the program. The cost of maintenance is one of the most important factors when contemplating automation.
5.3.2 Problems What we believe to be most important to remember is that the manual testing process must be well structured, with necessary and consistent documentation and consisting of tests good at finding errors and deficiencies, before considering automating. Without meeting these requirements, automation will most likely cause more problems than it solves. Also very important is to have realistic expectations and not believe everything manufacturers of automated tools might tell you. Many tools might be very good on what they can do, but they must be well understood before their possible benefits come true. It is almost always the case that manufacturers downsize the effort required to make use of their tools and instead they, of course, emphasize the miraculous successes that takes place using these tools. Fewster and Graham make a good point when stating our human need, or wish, to believe that using new technology will solve all our problems. Of course it is not so. Do not automate because you want to find additional errors that you did not find manually. Most tools are, as we stated earlier, re-test tools, meaning they execute a test that has in fact already been run. This means that most errors that can be found by this test already have been found. Despite this, there are tests that might still benefit from automation in this respect, and we have already mentioned load testing, for web applications. Automated testing is not the same as automatically creating the test scripts. In order to receive long time value by using tools, tests and test scripts need to be maintainable. By automatically creating the scripts, using capture tools, one builds in a certain amount of inflexibility. Actions taken by the user creates a strict order in the script, and if the tested application is modified, the captured sequence may not be valid anymore. This often generates unacceptable maintenance costs (Cem Kaner, 1997). The last problem we will address at this stage is related to the first consideration we mentioned in this chapter; the need for structured and well defined test that are good at finding errors. This time we will point out that the fact that a test is passed does not mean that a program is free of errors. We have previously emphasized the importance to remember that testing never shows the absence of errors, only the presence of them. Automating a test that in it self is faulty means that many errors, that the test is meant to capture, are missed. Automating a test like this means that the error is preserved to future releases.
5.4 Automated Testing Tools Evaluation There are several manufacturers of automated testing tools on the market today. Rational, Mercury Interactive and Compuware are some of the larger players. When comparing the tools, which are designed for web testing or with support for web applications, one notices the similarities in the way they are designed. They share many main features, and uses similar approaches on testing. 5.4.1 What is offered? The main features consist of script design and construction, script execution, and result analysis. The process of creating test scripts and the scripting language used by the tools differ from tool to tool. The construction of the scripts can either be done by graphic user interface recording, or by manually writing the scripts in the tool, or a combination of both. When recording a script, the tool captures actions made by the user and automatically creates script code. The script can then be edited to fit the test case. A form of checkpoints, depending on the tool, may be added to the script in order to create a baseline to compare actual with expected results when a test has been run. Many tools also offer the possibility to reuse scripts, captured with one browser, in other versions of browsers. When the script is run, the tool goes through the script line by line and performs the actions recorded, or written, and compares the result with the checkpoints added. The results of a test execution are then presented in some sort of result analysis window, where the status of the test (Pass, Fail) is shown and where possible failures or divergences have occurred. In order to do performance tests, many manufacturers offer a load-testing tool to verify how the application reacts to different loads and to identify bottlenecks and other weak spots in the system. These tests, as mentioned earlier in this report, becomes significantly important when testing web sites due to the uncertainty of the number of users and user side configuration. The tools often offer the possibility to use scripts made with the tool type described above, but also the option to create separate scripts for performance tests. Depending on the tool, different features are offered when designing the test, though they all share the possibility to define the number of users to simulate. Other options can be to define different scenarios to be run simultaneously with different user groups, or to define different user configuration. The performance statistics can be viewed in real time as well as after the test. The results of such tests are often presented in graphs or charts, displaying information such as user action, user number, page hits or resources used over time. 5.4.2 Evaluation of tools Before evaluating, there are certain aspects of testing that should be made clear. As mentioned in the previous chapter, benefits and problems were discussed as follows Benefits •
Running the same tests on later versions of the application
•
Running tests on a more frequent basis
•
Shortened time to market for later releases
•
The tests are being run in the exact same manner every time
•
A well structured test organization on the manual level is needed before automating
•
In order to create and perform relevant tests, the tools need to be fully understood
•
When an automated test is run, most of the faults have already been found
Problems
•
It can be hard to distinguish whether the faults lay in the tool or the application
•
Automatically created scripts comes hand in hand with low maintainability
Based on the above, certain factors for evaluation were established in order to compare the tools. •
Test functions or possibilities offered by the tools
•
Usability of the tools
•
Scripting language
•
Editing of scripts
•
Maintainability of scripts
•
Error analysis possibilities
•
Other functionalities (such as interface towards other software for enhanced testing)
The tools offer possibilities for debugging of tested applications, but in the evaluation, there will not be any consideration taken to how well they do it. The two manufacturers of automated testing tools that were chosen for evaluation were Rational Software and Mercury Interactive. The tools from Rational were Rational Robot 7.1 and TestManager. Prior, Rational offered a program called LoadTest. The main functionality of that program is now included in TestManager. We therefore aim at only evaluate the load testing functionality of TestManager since it offers much more. The two tools Robot and TestManager are included in Rational Suite Enterprise and offers close interactivity with other Rational products such as Rose and TestFactory. Rational’s tools are designed to test client/server application and offer extensive web support. The Tools evaluated from Mercury Interactive were Astra QuickTest 5.0 Professional and Astra LoadTest 4.5. These two are specifically designed for the test of web applications. These tools are lighter versions of Mercury’s products WinRunner and LoadRunner, which offers the possibility to test other software than web applications. Common with the two manufacturers above is that they both uphold that their tools in a rather uncomplicated way create usable tests. They both encourage the use of their capture/playback function in order to automatically create script by recording user actions. We will compare the tools with each other and evaluate if they are useful and compare our conclusions with both authors on the subject and the information given by the tool manufacturers. Astra QuickTest Vs. Rational Robot Both Rational Robot and Astra QuickTest are mainly functional testing tools, and aims at simplifying the test process by automating the creation and running of test scripts as well as providing analysis possibilities of results. They both offer the possibility to record scripts with graphic user interface and to automatically create scripts. They are also object oriented and recognize objects recorded in the application, as well as offering detailed information about object properties. The main way the two tools are used is very similar. The user navigates through a site and the tool records actions done. After inserting checkpoints, as it is called in QuickTest, or verification points, as it is called in Robot, the tool compares the actual result with the supposed as defined in the check- or verification points. There are a number of different types of such points offered by the two tools, in order to compare areas such as object properties, links and images or content in tables. These points may be inserted both during and after recording the script, in both Robot and QuickTest. When inserting a check- or verification point, one defines what properties, or which data that are to be compared in the test.
Fig 5.1 Script created by Rational Robot by using recording
The two tools differ in how they use checkpoints and what they cover. Robot offers a wider selection of verifications points to choose from, but QuickTest offer some that Robot does not. The object properties point is central for both tools and functions in the same way. When pointing at the object to be tested, the tool captures the properties of that particular object, such as name, default value, state etc. Objects can be ActiveX controls or Java objects or items such as check boxes, edit boxes or radio buttons. The check can be to verify that the objects appear in a specific state when the test is run. They both offer the possibility to verify links and images for both numbers and URL’s, but Robot offer the possibility to scan the entire site, together with the tool SiteCheck that displays the site in a site map and offer more extensive analysis possibilities. Astra QuickTest offer a handy checkpoint, which uses text in a web page, without it having to be placed in a table or box. This can be very useful when the pages are dynamically created and it needs to be verified that the correct text appears when, for example, a choice in a list have been made and the page is created based on this. Robot does not offer this possibility in such a simple way but is still possible to do by manually writing the function. The languages used by the two tools differ. Rational Robot uses SQABasic while Astra QuickTest uses VBScript. SQABasic is a language based on Microsoft Basic that has been further developed by Rational Software. If one is familiar with Basic, this language should not pose a problem to work with. The way the scripts are presented in the two tools differs. Rational Robot presents the script in a more traditional way to present code than Astra QuickTest does. QuickTest on the other hand offer the possibility to view the test in a tree view, while Robot only presents the verification points in such view. One major drawback of QuickTest is that the script code presented in the so called expert view is not that extensive as in Robot. Actions as scrolling in list box is not recorded in QuickTest, just the selection in it. One therefore has better control over the script in Robot than in QuickTest. One feature the tools offer is the possibility for using tables in order to perform data-driven tests. These can be used to, for example, parameterize input to, or even output from, the application. This function may be used in order to test acceptance of input to a form or to test data in databases by requesting and verifying the results. By creating tables with data presented in fields, the tools may then read from the tables and perform repeated actions, depending on how the data in the tables is formed and how the tool is configured to perform iterations. The two tools differ in how they utilize data tables. In Astra QuickTest the tables are created within the tool and are presented at the bottom of the screen. There is a choice of global or local table. The global table may be used by all actions 2 in the test, while the local may only be used by a specific action. With Robot the datapools, as called by Rational, are created outside Robot, in the tool TestManager, a tool for managing the test process. The creation of the tables in Robot offers more possibilities than QuickTest. Among them are the options to auto fill the tables with data as well as determining the sequence the data will be retrieved (in random order, unique or sequential). This may be very useful in order to test, for example, a credit card verification function, where TestManager can auto fill the tables with such numbers in the correct format. Among the predefined types are, first- and last name, date and credit car numbers. One also has the possibility to create user defined types that can later be used to fill tables. Rational Robot and TestManager therefore offer a much more extensive way to use data tables than Astra QuickTest. During the evaluation, certain areas where the tools may be more useful than others started to take shape. One obvious area is the possibility to parameterize input to the application. To test all possible inputs to a form or test every row or column in a database is nearly impossible to do manually. By using tables to parameterize the input, the tool may go through combination after combination, verifying the result with minimal manual effort. The creation of tables and datatypes for the test can of course be demanding, but we suspect that the resources saved by running the test automatically is often far greater. Both tools offer the possibility for parameterization of in-data as described earlier. But in order to verify results during a data driven test, the verification point comparing the result also needs to be data driven. Take an example where personnel records are stored in a database and can be selected and retrieved via a listbox and displayed in a web page. If one would want to verify that the correct record is shown for every selection, a verification- or checkpoint needs to be inserted on the page showing the record. This verification needs to be changed since a different record is shown for every selection. QuickTest offer a useful option to parameterize the checkpoints, by easily connecting them to a data table, so that the information captured during a test is checked against a new baseline for every selection. With Robot, the parameterization of verification points is not done in such a way. One has to create this in a separate function. This may then be called by writing such instructions in the script. It seems here that QuickTest has an advantage with the way that these checks are made, but one has to consider the factors mentioned earlier with building in inflexibility in the tests. Astra LoadTest vs. Rational TestManager The aim here is to evaluate the tools’ performance testing functionality. The tools are used to simplify the test process by, as the earlier described tools, automating the creation and running of scripts as well as offering analysis possibilities of results. The way the performance tests are done is very similar to the tools described in the previous chapter. The scripts are created in the same way, i.e. by recording user actions, or by manually writing them. The tool simulates a number of virtual users (VU), sending and requesting information, during which the tool captures all transactions in order for the tester to evaluate the performance of the tested application. In Astra LoadTest the scripts are created within that tool, but in Rational’s case the creation of scripts are made with Robot. TestManager though provides the actual load testing functionality. When a test script, or session of scripts, has been created, the process of designing how the actual load test will be run is done. In both tools, one defines a number of user groups that each may be configured separately. They may for example run separate scripts, have different configurations and differ in number of virtual users. By applying different groups as described, different load levels may be created on the application. The virtual users will follow their designated scripts and perform the actions in the predefined order and the load testing program captures transactions and gathers information for evaluation. Different settings, as time limits and the number of iterations that will be run, may be defined in both tools.
2
In Astra QuickTest, a test can be build by several actions. An action is a defined part or script of the test.
Fig 5.2 Scenario created with TestManager
There is also the possibility, in both tools, to insert a form of meeting point that is used to gather the virtual users in order to release them in a defined manner. This may be used to test a certain feature with a specifically high load, by having the VU perform the action simultaneously. If for example a certain form for input may be sensitive to high load, gathering the users and having them submit the form at the same time may test this. These rendezvous points, as named in Astra LoadTest, or synchronisation points, as called in TestManager, are often used with a timer function in order to measure the performance of that specific feature during the load test. These timer functions may be inserted wherever there is a need to measure a certain action or actions. In Astra LoadTest one may insert a so called transaction, which is similar to Rational TestManager’s blocks. These specify certain parts or actions in the script to be measured separately. And after the test, particular information on these is presented for evaluation. Functionality such as using data tables is offered by both tools. The creation and usage of those is done in the same manner as for QuickTest and Robot, and is therefore described in the prior chapter. In order to perform tests where verification- or checkpoints are being used, the two tools offer a bit different solutions. When creating tests with Astra LoadTest, one may insert checkpoints in the script in the same manner as with QuickTest. These will then be checked against during the actual run of the test. Rational propose a different approach to this by offering the possibility to run GUI 3 scripts simultaneously with VU script in the same test session. This is positive since one may use previously created scripts in a new session. The scripting language used by the two tools differs. The creation of VU scripts by Rational Robot is done in the so called VU scripting language. This is a language similar to C. It shares much of its syntax rules and library functions. According to Rational, if one is familiar with the C language, the VU language should not pose a problem. Astra LoadTest uses VBScript just as QuickTest and therefore has an advantage over Rationals products, which uses different language for GUI and VU scripts. Though it is probably so that by using a specialised language, more possibilities for creating test cases may be offered. When performing tests involving large amount of virtual users, these may have to be spread between several machines producing the load. This is possible with both tools, and functions in the same way. One computer is acting as the master computer and directs the tests, while other acts as hosts for VUs. The most important part of these tools is the result analysis during or at the end of a test. Information on the performance is given in forms of graphs or reports, describing requests per time unit, sent or received bytes per time unit, resources used etc. By viewing graphs, unsatisfying performance is identified and can then be more thoroughly examined by viewing reports, where detailed information on for example response time for individual VU or actions is displayed.
3
The scripts created by Rational Robot as described in the previous chapter are called GUI scripts.
Fig 5.3 Results showing average transaction response time in Astra LoadTest
The tools are very similar in the way they function. The main features of a performance testing tool are present in both. Though differences are present. As with the earlier described tools, Rational offers a more complex product with advanced features and possibilities not available in Astra tools. For example Rational makes it possible to set recording options depending on the server setup (for example if using proxy server) or if a specific database is used. Rational also makes it possible to include or exclude certain protocols when recording, for example HTTP, IIOP or pure Oracle requests. Though, the time it takes to efficiently use the tools from Rational is extensive. With Astra LoadTest one rather quickly may set up a useful load scenario. 5.4.3 Conclusion When running mainly the functional testing tools we found that they were somewhat inconsistent in how they captured different objects or actions, depending on the applications tested. In some tests they were unable to catch the content of a list box, and the next time it appeared as it should. The tools also show some instability when running some scripts, showing differences in result after two seemingly identical runs. Why this is so is hard to say. It can depend on several things, fluctuations in connection speed, performance of client and server, bugs in the tool or the technology used in the tested application. Regardless where problems reside, one often spends more time searching for bugs in the tool and editing scripts than on concentrating on the application that is tested, due to the instability of the tool. These problems occurred mainly when tests were run on applications not undergoing any changes between the tests. Some tests were done with changes to the user interface and the tools showed obvious problems when to large changes were made. When in an environment of software development, where new releases may have totally different user interface and the sequence in which actions shall be taken, have changed, we suspect that using the tools with capture/playback will end in heavy maintenance costs as discussed earlier. Still, we consider the feature to be of use by supplying a foundation for a test or just by giving useful hints on how to manually write the scripts. Though when regarding the performance testing tools, the use becomes more obvious, since these types of tests are difficult to do without tools. If one tool outruns the other or if tools are of interest at all is difficult to say. Rational, for example, offers extensive functionality when creating datapools, but it lacks the possibility to parameterize checkpoints in such a simple way as Mercury’s Astra QuickTest and Astra LoadTest. We believe that whether or not tools are of use is heavily depending on the size and type of project. QuickTest and LoadTest are easy to get started with and probably offer functionality enough for smaller projects without extent need of managing different releases or with applications that do not undergo greater changes in their development. Rational on the other hand offers not just a single testing tool but a suite of tools in order to enhance the software development process. The possibility to manage requirements and keeping track of releases and results of tests is offered. We therefore consider the tools from Rational to be more useful in larger projects where there are demands for such functionality. Though in order to do a more correct evaluation one should bear in mind that Mercury offers a set of tools called WinRunner, LoadRunner and TestDirector, as mentioned earlier in the evaluation. In order to compare with Rational’s products, these tools probably offer more of a match. Though both manufacturers uphold in their manuals that effective tests are easily created using capture/playback, we believe that this is often not the case. When talking to representatives from Rational even they state that automatic tools seldom automate the tests and using capture/playback is generally not the way tools are used. If and which tools are of interest therefore depends on the following: •
Size of the test project
•
Size of the application
•
Technology used by in the application
•
Stage of the test process
•
Resources available
•
Types of test
The conclusions come to during this evaluation follows below. Many of these conclusions get support from experienced testers and authors on the subject as well as manufacturers of automated tools. •
Since the obvious use of graphic interface in web applications and the tools dependence of it, the GUI should have reached a rather high level of stability before using these tools
•
An obvious use of the tools is when performing data-driven tests. By using in-parameters for the application, the test effort may be greatly reduced. These tests are of interest even if the GUI is not finished due to the saving of resources
•
Straight replay of recorded tests results in low rate of bug detection. This further implies the need for regression testing or datadriven tests using these tools.
•
Using the tools often focus testers on weaknesses in the tool rather than the tested application
•
The tools are not easily learned. In order to take full advantage of them to create useful and maintainable test cases, experience and education on the tools is needed as well good skills in programming. The architecture of the application tested should also be well understood
•
Testing of links throughout a site can be demanding to do manually but is easily done by a tool. These tools verify every link and displays information on broken links and orphan pages
•
Performance testing tools are of obvious use since the difficulty in performing performance tests manually is extent.
Appendix A: Test Lists Test list as presented by Tim Van Tongeren User interface 1.
Instructions
2.
Sitemap/nav.bar
3.
Content
4.
Colour/background
5.
Images
6.
Tables
7.
Wrap around
Functionality 1.
Links
2.
Forms
3.
Data verification
4.
Cookies
5.
Application specific functional requirements
Interface testing 1.
Server interface
2.
External interface
3.
Error handling
Compatibility 1.
Operating systems
2.
Browsers
3.
Video settings
4.
Modem/connection speed
5.
Printers
6.
Combinations
Load/stress 1.
Many users at the same time
2.
Large amount of data from each user
3.
Long period of continuous use
2.
Directory setup
3.
SSL
4.
Logins
5.
Logfiles
6.
Scripting languages
Security
Test list as presented by Vincent Soberano User interface 1.
2.
Structural 1.
Navigation
2.
Graphics
3.
Formatting
4.
Content
5.
General Appearance
Functionality 1.
Programming Language
2.
Linkage testing
3.
Forms
4.
Cookies
5.
Application-specific Transactions
Host interface 1.
Server interface
2.
External interface
3.
Error handling
Compatibility issues 1.
Operating systems
2.
Browsers
3.
Connection
4.
Printers
5.
Multimedia devices
Load/Stress 1.
User traffic
2.
Data volume
3.
Continuous usage
Security & Encryption 1.
Logins
2.
SSL
3.
Directory setup
4.
Log Files
5.
Hacker Scripts
Test list as presented by Thomas A. Powell Functional testing 1.
Unit testing
2.
Integration testing
3.
Browser testing
4.
Configuration testing
5.
Delivery testing
Content testing 1.
Spelling, grammar
2.
Accuracy, copyright - Liability issues
3.
Images
Security testing User test 1.
Usability testing
2.
Beta testing
Test list as presented by Hung Q. Nguyen User Interface Test 1.
Design approach
2.
User interface controls
3.
Navigation methods
4.
Feedback and Error messages
5.
Data presentation
Functional Test 1.
Functional Acceptance Simple Tests (links, images etc.)
2.
Task-Oriented Functional Tests (Test that the application performs tasks correctly)
3.
Forced-Error test
4.
Boundary Condition Test
5.
Exploratory Testing (Examine the applications behavior)
6.
Software Attacks
Database Test 1.
Interfacing (Internal and external)
2.
UI Design
3.
Consistency
4.
Usability
5.
Content
Help Test
Installation Test 1.
User side Installation
2.
Server side Installation
Configuration and Compatibility Test 1.
Servers (Application, Web, Database)
2.
OS
3.
Applications
4.
Hardware
Web Security Test 1.
Cryptography
2.
Protocols
3.
Firewalls
Performance 1.
Response time
2.
Load
3.
Stress
Appendix B: Test Areas Below is a presentation of the main areas to test when developing and publishing a web site. It is a checklist that presents the most important features to test under each area and how to perform them. Functionality testing 1.
Links
Links are maybe the main feature on web sites. They constitute the mean of transport between pages and guide the user to certain addresses without the user knowing the actual address itself. Linkage testing is divided into three sub areas. First - check that the link takes you to the page it said it would. Second – That the link isn’t broken i.e. that the page you’re linking to exists. Third – Ensure that you have no orphan pages at your site. An orphan page is a page that has no links to it, and may therefore only be reached if you know the correct URL. Remember that to reduce redundant testing, there is no need to test a link more than once to a specific page if it appears on several pages; it needs only to be tested once. This kind of test can preferably be automated and several tools provide solutions for this. Link testing should be done during integration testing, when connections between pages subsist. Resources: Rational SiteCheck (http://www.rational.com/) http://www.netmechanic.com/ http://home.snafu.de/tilman/xenulink.html http://www.cyberspyder.com/cslnkts1.html Summary:
2.
•
Verify that you end up at the designated page
•
Verify that the link isn’t broken
•
Locate orphan pages if present
Forms
Forms are used to submit information from the user to the host, which in turn gets processed and acted upon in some way. Testing the integrity of the submitting operation should be done in order to verify that the information hits the server in correct form. If default values are used, verify the correctness of the value. If the forms are designed to only accept certain values this should also be tested for. For example, if only certain characters should be accepted, try to override this when testing. These controls can be done both on the client side as well as the server side, depending on how the application is designed, for example using scripting languages such as Jscript, JavaScript or VBScript. Check that invalid inputs are detected and handled. Summary:
3.
•
Information hits the server in correct form
•
Acceptance of invalid input
•
Handling of wrong input (both client an server side)
•
Optional versus mandatory fields
•
Input longer than field allows
•
Radio buttons
•
Default values
Cookies
Cookies are often used to store information about the user and his actions on a particular site. When a user accesses a site that uses cookies, the web server sends information about the user and stores it on the client computer in form of a cookie. These can be used to create more dynamic and custom-made pages or by storing, for example, login info. If you have designed your site to use cookies, they need to be checked. Verify that the information that is to be retrieved is there. If login information is stored in cookies check for correct encryption of these. If your applications require cookies, how does it respond to users that disabled the use of such? Does it still function or will the user get notified of the current situation. How will temporary cookies be handled? What will happen when cookies expire? Depending on what cookies are used for, one should examine the possibilities for other solutions. Summary:
4.
•
Encryption of e.g. login info
•
Users denying or accepting
•
Temporary and expired cookies
Web Indexing
There are a number of different techniques and algorithms used by different search engines to search the Internet. Depending on how the site is designed using Meta tags, frames, HTML syntax, dynamically created pages, passwords or different languages, your site will be searchable in different ways. Summary: •
Meta tags
•
Frames
•
HTML syntax
5.
•
Passwords
•
Dynamically created pages
Programming Language
Differences in web programming language versions or specifications can cause serious problems on both client or server side. For example, which HTML specification will be used (for example 3.2 or 4.0)? How strictly? When HTML is generated dynamically it is important to know how it is generated. When development is done in a distributed environment where developers, for instance, are geographically separated, this area becomes increasingly important. Make sure that specifications are well spread throughout the development organization to avoid future problems. Except HTML classes, specifications on e.g. Java, JavaScript, ActiveX, VBScript or Perl need to be verified. There are several tools on the market for validating different programming languages. For languages that need compiling e.g. C++, this kind of check is often done by the compiling program. Since this kind of testing is done by static analysis tools and needs no actual running of the code, these tests can be done as early as possible in the development process. Language validation tools can be found in compilers, online as well as for download, free or by payment. Resources: http://arealvalidator.com/ http://www.delorie.com/web/purify.html, Summary:
6.
•
Language specifications
•
Language syntax (HTML, C++, Java, Scripting languages, SQL etc.)
Dynamic Interface Components
Web pages are not just presented in static HTML anymore. Demands for more dynamic features, custom made sites and high interactivity have made the Internet a more vivid place than before. Dynamic Interface Components reside and operate both on server and client side of the web, depending on the application. The most important include Java applets, Java Servlets, ActiveX controls, JavaScript, VBScript, CGI, ASP, CSS and third-party plug-ins (QuickTime, ShockWave or RealPlayer). The issue here is to test and verify the function of the components, not compatibility issues. An example of what to test can be a Java applet constructing and displaying a chart of company statistics, where the information first have to be retrieved and then interpreted and displayed on the screen. Since server-side components don’t have user interface, event logging (logfiles) can be used to record events by applications on the server side in order to determine functionality. Resources: Java Specific tools: JavaSpec and JavaStar. Summary:
7.
•
Do client side components (applets, ActiveX controls, JavaScript, CSS etc.) function as intended (i.e. do the components perform the right tasks in a correct way)
•
User disabling features (Java-applets, ActiveX, scripts etc.)
•
Do server side components (ASP, Java-Servlets, server-side scripting etc.) function as intended (i.e. do the components perform the right tasks in a correct way)
Databases
Databases play an important role in web application technology, housing the content that the web application manages, running queries and fulfilling user requests for data storage. The most commonly used type of database in web applications is the relational database and it’s managed by SQL to write, retrieve and editing of information. In general, there are two types of errors that may occur, data integrity errors and output errors. Data integrity errors refer to missing or wrong data in tables and output errors are errors in writing, editing or reading operations in the tables. The issue is to test the functionality of the database, not the content and the focus here is therefore on output errors. Verify that queries, writing, retrieving or editing in the database is performed in a correct way. Resources: Rational Robot (http://www.rational.com/) Astra QuickTest (http://www.merc-int.com/) Issues to test are:
Usability
•
Creation of tables
•
Indexing of data
•
Writing and editing in tables (for example valid numbers or characters, input longer than field etc.)
•
Reading from tables
1.
Navigation
Navigation describes the way users navigate within a page, between different user interface controls (buttons, boxes, lists, windows etc.), or between pages via e.g. links. To determine whether or not your page is easy to navigate through consider the following. Is the application’s navigation intuitive? Are the main features of the site accessible from the main page? Do the site need a site map, search engine, or other navigational help. Be careful though that you don’t over do your site. Too much information often has the opposite effect as to what was intended. Users of the web tend to be very goal driven and scan a site very quickly to see if it meets their expectations. If not, they quickly move on. They rarely take the time to learn about the sites structure, and it is therefore important to keep the navigational help as concise as possible. Another important aspect of navigation is if the site is consistent in its conventions regarding page layout, navigation bars, menus, links etc. Make sure that users intuitively know that they are still within the site by keeping the page design uniform throughout the site. As soon as the hierarchy of the site is determined, testing of how users navigate can commence. Have real users try and navigate through ordinary papers describing how the layout is done. Summary:
2.
•
Intuitive navigation
•
Main features accessible from main page
•
Site map or other navigational help
•
Consistent conventions (navigation bars, menus, links etc.)
Graphics
The graphics of a web site include images, animations, borders, colours, movie clips, fonts, backgrounds, buttons etc. Issues to check are:
3.
•
Make sure that the graphics serve a definite purpose and that images or animations don’t just clutter up the visual design and waste bandwidth
•
Verify that fonts are consistent in style
•
Suitable background colours combined with font- and foreground colour. Remember that a computer display exceptionally well presents contrasts apposed to printed paper
•
Three-dimensional effects on buttons often gives useful cues
•
When displaying large amount of images, consider using thumbnails. Check that the original picture appears when a thumbnail is clicked
•
Size – quality of pictures, usage of compressed formats (JPG or GIF)
•
Mouse-over effects
Content
Content testing is done to verify the correctness, accuracy and relevancy of information presented on the site, or in a database, in forms of text, images or animations. Correctness is whether the information is truthful or contains misinformation. For example wrong prices in a price list may cause financial problems or even induce legal issues. The accuracy of the information is whether it is without grammatical or spelling errors. These kinds of verifications are often done in e.g. Word or other word processors. Remove irrelevant information from your site. This may otherwise cause misunderstandings or confusion. Content testing should be done as early as possible, i.e. when the information is posted. Summary:
4.
•
Correctness
•
Accuracy
•
Relevancy
General Appearance
Does the site feel right when using it? Do you intuitively know where to look for information? Is the design consistent throughout the site? Make sure that the design and aim goes hand in hand. Too much design can easily turn a conservative corporate site in to a publicity stunt. Important to all kinds of usability tests is to involve external personnel that have little or no connection to the development of the site. It’s easy to get fond of ones own solution, so having actual users evaluating the site may be critical. Summary: •
Intuitive design
•
Consistent design
•
If using frames, make sure that the main area is large enough
•
Consider size of pages. Several screens on the same page or links between them
•
Do features on the site need help systems or will they be intuitive
Server Side Interface 1.
Server Interface
Due to the complex architecture of web systems, interface and compatibility issues may occur on several areas. The core components are web servers, application servers and database servers (and possibly mail servers). Web servers normally hosts HTML pages and other web services. Application severs typically contains objects such as programs, scripts, DLLs or third party products, that provide and extend functionality and effects for the web application. Test the communication between the different servers by making transactions and view logfiles to verify the result. Depending on the configuration of the server side compatibility issues may occur depending on, for example, server hardware, server software or network connections. Database compatibility issues may occur depending on different database types (SQL, Oracle, Sybase etc.). Issues to test:
2.
•
Verify that communication is done correctly, web server-application server, application server-database server and vice versa.
•
Compatibility of server software, hardware, network connections
•
Database compatibility (SQL, Oracle, Sybase etc.)
External Interface
Several web pages have external interfaces, such as merchants verifying credit card numbers to allow transactions to be made or a site like http://www.pris.nu/ that compares prices and delivery times on different merchants on the web. Verify that is sent and retrieved in correct form. Client Side compatibility 1.
Platform
There are several different operating systems that are being used on the market today, and depending on the configuration of the user system, compatibility issues may occur. Different applications may work fine under certain operating systems, but fail under another. The following are the most commonly used:
2.
•
Windows (95, 98, 2000, NT)
•
Unix (different sets)
•
Macintosh
•
Linux
Browsers
The browser is the most central component on the client side of the web. Browsers come in different brands and versions and have different support for Java, JavaScript, ActiveX, plugins or different HTML specifications. ActiveX, for example, is a Microsoft product and therefore designed for Internet Explorer, while JavaScript is produced by Netscape and Java by Sun. This substantiates the fact that compatibility problems commonly occur. Frames and Cascading style sheets may display differently on different browsers, or not at all. Different browsers also have different settings for e.g. security or Java support. A good way to test browser compatibility is to create a compatibility matrix where different brands and versions of browsers are tested to a certain number of components and settings, for example Applets, scripting, ActiveX controls or cookies. Summary:
3.
•
Internet Explorer (3.X 4.X, 5.X)
•
Netscape Navigator (3.X, 4.X, 6.X)
•
AOL
•
Browser settings (security settings, graphics, Java etc.)
•
Frames and Cascade Style sheets
•
Applets, ActiveX controls, DHTML, client side scripting
•
HTML specifications
•
Graphics
Settings, Preferences
Depending on settings and preferences of the client machine, web applications may behave differently. Try and vary the following:
4.
•
Screen resolution (check that text and graphic alignment still work, font are readable etc.)
•
Colour depth (256, 16-bit, 32-bit)
Printing
Despite the paperless society the web was to introduce, printing is done more than ever. Verify that pages are printable with considerations on: •
Text and image alignment
•
Colours of text, foreground and background
•
Scalability to fit paper size
•
Tables and borders
Performance 1.
Connection speed
Users may differ greatly in connection speed. They may be on a 28.8 modem or on a T3 connection. Users expect longer download times when retrieving demos or programs, but not when requesting a homepage. If the transaction response time is to long, user will leave the site. Other issues to consider are time-out on a page that request logins. If load time is to long, users may be thrown out due to time-out. Database problem may occur if the connection speed is two low, causing data loss. Summary:
2.
•
Connection speed: 14.4, 28.8, 33.6, 56.6, ISDN, cable, DSL, T1, T3
•
Time-out
Load
What is the estimated number of users per time period and how will it be divided over the period? Will there be peak loads and how will the system react? Can your site handle a large amount of users requesting a certain page? Load testing is done to measure the performance at a given load level to assure that the site work within requirements for performance. The load level may be a certain amount of users using your site at the same time or large amount of data transactions from user such as online ordering. Resources: Rational TestManager (http://www.rational.com/) Astra LoadTest (http://www.merc-int.com/) Summary:
3.
•
Many users requesting a certain page at the same time or using the site simultaneously
•
Large amount of data from users
Stress
Stress testing is done in order to actually break a site or a certain feature to determine how the system reacts. Stress tests are designed to push and test system limitations and determine whether the system recovers gracefully from crashes. Hackers often stress systems by providing loads of wrong in-data until it crash and then gain access to it during start-up. Typical areas to test are forms, logins or other information transaction components. Resources: Rational TestManager (http://www.rational.com/) Astra LoadTest (http://www.merc-int.com/) Summary:
4.
•
Performance of memory, CPU, file handling etc.
•
Error in software, hardware, memory errors (leakage, overwrite or pointers)
Continuous use
Is the application or certain features going to be used only during certain periods of time or will it be used continuously 24 hours a day 7 days a week? Test that the application is able to perform during those conditions. Will downtime be allowed or is that out of the question? Verify that the application is able to meet the requirements and does not run out of memory or disk space. Security Security is an area of immense extent, and would need extensive writing to be fairly covered. We will no more than point out the most central elements to test. First make sure that you have a correct directory setup. You don’t want users to be able to brows through directories on your server.
Logins are very common on today’s web sites, and they must be error free. Make sure to test both valid and invalid login names and passwords. Are they case sensitive? Is there a limit to how many tries that are allowed? Can it be bypassed by typing the URL to a page inside directly in the browser? Is there a time-out limit within your site? What happens when it’s exceeded? Are users still able to navigate through the site? Logfiles are a very important in order to maintain security at the site. Verify that relevant information is written to the logfiles and that the information is traceable. When secure socket layers are used, verify that the encryption is done correctly and check the integrity of the information. Scripting on the server often constitute security holes and are often used by hackers. Test that it isn’t possible to plant or edit scripts on the server without authorisation. Summary: •
Directory setup
•
Logins
•
Time-out
•
Logfiles
•
SSL
•
Scripting Languages
Functionality
(03)
Links Forms Cookies Web Indexing Programming Language Dynamic Interface Components Databases Usability Navigation Graphics Content General Appearance Server side Interface Server Interface External Interface Client side Compatibility Platform Browsers Settings, Preferences
(13)
Testing Need
Complexity
Aim
Appendix C: The Test Priority Sheet
Printers Performance Connection speed Load Stress Continuous use Security General Security
User Manual What it does This tool is designed to help prioritize testing efforts. It will help distinguish the most important areas of your application. How it does it Area specific questions are answered with numerical values that are multiplied with each other creating a Testing Need score. The value is compared to other values established the same way. The higher the value, the higher need for testing. Step-by-Step Before starting to use the Test Priority Sheet you need to establish certain factors. You need to be familiar with the following: 1.
Complexity (architecture, components)
2.
Aim (purpose, target group)
Answer the Questions Answer the questions below by entering a number in the appropriate field in the Test Priority Matrix. 0
1
2
3 Complexity
Not Present Low
Medium
High
1
2
3 Aim
Not Important
Medium Importance
High Importance
Low Impact of Failure
Medium Impact of Failure
High Impact of Failure
There are two questions presented under each test area below. The first is to be answered under Complexity and the second under Aim. Multiply the values on the same row with each other and put the product under Testing Need. Do this for all rows. Result Compare the values in the Testing Need column with each other. The higher the value, the higher the testing need. If several rows have the same value, higher priority should often be given the area with the higher Aim value. The questions Functionality Links: •
Are links present and to what extent?
•
Are links critical and how sensitive is the target group to bad links?
•
Are forms present? To what extent and how advanced?
•
How critical are forms for the site and how sensitive are users to problems with forms?
•
Are cookies used and to what extent?
Forms:
Cookies:
•
How critical are the use of cookies?
Web Indexing: •
How advanced and extent is the use of techniques affecting search engine indexing?
•
How severe effects will problems with web indexing have?
Programming Language: •
To what extent are different programming languages used?
•
How critical effects can problems with Programming Language have?
Dynamic Interface Components: •
How extent and how advanced are the use of different technologies?
•
How critical are the functions of these components and will users notice failure of these and how will they react?
Databases: •
How complex is the database architecture?
•
How critical are databases to the aim of the application and how will users respond to failure?
Usability Navigation: •
How complex is the structure of the site?
•
How severe effects will difficulty of navigation have?
Graphics: •
How extent is the use of different graphical elements?
•
How critical is the use of graphics and how sensitive will users be to problems with graphics?
•
To what extent does the site contain complex and sensitive information?
•
How severe effects will errors in the content have?
Content:
General Appearance: •
How advanced is the site layout?
•
How sensitive are users to poor appearance?
Server side Interface Server Interface: •
How complex is the architecture?
•
How critical information is dependant on communication between different servers?
External Interface: •
How complex is the architecture?
•
How critical information is dependant on communication with external servers?
Client side Compatibility Platform:
To what extent does the site contain features with possible platform compatibility problems?
To what extent may users differ in platforms and how critical effects will compatibility issues have?
Browsers:
To what extent does the site contain features with possible browser compatibility problems?
To what extent may users differ in browsers and how serious effects will compatibility issues have?
Settings:
To what extent does the site contain features sensitive to different settings?
To what extent may users differ in settings?
Printers:
To what extent does the site contain information that requires printing capability?
To what extent may users differ in configuration and how frequently will they print?
Performance Connection Speed:
To what extent are there features on the site sensitive to connection speed?
To what extent will users differ in connection speed and how sensitive will they be to bad performance?
Load:
Are there features on the site that are sensitive to high loads?
Will the presence of users reach high load levels?
Stress:
Are there features on the site that are sensitive to stress situations?
How likely is it that stress situations will occur and how severe effects may it cause?
Continuous use:
Are there features on the site that are sensitive to continuous use?
Will the site or feature be continuously used and what will be the effects of problems of the feature?
Security General security:
How advanced is the security?
How critical is the security of the site?
Appendix D: Methodology Evaluation Scores
Functionality
(03)
Testing Need
Complexity
Aim
Person 1: ASP and navigational issues.
(13)
Links
3
1
3
Forms
2
3
6
Cookies
0
-
0
Programming Language
1
2
2
Dynamic Interface Components
3
3
9
Navigation
3
3
9
Graphics
3
1
3
Content
3
3
9
General Appearance
3
1
3
3
1
3
Indexing
Usability
Server side Interface Server Interface External Interface Client side Compatibility Platform
Browsers
3
2
6
Settings, Preferences
1
2
2
Printers
1
1
1
Connection speed
3
3
9
Load
2
3
6
Stress
2
3
6
Continuous use
3
3
9
1
3
3
Performance
Security General Security
Functionality
(03)
Testing Need
Complexity
Aim
Person 2: ASP, Database connections, component design.
(13)
Links
1
3
Functions, admin. docs.
3
Forms
2
3
Cookies
0
1
Indexing
3
3
Programming Language
2
3
6
Dynamic Interface Components
2
3
6
Navigation
3
3
9
Graphics
1
3
3
Content
2
3
6
Platform
2
3
6
Browsers
3
3
9
Settings, Preferences
1
2
2
Printers
1
1
1
Connection speed
1
3
3
Load
2
3
6
6 Database searches
9
Usability
General Appearance Server side Interface Server Interface External Interface Client side Compatibility
Performance
Stress Continuous use
3
2
6
1
2
2
Security General Security
Functionality
(03)
Testing Need
Complexity
Aim
Person 3: ASP, SQLscript, database search functionality.
(13)
Links
1
3
3
Forms
3
3
9
Cookies
0
0
Indexing
0
0
Programming Language
3
1
3
Dynamic Interface Components
3
3
9
Navigation
2
3
6
Graphics
1
2
2
Content
2
3
6
General Appearance
1
2
2
Server Interface
2
3
6
External Interface
1
1
1
Platform
1
1
1
Browsers
2
3
6
Settings, Preferences
1
1
1
Printers
1
3
3
Connection speed
3
3
9
Load
2
2
4
Stress
3
1
3
Continuous use
3
3
9
2
3
6
Usability
Server side Interface
Client side Compatibility
Performance
Security General Security
References: Internet: Bach, James, James Bach on Risk-based Testing, 2000; http://www.stickyminds.com/ Hower, Rick, Software QA/Test Resource Center, 1996-2000; http://www.softwareqatest.com/ Iberle, Kathleen A., Step By Step Test Design, 2000; http://www.stickyminds.com Nguyen, Hung Q., Testing http://www.stickyminds.com/
Web-Based
Applications,
Analyzing
and
reproducing
errors
in
web
environment,
2000;
Soberano, Vincent, White Paper on Web Testing, Meeting the Unique Challenges of Testing Web Applications, 1998; http://members.spree.com/oceansurfer/webtesting.htm Van Tongeren, Tim, Web Testing, 1988; http://www.csgp.com/its/articles/websitetesting.html Zakon, Robert; Hobbes’ Internet Timeline v5.2, 1993-2000; http://www.isoc.org/guest/zakon/Internet/History/HIT.html Nielsen, Jakob, Web Usability: Why and How, 1998; http://www.zdnet.com/devhead/stories/articles/0,4413,2137433,00.html Microsoft, Microsoft Accessibility, (http://www.microsoft.com/enable)
Technology
for
Everyone,
2000;
http://www.microsoft.com/enable/dev/betatest.htm#prioritize,
Books: Fewster, Mark and Graham, Dorothy; Software Test Automation, Effective use of test execution tools; 1999; Adison-Wesley, Harlow; ISBN 0201-33140-3 Hetzel, Bill, The Complete Guide to Software Testing, 2nd edition, 1988; Wiley-QED, New York; ISBN 0-471-56567-9 Nguyen, Hung Q., Testing Applications on the Web, Test Planning for Internet-Based Systems, 2001; Wiley, New York; ISBN 0-471-39470-X Powell, Thomas A., Web Site Development, Beyond Web Page Design, 1998; Prentice Hall; ISBN 0136509207 Compendium: Schaefer, Hans, Testning, Företagsintern kurs: Sigma Systems AB; 4-6 september 2000, ITQ Paper: Kaner, Cem, Improving the Maintainability of Automated Test Suites, 1997; Presented at quality week 97 Other: Nielsen, Peter, From a Sigma nBit AB internal course in Test & Verification, November 2000 Other Articles from Internet read: Marick, Brian, When Should a Test be Automated; http://www.testing.com/writings/automate.pdf Hower, Rick, Beyond Broken Links, 1997; Internet Systems, July 1997; http://www.dbmsmag.com/9707i03.html Powers, Mike, Why Test the Web? How Much Should You Test? Jan 2000; http://www.data-dimensions.com/testersnet/wince.htm Kaufman, Eric, Testing Your Web Site, Nov 1999; http://www.data-dimensions.com/testersnet/wince.htm Bach, James, Testing Internet Software, Dec 1996; http://www.data-dimensions.com/testersnet/wince.htm Ocampo, Gerry, Testing Considerations for Web-Enabled Applications, Sep 1999; http://www.data-dimensions.com/testersnet/wince.htm Other books read: Mosley, Daniel J.; Client-Server Software Testing on the Desktop and the Web, 2000; Prentice Hall PTR, Upper Saddle River; ISBN 0-13183880-6 Perry, William E., Effective Methods for Software Testing, 2nd edition, 2000; Wiley, New York; ISBN 0-471-35418-X Other papers read: Earls, Alan, True Test of the Web, 1999; Information week, 01/25/99 issue 718, p1A, 5p Useful resources: http://www.stickyminds.com http://www.io.com/~wazmo/qa.html
http://www.mtsu.edu/~storm http://www.pcwebopaedia.com http://www.softwareqatest.com http://www.data-dimensions.com http://www.rational.com http://www.merc-int.com http://www.compuware.com http://www.microsoft.com/enable http://www.io.com/~wazmo