Regression Testing: "What" to test and "When" Regression testing is often seen as an area in which companies hesitate to allocate resources. We often hear statements such as: "The developer said the defect is fixed. Do we need to test it again?" And the answer should be: "Well, the developer probably said the product had no defects to begin with." The truth of the matter is, in today's world of extremely complex devices and software applications, the quantity and quality of regression testing performed on a product are directly proportional to the commitment vendors have to their customer base. This does not mean that the more regression testing, the better. It simply means that we must make sure that regression testing is done in the right amount and with the right approach. The two main challenges presented by regression testing are: 1. What do we test? 2. When do we test it? The purpose of this article is to outline a few techniques that will help us answer these questions. The first issue we should consider is the fact that it is not necessary to execute our regression at the end of our testing cycle. Much of the regression effort can be accomplished simultaneously to all other testing activities. The supporting assumption for this approach is: "We do not wait until all testing is done to fix our defects." Therefore, much of the regression effort can be accomplished long before the end of the project, if the project is of reasonable length. If our testing effort will only last one week, the following techniques may have to be modified. However, it is not usual for a product to be tested in such a short period of time. Furthermore, as you study the techniques outlined below, you will see that as the project's length increases, the benefits offered by these techniques also increase. To answer the questions of what should we test and when, we will begin with a simple suite of ten tests. In the real world, this suite would obviously be much larger, and not necessarily static, meaning that the number of tests can increase or decrease as the need arises. After our first test run with the first beta (which we will call "Code Drop 1") of our hypothetical software product, our matrix looks like this.
Test ID
Status CD 1
1 2 3 4 5 6 7 8 9 10
Pass Fail Fail Pass Pass Pass Fail Fail Fail Pass
Defect Number 1 1
2 3 4
In the matrix above, we have cross-referenced the defects we found, with the tests that caused them. As you can see, defect number 1 was caused by test 2, but it also occurred on test 3. The remaining failures caused unique defects. As we prepare to execute our second test run (Code Drop 2), we must decide what tests will be executed. The rules we will use only apply to our regression effort. There are rules we can apply to the subset of tests that have passed, in order to find out which ones we should reexecute. However, that will be the topic of another article. The fundamental question we must now ask is: "Have any of the defects found been fixed?" Let us suppose that defects 1, 2, and 3 have, in fact, been reported as fixed by our developers. Let us also suppose that three more tests have been added to our test suite. After "Code Drop 2", our matrix looks as follows:
Test ID 1 2 3 4 5 6 7 8 9 10 11 12
Status CD 1 Pass Fail Fail Pass Pass Pass Fail Fail Fail Pass
Defect Number
Status CD 2
Defect Number
1 1
Pass Pass
1 - Fixed 1 - Fixed
2 3 4
Fail Pass
2
Pass Pass
A few key points to notice are: Of the tests that previously failed, only the tests that were associated with defects that were supposedly fixed were executed. Test number 9, which caused defect number 4, was not executed on Code Drop 2, because defect number 4 is not fixed.
Defect number 1 is fixed, because tests 2 and 3 have finally passed. Test number 7 still fails. Therefore, the defect remains. Test number 13 is a new test, and it caused a new defect. We chose not to execute tests that had passed on Code Drop 1. This may often not be the case, since turmoil in our code or the area's importance (such as a new feature, an improvement to an old feature, or a feature as a key selling point of the product) may prompt us to re-execute these tests. This simple, but efficient approach ensures that our matrix will never look like the matrix below (in order to more clearly show the problem, we will omit the Defect # column after each code drop). We will also consider Code Drop 5 to be our final regression pass.
Test ID 1 2 3 4 5 6 7 8 9 10 11 12 13
ST CD1 Pass Fail Fail Pass Pass Pass Fail Fail Fail Pass
ST CD2 Pass Pass Pass Pass Pass Pass Fail Pass Fail Pass Pass Pass Fail
ST CD3 Pass Pass Pass Pass Pass Pass Fail Pass Fail Pass Pass Pass Fail
ST CD4 Pass Fail Pass Pass Pass Pass Pass Pass Fail Pass Pass Pass Fail
ST CD5 Pass Pass Pass Pass Pass Pass Pass Pass Pass Pass Pass Pass Fail
We will address tests 2, 7, and 9 later, but here are a few key points to notice about this matrix: Why were tests 1, 4, 5, 6, 10, 11, and 12 executed up to five times? They passed every single time. Why were tests 3 and 8 executed up to five times? They first failed and were fixed. Did they need to be executed on every code drop after the failure? If test 13 failed, was the testing team erroneously told it had been fixed on each code drop? If not, why was it executed four times with the same result? We can also ask the question: "Why isn't it fixed?" But we will not concern ourselves with that issue, since we are only addressing the topic of regression. In conclusion, we will list some general rules we can apply to our testing effort that will ensure our regression efforts are justified and accurate. These rules are: 1. A test that has passed twice should be considered as regressed, unless turmoil in the code (or other reasons previously stated, such as a feature's importance) indicates otherwise. By this we mean that the only time a test should be executed more than twice is if changes to the code in the area the test exercises (or the importance of the particular feature) justify sufficient concerns about the test's state or the feature's condition. 2. A test that has failed once should not be re-executed unless the developer informs the test team that the defect has been fixed. This is the case for tests 7 and 9. They should not have been re-executed until Code Drops 4 and 5 respectively. 3. We must implement accurate algorithms to find out what tests that have already passed once should be re-executed, in order to be aware of situations such as the one of test number 2. This test passed twice after its initial failure and it failed again on Code Drop 4. Just as an additional note of caution: "When in doubt, execute." 4. For tests that have already passed once, the second execution should be reserved for the final regression pass, unless turmoil in the code indicates otherwise, or unless we do not have enough tests to execute. However, we must be careful. Although it is true that this allows us to get some of the regression effort out of the way earlier in the project, it may limit our ability to find defects introduced later in the project. 5. The final regression pass should not consist of more than 30% to 40% of the total number of tests in our suite. This subset should be allocated using the following priorities: a. All tests that have failed more than once. By this we mean the tests that failed, the developer reported them as fixed, and yet they failed again either immediately after they were fixed or some time during the remainder of the testing effort. b. All tests that failed once and then passed, once they were reported as fixed. c. All, or a carefully chosen subset of the tests that have passed only once.
d. If there is still room to execute more tests, execute any other tests that do not fit the criteria above but you feel should nevertheless be executed. These common sense rules will ensure that regression testing is done smartly and in the right amount. In an ideal world, we would have the time and the resources to test our product completely. Nevertheless, today's world is a world of tight deadlines and even tighter budgets. Wise resource expenditure today will ensure our ability to continue to develop reliable products tomorrow.