mss # ms238.tex; AP art. # 9; 33(4)
2
Applying Hotelling’s T Statistic to Batch Processes ROBERT L. MASON Southwest Research Institute, San Antonio, TX 78228-0510
YOUN-MIN CHOU The University of Texas at San Antonio, San Antonio, TX 78249-0664
JOHN C. YOUNG McNeese State University, Lake Charles, LA 70609-2340 In this paper we show the usefulness of Hotelling’s T2 statistic for monitoring batch processes in both Phase I and Phase II operations. Discussions of necessary adaptations, such as in the formulas for computing the statistic and its distribution, are included. In a Phase I operation, where the focus is on detecting and removing outliers, consideration is given to batch processes where the batch observations are taken from either a common multivariate normal distribution or a series of multivariate normal distributions with different mean vectors. In a Phase II operation, where the monitoring of future observations is of primary concern, emphasis is placed on the application of the T2 statistic using a known or estimated in-control mean vector. A variety of data sets taken from different types of industrial batch processes are used to illustrate these techniques.
T 2 statistic to accommodate observation vectors that are correlated over time (Mason and Young (1999b)). A useful overview of the T 2 , as a control statistic for multivariate processes, can be found in Mason and Young (1998). In this paper we extend the use of the T 2 statistic to multivariate batch processes.
Introduction
H
T 2 is a very versatile multivariate control chart statistic. It can be used not only in Phase I operations to identify outliers in the historical data set (HDS), but also in Phase II operations to detect process shifts using new incoming observations. In either situation, orthogonally decomposing the T 2 statistic using the MYT decomposition procedure (Mason, Tracy, and Young (1995, 1997)) can help identify the variables causing the signal or outlier. There are many other applications of the T 2 statistic in multivariate control charts. For example, procedures exist for enhancing the sensitivity of the T 2 statistic to the detection of small process shifts (Mason and Young (1999a)), and for adjusting the OTELLING’S
Dr. Chou is a Professor in the Division of Mathematics and Statistics. She is a Fellow of ASQ.
Multivariate statistical process control (SPC) for batch processes is an extension of the corresponding univariate methods. The most common of these univariate techniques involves the plotting of the batch sample means on an individuals chart with the control limits computed using the average moving range of the sample means. These methods are reviewed in Woodall and Thomas (1995). Multivariate techniques for batch processes have only recently been developed. An excellent summary of the literature on batch processes is given in Nomikos and MacGregor (1995), and it also includes a method for monitoring batch processes that is based on multiway principalcomponent analysis.
Dr. Young is a Professor in the Department of Mathematics, Computer Science, and Statistics.
Most industrial processing units consist of three components: input, processing, and output. Control
Dr. Mason is a Staff Analyst in the Statistical Analysis Section. He is a Fellow of ASQ.
Journal of Quality Technology
466
Vol. 33, No. 4, October 2001
mss # ms238.tex; AP art. # 9; 33(4)
APPLYING HOTELLING’S T 2 STATISTIC TO BATCH PROCESSES
procedures may be established on any or all of the process components. When the processing unit uses batches as input, we label it a batch process (e.g., see Fuchs and Kenett (1998)). For example, in the production of certain grades of silica specially prepared feedstock is required as input for a given production run. However, variation in feedstock preparation produces different batches in production. Batches also occur in processes where the entire processing unit is changed at specified time intervals. Many types of reactors are run in this manner. At the end of a run cycle the reactor is shut down and refurbished, and production resumes with a newly rebuilt reactor. Although the relationships among the process variables are maintained, variation in the start position (due to refurbishing) produces different runs. Processes that may purposely be shifted to a different configuration to change production from one product to another product inherently generate batches. This is a common occurrence in the plastics industry and job shops where many different products are made at different times. Feedstock remains constant in this type of process, but the processing component is changed to accommodate the new product grade being produced. The runs for each specified product comprise a batch process since the runs are separated by the production of other products. The preliminary data set in a Phase I evaluation would be composed of the observations on the different runs or batches that characterize the product being produced. In this paper we demonstrate the usefulness of the T 2 as a control chart statistic for batch processes occurring in both Phase I and Phase II operations. For Phase I settings we consider preliminary data sets where the batch observations are taken from either (1) a series of multivariate normal distributions having the same covariance matrix but possibly different mean vectors; or (2) a common multivariate normal distribution with the same mean vector and covariance matrix. It is assumed that the batches are well defined and that the mean vectors and covariance matrices are known or can be estimated. For Phase II operations we consider process monitoring using new observations where there is a known or estimated target mean vector. In either phase, discussions of necessary adaptations, such as parameter estimation, are included. A secondary focus of this paper is on using the T 2 statistic to identify outliers in multivariate batch processes.
Vol. 33, No. 4, October 2001
467
Classification of Batch Processes Certain types of batch processes generate product similar to that produced by a steady-state continuous process. Limited between-batch variation is acceptable, but when variation becomes large the outlying batches are rejected and must be reworked. In these processes, we assume that observations come from the same p-dimensional normal distribution, Np (µ,Σ), have a common mean vector, µ, and have a common covariance matrix, Σ. We will label batch processes of this type as Category 1. An example of such a process arises in the production of certain types of pigments used in specialty grades of paints and coatings. Although different batches of paint are produced, customers seldom tolerate much variation in the color of the paint across the batches. Since the batches from a Category 1 process are assumed to come from the same multivariate distribution, the resulting data should be centered as close as possible about the common mean vector, µ. If µ is known and attainable, it is designated as the incontrol mean vector. An example of the in-control production regions for a set of batches taken from a Category 1 process containing two process variables, x1 and x2 , is illustrated in Figure 1. The elliptical regions represent the in-control data from the separate batches, and the darkened circles denote the corresponding batch means. Observe the closeness of the batch means to one another, and to the overall population mean, which is represented by a shaded circle. Another type of batch process, which we will label as Category 2, produces runs with significant separation between its batch mean vectors. The batch observations in this type of process come from different multivariate normal distributions, Np (µi , Σ), where µi , i = 1, 2, . . . , k, represents the population mean vector of the ith batch. Nevertheless, all batches produced in a Category 2 process must be contained in an acceptable region defined by customer satisfaction. The differences among the batch means can be due to known or unknown causes. As an example, consider a bivariate Category 2 batch process where the operation is chemical in nature and depends on the addition of a particular catalyst. A new “barrel” of catalyst is used on each batch produced. Although the catalyst concentration for a given barrel is fixed, the concentration level varies within acceptable process limits between barrels. Thus, the output is dependent on the level of the catalyst.
www.asq.org
mss # ms238.tex; AP art. # 9; 33(4)
468
ROBERT L. MASON, YOUN-MIN CHOU AND JOHN C. YOUNG
FIGURE 2. In-Control Production Regions for Batches from a Category 2 Process. FIGURE 1. In-Control Production Regions for Batches from a Category 1 Process.
An example of the output from such a process is illustrated in Figure 2. The small ellipses represent the in-control production regions of the individual batches while the large ellipse depicts the in-control production region for the overall process. The vertical axis on the right side of the graph represents the level of the catalyst. Changing the level of the catalyst moves the mean of the distribution of the process variables for a given batch to a different position. The relationship between the two process variables, x1 and x2 , is represented by the orientation of the small ellipses centered about the batch mean points. Observe the positive correlation of the two process variables within a batch, although there is negative correlation between the batch mean components. Since the batches from a Category 2 process are assumed to come from different distributions, no specific target mean vector is specified. However, all batches should produce data that are located within a fixed statistical distance from the overall mean of all the batches when the process is stable. For example, the large elliptical region of Figure 2 defines the in-control region of batch production for the given set of batches. Control procedures for batch processes, whether the type is Category 1 or Category 2, must address two issues. The first concerns the determination of whether the relationships between and among the process variables are being maintained relative to those observed in the historical database. The second concerns the determination of whether the process is in statistical control. For a Category 1 batch process, where a single multivariate normal distribu-
Journal of Quality Technology
tion is applicable, the two determinations are equivalent. However, for a Category 2 batch process, where different multivariate normal distributions can exist, the determinations can be different. For example, several batches may be maintaining in-control relationships between the process variables yet may be out-of-control relative to the overall mean vector.
Phase I Operation The construction of an in-control data set in a Phase I multivariate control chart operation for batch data includes the investigation of some of the same potential problems as occur when constructing control charts for non-batch data. For example, an analyst must decide which variables to monitor, select the correct transformations for the chosen variables, and determine if the observations are independent or correlated over time. Consideration also must be given to the effects of missing observations, and the data set must be checked for the possibility of severe collinearities existing among the variables. Some problems require different solutions with batch processes. There are those that involve determining if any outliers are contained in the data set, and some that require deciding on an estimation procedure for the common covariance matrix. In addition, the batch means must be examined to determine if they differ significantly from one another. Unfortunately, these problems are interrelated. For example, significant differences between batch means may be due to the presence of outliers. Outliers, in turn, can have a significant influence on parameter estimation. However, outliers cannot be detected without proper estimates of the parameters.
Vol. 33, No. 4, October 2001
mss # ms238.tex; AP art. # 9; 33(4)
APPLYING HOTELLING’S T 2 STATISTIC TO BATCH PROCESSES
Batch Size = 1 Suppose we collect data on k batches taken from either a Category 1 or a Category 2 batch process. Assume the batch size is one (i.e., ni = 1, i = 1, . . . , k) so that a single observation is made on p variables from k each batch. Thus, the total sample size is N = i=1 ni = k. In this setting, individual outliers can be detected by computing the T 2 value for each observation, X, and comparing the value to an upper control limit (UCL). The procedure for purging outliers for batch processes where a single observation is made on each batch is equivalent to the procedure for continuous production processes. The corresponding T 2 statistic and its distribution for the purging procedure are given as T 2 = X− X S−1 X− X (N − 1)2 ∼ (1) B[p/2,(N −p−1)/2] , N
size Nno longer reduces to k but has the general form k N = i=1 ni . Since observations may be taken from different distributions for the batches in a Category 2 batch process, the definition and detection of outliers changes from those given for a Category 1 batch process. With batches, the overall sample mean in Equation (1) is equivalent to the weighted average of the batch means; i.e., ni k
X=
where B[α,p/2,(N −p−1)/2] is the upper αth quantile of B[p/2,(N −p−1)/2] . It is useful to note that the T 2 statistics in a Phase I operation are not independent of one another. This occurs because the observation X is not independent of X and S. However, in a Phase II operation future observations are independent of the estimated parameters obtained in Phase I, but the plotted T 2 statistics are dependent. More details on these results can be found in Tracy, Young and Mason (1992). Batch Size > 1 When the batch sizes in a Category 1 or Category 2 batch process exceed one the outlier purging procedures are more complicated. In this case, we can have outliers among the batch means or among the individual batch values. For example, Category 1 batch processes are checked for individual outliers using the T 2 statistic in Equation (1) and the corresponding UCL in Equation (2). However, the sample
Vol. 33, No. 4, October 2001
k
Xij
i=1 j=1
N
=
ni X i
i=1
N
,
(3)
where Xij is the j th observation in the ith batch, and Xi represents the mean vector of the ith batch. Similarly, the covariance matrix estimator used in Equation (1) measures not only process variation within batches but also process variation due to differences between batches. The estimator S is the total variation, SST , divided by (N − 1) and is written as i 1 Xij − X Xij − X N − 1 i=1 j=1
n
k
where X and S are the common estimators, respectively, of the overall sample mean and the overall covariance matrix, and B(p/2,(N −p−1)/2) represents the beta distribution with parameters (p/2) and ((N − p − 1)/2). The UCL is given as (N − 1)2 UCL = (2) B[α,p/2,(N −p−1)/2] , N
469
S= =
SST . N −1
The total variation also can be written as SST =
ni k
Xij − Xi
Xij − Xi
i=1 j=1
+
k
ni Xi −X Xi −X
i=1
= Within-Batch Variation + Between-Batch Variation = SSW + SSB . In a Category 1 batch process, we assume that the between-batch variation is minimal and is strictly due to random fluctuations. Thus, the corresponding elements of SST and SSW are very similar in size. However, in a Category 2 batch process, the between-batch variation can be large and produce a considerable difference between corresponding elements of SST and SSW . Further, the ratio of the sample size to the number of variables might be small for a given batch. This could lead to a within-group covariance matrix estimate that is near singular. Individual batches also may contain statistical outliers, which tend to greatly influence the estimate of the covariance matrix for small sample sizes.
www.asq.org
mss # ms238.tex; AP art. # 9; 33(4)
470
ROBERT L. MASON, YOUN-MIN CHOU AND JOHN C. YOUNG
FIGURE 4. Bivariate Data for Two Separate Batches. FIGURE 3. Two Batches of Data With Outliers and Similar Correlations.
As an illustration of the effects of between-batch variation on a Category 2 process, consider the plot given in Figure 3 of a set of data taken from a bivariate process composed of two separate batches and having an outlier in each batch. The orientation of the data implies that x1 and x2 have the same correlation in each batch, but the batch separation implies the two batches have different mean vectors. If the batch classification is ignored, then the overall sample covariance matrix, S, is based on deviations taken from the overall mean. However, given the large separation between the two sets of data, a better estimator of process variation could be obtained by taking a weighted average (weighted on the degrees of freedom) of the two separate withinbatch covariance matrix estimators. This estimator, in general, is given by k
SW =
(ni − 1)Si
i=1
N −k
=
SSW , N −k
(4)
where Si is the covariance matrix estimate for the ith batch. The problems noted above can be circumvented by subtracting the corresponding estimated batch mean vectors from each set of batch data prior to data analysis. With this translation, the mean differences among the batches are removed and all data are centered at the origin. Translating the batch data also will provide the appropriate estimate of the common covariance matrix. Consider the scatter plot of observations presented in Figure 4. Suppose the graph represents data taken from two separate independent batches from a Category 2 batch process involving two variables. There is an indication in the plot that both batches have observations that are potential outliers, since several
Journal of Quality Technology
of the data points are separated from their respective batch means. Also, notice the similarity of the two batches with respect to their orientation and variation, even though their mean vectors are different. This is substantiated by examination of the summary statistics presented in Table 1 for the two batches of data. Translation to the origin is achieved by subtracting the estimates of the respective batch means from the corresponding batch data. For the Batch 1 sample in Table 1 we compute (x1 − 2.367) and (x2 −1.665), whereas for the Batch 2 sample we compute (x1 − 4.289) and (x2 − 4.714). This re-locates both batches at a common origin. As noted from the last column of Table 1, the variation statistics for the combined translated data are very similar to those for the individual batches. The correlation between the translated x1 and x2 is 0.909, the variance of the translated x1 is 0.105, and the variance of the translated x2 is 0.140. These values compare favorably with the statistics of either batch, and they reinforce the usefulness of the mean translation. The graph of the overall mean-translated data set is presented in Figure 5. Although the use of centering appears to be very helpful, a close examination of the data in Figure 5 reveals the continued presence of some mean-vector TABLE 1. Summary Statistics for Two Batches of Data
Sample Size x1 Mean x2 Mean x1 Variance x2 Variance Corr (x1 , x2 )
Batch 1
Batch 2
Translated
245 2.367 1.665 0.107 0.140 0.914
272 4.289 4.714 0.105 0.140 0.904
517 0.000 0.000 0.105 0.140 0.909
Vol. 33, No. 4, October 2001
mss # ms238.tex; AP art. # 9; 33(4)
APPLYING HOTELLING’S T 2 STATISTIC TO BATCH PROCESSES
FIGURE 5. Translation of Two Batches to Common Origin.
differences between the two batches. Observe that the bulk of the translated observations of Batch 1 are located in the first quadrant with a lesser amount located in the third quadrant; however, the opposite occurs with the translated observations of Batch 2. This mean-vector difference is unexpected and occurs primarily because the frequency distributions of the two sets of batch data are not symmetrical (as in a normal distribution), but skewed. This disparity between the batches can be attributed to the effects of operator control on the process. The first variable (x1 ) is an operator-control variable and the second variable (x2 ) moves in response to changes in x1 . In addition, there are numerous other “lurking” variables having great influence on x2 . Lead operators (on different shifts) run the process differently, but obtain the same overall process results (e.g., the same production level or the same product). Thus, although the correlation between the two variables remains relatively constant, the mean vectors across the two batches differ due to operator differences. The above data example poses an interesting problem. If we use a mean translation, all the transformed observations from a Category 2 batch process will have a p-variate normal distribution, Np (0,Σ), with mean vector 0 and covariance matrix Σ. This result is based on the assumption that the untransformed data have a multivariate normal distribution. However, as seen in the data in Figure 5, there may be situations where the underlying distribution is nonnormal. Because this has occurred in this example, it is useful to briefly describe one possible solution to this assumption violation. To do so we will consider the distribution, not of the original data, but of the T 2 values. If this distribution is similar to the beta distribution given in (1), we will consider it appropriate to continue using the T 2 charting procedure.
Vol. 33, No. 4, October 2001
471
The distribution of the T 2 statistic and its applicability to a particular situation can be readily established using a Q-Q plot (e.g., see Seber (1984)). For example, if one considers the transformed data of Batch 1 in Figure 5, there are three obvious outlying points. If these are removed during the data cleaning procedures of Phase I and a corresponding Q-Q plot is constructed using the beta distribution in (1), then the resulting plot is given in Figure 6. The trend in the plot lies along a 45-degree line and indicates that the beta distribution provides a good fit to the T 2 values for Batch 1. The Batch 2 data in Figure 5 appear to have seven outlying points that could be removed in the data cleaning effort. With their removal the resulting QQ plot, given in Figure 7, is similar to that obtained for the Batch 1 data. The trend is again highly linear, indicating the validity of the beta distribution assumption for the T 2 values for Batch 2. The Q-Q plot of the combined batches (with the combined 10 outliers removed) is presented in Figure 8. As expected, the trend is highly linear, and the results indicate that, even with the disparity observed due to operator differences, the departure from the specified beta distribution used in describing the T 2 statistic for separate or combined batches is small. This small discrepancy will not affect overall conclusions drawn in using the T 2 as the control statistic for this situation. Thus, we can identify individual outliers by ignoring the batch differences, treating the translated data as a single group, and determining signaling points using the T 2 statistic in Equation (1) with the translated data. This is accomplished by plotting the T 2 values on a control chart (e.g., see Fuchs and Kenett (1998) or Wierda (1994)). Any T 2 value exceeding the UCL of the chart is declared an outlier. Using an α = 0.01 and the T 2 statistic in Equation (1) with the translated data from Figure 5, we can now identify the observations located a significant distance from the center of the data (i.e., from the origin). In the first pass through the data, three observations from Batch 1 and seven observations from Batch 2 are designated as outliers (i.e., these are the same 10 observations deleted in Figure 8). The covariance matrix is re-estimated from the remaining translated observations. One observation from each batch is removed on the second pass, but a third pass detects no remaining outlying observations. Thus, there are a total of 12 observations designated as
www.asq.org
mss # ms238.tex; AP art. # 9; 33(4)
472
ROBERT L. MASON, YOUN-MIN CHOU AND JOHN C. YOUNG
FIGURE 6. Q-Q Plot of Batch 1 With Three Outliers Removed.
FIGURE 8. Q-Q Plot of Combined Batches With Ten Outliers Removed.
outliers, four from Batch 1 and eight from Batch 2. The outlying observations are designated with symbols in Figure 9. With the removal of these outliers, an estimate of the covariance matrix can be obtained from the remaining translated data.
Category 1 batch data in a Phase I operation is that the process is centered. This implies that the mean vectors of the k individual batches are all equal. Translating the observations from the different batches to a common origin in order to identify outliers for individual observations will remove any possible mean differences between the batches for the translated data. However, if the assumption of equal batch means is invalid, batch mean differences among the untranslated data may go undetected, particularly if these differences appear after the purging of the individual outlier data. Given the importance of this assumption, we recommend performing a test of hypotheses to determine the batch mean vector differences.
Category 1 Batch Mean Outliers In Category 1 batch processes it is assumed that the batch mean vectors do not have significant differences. Therefore, the specification, or estimation, of the means of the p process variables plays an important role in developing control charts for future observations. When new observations are monitored, the target mean vector may be specified as µT . Otherwise, µT must be estimated using the overall sample mean estimate given in Equation (3). Unfortunately, with batch data, the outlying batch means can distort the estimate of the in-control mean vector. For example, the plot given in Figure 10 represents a set of batch means with three obvious outliers located at points A, B, and C. The inclusion of Point B or Point C will shift the overall batch mean to either the left or right (respectively), while the inclusion of Point A will shift the overall mean upward and to the right. A basic assumption in using the T 2 statistic with
FIGURE 7. Q-Q Plot of Batch 2 With Seven Outliers Removed.
Journal of Quality Technology
Assume that an observation from the ith batch is distributed as X ∼ Np (δi , Σ) and that an observation from the j th batch is distributed as X ∼ Np (δj , Σ). In the form of a statistical hypothesis, the assumption of equal batch means is represented as H0 : δ 1 = δ 2 = · · · = δ k .
(5)
The alternative hypothesis is represented as HA : δi = δj for some i = j.
FIGURE 9. Translated Batch Data With Deleted Outliers.
Vol. 33, No. 4, October 2001
mss # ms238.tex; AP art. # 9; 33(4)
APPLYING HOTELLING’S T 2 STATISTIC TO BATCH PROCESSES
473
TABLE 2. Batch Means for Six Batches
Under H0 , each observation vector X is assumed to come from a common multivariate normal distribution; i.e., X ∼ Np (δ, Σ). Under HA , at least one batch observation comes from a different multivariate normal distribution. Before testing H0 , the issue of batch sample size must be addressed. Even if the number of observations was originally the same for each batch, the Phase I purging of individual outliers may have reduced the sample size in several of the batches. We suggest using an average value for the batch sample size; i.e., use (6)
where ni , i = 1, 2, . . . , k, is the number of observations in the ith batch, and n is rounded up to the nearest integer. Since most processing units take observations electronically on a regular sampling basis, the discrepancies between the ni should be small. However, variation in the length of a run cycle can produce a few extra observations per cycle. Establishing the reasonableness of the hypothesis in Equation (5) is important. If there are no differences between the batch means, the best estimator of the overall in-control mean vector is given in Equation (3). If there are outliers, the outlying batch means will need to be excluded from the calculations in Equation (3). Using the average value in Equation (6) for sample size, we can test the validity of the hypothesis in Equation (5) with a T 2 statistic for detecting batch mean outliers. The statistic is given by T 2 = Xi − X S−1 Xi − X , (7) W
Vol. 33, No. 4, October 2001
x ¯1
x ¯2
x ¯3
ni
1 2 3 4 5 6
10.633 8.816 3.745 5.900 9.563 7.358
9.808 16.875 9.800 7.033 2.809 9.325
2.425 2.058 1.027 3.408 0.463 1.683
12 12 11 12 11 12
Average
7.670
9.275
1.844
12
where SW is the sample covariance matrix computed using Equation (4) and the translated data with the individual outliers removed.
FIGURE 10. Plot of Batch Means With Outliers.
k 1 n= ni , k i=1
Batch #
The distribution of the statistic in Equation (7), under the assumption of a true null hypothesis, is that of an F variable. The distribution (i.e., see Wierda (1994)) is given as p(k − 1)(n − 1) T2 ∼ F(p,nk−k−p+1) , n[k(n − 1) − p + 1] where F(p,nk−k−p+1) represents the F distribution with parameters (p) and (nk − k − p + 1). For a given α level, the UCL for the T 2 statistic is computed as (k − 1)(n − 1)p UCL = F(α,p,nk−k−p+1) , (8) n[nk − k − p + 1] where F(α,p,nk−k−p+1) is the upper αth quantile of F(p,nk−k−p+1) . The T 2 value in Equation (7) is computed for each of the k batch means and compared to the UCL in Equation (8). Batch means with T 2 values that exceed the UCL are declared outliers. The corresponding batches are removed and a new incontrol mean vector is estimated for use in Phase II operations. To demonstrate the procedure, consider a Category 1 batch process where 12 observations are collected on three process variables for each of k = 6 batches. The estimated batch mean vectors, after individual outliers are removed, are given in Table 2. The average sample size for the batches, using Equation (6), is computed as 11.67. Thus, a value of n = 12 is used in determining the UCL in Equation (8). The correlation matrix for the three process variables is given in Table 3. To test the hypothesis of equal batch means, the T 2 statistic in Equation (7) is computed using the estimated covariance matrix SW employed in computing the correlation matrix. The T 2 values for the six
www.asq.org
mss # ms238.tex; AP art. # 9; 33(4)
474
ROBERT L. MASON, YOUN-MIN CHOU AND JOHN C. YOUNG TABLE 4. T 2 Values for Six-Batch HDS
TABLE 3. Correlation Matrix for Six-Batch HDS
x1 x1 x2 x3
1.000 −0.528 −0.043
x2 −0.528 1.000 0.519
x3
Category 2 Batch Mean Outliers For Category 2 batch processes, it is known that the batch means differ. However, in a Phase II operation, it is necessary to define an in-control region so that one can determine if a current batch gives evidence of process stability. The true p-dimensional center of this region is given as k 1 µi , k i=1
where µi represents the mean vector of the ith batch, and the best estimate of µ is given by X in Equation (3). Assuming the batch data can be described by a multivariate normal distribution, an in-control region based on a T 2 statistic can be developed. The form and distribution of the statistic is given as T 2 = Xi − X S−1 X − X i B ∼
(k − 1)2 B(α,p/2,(k−p−1)/2) . k
(9)
where SB = SSB /(k−1). Similarly, the UCL is given by (k − 1)2 UCL = (10) B(α,p/2,(k−p−1)/2) . k The T 2 statistic in Equation (9) is based on the estimated between-batch variation, SB . Note that, in the special case of a univariate batch process, this estimated between-batch variance is based on using
Journal of Quality Technology
T 2 Value
1 2 3 4 5 6
0.068 1.062 0.063 0.220 0.473 0.001
−0.043 0.519 1.000
batch means are presented in Table 4. For α = 0.01, n = 12, k = 6, and p = 3 the UCL in Equation (8) is computed as 0.882. Since the T 2 value for Batch 2 exceeds this UCL, the corresponding batch mean is declared to be significantly different from the other batch means. An examination of Table 2 reveals the reason. The mean of the variable x2 for Batch 2 is 16.875, which somewhat exceeds the other batch means on this variable. Thus, the analyst should delete Batch 2 before computing the overall batch mean vector for use in the Phase II operations on this process.
µ=
Batch #
the sample variance of the pooled batch means (e.g., see Woodall and Thomas (1995)). Thus, SB describes the relationships between the components of the batch mean vectors rather than of the individual observations. For example, in the Category 2 batch process example presented in Figure 2, the correlation between the two process variables is positive, while the correlation between the batch mean vector components is negative. The downward shifting of the overall process could be due to an unknown “lurking” variable, or to a known uncontrollable variable such as the catalyst concentration of the previous batch. To demonstrate how to develop an in-control region for later use in a Phase II operation, consider a Category 2 batch process where 12 observations are collected on three process variables for each of k = 16 batches. The estimated means of the batches, after outliers are removed, are given in Table 5. The corTABLE 5. Batch Means for 16 Batches
Batch # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
x ¯1 161.6091 152.6545 143.9455 149.7 158.6333 178.3727 133.3636 140.2 159.2 145.2182 150.6333 148.8167 143.7455 145.9 149.5636 147.3583
x ¯2 207.0545 205.4182 206.6545 200.15 194.025 195.2636 200.7909 202.4727 197.1 193.9273 194.8083 201.875 194.8 192.0333 187.8091 194.325
x ¯3 33.42727 15.11909 15.9 16.3 12.425 13.57273 0.000909 0.581818 24.33636 9.8 2.925 2.558333 1.527273 3.908333 0.963636 2.183333
ni 11 11 11 12 12 11 11 11 11 11 12 12 11 12 11 12
Vol. 33, No. 4, October 2001
mss # ms238.tex; AP art. # 9; 33(4)
APPLYING HOTELLING’S T 2 STATISTIC TO BATCH PROCESSES
475
TABLE 6. Correlation Matrix for 16-Batch HDS
x1 x1 x2 x3
1.000 −0.776 −0.571
x2 −0.776 1.000 0.484
x3 −0.571 0.484 1.000
relation matrix based on the between-batch variation is given in Table 6. The T 2 chart for the 16 batch means is presented in Figure 11. For α = 0.05, k = 16, and p = 3 the UCL in Equation (10) is computed as 10.263. Since no batch mean is out of statistical control, all the data can be used in computing the overall mean estimate given by Equation (3). Phase II Operation: Category 1 Process Before beginning a Phase II operation for a Category 1 batch process we must have an appropriate estimator of the covariance matrix. We have available the estimator SW , which is based on the translated observations with individual outliers removed and is the appropriate estimator to use in testing for mean differences. However, with the removal of individual outliers and the exclusion of observations from the outlying batches that exhibit a significant mean difference, one could use the common estimator S obtained from the remaining observations. For example, consider the data in Table 2. Because Batch 2 is an outlier, we would exclude it from the HDS and estimate the covariance matrix using S and the data from the remaining five batches. We use S rather than SW because the between-batch variation, SB , is assumed to be entirely due to inherent process variation (e.g., see Alt (1982) and Wierda (1994)).
FIGURE 11. T 2 Control Chart of HDS for a 16-Batch Process.
statistic for monitoring this observation is given by T 2 = X− X S−1 X− X , where the covariance matrix, S, and the target mean vector estimate, X, are obtained using the HDS of size N = nk. (ii) When process control is based on the more conservative covariance estimator SW , the T 2 statistic is given by T 2 = X− X S−1 X− X . W (iii) For the situation where the target mean vector, µT , is specified and the estimator S is used, the T 2 control statistic is given as
T 2 = (X − µT ) S−1 (X − µT ) ,
(11)
where S is again obtained from the HDS. (iv) When the in-control mean vector is specified but the estimator SW is used, the T 2 statistic is given as
T 2 = (X − µT ) S−1 W (X − µT ) ,
The estimator SW excludes between-batch variation, and thus it is not influenced by batch mean differences. However, if mean differences do not exist among the batches, the estimator S is a more efficient estimator of Σ than SW (e.g., see Wierda (1994)). Since both covariance estimators have merit depending on the circumstances, we will consider Phase II operations for monitoring future process observations using both S and SW . We also present control procedures for monitoring a Category 1 batch process for the four cases of interest presented below.
The T 2 formulas, distributions, and UCLs for the above four cases are presented in Table 7. When process control is based on the monitoring of the subgroup mean, X S , of a future sample of m observations taken on a given batch, slight modifications must be made to these formulas. These results are also contained in Table 7.
(i) Consider a single future observation vector, X, that is taken from a Category 1 batch process with an unknown mean target vector. Using the covariance matrix estimator S, the T 2
For a Phase II operation for Category 2 batch processes, control procedures can be based on observing either a single future observation or the sample mean vector of a subgroup of future observations for the
Vol. 33, No. 4, October 2001
Phase II Operation: Category 2 Process
www.asq.org
mss # ms238.tex; AP art. # 9; 33(4)
476
ROBERT L. MASON, YOUN-MIN CHOU AND JOHN C. YOUNG TABLE 7. Phase II Formulas for Category 1 Batch Process
Subgroup size: m
Target Mean: µT
T2 Statistic
Cov. Est.
S
(X − µT ) S−1 (X − µT )
SW
(X − µT ) S−1 W (X − µT )
Known 1
X− X X− X
SW S Known
SW
>1
S Unknown
SW
S Unknown
T2 Distribution∗
p(N −1) F(p,N −p) N −p
S−1 X− X
X− X S−1 W
S−1 XS −µT
S−1 XS −µT W
S−1 XS − X
S−1 X − X S W
XS −µT XS −µT XS − X XS − X
p(N −k) N −k−p+1
F(p,N −k−p+1)
p(N +1)(N −1) N (N −p) p(N −k)(N +1) N (N −k−p+1) p(N −1) m(N −p)
F(p,N −p)
F(p,N −k−p+1)
F(p,N −p)
p(N −k) m(N −k−p+1)
p(m+N )(N −1) mN (N −p)
F(p,N −k−p+1)
p(m+N )(N −k) mN (N −k−p+1)
F(p,N −p)
F(p,N −k−p+1)
∗ The T 2 UCL is the upper αth quantile of the T 2 distribution given above.
Phase II Data Example
batch being produced. Recall that the objective in this situation is to determine if the present batch is in-control (i.e., the T 2 value corresponding to the batch data is within the T 2 control region). When the target mean is unknown, the historical data set provides the estimates X and S. Note that, since the HDS is composed of single observations from k batches, S = SB and N = k. Therefore, the formulas for a Category 2 batch process are the same as those given in Table 7 for a Category 1 batch process except that we replace S with SB , and we let N = k. Also, none of the formulas based on SW are applicable.
To demonstrate a Phase II operation, consider a batch process for producing a specialty plastic polymer. A detailed chemical analysis is performed on each batch to assure that the composition of seven measured components adhere to a rigid chemical formulation. The rigid formulation is necessary for mold release when the plastic is transformed to a usable product.
FIGURE 12. T 2 Control Chart for HDS for Polymer Process.
FIGURE 13. T 2 Control Chart for New Observations on Polymer Process.
Journal of Quality Technology
A preliminary data set consisting of 52 batches acceptable to the customer is used to construct a
Vol. 33, No. 4, October 2001
mss # ms238.tex; AP art. # 9; 33(4)
APPLYING HOTELLING’S T 2 STATISTIC TO BATCH PROCESSES
477
TABLE 8. Summary Statistics for HDS for Polymer Process
N Mean Minimum Maximum Std. Dev.
x1
x2
x3
x4
x5
x6
x7
52 87.10 86.60 87.50 0.248
52 7.18 6.80 7.70 0.175
52 3.31 3.00 3.60 0.125
52 0.35 0.30 0.42 0.020
52 0.09 0.00 0.18 0.028
52 0.86 0.80 0.97 0.045
52 1.16 0.98 1.40 0.103
TABLE 9. Correlation Matrix for HDS for Polymer Process
x1 x1 x2 x3 x4 x5 x6 x7
1.000 −0.519 −0.659 −0.436 −0.092 −0.418 −0.426
x2 −0.519 1.000 0.040 −0.021 −0.054 −0.186 −0.211
x3 −0.659 0.040 1.000 0.634 0.308 0.136 0.125
historical data set. Each batch has only one observation vector. This example suffices for either a Category 1 or a Category 2 batch process since control is based on a single observation vector. A T 2 chart (with α = 0.001) of the data, using the statistic in Equation (1) with N = 52, p = 7, and α = 0.001, is presented in Figure 12. The UCL of 20.418 is obtained from Equation (2). Although observations #3 and #7 produce signals on the chart, both were retained in the HDS. Summary statistics for the seven measured variables obtained from the HDS are presented in Table 8. Examination of the ranges and standard deviations reveals tight control on the operational ranges of the individual variables. However, these statistics are not as critical to the customer as the relationships between and among the variables. The correlation matrix for the seven variables is presented in Table 9. This matrix indicates the structure or relationships that must be maintained among the variables. Any deviation can produce serious problems for the customer. Suppose that a target mean vector for this batch process is specified as µT = (87.15, 7.16, 3.29, 0.34, 0.08, 0.85, 1.14) . The T 2 value of a future observation, X, on the process will be computed using Equation (11). For
Vol. 33, No. 4, October 2001
x4 −0.436 −0.021 0.634 1.000 0.399 0.020 0.018
x5 −0.092 −0.054 0.308 0.399 1.000 −0.265 −0.220
x6 −0.418 −0.186 0.136 0.020 −0.265 1.000 0.693
x7 −0.426 −0.211 0.125 0.018 −0.220 0.693 1.000
α = 0.001, N = 52 (i.e., n = 1 and k = 52), and p = 7 the UCL = 34.238. Twenty six new batches are produced and a T 2 control chart (using the above target mean vector) is presented in Figure 13. Signals are observed on the observations on Batches #5, #7, #9, #14, and #21.
Discussion With the assumption that X is distributed as a pvariate normal, Np (µ, Σ), it can be shown that the common covariance matrix estimator S is a function of the maximum likelihood estimator of Σ, and it is an unbiased, least-squares estimator. Due to these properties, one would expect superior performance of the T 2 statistic in detecting individual outliers. This is indeed the case. The work by Wierda (1994) concludes that the T 2 statistic, when based on the common estimator, S, is more powerful than one based on SW (as presented in Equation (4)) in detecting individual outliers in a single batch when there are no mean differences among the batches. Studies comparing the power of the T 2 in detecting outliers to the power of T 2 statistics using alternative covariance matrix estimators have been performed (e.g. see Chou, Mason, and Young; 1999). With the basic assumption of a single multivariate normal distribution, the T 2 approach again produces superior results.
www.asq.org
mss # ms238.tex; AP art. # 9; 33(4)
478
ROBERT L. MASON, YOUN-MIN CHOU AND JOHN C. YOUNG
We have recommended translation of the batch data to a common origin by subtracting individual batch mean vectors from the corresponding batch observations. This translation produces an overall set of data satisfying the basic assumption of a single multivariate normal distribution. Since the different batches have a common covariance structure, it is possible to use the powerful T 2 statistic when seeking to remove individual outliers. The translation also produces a more precise covariance estimate. We believe the translation to a common origin to be an important step in developing a control procedure for batch processes with batch sizes greater than one. Clearly, the translation cannot be applied to batches of size n = 1. The control procedure for this situation reduces to the T 2 methodology used for monitoring individual observations. In this case, we use the common sample covariance estimator S in the formula for the T 2 statistic. The exact distribution of this T 2 and the corresponding UCL are well known (e.g., see Mason, Tracy, and Young (1997)). Some researchers have shown that, in the above monitoring procedure, the covariance estimator S can be affected by shifts in the mean vector. For example, Wheeler (1994) discusses the univariate situation, and Sullivan and Woodall (1996) use simulation techniques to examine different multivariate cases, such as an alternative covariance estimator based on the differences of successive observations. Although Sullivan and Woodall (1996) indicate that small random shifts (equivalent to different batches means) have an effect on the covariance matrix estimator, translation removes the possibility of contaminating the estimate with between-batch variation. In analyzing batch data in a process an extra component of variability has been added for consideration, namely batch-to-batch variation. Other sources of variation also could be considered. An application to multivariate SPC is given by Linna, Woodall, and Busby (2001) who examine the performance of multivariate control charts in the presence of measurement error variation.
Summary The creation of a historical data set is an important aspect in implementing a multivariate control procedure for batch processes. We have emphasized three important, interrelated areas in this development: outlier removal, parameter estimation, and the location of significant batch mean differences.
Journal of Quality Technology
When the batch data are collected from the same multivariate normal distribution, the T 2 statistic is an excellent choice for use in purging outliers. When the batch data are collected from multivariate normal distributions with different mean vectors, the translation of the different batches to a common origin again allows one to use the overall T 2 statistic to identify outliers. The approach yields a powerful procedure.
Acknowledgments The authors wish to thank the editor and the referees for their helpful comments and suggestions on earlier versions of this paper.
References Alt, F. B. (1982). “Multivariate Quality Control: State of the Art”. Technical Conference Transactions, ASQC, pp. 886– 893. Chou, Y. M.; Mason, R. L.; and Young, J. C. (1999). “Power Comparisons for a Hotelling’s T2 Statistic”. Communications in Statistics 28, pp. 1031–1050. Fuchs, C. and Kenett, R. S. (1998). Multivariate Quality Control. Marcel Dekker, Inc., New York, NY. Linna, K. W.; Woodall, W. H.; and Busby, K. L. (2001). “The Performance of Multivariate Control Charts in the Presence of Measurement Error”. Journal of Quality Technology 33, pp. 349-355. Mason, R. L.; Tracy, N. D.; and Young, J. C. (1995). “Decomposition of T2 for Multivariate Control Chart Interpretation”. Journal of Quality Technology 27, pp. 99–108. Mason, R. L.; Tracy, N. D.; and Young, J. C. (1997). “A Practical Approach for Interpreting Multivariate T2 Control Chart Signals”. Journal of Quality Technology 29, pp. 396– 406. Mason, R. L. and Young J. C. (1998). “Hotelling’s T2 : A Multivariate Statistic for Industrial Process Control”. Proceedings of the 52nd Annual Quality Congress, pp. 78–85. Mason, R. L. and Young, J. C. (1999a). “Improving the Sensitivity of the T-Square Statistic in Multivariate Process Control”. Journal of Quality Technology 31, pp. 155–165. Mason, R. L. and Young, J. C. (1999b). “Autocorrelation in Multivariate Processes” in Statistical Monitoring and Optimization for Process Control edited by S. Park and G. Vining. Marcel Dekker, Inc., New York, NY, pp. 223–239. Nomikos, P. and MacGregor, J. F. (1995). “Multivariate SPC Charts for Monitoring Batch Processes”. Technometrics 37, pp. 41–59. Seber, G. A. F. (1984). Multivariate Observations. John Wiley & Sons, New York, NY. Sullivan, J. H. and Woodall, W. H. (1996). “A Comparison of Multivariate Control Charts for Individual Observations”. Journal of Quality Technology 28, pp. 398–408. Tracy, N. D.; Young, J. C.; and Mason, R. L. (1992). “Multivariate Control Charts for Individual Observations”. Journal of Quality Technology 24, pp. 88–95.
Vol. 33, No. 4, October 2001
mss # ms238.tex; AP art. # 9; 33(4)
APPLYING HOTELLING’S T 2 STATISTIC TO BATCH PROCESSES Wheeler, D. J. (1994). “Charts Done Right”. Quality Progress 27, pp. 65–68. Wierda, S. J. (1994). Multivariate Statistical Process Control. Groningen Theses in Economics, Management and Organization. Wolters-Noordhoff, Groningen, The Netherlands. Woodall, W. H. and Thomas, E. V. (1995). “Statistical Pro-
479
cess Control with Several Components of Common Cause Variability”. IIE Transactions 27, pp. 757–764.
Key Words: Multivariate Quality Control, Outliers, Phase I, Phase II .
∼
Vol. 33, No. 4, October 2001
www.asq.org