Pith and Vinegar: Studying the Capability of Capability Studies

Sample Size

As most of you know, statistical tests calculate a mean and confidence intervals on the mean. We are all familiar with the fact that as our sample size decreases our knowledge of the “true” mean becomes less and less certain. This is important for tests that use the mean -- such as the t-test and ANOVA.

Below is an example of two data sets “Apples” and “Oranges”. In the first experiment we only had 15 samples of Apples and 15 samples of Oranges. Plotting the means with their calculated confidence intervals shows that we cannot differentiate between Apples and Oranges (since the confidence intervals overlap, we cannot be certain that both means are not equal).

But if we increase the sample size to 100 Apples and 100 Oranges, our confidence intervals decrease and we can now tell that Apples do have lower values than Oranges.

I am sure that most of you are realizing where this diversion from capability is heading.

Capability studies (and Gage R&R studies) use the standard deviation as a primary statistic and like the mean, the standard deviation also has a confidence interval.

The confidence interval of the standard deviation is also dependent on the sample size. As the sample size increases our estimate of standard deviation becomes better. For those who like this stuff, here is the formula for confidence interval on the Standard deviation.

Given a standard deviation of “1” and using the formula above, we can now plot the how the confidence intervals contract as sample size increases.

In other words, as my capability study’s sample size becomes smaller, the range in which the “true” standard deviation can exist becomes larger.

So, here’s the kicker.

One can calculate the upper and lower limits of Pp for various sample sizes and alpha values and look at the potential range of Pp or Ppk for a capability study.

Let's look at two examples of how this works.

Example One: From the chart above, let’s assume an alpha of 0.05 and a sample size of 15 units. Let’s also assume that we ran our capability study and calculated Pp = 1.67.

Now we could stop here and report to our customer that our Pp = 1.67. But, if we do the calculations we see that the actual Pp could lie between Pp= 1.1 and Pp = 2.2. If your customer has a savvy Six Sigma expert, you could be busted.In my Six Sigma youth, I was busted. Lately, I've been doing the busting. I like that better.

Example Two: If we increase the sample size of the capability study to 120, and if our calculations find a Pp = 1.67, then the actual Pp could be between Pp = 1.5 and Pp = 1.8. With the larger sample size we are much more assured that the Pp is actually close to 1.67. In this case, we would have a much better chance of defending a Pp = 1.67.

In summary, never forget that the likelihood that the Pp value you report is effectively correct is dependent on the size of your sample. This is one reason to avoid capability studies on small prototype builds and experimental runs.

We can also see from the charts above that once we have a sample size greater than about 90 pieces, the incremental improvement (decrease) in the range of our standard deviation (and thus our Pp) becomes small enough to ignore.

I hope that you have enjoyed these blog postings on the vagaries of Capability studies (Cp versus Pp, Data distribution, and Sample size). Practically we cannot always control these factors, but we can at least go into our study with our eyes open and a clear understanding of our ‘risk of being wrong’.

Hmmm... Maybe my next topic will be "The Risk of Being Wrong"

Capability of Capability - Part 1
Capability of Capability - Part 2

Pith and Vinegar

x

Wednesday, October 19, 2016

Studying the Capability of Capability Studies - Part 3

No comments:

Post a Comment