The ability to define a scientifically justified and statistically sound sampling procedure is a fundamental skill in modern systematic drug development.
In all product and process development and in all validation activities, developing a representative sample and the correct and defendable use of the statistical method used are crucial. In all sampling activities, a scientifically sound sampling plan should be developed, justified, and implemented. Sampling activities include clinical trials, formulation, stability (long term and accelerated), process characterization, analytical method development, method validation, process validation, and release testing.
International Conference on Harmonization (ICH) Q8, Q9, Q10, and Q11 (1–4) all require statistically valid sampling procedures in test and product and process development. Specifically Q8, Q9, and Q10 Questions and Answers (R4), states (5):
“Q6: Do traditional sampling approaches apply to real time release (RTR) testing?
A6: No, traditional sampling plans for in-process and end-product testing involve a discrete sample size that represents the minimal sampling expectations. Generally, the use of RTR testing will include more extensive on-line/in-line measurement. A scientifically sound sampling approach should be developed, justified, and implemented. (Approved April 2009) processing, in-process materials, and drug product quality can provide an opportunity to shift controls upstream and minimize the need for end product testing.”
The purpose of this paper is to organize a logical framework and identify the appropriate tools that aid in the development of a statistically valid sampling protocol. Due to the complexity of the kinds of issues and the types of problems to be solved, each sampling procedure will be somewhat unique; however, there are common issues and questions that each sampling protocol will need to consider and address.
Statistical Methods for Product Development
The following are generally key steps for using statistical methods for problem solving and product development.
Define the business case
The business case explains the development objective, how each activity relates to quality objectives, quality target product profiles (QTTPs), critical quality attributes (CQAs), timelines or product development requirements, and why the activity needs to be completed. The business case gives context to development and validation activities and helps CMC teams understand how each activity fits into overall business and quality imperatives. Failure to understand the business case for development and validation activities causes discontinuity in development activities and the logic of what, why, and how much becomes hard to justify and or file to the appropriate health agencies.
Define the problem
Problem definitions describe what we don’t know and or what is wrong with our current performance. Problem definitions are essential for problem sampling plan development and rationalization. Limitations and/or scope associated with each problem definition is critical in defining the associated population of units associated with the problem. Also associated with the problem definition, there should be clarity on what is the sampling unit (batch, vial, drum etc.)
Define all objectives, goals, and study questions
From the problem definition, objectives, goals and study questions must be defined. Goals come in four forms: maximize, minimize, match target, and none. Limits and acceptance criteria should also be defined. The sampling plan will be designed to ensure all study questions are answered, goals are achieved, and questions are sufficiently precise and accurate relative to the limits and tolerances of the acceptance criteria.
Determine all factors and responses and analytical methods
Based on the study questions and goals, what must be measured to meet the business case and solve the problem? All factors that influence the problem and study questions must be defined. All responses and analytical methods associated with the problem need to be clarified. In many cases, the factors and/or responses that are currently measured in batch records or in on-line data systems do not address the problem to be solved or the process to be characterized. Care should be exercised to assure there are no missing measurements or the business case will not be met or the problem will not be addressed correctly.
Define the population
Based on the study questions, unit definition, and problem statement, what is the population of units that need to be understood? Population definition is crucial prior to sampling plan formalization. For R&D, the population is often a function of formulation and or configuration. Another term that is often associated with the population is “volume.” How will the product perform in volume and at scale versus the limited product testing that is performed in small-scale studies?
Define the sample that will represent the population, study questions, and the defined problem
A sampling plan needs to be formally defined and scientifically justified to assure it is representative of the population and it meets the all defined study objectives and acceptance criteria.
WHO guidelines for representative sampling state (6):
“Representative sample, Sample obtained according to a sampling procedure designed to ensure that the different parts of a batch or the different properties of a non-uniform material are proportionately represented.”
The FDA process validation guideline states (7):
“The sampling plan, including sampling points, number of samples, and the frequency of sampling for each unit operation and attribute. The number of samples should be adequate to provide sufficient statistical confidence of quality both within a batch and between batches. The confidence level selected can be based on risk analysis as it relates to the particular attribute under examination. Sampling during this stage should be more extensive than is typical during routine production.”
All statistically justifiable sampling plans require two primary considerations: sampling method and sample size.
Define the sampling method
The sampling method is defined to clarify “how” samples are taken, “where” they are taken, and “how often.” How many samples are taken is sample size. To determine the sampling method, care must be exercised to assure the samples are taken from the primary sources of variation. Partition of variation (POV) or components of variation (COV) analysis are used to determine the proportion of variation within/between batch for example. In Figure 1, 54% of the variation occurs within the lot, so at least 54% of the measurements need to be allocated to the within lot sampling to be representative. The batch-to-batch variation is only 3% so many batches are not needed in this example. If 90% of the variation was between batches, then 90% of the samples should be from multiple batches.
Define the sample size
Sampling method must be defined first, then sample size. There are many International Organization for Standardization (ISO) standards for variables and attributes sampling for lot acceptance (8-10), and there are also the National Institute of Standards and Technology (NIST) standards for determining the sample size (11). SAS/JMP and other statistical packages have their own sample size and power calculators. Every sampling protocol involves risk. The risk should be known rather than unknown.
To use a sample size calculator, the confidence interval (1-alpha), the power of the test, how reliably you want to detect a change (1-beta), and the practical change to observe (delta) must be known. For example: alpha = 0.05, the power =0.95, and the delta in pH that we want to detect is 0.2 and the standard deviation of pH at the point of evaluation is 0.125 from historical measurements. What is the sample size? Eight samples will do it. Sample size does not address sampling method. Sampling method tells you how to take the sample; sample size tells you how many. Power curves are used when the delta is unknown (see Figure 2).
Collect data per the sampling plan
Follow the sampling plan and collect data and all associated data tags (e.g., time, date, analyst, etc.) during sampling.
Summarize data into statistics for the sample and confidence intervals for the population
Statistics and graphs are used to summarize and aggregate the data. The statistics describe the sample, and the confidence intervals are used to describe the population. Confidence intervals control for risk (95%, for example), variation in the data, and sample size. Confidence intervals should be in every graph and in every table.
Draw conclusions and inference
Based on the data and all associated confidence intervals, models, and graphs, draw conclusions relative to acceptance criteria, limits, CQAs, and the business case.
Verify conclusions, solve the problem, meet quality requirements, and achieve the business case
Verify the conclusions and prediction made from the sample are subsequently observed in the population. Demonstrate that solutions are generally applicable though on-going longitudinal monitoring and continuous validation protocols. Verification indicates earlier estimates taken from the representative sample are correct. Failure to verify conclusions and inferences generally indicates additional uncontrolled factors are at play and need to be understood and/or controlled before the business case can be achieved.
Summary
The ability to define a scientifically justified and statistically sound sampling procedure is a fundamental skill in modern systematic drug development. It impacts every aspect of development and validation. A structured approach using the key considerations outlined in this paper will aid in assuring it has a defendable technical basis for sampling method and sample size selection and controls and addresses risk relative to a clearly defined business cases, CQAs, problem statements, and study questions.
References
About the Author
Thomas A. Little is president of Thomas A. Little Consulting, drlittle@dr-tom.com.