A Guide for Testing Biopharmaceuticals Part 2: Acceptance criteria and analytical method maintenance

Publication
Article
BioPharm InternationalBioPharm International-10-01-2006
Volume 19
Issue 10
Pages: 40–45

The first part of this article, published in the September 2006 issue, discussed general strategies for validation extensions to other test method components, laboratories and even different test methods.1This second part provides practical tips on how to maintain test method suitability long after the formal completion of analytical method validation (AMV) studies.

ABSTRACT

Several gaps in current regulatory guidelines that govern the analytical method life cycle for the testing of biopharmaceuticals are identified. Strategic guidance on how to monitor and control the life cycle of an analytical test method is provided. Analytical method transfer, analytical method component equivalency, and analytical method comparability protocols are discussed in light of risk-based strategies for validation extensions. The use of an analytical method maintenance program is illustrated in relation to the predictable risk to patients and firm.

The first part of this article, published in the September 2006 issue, discussed general strategies for validation extensions to other test method components, laboratories and even different test methods.1 This second part provides practical tips on how to maintain test method suitability long after the formal completion of analytical method validation (AMV) studies.

Case studies on how to meaningfully derive acceptance criteria for validation extensions and the validation continuum (maintenance) are described as well as an example on how to reduce analytical variability in validated systems.

ANALYTICAL METHOD MAINTENANCE (AMM)

There are several points to consider when running an AMM programme. Usually, assay control results that are within established limits (e.g., ±3 s.d.), will yield valid assay results for the test sample(s). Whenever possible, the assay control should yield a similar assay response when compared with the test sample. Monitoring the assay control will then indicate when unexpected assay result drifts or spreads may have occurred. Whenever the assay control results are close or over their established limits, there is a high probability that test sample results are also going in the same direction. This is something that should not be ignored because it causes predictable, although not exactly measurable, errors in test results (cases 1B and 2B).1

Because production samples are run simultaneously with assay controls, both results should be reported and overlayed on a single statistical process control (SPC) chart (Figure 1). This will make "non-visible" data elements that lead to cases 1B and 2A-B, more "visible" and useful for information regarding the true production process performance. Had the process development, robustness and validation studies been completed properly by using a variance component analysis matrix, the contribution to overall process variance of effect sampling (timing, number of samples, handling, storage conditions, hold times) could be estimated. If needed, sampling could be better controlled within the standard operating procedures (SOP). The likelihood of all cases (1A-B, 2A-B) occurring could then be monitored. This would also allow a much better process understanding as not only could root causes for process failures be more readily identified, but also more readily measured.

Figure 1. Graphical representation of SPC potencies and "invisible" assay variance.

Furthermore, controlling and fixing process problems could save more batches from rejection. These steps would also be very much in line with the recently published principles for quality and process analytical technology leading to faster licence approvals and reduced inspections.2–6

From the robustness studies performed during analytical method development (AMD) and the intermediate precision results during AMV, the identity of the method component that may be responsible for changing test results when replaced is clarified.

This should indeed be the main purpose of AMD and AMV reports to not only provide results for variance components, but to also clearly identify potential bias and how they can be controlled as part of test system suitability. The validation is ideally only a confirmation of something already expected to be suitable from the AMD studies. It is the recommendation of the author that all method components and operational limits are sufficiently studied and properly documented in the AMD report. This information should then be used to set meaningful limits for sample handling and storage during testing, the number of replicates, and the overall system suitability limits in the AMD/AMV report. It will otherwise be more difficult to set limits and control the implementation of new method components without this knowledge from the AMD/AMV studies.

Qualifying another instrument, reference standard, critical reagent, assay control or operator for testing may arise from the need to get test results faster. Often, use of an alternative instrument or operator is inevitable. In any case, whenever critical validated method components are exchanged, equivalency in test results between the validated method component and the alternative component should be verified.

Table 1. Instrument equivalency execution matrix

As discussed, the AMD/AMV report should indicate whatever those "critical" method components might be and what associated risk might be involved when changing particular components. Equivalency can be accomplished similarly to AMT, provided that limits set initially still hold, by using an equivalency matrix and appropriate acceptance criteria for accuracy (matching) and (intermediate) precision.

An example for an execution matrix for instrument equivalency is illustrated in Table 1. The testing of various production lots, unlike the obligation to include this for AMTs (current regulatory expectation), should not be involved as this may not help for the purpose of the equivalency studies.7

UNDERSTANDING THE RELATIONSHIP OF ALL COMPONENTS

There are numerous ways that process and method performance knowledge can be used to monitor and improve overall process and product quality. In reality, unlimited time and resources are not always available and we should first identify the most critical components and maintain committed to implement critical improvements before we get lost in too much data. A hypothetical example is provided below to illustrate how acceptance criteria for AMTs and AMMs could be derived with respect to estimated probabilities for 1A-B and 2A-B. Following from this example, potential improvements are discussed to illustrate what could be done to reduce the undesirable probabilities for cases 1B and 2A-B.

Table 2. Historical process, sampling, assay performance data, and suggested limits for accuracy an (intermediate) precision.

A potency bioassay is used for downstream in-process and final container testing of a licensed biopharmaceutical drug. This method is monitored in our AMM programme. After months of commercial production, transfer of this analytical method to a different laboratory for product release testing is needed. If the downstream in-process stage is yielding inconsistent potency results (Table 2), the following data should be reviewed:

  • process development

  • process validation

  • AMD

  • AMV

  • historical process

  • method performance (assay control) data.

We may want to start with the easiest variance component, the assay variance, as monitored with the assay control. Figure 1 illustrates the relationship between the assay performance and observed recent SPC potency results (last n = 60).

The statistical likelihood for failures now needs to be estimated and a component variance analysis performed to estimate the contribution of each component. Then, focus can be put on how to set limits on post-validation activities from understanding the potential impact on the likelihood for all cases (1A-2B) to occur. The situation can be most effectively improved by having primarily in mind patient safety, dosing, and regulatory expectations, but secondarily also the firm as the need to pass specifications and stay profitable is important. Similar to propagation-of-error calculations, an estimate for the sampling variance (batch-uniformity, stability, protein adsorption losses, etc.), could allow immediate estimation of the actual (true) process performance for potency by simply solving for it from Equation 1 (V = variance).

The (hypothetical) historical results are presented in Table 2. The estimated probabilities for cases 1A and 2A, and the measurement errors ("invisible" component of SPC) are given in Table 3 along with probability estimates for the worst-case scenario when AMT results would reveal that the receiving laboratory will test at a bias at the acceptable AMT protocol limit of ±1.5%.

Although simplified, several observations can be made from the data in Figure 1 and Table 2. Most important, the overall process performance is out of a desirable ±3 s.d. SPC state. As said earlier, this is the "visible" SPC trigger that warrants action when limits are exceeded. The test system has recently (last n = 60) yielded higher (2.0%) results for the assay control. The assay control is the same molecule in a highly similar matrix as the test sample and both are run simultaneously in each assay run.

The "visible" process mean can therefore be expected to be about 2% higher. Looking chronologically at the assay control may show the root cause, that is, which assay component was changed and caused this difference over time. Alternatively, several smaller changes could have added up to this difference of 2.0%. For example, the reference standard may unknowingly be unstable even when frozen and may have lost 2.0% potency over time, providing proportionally higher (2.0%) results to the assay control and test samples.

Although the 2.0% expected difference in process data may in reality be buried within several small sampling or production changes, it nevertheless constitutes a 2.0% bias for testing from the time that specifications were set to match clinical data, and the then-existing process variance (PV) and AMV.

The estimate for the overall sampling process during production (from PV) is used to estimate the actual variance in process performance. However, if small- or large-scale process studies to estimate sampling variance were not done well or are no longer representative, this estimate may not be sufficiently accurate to provide a good estimate for the actual PV (using Equation 1). In the hypothetical example, the estimated true PV (2.0%) is smaller than the estimated sampling variance (2.3%), and smaller than the assay variance (3.0%) and always smaller than the overall "visible" process variance (4.3%). This may have been why the specifications had been set to 90–110 units in the first place. Often, the assay and sampling variance could indeed be greater than the actual process variance for downstream in-process potency testing for biopharmaceuticals because of various reasons. Some examples are listed below:

  • Protein adsorption losses (sampling, testing).

  • Sampling procedures lacking detail where needed.

  • Inappropriate sample handling before testing.

  • Poorly set test system suitability criteria.

  • Insufficiently monitored and controlled assay reproducibility.

  • Poorly developed or optimized test method.

  • Poorly written AMV protocol.

Acceptance criteria for AMT

Estimations for the true process mean (99 units) and variance (2.0%) were made from the assay control performance and from Equation 1, respectively. For the AMT, both accuracy and precision are areas of concern because several method components (operators, instruments, location and others likely) will change when the method is executed at the receiving laboratory. Tests should therefore be undertaken for the overall matching of the receiving laboratory results to those of the reference laboratory and for equivalent (intermediate) precision. Accuracy and precision limits are treated independently here for simplicity although they both impact all cases (1A-2B).

Precision

The assay performance is at 3.0% (last n = 60)—or potentially higher if control outliers were included—when run under routine conditions with small changes over time. AMV results yielded an intermediate precision of 2.4%. An imprecision higher than 3.0% should not be allowed because the method component is the highest contributor to the overall variance (Table 2). There is already a relatively high likelihood (about 1.73%) of observing an out of specification (OOS) result with two observed over the last 60 (Table 3 and Figure 1). The limit of 3.0% appears balanced between the likelihood of passing to achieve compliance or project advancement and the likelihood for all cases 1A-2B to be continued in the future.

Table 3. Estimates of release probabilities and measurement errors (cases 1A–2B) for AMT.

Accuracy-Matching

A recovery or matching of the expected (reference) potency of 100 ± 1.5% appears reasonable. Allowing a maximum of ± 1.5% difference constitutes about one-half of the recent assay variability. The receiving laboratory should not be allowed to test with a greater bias because further increase in the likelihood of OOSs because of the potential shifting in the process mean compared with the target (100 units) may be seen. Overall, recoveries inside this range (98.5– 101.5%) as evidenced by the data from AMV, historical assay control, and SPC should be possible.

Acceptance criteria for AMM

When exchanging or adding a single method component, such as a second instrument, maintaining accuracy or the matching of historical performance should be the main considerations for reasons given above. However, accuracy/matching and precision could both be readily studied to monitor both method performance characteristics. The acceptance criteria for accuracy and precision for AMT and AMM should be derived from the product specifications with regards to assay performance (control) and process performance (SPC data).

As good estimates for all variance components may be present, acceptable criteria (at least for precision) can be derived from the results in Table 2. Acceptance criteria for accuracy (matching) may need to be tightened to avoid a potentially large compounding of bias from several one-directional changes e.g., 99.0–101.0% vs. current system). Unless there was no alternative, a 2.0% increase in test results should not have occurred when the reference standard was changed.

In Table 3, probability estimates are calculated based on the expected rounding of test results to specifications (90–110 units/mL). This leads to more test results falling within specifications as 89.5 is rounded up to 90 and 100.49 is rounded down to 100. The current calculated probabilities for observing passing results (1A) are 98.64% for results above 100 units and 99.63% for below, respectively, for a net total of 98.27%. Given a normal data distribution and the historical process mean of 101.0 units, the probabilities for failing low results versus failing high results do currently not match. The allowed worst-case probabilities for 1A after AMT (protocol acceptance criteria are 100 ±1.5%) are much greater towards the positive direction and come to the total of 96.73%. Case 2A probabilities are simply the reverse of 1A (100% -1A).

If the AMT results would yield a still acceptable +1.5% bias, the predicted failure rate would have almost doubled from 1.73–3.27%. For normal data distributions, cases 1B and 2B will always be similar as results could equally differ in both directions from the observed SPC results. The total variance predicted for measurement errors (3.8%) was calculated similar to Equation 1 from the sum of assay control variance (3.0%) and sampling variance (2.3%). The calculation of exactly predicted probabilities for cases 1B and 2B becomes complex and is beyond the scope of this article.

Two variance components (assay precision and sampling variance) have been identified that should be improved in light of the 1.73% predicted failure rate. This situation could easily get significantly worse after, for example, the method is transferred, method components are exchanged, or process changes are implemented. It should now become clear why this should not be neglected and acceptance criteria should be systematically derived as discussed above, and why regulatory guidance (PAT and risk-based validation) has recently incorporated some of these principles.3–6

Improving the situation

It is usually more cumbersome to implement variance (precision) improvements versus matching (accuracy) improvements. Precision improvements usually mean more samples, more testing and more tightly controlled processes, and are therefore costly changes. Accuracy or bias in testing is usually easier to affect by simply switching to another instrument, reference standard, etc.

However, the pushing of results into only one direction can backfire and make the situation much worse later when small changes can add up and are not visible by themselves (because of poor precision). Decreasing the overall measurement errors or uncertainty in test results by increasing precision has a better long-term affect because any bias in results (from production, sampling or method changes) will be more visible and will appear sharper. Following the potency bioassay example, the sampling and testing scheme is briefly illustrated below.

Potency in-process sampling and testing scheme

To decrease the relatively high variance in the observed potency results (SPC), a variance analysis is needed. The sampling and assay set up for the downstream in-process potency test would then look like this (n = number of samples):

  • collect n = 1 in-process sample

  • split sample into two aliquots

  • run two independent assay runs, each generating n = 3 results (total of n = 6 results).8

In the example, reproducibility could be significantly improved by collecting n = 3 independent samples, each run in 3 independent assays with each having 3 replicates (total of n = 9 results). Because of the current assay set-up and sampling process, increasing the number of samples from n = 1 to n = 3 should significantly improve precision by a factor of 1.73 (square-root of 3, see Equation 2).8

To a lesser extent, but likely still worthwhile, an increase in assay runs from two to three runs should further improve the situation. Another more easily implemented improvement would be to run the assay control at only ±2 s.d. (instead of 3 s.d.). This test system suitability change should also significantly improve the overall SPC variance (Figure 1). This improvement will lead to an expected 4% higher rate for invalid assay runs. This will be a relatively small price to pay when considered the predicted return.

Stephan O. Krause, PhD, is the manager of QC Technical Services and Compendial Liaison at Bayer Healthcare Pharmaceuticals, 800 Dwight Way, Box 1986, Berkeley, CA 94701, tel. 510.705.4191, stephan.krause.b@bayer.com

REFERENCES

1. S.O. Krause, Pharm. Technol. Eur. 18(5) 2006.

2. S.O. Krause, BioPharm International. 8(10):52–59 (2005).

3. Guidance for Industry (CDER/CVM/ORA, US FDA). PAT — A Framework for Innovative Pharmaceutical Development, Manufacturing, and Quality Assurance, September 2004.

4. ICH. Pharmaceutical Development, Q8, November 2005.

5. ICH. Quality Risk Management, Q9, November 2005.

6. ICH. Quality Systems. Q10. In draft.

7. S.O. Krause, Analytical Method Validation for Biopharmaceuticals, General Session Presentation, IBC International Conference: Quality Systems and Regulatory Compliance (Reston, VA, April 05, 2005).

8. F. Brown and A. Mire-Sluis (eds): The Design and Analysis of Potency Assays for Biotechnology Products. Dev Biol, Karger, 107(1), 117–127 (2002).

Recent Videos
Buy, Sell, Hold: Cell and Gene Therapy
Buy, Sell, Hold: Cell and Gene Therapy
Buy, Sell, Hold: Cell and Gene Therapy
Related Content
© 2024 MJH Life Sciences

All rights reserved.