Protecting the integrity of data is a challenge of 21 CFR Part 11 compliance. Integrity requires records to be complete, intact, and maintained in their original context ? associated with the procedures which were used to create the data.
Ensuring data integrity by protecting original data from accidental or intentional modification, falsification, or even deletion is the key for reliable and trustworthy records that will withstand scrutiny during regulatory inspections. Many assessments and action plans stop at the system security level discussed in the "Access Security" article in this series. However, merely controlling and securing access to a data system does not address the real challenge for today's laboratories ensuring data integrity.
Data integrity means that data records are complete, intact, and maintained within their original context, including their relationship to other data records. To use an analogy from the paper world, a contract is valid only if all the pages of the document are complete, legible, and if it contains the required authentic signatures and properly states the terms and conditions. In this sense, integrity denotes validity.
In the case of a chromatography data system (CDS), data integrity gives a high degree of certainty that a given record (such as a calculated chromatographic result) has not been modified, manipulated, or otherwise corrupted after its initial generation.
In the context of CDSs, data integrity requires automatic change management of metadata (for example, storage of method setpoints), including "revisioncontrol" for reanalyzed data. Data integrity also means that data cannot be entered out of context. Operational checks enforced by a computerized system should check user permissions and enforce a certain sequence of permitted steps according to a defined workflow.
Another technical challenge for data management systems is to ensure referential integrity, that is, the integrity of the record's relationships (dependencies). A database record is traceable, reliable, and trustworthy only if the complete set of related (dependent) records is available.
Think of the following scenario: Sample XYZ needs to be analyzed using Method A, revision 4 (the current revision). Due to a shortage of solvent during the analysis, chromatogram 1 of sample XYZ is invalid, and the sample must be re-injected. The system stores revision 2 of the binary chromatogram without deleting or overwriting the original. (Check whether your data system can really do this!) Chromatogram 2 is now processed, generating the result XYZ.2-A4-1. One of the points on the calibration curve is subsequently marked invalid because the reviewing analyst found a previously undetected sample preparation error. Chromatogram 2 must be reprocessed a second time, generating result revision XYZ.2-A4-2. The results are reviewed, approved, and archived. Over the course of the following months, method A is updated to revision 5 due to a specification change. In the course of an FDA audit, the results for sample XYZ are revisited.
A system with appropriate measures for maintaining referential integrity will retrieve the requested revisions of those results, including the correct references to the revisions of the raw data (XYZ.2) and processing methods (A4). Many current systems will allow retrieving the final result and the original raw data. But will they show the correct version (A4 instead of A5) of the processing method used at the time? Will they really show the history of revisions? If your system does not do this, you should develop appropriate procedures to capture at least a paper trail of the iterative changes.
After security, traceability is one of the first prerequisites for the trustworthiness of a record. The computerized audit trail of a laboratory's data system holds the evidence of who did what to a record and when. According to McDowall, "audit trail is a software utility that monitors changes to selected data sets within the main application."
1
Section 11.10 (e) of 21 CFR Part 11 requires an audit trail for "actions that create, modify, or delete electronic records" and that it be "secure, computer-generated, time-stamped."2 It is neither new nor surprising that previous entries in the audit trail must not be obscured, a practice well known to the keepers of paper records in a cGMP environment.
During FDA inspections, auditors typically refer to laboratory logs for the sequence of analysis and manufacturing steps. Similarly, audit trails help to manage, control, and also inspect the history of changes made to raw data and intermediate results that are used to calculate final results. However, the audit trail is only a subset of the change management of electronic records. Change management for electronic records requires both an audit trail (frequently called logbooks) and revision control of records. Logbooks merely describe what happened and when, but keeping the record under revision control establishes the exact details (for example, the chromatographic result before and after the change). When implemented properly, change management therefore can be used to answer the following questions:
Obviously, the capability of attaching audit comments to an electronic record helps the originator as well as the reviewer in documenting an action and justifying why it was done. Part 11 does not explicitly require entering a reason for a change, but some predicate rules do (for example, Good Laboratory Practice regulations). Some modern data systems therefore offer a function for fixed or user-definable audit comments. The data system can record, for example, that acertain method parameter was changed from value X to value Y, and in the comment section the analyst may state that this was because of a revised SOP.
Finally, in addition to operational controls that enforce the sequence of permitted steps systemically, audit trails also play a role in preventing "pencil whipping," that is, "the entry of data before an action occurs or at the end of the day, as an afterthought."6
Metadata is central to the trustworthiness of records and compliance with Part 11. Without metadata, the traceability of a record is extremely limited. The complete and uncorrupted package of raw data, metadata, and results represents a trustworthy and reliable set of information that helps generate knowledge that results, production processes, and product quality are under control. Without metadata, it is not possible to "replay" the original result using the original input parameters. Even though "instant replay" is subject to enforcement discretion according to the 2003 Part 11 guidance, it is important for data migration when replacing legacy systems. Theoretically, firms can get away with not carrying legacy data forward, especially since this is a tough technical challenge for the regulated industries and their suppliers. Practically, however, it is an unacceptable waste of resources if there is no electronic data transfer between the original system and its replacement. Can a pharmaceutical development or quality-control laboratory afford to manually re-enter hundreds or even thousands of analytical methods? How efficiently can they investigate a complaint if there is only a paper archive that is not keyword searchable? How do they judge a small unspecified impurity if only a paper printout of the original chromatogram is available with no possibilities for zooming, reintegrating, or inspecting the spectral data?
Electronic records generated by an analytical instrument can be regarded as trustworthy and reliable if there is evidence that the communication between the instrument and system controller is trustworthy and reliable.
In many cases, firms must rely on electronic raw data to perform regulated activities such as QA/QC testing of finished drug products for batch release or when investigating a suspected out-of-specification (OOS) result. For example, a regulatory agency may ask for documentation of instrument conditions to support the laboratory's conclusion that a certain result was not OOS due to a technical failure of the apparatus. It may be difficult to show evidence that a given measurement was in fact performed according to the defined procedure, unless there is detailed documentation of the metadata. Examples of such metadata are instrument setpoints used during the analysis and setpoints used for re-integration (including documentation of the previous setpoints and previous results). Without hard evidence like this, the regulatory agency may suspect attempts to test the results into compliance!
Level-4 instrument control employs techniques such as automatic tracking of instrument identification and configuration, early maintenance feedback (EMF), self diagnostics, real-time data acquisition and synchronization independent of the computer, and bi-directional handshake protocols between devices and controllers for reliable and traceable instrument communication. We discuss level-4 instrument control further in the Instrument Control article in this series.
Object Database Management Systems (ODBMS) are specifically designed to manage and store complex objects and their complex relationships. Applications based on ODBMS are commercially available and suppliers have implemented data management systems based on ODBMS for several years.
7
According to Loomis, one of the main benefits, apart from referential integrity and ease of system administration, is that "the storage of objects as objects, rather than fields of tables, not only maintains the inherent nature of the object, but can also eliminate 30-70% of a project's total code, which is typically used to map objects to tables."
8
For analytical data management systems, a significant percentage of the entire data volume consists of instrument raw data. Due to the size of the raw data, it is typically stored in an efficient binary format. Binary objects in a database system are called binary large objects (BLOBs). Modern systems such as Oracle 9i allow BLOBs to be managed efficiently within the database. In contrast, older systems implemented their own hybrid data management structure: database schemas used a combination of relational tables managed in the database and binary objects managed in a flat file system.
Managing tables and objects within the database management system simplifies system maintenance significantly, as standard IT procedures for disaster recovery (backup) and data archiving can be used instead of specialized, two-fold procedures that have to synchronize processes on the database with processes on the file system.
1. McDowall RD. Operational measures to ensure the continued validation of computerised systems in regulated or accredited laboratories.
Laboratory Automation and Information Management
1995; 31(1):25-34.
2. FDA. Code of Federal Regulations, Title 21, Part 11; electronic records; electronic signatures; final rule. Federal Register 1997; 62(54):13429-13466. (See sections 11.10(b) and 11.30.)
3. FDA. cGMP warning letter, File No.: 04-NWJ-02. Available at URL: www.fda.gov.
4. FDA. cGMP warning letter, File No. 2004-NOL-03. Available at URL: www.fda.gov.
5. FDA. cGMP warning letter, File No. 04-NWJ-01. Available at URL: www.fda.gov.
6. FDA. Human Drug CGMP Notes 1997; 5(4). Available at URL: www.fda.gov/cder/dmpq/cgmpnotes.htm.
7. Loomis TP. The best of LIMS and object and relational DBMS can be combined. Scientific Computing and Automation 1998; Feb:73-76.
8. Guzenda L. 7 signs that you need an object database. Scientific Data Management 1999; Sep/Oct:30-33.