- ICH GCP
- US Clinical Trials Registry
- Clinical Trial NCT02795806
NLM Scrubber: NLM s Software Application to De-identify Clinical Text Documents
NLM Scrubber: NLM's Software Application to De-identify Clinical Text Documents
Background: Electronic health records contain a vast amount of data about diseases and treatments. Researchers could use this data to test their ideas, but they would need to use records from more than just their own group of patients. But access to those records is restricted to ensure patient privacy.
U.S. National Library of Medicine (NLM) has created a computer tool called NLM Scrubber. This program recognizes and deletes personal information from health records. The researchers who developed this program now need access to the original records. This will allow them to see how well the program removes personal information from patient records and how they can make it more accurate.
Objectives:
To find ways to improve clinical text de-identification.
Eligibility:
No new participants. Researchers will review data that have already been collected.
Design:
Researchers will collect a random sample of reports. These will be from different doctors in different fields.
Researchers will manually remove personal information from the records.
Researchers will also automatically remove personal information from original records using NLM-Scrubber.
Researchers will compare the results of the computer program versus the manual changes. They will note when the program has not been removing personal information correctly. They will also note when the program has been deleting nonpersonal health information incorrectly.
Researchers will use the results to revise the program. They will keep testing it until the de-identification process is complete.
Study Overview
Status
Conditions
Detailed Description
This study is about the quality assessment, improvement, and monitoring of an automatic clinical text de-identification software application called NLM Scrubber, which has been developed at the National Library of Medicine (NLM). The application has been developed so that clinical reports can be used in secondary scientific studies (i.e., for secondary use) without breaching patient privacy. Research on methods for protecting patient privacy and on the development of NLM Scrubber have been conducted by following the guidelines of and in compliance with HIPAA and the Privacy Act.
In order to further develop and improve NLM Scrubber and assess its de-identification performance effectively, the investigators require the original / unredacted samples from all potential clinical report types and sources. To this end, NLM investigators have been
collaborating with entities within NIH, namely, NIH Clinical Center, BTRIS, and NCI as well as outside entities, Kentucky State Registry administered by University of Kentucky and researchers from the University of Pittsburgh, who stated their interest in integrating NLM
Scrubber to their application called Text Information Extraction System. These entities collect samples of various types of clinical reports for assessing and improving NLM Scrubber performance. However we also need access to the original data in order to assess
potential problems and improve the accuracy of NLM Scrubber.
Study Type
Enrollment (Estimated)
Contacts and Locations
Study Locations
-
-
Maryland
-
Bethesda, Maryland, United States
- National Library of Medicine
-
-
Participation Criteria
Eligibility Criteria
Ages Eligible for Study
Accepts Healthy Volunteers
Sampling Method
Study Population
Description
- No new participant enrollment. Researchers will review data that have already been collected.
Study Plan
How is the study designed?
Design Details
- Observational Models: Other
- Time Perspectives: Retrospective
Cohorts and Interventions
Group / Cohort |
|---|
|
1
Everybody for whom a clinical narrative report is created.
|
What is the study measuring?
Primary Outcome Measures
Outcome Measure |
Measure Description |
Time Frame |
|---|---|---|
|
The rate of de-identification of PII
Time Frame: 01/01/2017-01/31/2027
|
HIPAA Privacy Rule defines 18 types of personally identifying information, that need to be de-identified, which include personal names, addresses, significant dates, numeric identifiers (such as social security number).
Our annotators label those words and numbers creating a gold standard and NLM-Scrubber tries to recognize and eliminate all of them.
The rate of de-identification of PII refers to success of this outcome measure.
|
01/01/2017-01/31/2027
|
Secondary Outcome Measures
Outcome Measure |
Measure Description |
Time Frame |
|---|---|---|
|
The rate of erroneously redacted clinical information
Time Frame: 01/01/2017-01/31/2027
|
While NLM-Scrubber tries to eliminate only PII elements while preserving non-identifying study data, it inadvertently deletes some of the non-identifying study data elements (non-protected health information) as well.
The rate of erroneously redacted clinical information refers to the failure of NLM-Scrubber in preserving non-identifying health information.
|
01/01/2017-01/31/2027
|
Collaborators and Investigators
Investigators
- Principal Investigator: Mehmet M Kayaalp, Ph.D., National Library of Medicine (NLM)
Publications and helpful links
General Publications
- Kayaalp M. Patient Privacy in the Era of Big Data. Balkan Med J. 2018 Jan 20;35(1):8-17. doi: 10.4274/balkanmedj.2017.0966. Epub 2017 Sep 13.
- Kayaalp M, Browne AC, Dodd ZA, Sagan P, McDonald CJ. De-identification of Address, Date, and Alphanumeric Identifiers in Narrative Clinical Reports. AMIA Annu Symp Proc. 2014 Nov 14;2014:767-76. eCollection 2014.
- Kayaalp M, Browne AC, Callaghan FM, Dodd ZA, Divita G, Ozturk S, McDonald CJ. The pattern of name tokens in narrative clinical text and a comparison of five systems for redacting them. J Am Med Inform Assoc. 2014 May-Jun;21(3):423-31. doi: 10.1136/amiajnl-2013-001689. Epub 2013 Sep 11.
Study record dates
Study Major Dates
Study Start
Primary Completion (Estimated)
Study Completion (Estimated)
Study Registration Dates
First Submitted
First Submitted That Met QC Criteria
First Posted (Estimated)
Study Record Updates
Last Update Posted (Actual)
Last Update Submitted That Met QC Criteria
Last Verified
More Information
Terms related to this study
Keywords
Other Study ID Numbers
- 999916122
- 16-LM-N122
Plan for Individual participant data (IPD)
Plan to Share Individual Participant Data (IPD)?
IPD Plan Description
Drug and device information, study documents
Studies a U.S. FDA-regulated drug product
Studies a U.S. FDA-regulated device product
This information was retrieved directly from the website clinicaltrials.gov without any changes. If you have any requests to change, remove or update your study details, please contact register@clinicaltrials.gov. As soon as a change is implemented on clinicaltrials.gov, this will be updated automatically on our website as well.
Clinical Trials on Personally Identifiable Information
-
University of California, DavisCompletedNoticing Nutrition Information | Recalling Nutrition InformationUnited States
-
Elizabeth Glaser Pediatric AIDS FoundationJohnson & JohnsonCompletedInformation DisclosureCameroon
-
Christiana Care Health ServicesThomas Jefferson UniversityCompleted
-
Weill Medical College of Cornell UniversityNew York State Department of HealthCompletedHealth Care Utilization | Health Information Technology | Health Information Exchange | Virtual Health RecordUnited States
-
London School of Hygiene and Tropical MedicineCompletedInformation Seeking BehaviorUnited Kingdom
-
Washington University School of MedicineNational Center for Advancing Translational Sciences (NCATS)RecruitingInformation DisseminationUnited States
-
University Hospital, BordeauxUniversity of BordeauxCompletedEvaluation of a National Health Information Technology-based Program to Improve Healthcare Coordination and Access to InformationFrance
-
Brigham and Women's HospitalNot yet recruitingHospital Information Systems
-
State University of New York at BuffaloNational Cancer Institute (NCI)Active, not recruiting
-
Washington University School of MedicineAgency for Healthcare Research and Quality (AHRQ)Completed