Transcript Data Quality Nightmare Video Capstone
Maybe we should get started. I have to meet with a vendor in less than an hour. If Richard doesn’t show up, I can just catch him up on what we talk about.
>> Well that’s a shame. Because the reason I wanted us all here at the same time is because I have a number of complaints here from various departments regarding data quality issues. And —
>> I’m sorry. I’m sorry. I apologize. Yell at me later; it was an emergency. How much did I miss?
>> That’s okay. Emergencies are our middle name. I was just saying that the reason I wanted us all here together in the same room, is because I have some complaints here from various departments regarding data quality issues that show that we have some problems.
>> Yeah, you probably got complaints from people in my department.
>> Oh, I do. Here’s some samples. I think these samples cover the problem areas that I’ve identified. For example, here’s one with data fields that are too short on some forms. Last names, street names being truncated. And then the truncated information gets forwarded on to other departments that need the full information.
>> Hm. So if the field sizes aren’t consistent from form to form, then just changing one form’s field size can have consequences down the line?
>> Here’s another where consistency is the problem. We’re using different terms for the same information. And counter number on one form is referred to on another form as the case number.
>> How could that be confusing?
>> You’re kidding?
>> Oh. All right. Where here’s another where data’s being entered in different ways. For instance, M/F on one form and Male/Female on another form. Now here we have a complaint where patients have multiple record numbers, and they’re not being linked. So we’re not collecting the required data. We’re putting out inaccurate reports. And complete medical records aren’t being generated, which can certainly have a major impact on patient care.
>> Oi is right. So now we have an idea of the breadth of the problem, any ideas on how to address them?
[ Pause ]
[ Silence ]
We ended up looking at a number of options to dig ourselves out of this hole we’d put ourselves in. We decided to create a data dictionary for the development of forms. This dictionary would specify the wording for the basic choice. And the number of characters in the field for that choice wherever appropriate. For instance, the form would say that you have to choose between male, female, and unknown for gender. Not just M, F, and U. Richard said that we can get algorithm software that can identify potential duplicate records. It can also perform data modeling to determine our data needs, and that can even include what we need in terms of compliance with UHDDS and/or other dataset requirements. We also discussed forming a team to manage communications. So that, for instance, data fields would not be cut off or changed without making sure everybody who needs to know would be properly notified. Whatever solutions we end up choosing, it’s clear that there’s going to have to be some training involved. But I think that we all expected that. It’s too complex to be solved just by pushing a button. The solution is going to require a certain amount of hands-on education.
1.What steps can they take to deal with consistency issues, duplicate records, and other challenges?
2.How can data quality be built into information systems?
3.What are some problems that may arise because of poor data quality?