Skip to main content

© 2021 Excellerate. All Rights Reserved

Episode 12 | Are We Drowning in Healthcare Data?

August 6, 2021 | 25 min 44 sec

Podcast Host – Madhura Gaikwad, Excellarate

Podcast Guests – Punkaj Jain, SVP Healthtech at Excellarate | Arun Mirchandani, Executive Advisor and Technology Leader | Srinivasan Venkataraman, AVP – Operations at Excellarate

Brief Summary

According to an IDC report, the volume of healthcare data in 2013 was 153 exabytes, in 2020 it is projected we will have 2,314 exabytes of healthcare data, which is 15 times more data just in the last 7 years. So the question is – are we drowning in healthcare data today, and wait another few years, will it explode?

Tune in to this episode to explore the different types of healthcare data, and the challenges in collecting, managing, and using this data.


Madhura Gaikwad (00:09):

Hello, and welcome to zip radio podcast powered by Excellarate. I’m your host Madhura and the topic for today’s episode is – Are we drowning in healthcare data? In this episode, we try to stand different types of healthcare data. Continuing the discussion from the previous episode on interoperability in healthcare data, Punkaj Jain, Senior Vice President Health Tech at Excellarate joins me as my co-host in this episode. Health tech expert in technology leader, Arun Mirchandani joins us again with Srinivasan Venkataraman, who is the AVP operations at a Excellarate to discuss the challenges in healthcare data. So welcome onboard Arun, Srinivasan and Punkaj.

Punkaj Jain (00:46):

Thanks, Madhura for that nice introduction and welcome everybody. So just to start and set the context. So according to I was just reading one of the IDC reports, and it says like the volume of healthcare data in 2013 was about 153 exabytes. And in 2020, it is projected we will have about over 2,300 exabytes of healthcare data. That is 15 times more data just in last seven years. And as many of you know, one exabyte is 1 billion gigabytes of right. So, do you think we are drowning in healthcare data? Wait another few years and it will explode. So, with that Arun, let me turn to you and ask in your opinion, how you conceptualize this drowning in the health data concept.

Arun Mirchandani (01:33):

First of all, thanks for giving me this opportunity. It’s in a series of discussions that we’ve been having around healthcare, and some of the challenges in you’re going to be touching mostly on what I call too much of a good thing. You know, they are saying goes that, you know, be careful of what you ask for. And in case of healthcare data, we have really a big problem plaguing us right now. And unfortunately, or fortunately it is the data that’s ultimately going to help. The biggest challenges we are in healthcare, which is improving outcomes and reducing costs and improving quality. But today the way data exists as the way I call data demused disorder – DDD we will talk more as the conversation goes but yeah, we have too much thing is going on.

Pankaj Jain (02:22):

And what do you think from, in your opinion, how do you conceptualize being in this industry for over 20 years and working with the teams and how do you conceptualize this concept?

Srinivasan Venkataraman (02:33):

First of all, thanks. And thanks for the introduction. And it is good to meet the same group in the second series of our health tech conversation and more the merrier, they say, but especially with data, it’s a dangerous thing because the more we accumulate, how well we handle the data is key to success. So, to your point, drowning in data is pretty apt. It resonates well to the conversation. I think the due diligence aspect is more important, according to what I’ve seen in the industry and our experiences asking the right questions during the due diligence and the discovery, some of them I quote here, which will help, and we can talk more in the further conversation. I think we have to ask our customers – Hey, are you looking for ingesting and interpreting data from multiple sources? What is the scope, understanding the target audience and the usability, and is the system is really for clinical use or inventory management or reporting?

Srinivasan Venkataraman (03:35):

You can name it. There are several variants asking that is very important to design the scheme better, and which will eventually help in analytical needs and the governance. And the third one is during the design; we need to be very clear in architecting our solution to separate our inconsequential data and actionable data. The data is huge. The actionable level data will separate us from drowning. We will float very well on the floor. We don’t get drowned if it’s separated. And another thing is, will our insights, whatever actionable data that we have separated, will it flow very well in the downstream system? Will it integrate very well with our EHR? Will it very well integrate with our hospital management system or claims processing or our orders management. These are all some of the things which over the period of years, we have mastered this area to make sure that we do the due diligence, ask the right questions, make sure we design the system and avoid the customer from drowning in the data to be precisely to answer your question Punkaj.

Punkaj Jain (04:37):

It sounds good. So, I don’t, let me turn it to you. Like given description a big thing. I heard the words, like there are many types of different data and all the different things. So maybe what we should do is like, kind of discuss briefly, like what, in your opinion, we see as the different kinds of data or the hierarchy or any parameter, whatever you think. And then it conceptualizes for everyone like, okay, we are talking the same language, right?

Arun Mirchandani (05:00):

Yeah. Uh, good question. Just as a little background, uh, you know, I classify three broad categories of healthcare data. There is obviously the most important and critical one is what we call the clinical data, the patient data, which is either stored in EMRs or stored in all sorts of silo data or data lakes created by the life of apple on smart watches and Fitbit. Then there is obviously the operational or administrative data, which is collected by all the health systems, whether it is staffing levels, patient census level, things like  you knows turnover rates, all those types of things that are all around the operations of a healthcare delivery. And then the final one, probably the most important really is the payer or the claims data, which is where, you know, the economics of healthcare are obviously the most important one and payers ultimately have access to all that data. So, figuring out, which treatments are working? or which treatments are not working? how much should the treatment cost? what are the outliers that can, you know, make the expenses go up? What pharmaceutical combinations are working? All of that is part of claims data. So again, you know, we have three categories: the patient data or the clinical data, the operational or clinical data, and the claims data

Srinivasan Venkataraman (06:36):

Absolutely, I think it’s a good segway. Thank Arun. And claims data, because that is a source of income for our target customers like hospitals, covered entities and pharmacies. So, think about it. We, every day, the accumulation of data is huge as per your introduction, Punkaj, it goes in megabytes, exabytes. Handling the volume of data on all these categories, first is, it keeps data, keeps flow, so, how do you handle, how do you channelize it? And then I will take a couple of examples with respect to claims data and HER, EMR workflow, and how will we use our EDI exchange capabilities? So, typically what happens is in claims data, we work with, uh, either hospitals or pharmacies. The data comes from either a switch or a pharmacy, like a Cardinal Health or Relay Health, or from the respective pharmacy change. And the customers have their legal contract. They bound by the rules and as a software development partner, what we do, we extract the data by creating a communication channel, a data transmission connection layer.

Srinivasan Venkataraman (07:41):

So, we need to make sure that uptime downtime is managed well and a service level agreement within certain time. We need to make sure the data flows in loaded and processes pretty well. So, in this data collection mechanism, there is a specific hierarchy rule. First is we call this raw data. The raw data is assets. We don’t want to tamper with it, store it, and then do the massaging slicing and dicing and we move it to the claim switch. And from claim switch, it goes to claims data by processing, lot of rules to typically, you know, claims is all about paid and reversals, how much is paid and how much is the reversal. And then these claims, whatever we have captured claims in the three 40 world or enterprise pharmacy solutions that gets counted into the inventory. And that inventory will be managed for pleasing orders, like a virtual inventory for our customers.

Srinivasan Venkataraman (08:28):

And apart from these layers, there is another caveat to it as a direct price, because finally you need to match a price, a dollar value to it. And the direct price is almost a daily load and a weekly load, which is a wall of data as well. So, we have a slot, a partition to manage all these pretty well. And that way there is another layer is, you know, depending upon where they are located in geographical location, they need to have access to the data in a secure fashion, following all the rules, like a HIPAA, HL7. And one other example I wanted to give on the EHR workflow is see, there are some customers they ask, you know, can we reduce the time of dealing with patient records? So how well you can quickly get the information from the, uh, EHR system to the hospital or any lapse using DICOM. So, we have done data extractor and what it does is using any API, like a fire API, or a service layer using our Eddi engine quickly, transfer the data and make sure it is available readily for our physicians and doctors to act on the patients. So, this is just an example on the claims data and EHA data and how well we use our electronic data and to change.

Punkaj Jain (09:37):

So, one of the things, if I may follow, which with you is like, what I hear you say, obviously the scalability and all those things and the security and all that steps are critical. But also, I think you touched upon, like, without the context, the data has no meaning, right? I, as we have the expertise, because we’ve been working with over 20 years in the healthcare space, we have the context to be able to interpret this data. Is that a fair statement?

Srinivasan Venkataraman (10:01):

Absolutely. I think very well said, thanks for bringing up that point. Punkaj yes, because of the domain experience, we clearly understand what kind of structure the file will have. Meaning sometimes we work with a pharmacy chains or covered entity or hospital, or sometimes, you know, we directly deal with the switch. So, the data that flows in has different segments, segments is like a tab separated or a com separated little technical aspects in the file. We know exactly what does that segment mean? Each segment has a, you know, it has an attribute that maps with a specific parameter in the healthcare world. So that is purely the subject matter knowledge, and which helps us defining the schema pretty well. And from the scalability point of view and in the futuristic purpose, it’ll be useful for governance typically from our cable and purging perspective, I would like to add one thing, which is very important is healthcare space. You need to maintain the data for seven years. It should not be even, you know, archived approached. And after seven years, it goes through archival process. So, our schema definition in the initial stage because of our understanding of this world helps, I think that answers your question.

Punkaj Jain (11:12):

Thanks. Sounds good. You follow up?

Arun Mirchandani (11:15):

Yeah. The only thing I would like to add is another way to look at the different data categories and different problems that each, uh, present. Yeah. So, you know, the operational clinical and data are sort of where the data originate. That’s sort of the way to characterize those three. But another way to look at data is that you always have at the lowest level, you have what I call like raw data. And the challenges are that are associated with raw data is things like screen touched on security or the data, making sure it’s not leaked storage and scale. And then of course the most important being the interoperability. So, think of it like a data layer where the challenges that we as technologists or you, users of data, are dealing with security storage, things like that. The next layer up, if you’ll is now that you have all that data in the, looking at it in a more organized and aggregated fashions, requires a different set of challenges to dealt with.

Arun Mirchandani (12:13):

So, you know, you deal with visualizing data, worrying about data, quality, its integrity, and data interoperability. So, for when you have organized and aggregate data, then these are some of the challenges that we have to deal with. Then the next layer up now is if you, now you have interpreted in, this is where technologies for, you know, time series and trending analysis are used, detecting anomalies, creating sort of alerting and alarming events based on those anomalies. So that something can be done about the data so that we can get information from them, the raw data, organized data, now you’re interpreting the data and part of the information there. And then finally there is data intelligence, you know, it’s all good to be able to see things and to be able to trend them, but really the biggest for the comes, when you’re to use the data intelligence to gain knowledge, right? Ultimately, it is about predicting where things are going before, they get back or detecting patterns in data and cross-referencing different data streams to create a context about what’s happening so might get data stream from clinical side or might get data stream from operational side but putting it together to see what might be happening in the hospital that we could predict and correct ahead of time. And I’ll give you some examples as we go forward. So that sort of knowledge is there, right? So that’s another way to look at the hierarchy of data. What kind of problems we faced at each level and how to we use that information to build on the data?

Punkaj Jain (13:55):

It’s a great classification and I would love to have that follow up discussion between you and Srinivasan. But as you remember, like in the first episode, when I said, like, I put myself as a patient and I asked you both like, how come I can get my health data on my phone, right? So, when you are talking about this classification, so this knowledge layer as a patient with all this other stuff, like claims data, technical security that must do. But I, as a patient, I’m like more excited about this knowledge layer as you characterized, because that’s what I expect from the healthcare system. Like, you’ll be able to predict what’s going to happen to my health there or educate me like these are the things I should take care of it now, right? So, I think that’s a wonderful classification you did. So, I would love to hear more from you and Srinivasan, on that, this classification, it’s a wonderful classification.

Arun Mirchandani (14:44):

Yeah. I mean, it’s really, we’re in very exciting times because the availability of the complete/compute? resources we ease of, or the ubiquitous Ness of data generation from different sources. It really put us at the point. And of course, the advances in, you know, machine learning and artificial intelligence has put us at the point where we can start to think about practical uses of all these data streams and combining them in a way that they divide knowledge, just not the patient, but also to, you know, healthcare operator, care everything, right? Yeah. So, I have some examples of how these, in my experience that I’ve seen how, so these actually being used in different aspects of healthcare delivery.

Punkaj Jain (15:22):

Srinivasan, you have any input on the way. I don’t characterize these four categories raw data, organized, interpreted data and data intelligence.

Srinivasan Venkataraman (15:33):

Absolutely. I’d like to add my points. See, by data of the day, you also ask the same question. Okay. I’m sharing my data. I know what data is being collected by the respective parties. What do I get? What do I see? It’s all about the visualization, how well you can visual, depending upon the actors, the personas, that’s the key for success. Other, that’s where we differentiate, whether you are really drowned in data, or you are managing the data in a pretty organized planned fashion. So, there are several ways like for a financial accounting, how the healthcare data should be shown or a balanced scorecard hospital administration, patient eligibility. There are several ways we can create a business intelligent dashboard. We have used software like the power BI or a sequel server, SQL server, reporting server, our tableau.

Srinivasan Venkataraman (16:25):

Look, those are all visual media. Finally, the end goal is to, for you to see what everyone, all the persons to see how well the data has been and collected and what is the end result and how we need to take it forward. And this is manual fashion. And to ours point how well this can be automated and optimized using the future technology growth, uh, from a robotic process automation. So, you see data entry is a nightmare in the healthcare industry. So, we cannot have a thousand-seater for entering data for one hospital. How well it can be collected using tools like, you know, selenium or UI path using RPA method. And future is going to even bots, you know, without even human interface, without even it’s more virtual, a voice that will be talking to humans to collect the data. So that is more from NLP – Natural Language Processing. That is from the data collection to your point and the raw data and how well it’ll be massaged using various intelligent analytical algorithm. And then how will it be presented in the visualization?

Punkaj Jain (17:25):

Good, good, good, good. And I heard,

Arun Mirchandani (17:27):

Yeah, that reminds me of some, uh, use cases that I recently come across. Very clever use cases and the operational data, you know, one of the biggest cost factors for health systems are personnel costs. And you can imagine, you know, hospital employing 4-5 thousand nurses and several other healthcare workers scheduling their times based on the amount of patient traffic or based on the amount of census as they call it, it’s a very hard task. It’s not like everybody shows up 8-5 and you know, they, their work and go, it has to kind of correlate with the patient traffic and the severity of the, that they’re handling. And this stuff used to be a virtual challenge for all the hospitals, until some of these intelligent ML technologies came around machine learning technologies where one company looked at the historic data, both based on, you know, different of the week, different of the year, different weather conditions.

Arun Mirchandani (18:31):

And they were able to predict how much traffic meaning patient traffic that hospital would see. So, they looked at historic ten years’ time series data, and they were able to look at trends within that time. They were able to predict like if it is, you know, August 1st week school start, and this was somewhere in Madison, you know, they’ll have an increased traffic, which will lead to increase accidents, which may lead to increase emergency room admissions, which may lead to, you know, higher stacking levels. So being able to predict that was a very useful thing. And the hospitals ultimately to, you know, save a lot of money because they didn’t have to get people on overtime. And, you know, sometimes you have access demand. Sometimes you have access supply, and they were able to balance that quite clearly. So, one very easy example of what I call smart staffing.

Arun Mirchandani (19:21):

Another example that I saw, this was at Stanford where this other company was able to improve their OR – operating room or turnaround time. There was a particular neurosurgeon who was pediatric neurosurgeon, and his time was, you know, he was one of the team in the world, in the country and he, his time was very hard to schedule, and they wanted to make sure that when his surgeries were scheduled in the operating, all other surgeries will be put on hold and so that they could optimize the use of his time. And so again, you can think of it as a multidimensional optimization problem, where they looked at how long those surgeries typically took, what were the out-fire case, and how sure that that operating group was always available with the right equipment, with the right train staff. So that that surgeon could optimally perform surgery. And again, they use machine learning and a lot of historic data analysis to make sure that not only that room was available, but all the supplies that that surgeon needed, all the people that the surgeon would rely on staff and so forth were available at the right time so that his surgeries were not delayed.

Arun Mirchandani (20:37):

All other surgeries had to be rescheduled around that time. Right. So, two very good examples of staffing and also operating room scheduling so that either you can maximize the revenues from that service line,

Punkaj Jain (20:50):

I think great use cases allow it. So, it’s basically the point I think we hear is like, okay, it is one thing to tell that this is the problem. But another thing is like, to be able to say, okay, this is how you solve the problem, right? So, it’s a very nice use cases and that’s what the, this whole data thing and the way characterize. And then if things happen, that’s where we are heading into. And that’s where the value will be. And that’s where the companies or hospitals or different organizations will be able to either safe cost or put a additional revenue thing. Cause now they’re providing more value for the data they have. Right., Srinivasan any, anything what? I don’t say any your comments, any thoughts?

Srinivasan Venkataraman (21:30):

Yes. Yes. And see, we talked about the visualization layer and again, drowning in data. So, what is the solution for it? I mean, we talked about various workflow and various, uh, layers, how the data gets transformed and it goes and settles down in one place. So, what normally the education we give us, you need to think how long the longevity of the data, how well you’re going to use it. So, data warehouse, I mean the storage point of view and data warehouse, there are three categories is we call the EDW. That is your data warehouse. That’s the top layer. And the next one is operational data store. If you don’t want to invest much on the data warehouse. But I would like to have a data store that is called ODS. And the next layer is called DM data Mot that’s.

Srinivasan Venkataraman (22:18):

I would call as a data warehouse light version, very light version of this thing. So, there’ll not be much of cubes and dimensions. So, this is what will make your data in a much more streamline fashion. This can help in analytical purpose and predictive analytics to the point that we discussed earlier, that is going to be the future. And for predictive analytics, this kind of partitioning, slotting and storage is important. And it is very well use. One of the TED talks What I say is the industry is trying to use the data for predicting the cardio attack based on their culture, their lifestyle, their geographical location, the age, and you name various parameters, all these parameters, each one is a data, and that comes from the warehouse and that’s the endpoint I would like to mention.

Punkaj Jain (23:02):

That’s a great ending. And I think the really a good discussion. I think we, as they said, appeal the next layer of for year half, we start from the first episode. And today, uh, in this episode, I definitely get to know the hierarchical and different ways of looking that data. So, it was a great discussion. Any parting, I think we are almost running out of time. So, any last comments Arun and Srinivasan, you have, I’ll hand it over to you. Yeah.

Arun Mirchandani (23:26):

Just, uh, you know, again, great conversation, thanks for the opportunity. And I’m really excited about some of the practical users that I’m beginning to see in how all of the different types of data are being used for medical imaging in developing new drug therapies. And, and even for what’s happening right now in the pharmaceutical and adverse reaction monitoring of all the vaccines, all this is happening right now, as we speak and all of it dependent on this set of data problems that we talked about. So, thank you,

Srinivasan Venkataraman (24:01):

Absolutely welcome the technology growth and we embrace it. And our healthcare data and healthcare solution are driving towards artificial intelligence, machine learning, natural language processing. And in the whole mix, there is a blockchain and it is taking its stride. And we are a head of the market. There are several layers of blockchain that is essentially being used to handle the EMR EHR data like Hyperledger, Ethereum. There are various flavors. So, what I would like to mention is the data traditional way of handling and using it within advanced technologies. And that is the roadmap and hope it is used well, and make sure that it is useful for all the parties.

Punkaj Jain (24:45):

For me, the key takeaway from today’s conversation is that yes, health data is huge. However, today I heard that we will not be drowning in the health data rather make intelligent and predictable decisions impacting people lives. Ultimately based on the data. In addition, I love the way Arun define the health data pyramid. It, it much easier to conceptualize the different sets of data. Thanks again, Arun and Srinivasan for an open discussion and providing your insights. Thanks, Madhura for organizing the event.

Madhura Gaikwad (25:15):

Thank you Punkaj and thanks Arun and Srinivasan for taking the time again, to join us today. This discussion has definitely given our listeners a lot to think about, and we will keep it going in our upcoming session to explore challenges in the healthcare domain. Thank you everyone for tuning in. If you are looking to accelerate your product engineering, digital transformation and business agility, visit our website, for more information. Thank you.

ZipRadio is available on these platforms