. Читать онлайн книгу.
target="_blank" rel="nofollow" href="#ulink_39fa4661-aa55-5c2f-9863-ee9358c6efc9">29 From a HIPAA standpoint, once it is anonymized, it is no longer HIPAA data. If the data is de-anonymized, it has the same structure as HIPAA data, but it is no longer has the HIPAA compliance requirements—even if that data has all the same elements. For many this is a concerning loophole. Many organizations, even if they legally anonymize the data, are, in effect, giving out HIPAA data. They follow the letter of the law, but not the spirit of the law. The law was intended to keep people's data private, but with modern data mining techniques that data is no longer protected. What is worse, that data may be bought, sold, and traded without consent or even anyone's knowledge that this is going on.
Brokers now sell very detailed information about population segments, including name, address, phone number, email address, and such information as people with cancer, erectile disfunction, bladder control, STDs, etc. Not all brokers provide information that is this specific, but it does allow for targeted advertising campaigns.30
It will be interesting to see how this plays out as more and more people become aware of the roles of data brokers. Many of these data brokers were unknown until Vermont created a law to govern them in 2018.31 Since then, we have started to discover many of the companies that are in the data broker market. By March of 2019, 121 companies were identified—obviously not all of them are interested in HIPAA data. In terms of being aware of where your data is, it is much more challenging than ever before because that data could literally be anywhere on planet Earth.
If only anonymized data were the only concern. Kevin O'Reilly, a news reporter for the American Medical Association, reported about Project Nightingale, which puts patient data from the 2,600 hospitals that are part of Ascension health into the hands of Google. Google's intent is to use artificial intelligence on the data. In fact, prior to that it spent $2.1 billion to acquire healthcare data on its users.32 Of course, data provided in this form is HIPAA data, and the requirements for HIPAA must be followed. From a privacy standpoint, providing that data is done without informed consent, people do not have control of their own data.33 Given some of the flagrant violations of data usage by Facebook and other technical giants, there is understandably some concerns related to that data.
In the previous section we talked about applications that have medical that are not validated by science or contain false information. Given that volume is important, the more sources of data presented, the better. If one of those applications sends erroneous data, the data stream may be polluted. Remember that volume, velocity, and variety are extremely important from a sales standpoint. Veracity, to an extent, can be validated by stating that the data is top notch. Undoubtably, when it comes to the buying data, some companies will do a better job validating the data prior to purchase. Given the volume of potential data sources, this can be a daunting task for many organizations.
Another challenge is that oftentimes this means sharing data globally, which means data can literally be anywhere. Health data can physically be located in any country. Although frowned upon, there is no law requiring U.S. health data to remain in the United States. Oftentimes, depending on the platform, that is exactly what happens. Some data brokers, not all, send data throughout the planet to ensure that, in case of an emergency, it is backed up. Unless a thorough investigation is performed about the platform and someone thinks to ask that question, the hospital or doctor's office may be blissfully unaware that the data is being spread throughout the world.
In the end, big data is about sharing of data and aggregating the right data sets in the right way. That data may or may not be HIPAA data, but may have all the markers of HIPAA data. The data may be collected from applications and shared in ways that we, as consumers, may not be aware of. It also holds the promise of expanding our scientific understanding and taking us into future directions we have only begun to imagine today. Big data is not about the data itself. There are goals and objectives from many different angles that make it important. There are also tools that data scientists use to sort through the volumes of data.
Data Mining Automation
Analyzing the volume of data that comes out of big data is not a minor undertaking. As the volume of data increases and we become more and more subtle in terms of our analysis of data, often it is worthwhile to get some extra help to analyze that data. Unsurprisingly, there are numerous ways to help with that analysis. Most people immediately start to think of artificial intelligence—partially popularized by IBM's Watson that beat out other contestants on Jeopardy in 2011 and partially by science fiction. While certainly artificial intelligence is used, often the term is overused by marketing teams. Some of the subcomponents of artificial intelligence are more than sufficient to meet the data analytics needs of many companies. For example, many tools use machine learning or deep learning for analytics purposes. Each method has its own pros and cons, and each may bring value depending on the context. Nonetheless, while there are many other tools for working with data, the range of tools that data scientists use to correlate data and discern patterns offer vast improvements over the activity that people perform on their own. Figure 3-1 demonstrates the relationship between the different technologies relating to data science and artificial intelligence. For purposes of simplicity, let's use the term artificial intelligence broadly to talk about the full range of available tools, although it is not technically accurate.
Figure 3-1: Relationship of data science to enablement technologies
One thing to keep in mind is that artificial intelligence and many of the related tools are in their infancy and require tremendous amount of maturation to meet their full potential. Even in the healthcare market, the full potential has yet to be reached. McKinsey and company identified three stages of artificial intelligence uses that are helpful to highlight the context. The first phase is that we are striving to work on repetitive and largely administrative tasks to reduce the existing workload. We are beginning to see this for specializations that work with images as well. The second phase is to use artificial intelligence in home care—often related to remote monitoring. In this phase artificial intelligence will be utilized more often as an aspect of the connected devices themselves. Phase three will be focused more on being embedded within the clinical processes.34 Being a cautious technological optimist, I am sure there will be further applications of artificial intelligence in the future, especially when it is tied to robotics.
It is important to look more deeply at these phases described by McKinsey. Having artificial intelligence injected into telemedicine has tremendous potential to push medicine into a more proactive mode than ever before. If we took the technology within health applications and tied them into augmented forms of monitoring devices that tie into artificial intelligence systems, we could begin to detect potential medical issues (or diseases) early prior to onset of symptoms. The proactive measures of some of the simpler forms of artificial intelligence are already saving us millions of dollars and countless lives every year; just imagine how many more people and how much more we could save with more fine- grained medical information.
Another area of interest for artificial intelligence is data mining EHR records, which does include mining the records of IoMT devices to look for predictors of risk. Obviously, this is another proactive measure that companies are focusing on. What is interesting is that this process is valuable from multiple angles. The hospital is doing it to help their patients, and the IoMT device providers are using the information not only to help patients, but also to fuel the next generation of improvements