When I kicked off the topic of getting started in analytics, I should have foreseen this collision of “hey we really want cool data visualization” and “OH MY GOSH THAT DATA IS REALLY SENSITIVE”. It’s also very timely that a new player in the HIPAA/Tech blog arena wrote about encryption in four succinct scenarios. In his post, Hudson shares the following in relationship to the breach that affected about 80 million customers:
“That said, it (Anthem) did have a duty to implement an encryption and decryption plan that would address reasonably anticipated threats and hazards.”
Ouch. It’s not like someone had 80 million customers PHI data on a spreadsheet on their laptop. Did they? It’s not an unlikely scenario. I think the point taken, though, is that leaving sensitive data on a loosely secured database is just as dangerous, especially one in which we use this data under the pretense of trying to fulfill the role of providing BI. In Anthem’s case, as I understand it, they had legit credentials to the database. No encryption would have saved the day. Yikes. I digress.
As BI practitioners, we have a really tough job that technology doesn’t necessarily seem to do an awesome job of solving today. Let’s discuss.
Data Security and Applications
I feel like application developers have it easy. Even if a database does have the ability to encrypt the data on disk, most modern programming languages have some ability to encrypt/decrypt data within the application and store it persistently, in the database. The separation of concerns is there. The application knows how to tidy it up and make it readable; can apply whatever security it wants; and everybody is happy. So data enters an app in a secure protocol, is encrypted along the way, and then stored as such. The keys to do so reside in the application. As long as you abide by some due diligence rules, you have it easy.
Data Security and Business Intelligence
I’m going to pick on SAP a little bit here (sorry friends). An application developer may have encrypted his data on the way into the database. That’s great. But, nifty tools like SAP BusinessObjects don’t know how to decrypt that, as simply storing that decryption cipher in my object’s SQL is about as secure as…well…storing it in an unencrypted database (or Excel).
What would be perfect? I don’t know.
What would help?
- Encrypted data, all the time, on disk, in memory, wherever. Simply put, it should persist that way.
- A data connection layer with the intelligence to decrypt that data.
Yeah, that’s about it. Could this have saved Anthem’s bacon? Potentially. Data, at rest, encrypted, *might* have negated someone having a flimsy set of credentials and just running a query to get data. There are certainly concerns that always encrypted data may take us back 10 years database performance-wise. However, of all of the customers I’ve worked with in my 16 years in this field, I’ve only seen two (that I can think of) take a serious look at database encryption and exposing that data at the BI layer. In both of those scenarios, the lack of solutions on the market was the real inhibitor. Further, with as many healthcare customers, or customers handling healthcare data, it amazes me that current US regulations around protected health information haven’t spawned real innovation in this stagnant space.
I’m really not picking on Timo and his so-simple-its-stupid-but-really-cute-and-accurate-napkin-sketch that describes the big data iceberg. I think it’s a great graphic. It’s relevant to every customer of all shapes and sizes. But in particular, those with serious regulatory and compliance challenges in Data and/or Information Security should really also be sketched in there as a scribble to say “oh yes, we do value our customer’s data…and don’t want to pay penalties of up to one-bajillion dollars or something”.
I can’t predict when the next Anthem is going to happen. I’m sure major corporations keep a slush fund hanging around just to cover penalties. But that is a bleak outlook for you and me, who at the end of the day, just get some free credit protection for a while. In an ideal world, SAP would provide an end to end solution as a platform company, whether SAP HANA is the underpinnings of that ideal world or not. In it, its ETL technology would manage the job of encrypting data on the way in, and SAP BusinessObjects would understand it enough to encrypt it on the way out. Customers still have the responsibility to set the appropriate roles and rights to actually get into that BI layer to access data, but at least we won’t see fluke user accounts with database access expose 80-something-million customer’s data to the world.