It is almost trite to say that data is important in building corporate value today. The most valuable companies in the world by market capitalization, the so-called GAFAs (Google, Apple, Facebook, and Amazon), are those that have massive amounts of data. But more data does not automatically translate to greater returns. For instance, most banks have massive amounts of data but cannot use that data effectively. To achieve value from that data, it is essential to be able to use that data effectively.
An illustrative anecdote: I was recently speaking to a data scientist at well-known bank out of New York who was lamenting a conversation he had with another data scientist friend at Google. The data scientist at the bank has a team of a dozen or so people that he manages. Within that team the newest recruit, a PhD earning a commensurate hefty wage, is actually not even doing data science but is relegated to running around the organization to find out where various data is stored and negotiating access to it one way or another.
On the other hand, his friend at Google has access to all the data that Google has and he might need because — appropriately for an organization self-tasked with organizing the World’s information — Google had an explicit strategy to organize and make usable all of its data across the enterprise from day one.
Now, admittedly this is as not only a data management problem but also a question of having a collaborative culture and the luxury of modern systems throughout that support cross-enterprise data usage. But the point is that data is equivalent to corporate value today, and many companies are not accessing that value.
Making data usable
In a recent survey Ohalo carried out of several dozen executives in data management roles at large financial institutions, 55% of them expressed that they could not tell their board with certainty what data is stored where at any given point in time. This was one of the top 3 concerns expressed in the survey. One CIO said that just in the one division he manages, he has 6,000 different databases, each with potentially hundreds schemas and each schema with potentially hundreds of schemas. Bringing that data into a usable state is not an easy thing.
Achieving value in the data that companies have is not a particularly short road, but a reasonable first step to me seems to be getting a basic understanding of what data the organization holds in the first place, which is surprisingly congruent as a starting point with new regulations like the General Data Protection Regulation (GDPR).
The GDPR opportunity to take control of your data
The subject of this blog is not the technical details of the GDPR, about which much has been written. Suffice it to say for our purposes in this piece that it is the largest change in data protection regulation in 20 years. It applies not only to European Union firms but any firm in the world that is gathering data about EU citizens. Importantly, this is applicable not only to the firms that initially gather the data but also to the third party firms (hey, enterprise SaaS businesses!) that process data on behalf of firms — “Data Controllers” and “Data Processors”, respectively, in GDPR parlance. It puts billions in Euros of fines at risk for these companies.
Some of the most-discussed aspects of new regulations like the GDPR are Data Subject Rights, or essentially the rights of consumers to request data held about them, erase that data, and rectify that data, among others. In order to meet these requirements, a basic understanding of what data is held about who is necessary.
The more important point about the GDPR for risk officers and staff trying to build data centric businesses within large organizations is that it could provide a trigger to finally get control of data in the enterprise. There is an opportunity to both comply with the regulations while also using that chance to understand where data sits and how to access it. As demonstrated in the aforementioned survey there is a real need to understand what data is there across many enterprises. What if it was, ironically, regulation designed to limit the use of data to end-consumer-approved-cases that finally provided the impetus to get control of data across the thousands of silos that exist in large organizations?
Staff in data management roles can seize the regulatory forcing function that is GDPR to carry out a review of what data sits where. By understanding their data better, enterprises will be able to not only satisfy their regulators and customers but also build great products that raise the value of the enterprise as a whole.
Ohalo builds easily integrated enterprise data tools for data compliance. The Data X-Ray automatically classifies sensitive data with a machine learning algorithm. The Data Protection Router is a blockchain-based tool to track the lineage of data as it moves through the enterprise and across to third party organizations. To schedule a demo of either click here.