One of the best analogies I have heard for organizations without a data strategy is a supermarket selling everything in white-colored bottles of the same size without a label. If you go shopping for ketchup in such a supermarket, you will need to figure out the ketchup from the rat poison and everything else and then figure out if the ketchup is still fresh and hasn’t expired. While this analogy may sound extreme, the state of data in organizations without a data strategy has a good resemblance to this hypothetical situation. So what are the basic considerations for figuring out the raison d’etre for a data strategy and why do it now? Is this a problem for large enterprises or is it a problem for all data-driven startups?
One of the primary drivers for more data collection in today’s world is renewed focus on continuous user engagement and the associated need to measure and predict everything. The lean startup movement has revealed innovation best practices to the business world in addition to serving as a model for building new successful businesses. Let us consider some questions that provide clarity on the need for a data strategy.
What type of data makes business sense?
For a traditional business focused on processes, data was a by-product used to enable the process and mostly discarded after the transaction. Today, customer engagement and experience is the focus of lean startups and data is critical in making that happen. After the process is enabled, data is no longer discarded. It is no longer sufficient to measure data associated with an internal event. It is now necessary to identify the external triggers for an internal event and store that information as well. Identifying and capturing these associations enable businesses to identify patterns that help with future planning.
Whose data is it anyway?
Data that is stored by a business is governed by multiple laws, both local and international. These vary from Health Insurance Portability and Accountability Act (HIPAA) in the USA to General Data Public Regulation (GDPR) of the EU. For example, GDPR impacts any business enabled over the internet and services users who belong to the EU, irrespective of the physical location of that business. Also, sharing of data with external entities is a norm and not an exception, especially if the data is related to the external entity. For example, Facebook allows users to download the data collected by Facebook about the user at any time. Rules governing data storage tend to vary and evolve with time.
Digital Asset Lifecycle blog by FileCamp
Which is the right version of the truth?
The primary function of data ingestion is data purification and enrichment. Enrichment of data is not always a real-time operation in big data systems. Enriched & sanitized data tends to move across different application ecosystems and it is quite common to end up with multiple copies of data in different or the same data stores. When we want to refer to the data in the future, it is critical to know how to identify the right version of data and how to locate it. This also includes dealing with a decay of data.
It is 3 AM, do you know where your data is?
A cliche but still true, data is the new oil. And data theft is the current trend in cybercrime. Businesses need to be aware of data security and encryption. Encrypted data allows for some protection in case of a breach. Access control to big data is not easily achieved. A solution like Centrify can help with setting up access control for Hadoop, but managing data access still requires a dedicated team. Even if the data center of the business is secure, third-party solution providers who access data are still vulnerable to breach. The business owning the data is still liable for the loss and abuse of data due to third-party negligence. The most recent example would be Facebook facing the flak for Cambridge Analytica. Data at rest, data in motion and data at work all need to be secured.
In conclusion, every organization needs a Data Strategy that is:
- Actionable and not a mission statement. A data strategy is a tool that allows for unification of Business and IT expectations for all data-related capabilities.
- Relevant to the needs of the organization.
- Evolutionary and evolves with the changes that happen with time including regulations, etc.
- Integrated with all upstream and downstream data providers and consumers.
To further understand the subject, a good starting point is a book like Data Strategy: How to Profit from a World of Big Data, Analytics and the Internet of Things by Bernard Marr. If you are planning your data strategy, now is the right time to contact us and talk to our experts on how to define the right strategy for your organization.