March 17, 2022
Big data or smart data?
Data Warehouses, Big Data, Data Lakes, Data Fabrics and now Data Mesh... These terms are everywhere nowadays, illustrating the evolution of the Data & Analytics landscape over time.
But simply upgrading technologies is not enough for making data-based decisions and jumping on opportunities.
Organizations should also reevaluate the methodology and strategies they use to capture Data & Analytics value. In some cases, Big Data is the answer to extracting insights and delivering value. Other times, all it takes is to have a small set of precise and reliable data.
What is Big Data?
Big Data has been a buzzword for years. Nowadays, it is a must-have capability for many organizations.
By collecting all data first then developing possible use cases, Big Data is definitely a technology-driven approach.
And it requires a high level of maturity for companies to really benefit from that massive amount of information. Indeed, deriving value from data is never a simple task, and the 3Vs of Big Data make it even harder:
- Volume: the quality of data being collected, stored and analyzed
- Variety: data is often unstructured and from diverse sources with different types of formats and characteristics.
- Velocity: the speed of data being collected, distributed, ingested and processed
Skilled employees spend a significant amount of their time sourcing, cleaning and preparing data instead of generating value from it. Not knowing why to collect or what to do with data leads to storing incomplete information, insufficient metadata management, and struggles to organize or find data. In addition, the cost of processing, cleaning and standardizing data grows with the volume of data.
On average, data scientists spend 45% of their time in loading and cleaning data before they can use it for value-added tasks. Anaconda, 2020
Even facing these challenges, it is safe to say that Big Data is still soaring and needed in many situations. But what cannot be overlooked is that it requires extensive time and resources while the return on investment is often missed.
What is Smart Data and how is it different?
Organizations want to treat big data more efficiently in order to use high-quality information in various strategies. This leads to the growing discussion around Smart Data.
Smart Data is a business-driven approach to prioritize, organize, synthesize and optimize Data & Analytics for usability and action:
- Smart Data starts from serving a certain business need and is enriched and verified in that context
- Smart data relies on automation and data observability to ensure reliability and robustness
- Smart Data reduces the importance of volume in Big Data and focuses on veracity and value
Instead of finding a purpose for data, find data for a purpose. Bart de Langhe and Stefano Puntoni, MITSloan Management Review, 2020
Smart Data can also have an impact on technology and architecture decisions, to ensure Data & Analytics are stored and processed at the right location, and accessed by the right individuals or devices only… in the right situation.
Edge Computing / Edge AI is such a use case, in which Smart Data means to identify which information from connected devices should be processed right away (on the device) and what should be sent back to the central system. It enables more efficient processing, better protection of personal or sensitive data, the optimization of environmental impact, etc.
Smart Data doesn’t replace Big Data nor simply eliminates excessive information. It helps the organization be efficient when collecting, storing and processing data.
The benefits of Smart Data principles
In order to build a competitive advantage with Data & Analytics, organizations need to leverage the best of the technologies with the right approach. This is where Smart Data comes into play, in order to shift the focus to Value and Impact.
Better facing the challenges with fragmented D&A infrastructure
Dealing with the distributed nature of organizations and underlying Data & Analytics systems is a typical struggle: centralization attempts brings benefits in term of accessibility, but also induces complexity in term of management, governance and security. And as always, risks that the one-size-fits-all platform becomes an inhibitor for some use cases.
Data Mesh has emerged as a way to tackle those challenges, by defining the organizational structure for Data & analytics and supporting different underlying technology platforms and architectures.
As such, a Smart Data methodology is such a good foundation to approach the Big Data / Data Lake organization as a Data Mesh: develop a distributed structure and ownership, while maintaining shared visibility and collaboration, thanks to clearly defined boundaries and objectives.
Cost-effective data management
Data management requires a huge effort, to ensure well-documented metadata, to address data quality, to track and enforce compliance and security. The challenge is to prioritize all these efforts and to align them to actual business priorities.
By focusing on concrete and well-identified requirements, Smart Data enables organizations to do more with less: focusing on accuracy and not on quantity, streamlining costs and efforts to build and maintain Data & Analytics pipelines, and delivering tangible outcomes.
Reduce error and help human judgment
Scaling Data & Analytics within organizations, in order to enable data-driven decision making (a.k.a. value-driven), requires to leverage trusted data.
Relying on Smart Data is a way to iteratively build trustable and reliable data assets, ensure their proper management and encourage their reuse.
Flexible to change
And finally, the continued evolution of organizations and technologies make it difficult to maintain over time: how to deal with M&A, local regulations, the emergence of new technologies …
By taking a value-first approach, Smart Data makes it easier to adjust plans and D&A management efforts :
- To support evolving business priorities,
- To address strengthened regulations and compliance requirements,
- To enable innovation by relying on emerging technologies,
- To offer a solid ground when reorganizations are required, involving changes in data governance
While many organizations focus on technology and data as the beginning of their Data & Analytics journey, they face the risk of getting overwhelmed with the efforts and complexity. As a result, it greatly limits the impact and value of Data & Analytics over time.
A value-first approach helps align Data & Analytics efforts with business strategies, deliver tangible and measurable benefits and develop the right technology foundations along the way.
In the end, we can say that Big Data and Smart Data are not something to choose from, and both are required within organizations: Smart Data is the key to succeeding with your Big Data architecture and delivering the expected value and impacts in the long run.