With
so much hype about big data, it's hard for IT leaders to know how to
exploit its potential. Gartner, Inc. dispels five myths to help IT
leaders evolve their information infrastructure strategies.
"Big
data offers big opportunities, but poses even bigger challenges. Its
sheer volume doesn't solve the problems inherent in all data,"
said Alexander Linden, research director at Gartner. "IT leaders
need to cut through the hype and confusion, and base their actions on
known facts and business-driven outcomes."
Myth
No. 1: Everyone Is Ahead of Us in Adopting Big Data
Interest
in big data technologies and services is at a record high, with 73
percent of the organizations Gartner surveyed in 2014 investing or
planning to invest in them. But most organizations are still in the
very early stages of adoption — only 13 percent of those we
surveyed had actually deployed these solutions.
The
biggest challenges that organizations face are to determine how to
obtain value from big data, and how to decide where to start. Many
organizations get stuck at the pilot stage because they don't tie the
technology to business processes or concrete use cases.
Myth No. 2: We Have So Much Data, We Don't Need to Worry About Every Little Data Flaw
IT
leaders believe that the huge volume of data that organizations now
manage makes individual data quality flaws insignificant due to the
"law of large numbers." Their view is that individual data
quality flaws don't influence the overall outcome when the data is
analyzed because each flaw is only a tiny part of the mass of data in
their organization.
"In
reality, although each individual flaw has a much smaller impact on
the whole dataset than it did when there was less data, there are
more flaws than before because there is more data," said Ted
Friedman, vice president and distinguished analyst at Gartner.
"Therefore, the overall impact of poor-quality data on the whole
dataset remains the same. In addition, much of the data that
organizations use in a big data context comes from outside, or is of
unknown structure and origin. This means that the likelihood of data
quality issues is even higher than before. So data quality is
actually more important in the world of big data."
Myth
No. 3: Big Data Technology Will Eliminate the Need for Data
Integration
The
general view is that big data technology — specifically the
potential to process information via a "schema on read"
approach — will enable organizations to read the same sources using
multiple data models. Many people believe this flexibility will
enable end users to determine how to interpret any data asset on
demand. It will also, they believe, provide data access tailored to
individual users.
In
reality, most information users rely significantly on "schema on
write" scenarios in which data is described, content is
prescribed, and there is agreement about the integrity of data and
how it relates to the scenarios.
Myth
No. 4: It's Pointless Using a Data Warehouse for Advanced Analytics
Many
information management (IM) leaders consider building a data
warehouse to be a time-consuming and pointless exercise when advanced
analytics use new types of data beyond the data warehouse.
The
reality is that many advanced analytics projects use a data warehouse
during the analysis.In other cases, IM leaders must refine new data
types that are part of big data to make them suitable for analysis.
They have to decide which data is relevant, how to aggregate it, and
the level of data quality necessary — and this data refinement can
happen in places other than the data warehouse.
Myth No. 5: Data Lakes Will Replace the Data Warehouse
Vendors
market data lakes as enterprisewide data management platforms for
analyzing disparate sources of data in their native formats.
In
reality, it's misleading for vendors to position data lakes as
replacements for data warehouses or as critical elements of
customers' analytical infrastructure. A data lake's foundational
technologies lack the maturity and breadth of the features found in
established data warehouse technologies. "Data warehouses
already have the capabilities to support a broad variety of users
throughout an organization. IM leaders don't have to wait for data
lakes to catch up," said Nick Heudecker, research director at
Gartner.
Comments
Post a Comment