Dark data is
digital information that is not being used. Consulting and market research
company Gartner Inc. describes dark data as "information assets that an
organization collects, processes and stores in the course of its regular
business activity, but generally fails to use for other purposes."
Many times, an organization may leave data dark for practical
reasons. The data may be dirty and by the time it can be scrubbed, the
information may be too old to be useful. In such a scenario, records may
contain incomplete or outdated data, be parsed incorrectly or be stored in file
formats or on devices that have become obsolete.
Increasingly, the term dark data is being associated with big
data and operational data. Examples include server log files that could provide
clues to website visitor behavior, customer call detail records that
incorporate unstructured consumer sentiment data and mobile geolocation data
that could reveal traffic patterns that would help with business planning.
Potentially, this type of dark data can be used to drive new
revenue sources, eliminate waste and reduce costs. As a result, many
organizations that store dark data for regulatory compliance purposes are using
Hadoop to identify useful dark bits and map them to possible business uses.
Comments
Post a Comment