The promise of Big Data is to access all relevant data from all external and internal sources, and put it to work for the organization. Stratecast has laid out the principal benefits of harnessing the power of Big Data: - Data-driven organization. Makes an organization data-driven through policies and practices designed to transform actionable information into organizational action. - Data decisions. Forces the business to make decisions about which key performance indicators (KPIs) and areas of data it wants to focus on—in effect, “where to look.” While that is obviously essential from a data standpoint, it also has a “pre-ripple effect” throughout the business because, in so doing, the organization decides what is truly important. - No (data) fear. Addresses the reality that most of what people are talking about when they say “Big Data” is Unstructured and Semi-structured data, which do not fit easily into tables, or surrender themselves to standard SQL table queries/lookups. By deploying systems and best practices that can render Unstructured and Semi-structured data as readily manageable as Structured data, an organization no longer has to “fear the data.” Instead, it captures, analyzes, and puts the data to good use to build and support the business. - Agile organization. Means that the organization has truly learned to collect, assimilate, normalize, analyze, and act on data from all sources—both external and internal; both online and offline (physical world)—and again, while that is crucial simply from a data management standpoint, it is also essential in creating an agile organization. - Data for all. Represents the culmination of the dream that a company’s most important data need not be locked away and available exclusively to IT—nor, today, to remain the sole province of the Data Scientist and team—but instead, be accessible to all employees. This empowers people with the data they need to do their jobs better. However, while organizations are conceptually ready to reap the benefits of Big Data, moving from their current state of data-not-quite-readiness to a Big Data-driven future requires tackling three challenges: one structural, one temporal, and one about reliable data. This Stratecast report briefly analyzes these challenges and the viability of a solution designed to overcome all three.
Big Data: Structure Matters
Stratecast has previously analyzed the challenges posed by adding “new data,” unstructured and semi-structured, to the existing pool of structured data that data management systems have been handling with relative ease since the 1970s. As such, this report is not the place for an exhaustive discourse on that topic, but a quick review is in order to frame our discussion. Stratecast recognizes three data structures: 1. Structured data: information that is easily managed in a relational database (RDB), in columns and rows (such as addresses and phone numbers), and that is readily accessed via Structured Query Language (SQL) requests. Structured data depends on first creating a data model that defines data fields and characteristics of those fields. 2. Unstructured data: the fastest-growing type, which, at least on the surface, has no readily identifiable or consistent structure, follows no preset data model, and is not easily captured or accessed in a traditional RDB. Unstructured data emanates mainly, but not entirely, from mobile and Web communications, and includes but is certainly not limited to: a. Web content including both site and social b. Mobile communications, both phone and Web c. Business productivity/word processing documents d. Images and other electronic objects e. Books f. Multimedia: audio and video 3. Semi-structured data: also viewed as a sub-type of either structured or unstructured data, semi-structured data features tags or other markers that identify parts of the data; but, like unstructured data, follows no strict data model.
Semi-structured data includes things such as: a. Email b. Extensible Markup Language (XML) messages c. Labeled graphs
Table Of Contents
The Real-time is Now for Big Data to Pass the ACID Test - Could This Be a Turning Point?Â