The 14 Most Shared Big Data Articles in February 2014

1. The World’s Top 10 Most Innovative Companies in Big Data.

1. GE – For harnessing data from its planes and trains to power a new Industrial Internet, potentially saving billions.
2. KAGGLE – For feeding its DIY data scientists cash-prize challenges (then molding them into a consulting biz)
3. AYASD – For using a visual approach to take the guesswork out of big data.
4. IBM – For playing global data evangelist by sharing its problem-solving power with cities, businesses, and universities.
5. MOUNT SINAI ICAHN SCHOOL OF MEDICINE – For embracing data scientists and supercomputers to build the hospital of the future.
6. THE WEATHER COMPANY – For analyzing millions of local climates to predict how shoppers’ habits sway with the weather.
7. KNEWTON – For forging alliances to make millions of students smarter, from adaptive-learning ebooks to personalized English language training courses.
8. SPLUNK – For providing businesses with hundreds of homegrown apps to sniff out error files and keep things humming.
9. GNIP – For expanding its service to let customers dive through every social media stream available.
10. EVOLV – For mining employee performance to help stanch turnover and upend HR.

2.  Big (Bad) Data

The buzziest idea in business may be its greatest downfall.

Social media and Big Data, the term du jour for the collection of vast troves of information that can instantaneously be synthesized, are supposed to help us make smarter, faster decisions.

But with increasing frequency, it may be leading to flawed, panic-induced conclusions, often by ascribing too much value to a certain data point or by rushing to make a decision because the feedback is available so quickly. This digital river of information is turning normally level-headed decision-makers into hypersensitive, reactive neurotics.

Or at least we’re finding out that some wisdom is needed to know which crowd to follow.

The greatest challenge of Big Data — especially social media — is separating the signal from all the noise. A study by the Pew Research Center, for example, found that Twitter users are more often than not negative…  …“The reaction on Twitter to major political events and policy decisions often differs a great deal from public opinion as measured by surveys,” Pew reported. That is due, in part, to the fact that “Twitter users are not representative of the public”: They are younger and more likely to lean toward the Democratic Party. It turns out that what’s “trending” on Twitter may not really be “trending” at all.

Big Data and massive efforts to analyze it aren’t going away. But the need for judgment — and patience — is more important than ever. A crowd may be wise, but ultimately, the crowd is no wiser than the individuals in it.

3. Big Data Is All Data

Podcast: Why deriving value from data may not be as complex as it seems

Mark Myers, market segment manager for IBM Watson Explorer, says the familiar “three Vs”—volume, variety, and velocity—definition of big data is too narrow. Listen to the podcast, “Big Data Is All Data,” to hear Myers explain the thinking behind this approach and his advice on how to begin getting more value out of data—which may be easier than you may think.

4. 10 Mistakes Enterprises Make in Big Data Projects

Avoid common pitfalls when planning, creating, and implementing big data initiatives

While accomplishing this goal seems realistic given the progression of technology and the commoditization of infrastructure, there are 10 common pitfalls that enterprises, in particular, need to avoid when planning and implementing a big data program. By avoiding these drawbacks, outcomes can enhance an organization’s analytical insights and decision support processes.

1. Lacking a business case
2. Minimizing data relevance
3. Underestimating data quality
4. Overlooking data granularity
5. Improperly contextualizing data
6. Not grasping data complexity
7. Ignoring data preparation
8. Delaying organizational maturity
9. Forgoing data governance
10. Deploying technology as a silver bullet

5. The real promise of big data: It’s changing the whole way humans will solve problems

An analytic truth was one that could be derived from a logical argument, given an underlying model or axiomatization of the objects the statement referred to. Given the rules of arithmetic we can say “2+2=4” without putting two of something next to two of something else and counting a total of four.

A synthetic truth, on the other hand, was a statement whose correctness could not be determined without access to empirical evidence or external data. Without empirical data, I can’t reason that adding five inbound links to my webpage will increase the number of unique visitors 32%.

Fundamentally, we’ve gone from creating novel analytic models and deducing new findings, to creating the infrastructure and capabilities to solve the same problems through synthetic means.  Until recently, we used analytical reasoning to drive scientific and technological advancements. Our emphasis was either 1) to create new axioms and models, or 2) to use pre-existing models to derive new statements and outcomes. The relatively recent development of computer systems and networks has induced a shift from analytic to synthetic innovation.

Google and Amazon serve as early examples of the shift from analytic to synthetic problem solving because their products exist on top of data that exists in a digital medium. Everything from the creation of data, to the storage of data, and finally to the interfaces scientists use to interact with data are digitized and automated.

Before we can apply synthetic methodologies to new fields, two infrastructural steps must occur:

1. the underlying data must exist in digital form and
2. the stack from the data to the scientist and back to the data must be automated.

That is, we must automate both the input and output processes.

Finally, in economics, we’re no longer relying on flawed traditional microeconomic axioms to deduce macroeconomic theories and predictions. Instead we’re seeing econometrics play an every increasing role in the practice and study of economics.

Big data isn’t meaningful alone; rather it’s a byproduct and a means to an end as we change how we solve problems.  Marc Andreessen famously argued, “Software is eating the world” in his 2011 essay. However, as we dig deeper and understand better the nature of software, APIs, and big data, it’s not software alone, but software combined with digital data sets and automated input and output mechanisms that will eat the world as data science, automation, and software join forces in transforming our problem solving capabilities – from analytic to synthetic.

6. Scientists set new speed record for big data

IBM today announced that it has achieved a new technological advancement that will help improve Internet speeds to 200 – 400 Gigabits per second (Gb/s) at extremely low power.

The speed boost is based on a device that can be used to improve transferring Big Data between clouds and data centers four times faster than current technology. At this speed 160 Gigabytes, the equivalent of a two-hour, 4K ultra-high definition movie or 40,000 songs, could be downloaded in only a few seconds. The device was presented at the International Solid-State Circuits Conference (ISSCC) in San Francisco.

As Big Data and Internet traffic continues to grow exponentially, future networking standards have to support higher data rates. For example, in 1992, 100 Gigabyte of data was transferred per day, whereas today, traffic has grown to two Exabytes per day, a 20 million fold increase.

7. BlueKai Acquisition Validates that Customer Data is King

According to Oracle, BlueKai’s 3rd party data will be used to augment a customer’s proprietary 1st party data with actionable information. As I read that last statement again, I thought about the implications and why customers need to pay a company like BlueKai to augment its 1st party data with 3rd party data. I like to think of it this way. Why rent when you can own?

Delivering relevant 1:1 marketing demands a 1st party 1st strategy that can track customers anywhere and anytime and the ability to take action on this data at the right time.

This is Ensighten’s approach:

  1. Collect your 1st party customer data anywhere and anytime!
  2. Own that data forever.  It’s yours.
  3. Take action at a 1:1 level.

Leverage your 1st party customer data to optimize your individual customer’s experience or integrate with any application to drive your desired outcome.

The BlueKai acquisition by Oracle validates that the customer data is king. Delivering a personalized experience based on behavior is the goal. The key however is to leverage your 1st party data.  Collect, own and act on your own customer data and push this insight to the DMP. Not the other way around.  Over the long run you will see the true value in owning versus renting.

8. Why “Big Data” Is a Big Deal

Information science promises to change the world.

DATA NOW STREAM from daily life: from phones and credit cards and televisions and computers; from the infrastructure of cities; from sensor-equipped buildings, trains, buses, planes, bridges, and factories. The data flow so fast that the total accumulation of the past two years—a zettabyte—dwarfs the prior record of human civilization.

“There is a big data revolution,” says Weatherhead University Professor Gary King. But it is not the quantity of data that is revolutionary. “The big data revolution is that now we can do something with the data.”

Eagle agrees that “you don’t get good scientific output from throwing everything against the wall and seeing what sticks.” No matter how much data exists, researchers still need to ask the right questions to create a hypothesis, design a test, and use the data to determine whether that hypothesis is true

Safeguarding data is his other major concern, because “the privacy implications are profound.” Typically, the owners of huge datasets are very nervous about sharing even anonymized, population-level information like the call records Eagle uses.

DATA, IN THE FINAL ANALYSIS, are evidence. The forward edge of science, whether it drives a business or marketing decision, provides an insight into Renaissance painting, or leads to a medical breakthrough, is increasingly being driven by quantities of information that humans can understand only with the help of math and machines. Those who possess the skills to parse this ever-growing trove of information sense that they are making history in many realms of inquiry. “The data themselves, unless they are actionable, aren’t relevant or interesting,” is Nathan Eagle’s view. “What is interesting,” he says, “is what we can now do with them to make people’s lives better.”

John Quackenbush says simply:  From Copernicus using Tycho Brahe’s data to build a heliocentric model of the solar system, to the birth of statistical quantum mechanics, to Darwin’s theory of evolution, to the modern theory of the gene, every major scientific revolution has been driven by one thing, and that is data.”

9. Big Data: Are you ready for blast-off?

As Laurie Miles, head of analytics for big data specialist SAS, says: “The term big data has been around for decades, and we’ve been doing analytics all this time. It’s not big, it’s just bigger.”

But it’s the velocity, variety and volume of data that has merited the new term. There was a proliferation of so-called unstructured data generated by all our digital interactions, from email to online shopping, text messages to tweets, Facebook updates to YouTube videos.  According to computer giant IBM, 2.5 exabytes – that’s 2.5 billion gigabytes (GB) – of data was generated every day in 2012. That’s big by anyone’s standards. “About 75% of data is unstructured, coming from sources such as text, voice and video,” says Mr Miles.

Data is only as good as the intelligence we can glean from it, and that entails effective data analytics and a whole lot of computing power to cope with the exponential increase in volume.

But a recent Bain & Co report found that of 400 large companies those that had already adopted big data analytics “have gained a significant lead over the rest of the corporate world.”

“Big data is not just historic business intelligence,” says Mr Carr, “it’s the addition of real-time data and the ability to mash together several data sets that makes it so valuable.” Practically, anyone who makes, grows and sells anything can use big data analytics to make their manufacturing and production processes more efficient and their marketing more targeted and cost-effective.

Big data needs new skills, but the business and academic worlds are playing catch up. “The job of data scientist didn’t exist five or 10 years ago,” says Duncan Ross, director of data science at Teradata. “But where are they? There’s a shortage.”

Social media platforms will often say that their users own their own content, but then lay claim to how that content is used, reserving the right to share it with third parties. So when you tweet you effectively give up any control over how that tweet is used in future, even though Twitter terms and conditions say: “What’s yours is yours.”

Privacy and intellectual property laws have not kept up with the pace of technological change.

10. What Big-Data VCs are Sick of — and What They Really Want

“The most grotesquely over-invested category I have seen is anything that has to do with sales and marketing,” said Matt Ocko, the managing partner at Data Collective.

  • Do something original – Don’t be “the eleventh [person starting] a Wi-Fi chip company, the fourteenth [person doing] scale-out storage.” It’s probably not going to work.

Companies interested in building commercial distribution of the Hadoop ecosystem of technologies for storing, processing, and querying lots of different kinds of data. “Anyone besides Cloudera is doomed,” Ocko said, referring to a company that’s in a position to go public this year.

Tools that can help data scientists save lots of time, such as Trifacta, which simplifies the process of getting a big heap of data ready for analysis, can also be appealing.

As for Ocko himself, he likes “idiot savants” trying to solve targeted problems in “narrow categories” that could result in the “complete overturn of everything in the existing quarter.”

“The applications that we don’t see — the technology behind it — that’s the stuff that’s going to be interesting to me,” he said.

11. 4 Ways to Actually Use Big Data

A number of new tools have made Big Data accessible to growing businesses. Here are four ways you can use those tools to communicate with customers and manage your company’s reputation.

Big Data–the name given to the deluge of digital interactions companies have with clients–has been hyped for years as a kind of “crude oil” of the new millennium, hugely valuable but useless if unrefined. The challenge is that advanced software suites and analytics experts have generally been needed to make any sense of the terabytes of raw information that can be collected daily.

For businesses looking to leverage social data to their own advantage, applying these basic analytics hacks can be a good place to start:

    1. Use filters to separate important messages from background noise:
    2. Track overall changes in message volume: 
    3. Incorporate tools that automatically track sentiment: 
    4. Choose software that spits out snazzy reports:

12. Big Data, Big Business, Big Brother?

“Big Data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they are doing it.”

Lets define ‘big data’ with my 5V’s (expanded from Gartner): the exponential growth of data-velocity, -variety, -volume, -virility and -value. In other words, a lot like before but vastly larger, faster, more varied, more viral and massively valuable – and in the aggregate of these 5 trends lies its mind boggling potency. IMHO, Big Data’s economic and social importance will rival that of the oil economy by 2020 – and mobile devices are already the key driver of big data, globally.

“The reality is that our personal data footprint is now becoming unfathomably wide…

The bottom line is this: Big Data has enormous potential for everyone, and along with the other 5 memes it could be hugely beneficial for everyone on this planet. But if Big Data equals big brother sharing the spoils with big business, then it will amount to Big Rip-Off for the rest of us.

The time to tackle these issues is now.

13. Big Data’s Fading Bloom

Nobody would deny that Big Data was one of the most talked about areas in tech last year. And while Big Data was once viewed as the golden child of tech, its bloom is fading in terms of the value that it is able to deliver all on its own. There was a time not that long ago when the focus was on finding, capturing and storing data. But today, the shift in everyone’s focus is how to unlock the value from each and every piece of data we can uncover.

Enter the Chief Data Officer. In our last few columns, all of which received considerable attention, I looked at the rise of the CDO as the new C-level executive chartered to efficiently manage the distribution of the huge new streams of market, customer and other competitive information now flowing into the enterprise from the Big Data/Cloud revolution.

The primary task of the Chief Data Officer is to prepare incoming data in such a manner that it will be both understandable and usable for maximum productivity by every employee, at every level, in the enterprise.

The duty of the CDO is to establish the infrastructure that presents data to the right people in the right way.

This is a very different perspective from what is already becoming the standard view of the CDO. In that view, the CDO is the functional equivalent of the CIO, but for a company’s data as opposed to its systems. That view, I believe, is a mistake one that looks through the wrong end of the telescope and makes the CDO just another part of the problem.

Most data is presented in an organization in three different ways, …… based upon how much time each of these players have to conduct their own analysis:

1. Reporting – Reported data is typically delivered to the rank-and-file of the organization and is designed for efficiency – i.e., How am I doing? How do I improve performance, consistency and customer satisfaction? As such it is inclusive, tries to identify best practices and disseminate them through the organization.

2. Dashboarding – Dashboards take two forms. One is designed to capture, dynamically, the operating efficiency of the organization. This type of dashboard is typically very sophisticated and looks at multiple parameters. By comparison the second type of dashboard, designed for senior executives, is comparatively simple as the result of considerable pre-digestion of the data. They are designed in this way in recognition of the limited amount of time C-level users can devote to them. By the same token they are exclusive – that is, they are designed to look for the outlier data that portend a change in the health of the enterprise.

3. Visualization – This is where analytics used to belong. Visualization is designed for the company’s information analysts – and it is all about discovery. That is, how can the company look at its caches of Big Data in a different way with a different approach and different tools in order to better understand markets, customers, competitors and employees

Data Audit. This is an analysis of how the company’s different constituencies actually use the data supplied to them.

Most companies today just assemble the data and get it out to the company. In the future the competitive edge will go to those firms that understand who in the company uses what data, in what presentational format.

14. Data Isn’t Just for the Big Guys Anymore

Small businesses can follow the lead of a company such as Nest which is transforming the normally staid thermostat and smoke alarm business. The core of Nest is data, because the product pulls user preferences, getting “smarter” over time for maximum efficiency.

Data means finding the repeat loyal customers and providing them with outlets for brand building such as social media contacts or even an old-school free t-shirt. At its core, data for the small business means streamlining processes and finding what is most relevant to customers.

For small businesses looking for ways to rely on data to make better decisions, here are some guidelines:

    • Decisions should be informed by data. While the small business manager needs intuition and experience, data should be a consistent driver for actions. Not everyone is a visionary that can spot trends on their own, but data turns people into experts.
    • Thinking of expanding into a new area? Let the data tell you if that’s a wise move or if you should focus elsewhere. Considering a pricing increase? Play with the data to see the cost/benefit of potentially fewer customers but possibly higher margins. Data enables broad A/B testing and experiments.
    • Departments within the company need to measure everything. If something can’t be measured then it can be surveyed. Spend time on the important metrics, especially the ones that relate to revenue and customer satisfaction.
    • Data should see the light of day. Small business owners shouldn’t ask for data from the entire team and then keep the results to themselves. Transparency is king. It not only drives accountability but it also spurs innovation and creative thinking.


Skip to toolbar