Avoiding the Pitfalls of Bigger Data at the Human-Machine Interface
One of the widely held misconceptions in the field of big data analytics is that you can scale your way into insights by just adding more data. That may be true in some situations, but as 538 Editor in Chief Nate Silver and Crowdflower CEO Lukas Biewald said at last week’s inaugural Rich Data Summit, it’s just not that simple.
Silver led off his keynote address at the Rich Data Summit with a reference to the 2008 story “The End of Theory: The Data Deluge Makes the Scientific Method Obsolete,” written by Wired magazine editor Chris Anderson. Silver, the statistician and former baseball analyst who wrote the 2012 book “The Signal and the Noise,” had good things to say about Anderson in general, but not this particular story.
“This is the idea that it’s all about brute force and volume, that it’s a computational problem, that if you have millions or billions or trillions of observations and you have powerful enough computers then eventually correlations will rain down from the sky and you discover truth through brute force alone,” Silver said.
“This idea, I think, is kind of that big data is magic, in other words, and that it’s push-button, where you get your big data, you push a button, and all of a sudden you have extremely valuable output,” he said. “This idea is very wrong, and a little bit dangerous.”
It was ironic that, just after Anderson’s story came out, we witnessed the near-collapse of the nation’s financial system in the fall of 2008, when the effects of “a series of complex algorithms that the banks used” came home to roost, Silver said. “These are all examples of predictions that were all substantial failures of some kind,” he said “We’re getting the little stuff right more and more, but we still have all types of issues.”
Read the rest of the article at Datanami.