80legs is teaming up with InfoChimps, host of last year’s inaugural Data Cluster at SXSWi. Our panel, Data Nerds, Is Big Data Crushing the Web? , questions the future of big data and its impact on the future of tech. Here’s an excerpt from our proposal:
Web data is growing at a record pace – and data junkies will soon rule the tech world. 50 million tweets per day. 1.2 million photos served per second. 50 million websites added annually. The question is, how are we expected to build the next generation of technological innovations on top of this ever-growing Everest of data? To be honest, it can be daunting. In this panel, we’ll discuss how big data on the web changes the game for everyone. Is Hadoop good enough to manage this data explosion? Is massive web crawling dead? Is it even feasible to make such vast amounts of data open to everyone, and how do people even tap into it? Should the average Joe even care?
We’re excited to see several other panel proposals that also address the issue of making sense out of ever-more-massive amounts of data. While you’re voting us up, give these folks some thumbs as well!
The collateral that is presently available is largely from the social media giants that tout solutions built using 10,000 node clusters that process petabytes of data a day. The reality? The average person just cannot relate or intuitively draw parallels to their own business problems. While Big Data solutions are worthwhile far before you reach petabyte scale data, just getting started can be a challenge in itself.
With probabilistic computing, you can interpret and act on all kinds of data using statistical inference – starting with some background assumptions, you can propose possible configurations of the world that explain how that data came about. You can use probabilistic computing to trace effects back to their probable causes. For instance, what do web surfing and purchases tell us about the consumers? How can site usage patterns inform user interface design? And what are the best ways to targets ads and offers at specific users?
Visualizing social data teaches us about people’s behavior, cultural norms, relationships and much more. The panelists are interactive visualization gurus from groups who are all trying to make sense of data – Stamen Design, IBM Research, Microsoft, New York Times and Google.
This session presents notes from the road gathered over the last 4+ years while building Scout Labs (by Lithium Technologies). It includes discovery and acquisition of data, and the amount available. We also cover the general messiness and lack of structure of the data, and challenges in building systems to analyze it.