SXSWi Panel Picker is now open

80legs is teaming up with InfoChimps, host of last year’s inaugural Data Cluster at SXSWi. Our panel, Data Nerds, Is Big Data Crushing the Web? , questions the future of big data and its impact on the future of tech. Here’s an excerpt from our proposal:

Web data is growing at a record pace – and data junkies will soon rule the tech world. 50 million tweets per day. 1.2 million photos served per second. 50 million websites added annually. The question is, how are we expected to build the next generation of technological innovations on top of this ever-growing Everest of data? To be honest, it can be daunting. In this panel, we’ll discuss how big data on the web changes the game for everyone. Is Hadoop good enough to manage this data explosion? Is massive web crawling dead? Is it even feasible to make such vast amounts of data open to everyone, and how do people even tap into it? Should the average Joe even care?

We’re excited to see several other panel proposals that also address the issue of making sense out of ever-more-massive amounts of data. While you’re voting us up, give these folks some thumbs as well!

Big Data for Everyone (No Data Scientists Required)

The collateral that is presently available is largely from the social media giants that tout solutions built using 10,000 node clusters that process petabytes of data a day. The reality? The average person just cannot relate or intuitively draw parallels to their own business problems. While Big Data solutions are worthwhile far before you reach petabyte scale data, just getting started can be a challenge in itself.

Data Overload: Probabilistic Computing For Breakthrough Data Analytics

With probabilistic computing, you can interpret and act on all kinds of data using statistical inference – starting with some background assumptions, you can propose possible configurations of the world that explain how that data came about. You can use probabilistic computing to trace effects back to their probable causes. For instance, what do web surfing and purchases tell us about the consumers? How can site usage patterns inform user interface design? And what are the best ways to targets ads and offers at specific users?

Beautiful Data: Interactive Visualization of Social Media

Visualizing social data teaches us about people’s behavior, cultural norms, relationships and much more. The panelists are interactive visualization gurus from groups who are all trying to make sense of data – Stamen Design, IBM Research, Microsoft, New York Times and Google.

Making Sense of Social Media Data

This session presents notes from the road gathered over the last 4+ years while building Scout Labs (by Lithium Technologies). It includes discovery and acquisition of data, and the amount available. We also cover the general messiness and lack of structure of the data, and challenges in building systems to analyze it.

Selected as a panelist for SxSWi

I just found out we’ve been selected to speak on a panel at SxSWi!  The panel is called “Big Data, Big Dreams” and is headed by Juan Sequeda of Semantic Web Austin.  The purpose of the panel is to explore how to scale applications that must handle very large amounts of data.  From the panel description:

How can we have applications that can scale with large amounts of data? Are relational databases sufficient? What other technologies are out there that can scale? This panel will talk about existing technologies that manage large amounts of data.

The 80legs crawler uses its own, custom-built data storage system, which has the very original name “Crawler Data Store”, or CDS.  We actually explored various general-purpose data storage systems before deciding to build our own.  At the same time, we do use relational databases to handle other parts of our system.  We really do have a mash-up of data storage technologies, all working together.

By the way – vote up the panel if you haven’t already!