Archive for the 'Events' Category

SXSWi Panel Picker is now open

80legs is teaming up with InfoChimps, host of last year’s inaugural Data Cluster at SXSWi. Our panel, Data Nerds, Is Big Data Crushing the Web? , questions the future of big data and its impact on the future of tech. Here’s an excerpt from our proposal:

Web data is growing at a record pace – and data junkies will soon rule the tech world. 50 million tweets per day. 1.2 million photos served per second. 50 million websites added annually. The question is, how are we expected to build the next generation of technological innovations on top of this ever-growing Everest of data? To be honest, it can be daunting. In this panel, we’ll discuss how big data on the web changes the game for everyone. Is Hadoop good enough to manage this data explosion? Is massive web crawling dead? Is it even feasible to make such vast amounts of data open to everyone, and how do people even tap into it? Should the average Joe even care?

We’re excited to see several other panel proposals that also address the issue of making sense out of ever-more-massive amounts of data. While you’re voting us up, give these folks some thumbs as well!

Big Data for Everyone (No Data Scientists Required)

The collateral that is presently available is largely from the social media giants that tout solutions built using 10,000 node clusters that process petabytes of data a day. The reality? The average person just cannot relate or intuitively draw parallels to their own business problems. While Big Data solutions are worthwhile far before you reach petabyte scale data, just getting started can be a challenge in itself.

Data Overload: Probabilistic Computing For Breakthrough Data Analytics

With probabilistic computing, you can interpret and act on all kinds of data using statistical inference – starting with some background assumptions, you can propose possible configurations of the world that explain how that data came about. You can use probabilistic computing to trace effects back to their probable causes. For instance, what do web surfing and purchases tell us about the consumers? How can site usage patterns inform user interface design? And what are the best ways to targets ads and offers at specific users?

Beautiful Data: Interactive Visualization of Social Media

Visualizing social data teaches us about people’s behavior, cultural norms, relationships and much more. The panelists are interactive visualization gurus from groups who are all trying to make sense of data – Stamen Design, IBM Research, Microsoft, New York Times and Google.

Making Sense of Social Media Data

This session presents notes from the road gathered over the last 4+ years while building Scout Labs (by Lithium Technologies). It includes discovery and acquisition of data, and the amount available. We also cover the general messiness and lack of structure of the data, and challenges in building systems to analyze it.


Defrag Experience

Defrag2009LogoThis past week I was in Denver attending Defrag 2009, which is something of the uber-tech geek con and bills itself as:

…focused on the tools and technologies that accelerate the “aha” moment, and is a gathering place for the growing community of implementers, users, and thinkers that are building the next wave of software innovation.

It was a very unique experience, to say the least.  We actually were unsure of our interest in attending when we first heard about Defrag.  Eric Norlin had contacted me several months ago about us being a sponsor.  Since “big data” is one of the themes at Defrag, he justifiably figured that we would fit right in.  Unfortunately, with DEMO looming, we were unsure of Defrag of being worth taking a chunk out of our budget.  I actually initially declined Eric, but he was persistent and contacted me again after DEMO.  Of course, I was even more cautious about committing now that all the money for DEMO had actually been spent!  But, after Eric offered the opportunity to speak, I decided we’d go for it.

Let me first say that deciding to attend Defrag was definitely the right move.  The quality of level of the audience is definitely the highest of any conference I’ve seen.  Each person that came by the booth was plugged-in, technical and business-savvy.  We actually managed to generate a good number of promising leads, which was impressive considering there were only about 350 people in attendance.  From a pure business perspective, just closing 2 or 3 of these leads would make the conference worth it for us.

We had some great one-on-one conversations with folks there, including talks with the guys at Infochimps, Robert Scoble, and Bill from Factual (previously of Y! BOSS).  We also gave some folks a sneak peak at what we’re working on with Language Computer.  Without providing too much detail, we’re building a service called Extractiv, which will let people turn any part of the web into highly structured, semantic data.

The one down vote I would give for Defrag is that the talks didn’t always live up to “tech” billing I thought they would.  In many cases, on-stage discussion converged onto social media, Twitter, etc.  While those are important new developments, many of the speakers focused on how to create the right UI or visualize social content.  My personal opinion is that UI and visualization are not the hard problems to be solved in these spaces.  Rather, converting that content into meaningful and actionable data is.

Oh, and here’s my little presentation!

I think I ruffled some feathers by actually suggesting something could be better than the cloud in some cases (god forbid!). :)

Overall it was a great time, and I look forward to attending next year!

Web-Scale Apps Challenge

10-19-2009 11-22-33 AMWe just launched the 80legs Web-Scale Apps Challenge over at ChallengePost!  We’re challenging anyone and everyone to make the coolest apps for crawling and processing web content.  The top 3 entries will win some pretty sweet prizes, like a Kindle, original mint-condition Atari, and more.

We issued this challenge in anticipation of our App Store launch, which will happen the week of November 16th.  The 80legs App Store will allow our users to buy and run 80Apps created by third-party developers.  Our users will get to run custom code without having to do their own development work, and developers get a way to monetize cool web content processing technologies.

More details on the App Store to come!  For now, check out the challenge at!

Our launch experience, part 3: Now the real work begins

I ended my last post mentioning that I figured we could take a small break the day after DEMO.  Boy, was I wrong.  When I woke up the next day, I saw several hundred emails, about 300 tweets referring to 80legs and dozens of articles discussing us.  So instead of checking out the beach, we spent the morning responding to emails and catching up on all the 80legs discussion.

I think we did a really good job of getting the word out for 80legs.  Here are some quick stats showing how well we did on this front:

  • # of articles on 80legs: 16
  • # of times 80legs was mentioned as “Best of DEMOfall09”: 2
  • # of re-tweets of articles: 700+

Here are just some our favorite articles:

I should also note that we got posted to Hacker News, Digg and Slashdot.  Here’s what happened to our web traffic in the week following DEMO:


Having a 1 Gbps connection helps

Interesting note: most of our web traffic came from Hacker News.  We check HN regularly and participate in the discussions from time to time, so it was awesome to get so much interest from our own community.  Of course, our main focus is not our web traffic (which I think is pretty good for non-consumer-facing service), but customer adoption.  Here are a few stats on that:

  • # of users that logged in since DEMO: 1554
  • # of jobs run since DEMO: 1557

Just as an aside, there were about 50 active beta users, and not every user that logged in has run a job.

Another interesting outcome from DEMO is that we’ve realized there’s demand for customized services on top of 80legs.  In other words, people want to use our team to either build customized products for them that are powered by 80legs, or they want us to build the 80Apps that run within 80legs.  We originally expected third-party companies to build these services and products themselves over time as 80legs became more popular.  In the long-term, that is most likely the key to 80legs sustainable success.  In the short-term, however, we think it’s prudent to pursue these engagement ourselves.  In fact, it makes sense to modify our business model somewhat and form 2 additional product/service lines: one for developing value-added services on top of 80legs and another for custom implementation of 80Apps.  Of course, we need to consider how to manage these two additional lines while still managing and improving the core service.

I feel our team’s experience so far has been pretty awesome.  We spent about 2 years developing what we feel is a pretty cool technology and now we’re starting to see the fruits of our labor.  That said, I’m a big believer that developing good technology is just the first step of many when it comes to finding commercial success.  Now we get to focus on execution, customer satisfaction, and delivering on what we’ve been promising.  Now the real work begins.

Our launch experience, part 2: DEMO

Around July, we started thinking about how to launch the live service.  We were fortunate that our plans lined up with DEMO.  Of course, they also lined up with TechCrunch50.  I imagine some companies have to think about which one is best for them, but for us it was pretty easy.  TC50 required a company to have no public exposure before their event, which of course made us ineligible.

We did have to think a bit about the cost of DEMO.  I talked to my friends that had demoed there and was ultimately convinced that it was a great place to launch a product, provided you took full advantage of it with the press, PR and other media outlets.

Again though, I wasn’t sure we would even get into DEMO.  80legs was usable by this point, but again – here was a completely non-shiny service, void of any semblance of a bell or whistle.  Sure, any “big data”-nut is going to think what we do is the coolest thing since SSDs, but will anyone else?  We weren’t sure.

Carla from Guidewire was the one that talked to me about our application.  I gave the 5-second spiel, and was excited to hear that she understood it and really liked the idea.  She did wonder about how we could make the demo interesting.  I assured her we could (while making a note to myself: “Figure out how to make demo interesting!”).

A few weeks later…

Guys, I’ve got news.  We’re going live in September.

We got into DEMO?

Yep.  So we’ll be on stage.  Hundreds of people.  Thousands of Internet viewers.

So we have about 8 weeks to get everything stable, fully-tested, and scaled out.. oh and we need to make the web portal look a lot better.


Now, it’s not like we had been slacking off, but July to September was especially scrambly, particularly for our back-end guys.  On the business and marketing end, we wanted to make sure we take full advantage of not only DEMO itself, but the momentum it could generate after the event.

For that, I sought out a PR firm to help with the media.  I asked a bunch of tech/startup friends in Texas about who to go with, and almost all of them recommended Jones-Dilworth, run by the veteran Josh, who had just left Porter Novelli.  If every trusted source you have recommends the same firm, you should probably go with them!

Josh and his team met with us and mapped out a strategy to garner media attention for DEMO and keep momentum going afterward.  They also helped with training our team for handling interviews, which was a big help.  In the week leading up to DEMO, I did at least 1 interview almost every day.  It was pretty awesome talking to and being interviewed by the same folks I had been reading every day for the past few years.

We got into San Diego on Sunday.  The event and crew at DEMO were very nice and professional.  They definitely run a tight shift, but are also super-approachable.  Everyone on staff seemed to know all the details, where to be, etc.

On Monday, all the demoers went through a few introductory items and then we headed off to a happy hour by the bay.  Mingling with other startups, VCs, and press folks is pretty fun.  It’s pretty awesome to be at a party where everyone is doing something interesting or has something engaging to say.  Can’t say the same about most bars I go to :).

Not your usual bar scene

After that was the “CEO & Dealmakers” dinner, which was only attended by 1 member of each company as well as VCs and other such folks.  While the pre-dinner topic, “The Good, Bad and the Ugly of VC” is something I’ve read about ad nauseum, hearing it straight from guys like the president of the NVCA was pretty cool.  I got a chance to thank Matt Marshall and Chris Shipley for giving us the chance to DEMO.  Matt and Chris kind of seem like opposites.  Chris was cracking jokes about Pittsburgh (I went to CMU and she’s from there), but Matt was like “But seriously, what are you demoing?”.

CEO & Dealmakers Dinner

After dinner, I had a cool talk with Flip from Infochimps and Mike Olson from Cloudera about Hadoop and how we might use it for providing post-processing services of crawled data.  Yeah, that’s the kind of after-dinner conversation you have at DEMO :)

The real show started on Tuesday, with the first group of presenters in the morning.  There did seem to be a few network issues, which was unfortunate.  Digsby actually ran an “offline” version of their chat client to demo their new Twitter capabilities.  All the data was cached locally.  Now that’s what I call a backup plan!

After the presentations, the pavilion was open for a few hours.  Our booth traffic was a bit slow.  Although we had a fair number of people come by, it was nothing like Web 2.0, where a constant stream came by.  I think two factors contributed to this: 1) we hadn’t yet presented and 2) we had already talked to almost all the press folks.

Wednesday came along, which meant it was time to demo!  Although people say I always seem uber-calm, I must admit I was just a touch nervous :).  The staff guy pulled me up.  Chris called me out.  I walk out – cameras, lights, hundreds of people before me, time to launch.  “Hi, my name’s Shion Deysarkar and I’m here to show you a revolutionary new service called <dramatic pause> 80legs.”  I wonder if I’ll ever forget the lines?

I actually used a pretty cool semantic 80App written by a technology partner of ours and compared what positive and negative things people are saying about DEMO and TC50.  I thought this would be a fun demo for the audience, given the interesting history between the two shows.  I didn’t actually show who came out on top though – people had to come by the booth to find out!  It turns out that DEMO just eked out, with a 95% to 91% positive rating over TC50.  If you want to learn more about the future of this app, check out these posts.

Side note:  Even though I poked a little fun at the TC crew, I thought they’d like the joke, given their sense of humor and attitude on DEMO.  Most of the audience cracked up at my joke, but a TC writer told me the joke was “lame”.  Oh well, can’t win them all.

The demo went pretty smoothly, which I was pretty happy about.  It was great to have it out of the way though.  About 2 hours after, I could feel my body crashing, as I could finally relax.  I don’t drink a lot of soda, but I went through about 3 Pepsis (why San Diego doesn’t have Coke is beyond me) before dinner to keep the energy levels up.

At the end of the show were the awards.  7 companies received DEMOGod awards, and 2 each received media prizes – 1 company in the consumer category and 1 in the enterprise category.  I’ll admit that I was a bit miffed we didn’t win the enterprise category, but c’est la vie.  Oh, we also got treated to a little dance by the DEMO staff.

Maybe I was wrong about Matt...

Maybe I was wrong about Matt...

DEMO was finally over.  It was a great experience, but I was looking forward to a little relaxation the next day.  I figured we’d sleep in, check out San Diego for a bit, and enjoy the moment.  I couldn’t have been more wrong…

Next-up, part 3: Post-DEMO, or “Now the Real Work Begins”

Our launch experience, part 1: beta

Wow, what a crazy few weeks it has been!  For those of you just tuning in, we just launched 80legs.  Since launching, we’ve been swamped with emails, press, tweets, and much more, but I thought I’d recap our experience, from beta to launch, including our experience at DEMO.

We announced our private beta at the Launch Pad event during the Web 2.0 Expo in San Francisco, back in April.  We had been working on 80legs since early 2008.  Around February, I decided we’d exhibit at the Web 2.0 Expo to get some early exposure.  When I signed the booth contract, we weren’t thinking of making 80legs available in April.  But then I came across the Launch Pad event that they had.  Applying was pretty straightforward – all I had to do was fill out a form.  But the form asked for what kind of demo I could show right now, so that the judges could get a sense for what we did.  At that point, you could run a crawl through 80legs, but there was no pretty interface to it.  It was just command-line Java.  So in the form I said something like “Nothing to show now, but trust me – it will be really cool in April!!” and submitted it.

I was pretty sure nothing would come of it.  Surely they had several applications for products that looked shiny and sexy and would never accept anything as obtuse as a “distributed computing service designed for crawling and processing web content”… that wasn’t even ready to show yet.  Then a few weeks later, I get an email welcoming us into Launch Pad.  Ohhh-k :)  I stood up from my desk (this is mid-February, I think) and said:

Guys, I’ve got news.  We’re launching our beta in April.

We are?

Yes.  At the Web 2.0 Expo.   In front of hundreds of people.  On stage.

We don’t have an interface.  Or any way for people to setup an account.  And we’re still making the crawling reliable.

Yeah.  I guess we have a month to do that!

So during March, we scrambled putting together the first version of the web portal, getting the crawling to an acceptable state, and a bunch of other stuff.  It was nose-to-the-ground, grind-away work, but at Launch Pad, we had something to show and it looked good (well, for a beta).  The Launch Pad garnered us some press as well.

We got about 300 sign-ups for the private beta – not bad for a technical product.  We decided on letting them into 80legs in periodic batches.  On retrospect, we could have handled this better.  The first couple of batches let in responded well and offered substantive feedback.  But later batches, which may have had to wait a few months, had forgotten about us.  The excitement had worn off.  It would have been better to let them all in at once, or to at least have sent them reminders.

During our beta period, we spent a ton of time on collecting feedback from users, quickly implementing suggestions we felt were important, and scaling up our crawling ability.  Every 2-3 weeks we worked on a major new feature, such as crawler improvements, 80Apps, the API and several others.  At the same time, we were implementing a ton of minor features to make the system more robust and usable.

Our beta was going well and was getting to the point where we were starting to think about going live.  But we wanted to make a splash with our live launch.  We needed something that would get the momentum going again.  Something big…

Stay tuned for part 2: DEMO ..!

80legs has launched!!

The day is finally here!  We are now live, beta has exited, 1.0 is a go!

Before I go any further, I want to thank the many beta users that helped us over the last several months by providing feedback, suggestions for improvements, and identifying bugs.  Without your help, we wouldn’t have been able to get 80legs to where it is today.

During the private beta, we were working on several features, all of which are now ready for public use.  These features are:

  • True web-scale crawling: crawl up to 2 billion pages per day
  • Usability: easily and design your own crawls using an intuitive job form
  • 80Apps: write and run your own applications on over 50,000 computers
  • API: programmatically control 80legs to work for you

There’s also one big change that comes with leaving beta – 80legs is no longer completely free to use.  Our pricing is now in effect.  You can still dip your toes in the water and run jobs that crawl less than 100 pages.

We are doing our official launch announcement at DEMO.  If you happen to be at the show, please come and visit us at pavilion station #2!

Twitter Updates