Posts Tagged 'open data'

Crawl Packages: Aggregate Website Data in a Few Clicks

We’re excited to announce a new service at 80legs: Crawl Packages.

What crawl packages are:

Crawl packages are pre-configured crawls that you can access and run in just a few clicks.

For a specific website or group of websites, we’ve designed and setup an 80legs crawl, along with custom data extractors, to crawl that site and extract all the interesting information from it.  These are crawls you could have setup yourself, but we’ve gone ahead and done all the work for you.

Types of crawl packages available:

We’re currently offering crawl packages for social networks, retail/shopping sites and business directories.  We’ll be expanding our offerings to include other websites as well.  Initial plans include crawling blogs (and their comments), semantic annotation feeds of various websites, and so on.

Results & Pricing:

Most crawl packages will cost $350 per month and produce 10 – 20 million records per month.  The type of records produced depend on the crawl package.  Social network packages produce publicly-available profiles, Retail packages produce product listings, etc.

Open Data:

We realize that the availability of crawl packages will raise some concerns over what data should be crawled and shouldn’t.  We only crawl publicly-available Web data.  We don’t crawl private data and have no interest in that.

What we are interested in is what our users can do with Web data that is more accessible.  Since our launch, we’ve seen many startups come to us asking for large amounts of Web data so that they can create additional value on top of that data.  They want to do interesting things like provide new insight into how people connect with one another, create CPIs of online product invetory, and more.  We want to make that possible, and crawl packages are a step in that direction.

Advertisements