New feature: 80app packs!

We’ve just deployed a new version of 80legs that adds an exciting new feature: 80app Packs!

Plus and Premium subscribers will now have access to a growing set of useful, pre-built 80apps.  The following 80apps are currently available or will be available soon:


  • Return Page Content
  • Regex Text Matcher
  • Regex Source Matcher
  • Image Resizer


  • All Plus 80apps
  • Social Network Scrapers
  • E-commerce Site Scrapers

80legs users will be able to select these apps and get the information they want from crawls with zero programming.  Everything will be pre-built and ready to go.  We want to make things as easy as possible for our users.

We plan to keep on adding more and more 80apps to Plus and Premium Plans.  If you have an idea for 80apps you’d like to see, just let us know!

7 Responses to “New feature: 80app packs!”

  1. 2 Hoze March 1, 2010 at 10:38 am

    Are you aware that your crawler is used for crashing web servers? your crawler makes lots of connections from lots of different IPs, i still cant get my server up, and i have banned more than 500 IP addresses!

  2. 3 Shion Deysarkar March 1, 2010 at 10:40 am

    Hoze – you can follow the instructions here to prevent 80legs from crawling your site:

  3. 4 Janus March 12, 2010 at 2:53 pm

    The crawler also does NOT obey robots.txt OR html header directives, despite what it says in the FAQ referenced above.

  4. 5 Shion March 12, 2010 at 2:56 pm


    We obey all standard robots.txt directives. Which directive in your robots.txt file are we not following?

  5. 6 Janus March 16, 2010 at 8:03 pm

    Our robots.txt contains the following:

    User-agent: *
    Disallow: /exampledirectory/

    Your bot cruises merrily into /exampledirectory/ oblivious to robots.txt directives. I also ignores robots directives in page headers such as nofollow.

  6. 7 Shion March 17, 2010 at 8:39 am


    Please let me know which domain this robots.txt is on (either here or by contacting us:

    The nofollow tag is not supposed to be used to tell a bot to not crawl it. It’s for search engines such as Google to not index that content.

    – Shion

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Twitter Updates

%d bloggers like this: