Extract Summit 2021 Coding Contest

powered by Zyte

Crawl and scrape all expected items from a given website and get a chance to win bragging rights and cool prizes.

Virtual

September 30, 2021

Join the contest Register for Extract Summit

What is the Extract Summit 2021 coding contest?

What better way to kick-off Extract Summit 2021 than to start the day with a cool web scraping coding contest!

So get ready to awaken your web scraping skills and join us for a challenge to crawl and scrape all expected items from a given website using Scrapy Cloud.

Be the first one to successfully extract all expected data to win bragging rights and the grand prize.

But don't lose hope if you're not the first one, we have really cool prizes for a few lucky participants!

What you will need to participate

Scrapy Cloud account

Join Scrapy Cloud

Scrapy Discord access

Join the Scrapy Discord.

The URL of the target website will be revealed through Discord.

Join Scrapy Discord

Register for the contest

Fill the form below

We will need the information to ensure that you are correctly enrolled for the contest.

Register here for the coding contest

Contest outline

Get your Scrapy Cloud account, your Scrapy Discord access, and your registration in place beforehand.

Once you have these in place, you're ready to begin as soon as the contest begins at 2 pm GMT on 30th September.

Here's how to proceed on the day:

We will reveal through Discord the URL of the target website and a specification of item fields that need to be extracted.
You must write a spider that extracts all items with the specified fields, and run it in Scrapy Cloud.
Once the Scrapy Cloud job finishes, you must submit the job ID to a bot in the Scrapy Discord server.
The bot will let you know whether or not you managed to extract all items with complete data.
If you failed, update your code and try again with a new Scrapy Cloud job. The bot will accept unlimited job submissions for the duration of the competition.
To win, be the first to submit a job that successfully extracts all expected data.

The website will not ban any client for any reason, so you will not need any proxy. But crawling the website and extracting item data will not be straightforward, do not expect to get a working spider on your first run.

Prepare for the Code Contest

For a few days, before the competition, we will enable a testing website for you to practice and prepare a code base, so that once the actual competition starts you just need to update your code to point to the competition website and update your crawl and extraction logic accordingly.

Good luck, have fun!

Join the contest Register for Extract Summit