Web Data Extraction Summit 2020

Virtual Experience presented by Scrapinghub

After the huge success of the Web Data Extraction Summit last year, we’re delighted to announce it’s back in 2020!

Wherever you are
November 10th, 2020

Things are a little different...

Yep, we're living in strange times. Amidst the current situation arising due the COVID-19 pandemic, we want to ensure that every precaution is taken to look after our attendees, speakers, and partners. Therefore we have decided to make this year’s Extract Summit a completely online experience!
You can still expect the greatest minds in data extraction and web scraping to come together to educate, inspire, and innovate.

And best of all! It’s completely free!

Learn Practical Skills

Get practical and fundamental tips on web scraping, data extraction, and proxy management from experts who do this every day.

Expert Speakers

Extract Summit brings together the smartest minds in data extraction to discuss the future of data extraction and the latest industry best practices.

Get Real Insights

Get customer case studies and behind the scenes insights into how companies are leveraging web data to gain a competitive edge.

What is Extract Summit?

The Web Data Extraction Summit is a one-day event, jam-packed with talks and workshops where we will discuss everything from the latest trends in data extraction to web scraping best practices, and how web data can turbocharge your business.

We gather the preeminent thought leaders in data extraction and web scraping to share their insights on how we can leverage the internet, the world’s largest dataset, to gain a competitive edge in today’s marketplace.
Register now
Web Data Extraction Extract Summit 2020


Product Vision & Customer-Driven Innovation at Scrapinghub
In the constantly changing world of data extraction, innovation is key to the future. Ryan Farley, Head of Product at Scrapinghub, will talk about connecting with customers and users to seek their input in the product development lifecycle, and how this customer-driven focus is driving innovation in the market and developing an exciting product vision for the future.
Presented by Ryan Farley - Head of Product at Scrapinghub


Ryan Farley is the Head of Product at Scrapinghub, with over 20+ years of experience in building products and businesses around the world across multiple industries including fintech, proptech, government, e-learning, consumer internet, and sport. He works to the formula of C+E=S; meaning Customers + Employees = Shareholders with a relentless focus on trying to create products and services to wow and delight customers.
Tues 10th Nov 2020
Time: TBC
Panel: Legal Compliance in the World of Web Scraping
The lack of clear legal guidance in the Web Scraping industry means you have to be extra cautious in the manner and the type of data you scrape. In this panel, Head of Legal at Scrapinghub, Sanaea Daruwalla brings together a panel of legal experts in the field of data extraction to discuss the various aspects of web scraping compliance and updates in the legal landscape.
Host: Sanaea Daruwalla - Head of Legal, Scrapinghub
Panelists: Sarah McKenna - CEO, Sequentum
                  Marc Zwillinger - Founder, ZwillGen PLLC
                  Paul Griffin - CEO, First Compliance
Tues 10th Nov 2020
Time: TBC
Pierluigi Vinciguerra
Running a Business On Web Scraped Data
Every day we hear sentences like "Data is new oil" or "web data is a gold mine" and that's definitely true. In this talk, see how establishing a business based on web scraped data has much more in common with old traditional mining companies. Pierluigi will cover the processes, tasks, operators and tools needed to run a reliable and modern company and avoid the many hassles that web scraping can help overcome.
Presented by Pierluigi Vinciguerra - CTO and Co-Founder of Re Analytics


Pierluigi Vinciguerra is the CTO and Co-Founder of Re Analytics, a data boutique for consumer and luxury goods. With 10+ years of experience in business Intelligence, web data integration and scraping, Pierluigi is an expert in data management. The team at Re Analytics crawls 1+ Billion price points every month to extract valuable insights for investors and C-level executives in Consumer and Luxury goods.
Tues 10th Nov 2020
Time: TBC
How Venture Capital Firms use Web Scraping to find the next Billion-Dollar Company
The goal of any Venture Capital firm is to invest in successful companies to get a return on their investments, so they are on a constant search for promising startups.  Simply subscribing to a data feed or a data platform that everyone else has access to, doesn’t give firms an edge. That’s where web scraping comes in, bringing much more productivity in the research process. In this talk, Carles explains how Venture Capital firms use web scraping to make data-driven investment decisions, and how they rely on advanced data analytics in their processes.
Presented by Carles Illa, Software Engineer and Data Scientist at Nauta Capital


Carles Illa is a Software Engineer and Data Scientist at Nauta Capital, a Pan-European Venture Capital Firm investing in early-stage software companies.
He works on the development of the Dealflow Engine, a proprietary tool to automatically extract, structure, and enrich data of potential investment opportunities.

Tues 10th Nov 2020
Time: TBC
Everyday Low Pricing Strategy with Scrapinghub, Google App Scripts & Heroku
Is your business suffering because your competitors are undercutting your prices? Are you struggling to keep up? Say hello to your very own Price-spy! Made with tools like Scrapinghub, Google App Scripts, Heroku, Rameez shows how you can see your competitor's changing prices and trends.
Presented by Rameez Kakodker, Digital Transformation Expert


With over 30 e-Commerce launches in the last 10 years, Rameez has been at the helm of digital transformation within the GCC & Asia region. He regularly features on Medium and supports startups with their roadmap planning and product.
Tues 10th Nov 2020
Time: TBC
Web Scraping in 2020
As the web evolved from static sites to complex JavaScript applications, even the techniques and tools needed to scrape it have changed. From plain HTTP requests to robotized browsers - This talk will show you all the tricks you need to extract data from the modern web reliably and scalably.
Presented by Ondra Urban, Technical Web Scraping Expert


Ondra is a hacker of the browser age. He extracts terabytes of publicly available data and translates them to a language that machines can understand. At Apify, Ondra leads a team of fellow hackers who grow their open source projects, break anti-scraping walls, and dabble in AI.
Tues 10th Nov 2020
Time: TBC
TellFinder Alliance: Tackling Online Exploitation with Data
Last year at Extract Summit, Amanda Towler and David Schroh talked about their five-year project to build a bleeding-edge data collection and extraction pipeline to fight human trafficking. This year, Amanda would like to expand on this topic to discuss how the team pivots their data pipeline to tackle a broader array of online exploitation, how having a solid foundation makes this a tractable, efficient process, and the impacts they can have on the world.
Talk by Amanda Towler, Co-Founder & Principal Investigator Hyperion Gray, LLC


Amanda is the Co-Founder and Principal Investigator at Hyperion Gray, LLC, a technology R&D small business working primarily with the Defense Advanced Research Projects Agency (DARPA). She has a decade of experience spanning OSINT, offensive security, data science, and software development. She has consulted with law enforcement on several high profile dark web child exploitation cases.
Tues 10th Nov 2020
Time: TBC
Introducing AutoCrawl - The AI-Powered Crawler
AI is disrupting the ecosystem, altering every single process with new machine-learning powered approaches. In this talk, Iván will show how this impacts the world of data crawling by introducing AutoCrawl, an AI-powered crawler capable of gathering data from websites automatically.
Presented by Iván de Prado Alonso, Data Scientist, Scrapinghub


Iván is a Data Scientist at Scrapinghub who loves Deep Learning and Computer Vision. He has 10+ years of experience working for and with startups, dealing with the greatest technical challenges at each.
Tues 10th Nov 2020
Time: TBC
Panel: Cutting edge ways to tackle antibot challenges
Extracting web data at scale can provide huge value. But with scaling up, there is often an obstacle standing between you and the data, preventing easy access: antibots. Antibots introduce a big and important challenge to solve for anyone who wants to scrape the web at scale. If you don’t have a reliable way to solve these challenges created by antibots, you will not be able to access any data. In this session, we are going to dissect this problem and look at the possible solutions with the help of our antibot experts at Scrapinghub.
Host: Attila Toth
Panelists: Akshay Philar & Tomas Rinke
Tues 10th Nov 2020
Time: TBC
Separating Extraction From Crawling Logic With Web-Poet
What are Web-poet and Scrapy-poet projects? How do they work and how could they be helpful? Victor will take you through the state of development, the foreseeable future, and their relation with AutoExtract and AutoCrawl projects.
Presented by Victor Torres, Web Scraping Python and Scrapy Guru


Victor Torres is Full-stack developer with 5+ years of experience leading agile teams and building web applications. Currently works with Python and web scraping at Scrapinghub.
Tues 10th Nov 2020
Time: TBC
calling for speakers extract summit

Want to be Involved?

We are currently taking applications for speakers for Extract Summit 2020. If you are interested in speaking at the event, simply fill out our application form with your details and the topic you would like to talk about
Apply to speak

Web Data Extraction Summit is organised by Scrapinghub.
Scrapinghub delivers world class web data extraction products and services.
© Web Data Extraction Summit 2020

map-markercalendarlightbulb-obookusers linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram