Thank you for attending this year!

Below is all the recordings from the day.  Sign up and get full access!

Product Vision & Customer-Driven Innovation At Scrapinghub
In the constantly changing world of data extraction, innovation is key to the future. Ryan Farley, the Head of Product at Scrapinghub, will talk about connecting with customers and users to seek their input in the product development lifecycle, and how this customer-driven focus is driving innovation in the market and developing an exciting product vision for the future.

Presented by Ryan Farley - Head of Product at Scrapinghub


Ryan Farley is the Head of Product at Scrapinghub, with over 20+ years of experience in building products and businesses around the world across multiple industries including fintech, proptech, government, e-learning, consumer internet, and sport.
He works to the formula of C+E=S; meaning Customers + Employees = Shareholders with a relentless focus on trying to create products and services to wow and delight customers.
Panel: Legal Compliance In The World Of Web Scraping
The lack of clear legal guidance in the Web Scraping industry means you have to be extra cautious in the manner and the type of data you scrape. 
In this panel, Head of Legal at Scrapinghub, Sanaea Daruwalla brings together a panel of legal experts in the field of data extraction to discuss the various aspects of web scraping compliance and updates in the legal landscape.

Host: Sanaea Daruwalla - Head of Legal at Scrapinghub
Panelists: Sarah McKenna - CEO of Sequentum
                  Marc Zwillinger - Founder of ZwillGen PLLC
                  Paul Griffin - CEO of First Compliance
Pierluigi Vinciguerra
Running a Business on Web Scraped Data
Every day we hear sentences like "Data is the new oil" or "Web data is a gold mine" and that's definitely true. 
In this talk, we will see how establishing a business based on web scraped data has so much in common with the old traditional mining companies. Pierluigi will cover the processes, tasks, operators and tools needed to run a reliable and modern company. Find out how web scraping can help avoid the many obstacles faced while running a successful business.

Presented by Pierluigi Vinciguerra - CTO and Co-Founder of Re Analytics


Pierluigi Vinciguerra is the CTO and Co-Founder of Re Analytics, a data boutique for consumer and luxury goods. With 10+ years of experience in business Intelligence, web data integration and scraping, Pierluigi is an expert in data management. The team at Re Analytics crawls 1+ Billion price points every month to extract valuable insights for investors and C-level executives in Consumer and Luxury goods.
Utilizing The Scrapy Cloud API For A Seamless Data Pipeline
In this talk, Johnel will talk about how the team at Prospel utilized the Scrapy Cloud API to create an automated process from initial scraping to clean data. Basically, automating the E and T in the ETL process!

Presented by Johnel Bacani, Data Specialist at Prosple


Johnel Bacani is a Data Specialist at Prosple. He designs and manages various data pipelines. He loves Python and he's been using it since 2013 for both work and play.
Web Scraping Tech Stack For 2020
As the web evolved from static sites to complex JavaScript applications, even the techniques and tools needed to scrape it have changed. From plain HTTP requests to robotized browsers - this talk will show you all the tricks you need to extract data from the modern web reliably and scalably.

Presented by Ondra Urban, Technical Web Scraping Expert at Apify


Ondra is a hacker of the browser age. He extracts terabytes of publicly available data and translates them to a language that machines can understand. At Apify, Ondra leads a team of fellow hackers who grow their open source projects, break anti-scraping walls, and dabble in AI.
TellFinder Alliance: Tackling Online Exploitation with Data
Last year at Extract Summit, Amanda Towler and David Schroh talked about their five-year project to build a bleeding-edge data collection and extraction pipeline to fight human trafficking.
This year, Amanda would like to expand on this topic to discuss how the team pivots their data pipeline to tackle a broader array of online exploitation, how having a solid foundation makes this a tractable, efficient process, and the impacts they can have on the world.

Presented by Amanda Towler, Co-Founder & Principal Investigator at 
Hyperion Gray, LLC


Amanda is the Co-Founder and Principal Investigator at Hyperion Gray, LLC, a technology R&D small business working primarily with the Defense Advanced Research Projects Agency (DARPA). 
She has a decade of experience spanning OSINT, offensive security, data science, and software development. She has consulted with law enforcement on several high profile dark web child exploitation cases.
Introducing AutoCrawl - The AI-Powered Crawler
AI is disrupting the ecosystem, altering every single process with new machine-learning powered approaches.
In this talk, Iván will show how this impacts the world of data crawling by introducing AutoCrawl, an AI-powered crawler capable of gathering data from websites automatically.

Presented by Iván de Prado Alonso, Data Scientist at Scrapinghub


Iván is a Data Scientist at Scrapinghub who loves Deep Learning and Computer Vision. He has 10+ years of experience working for and with startups, dealing with the greatest technical challenges at each.
Separating Extraction From Crawling Logic With Web-Poet
What are Web-poet and Scrapy-poet projects? How do they work and how could they be helpful? In this talk, Victor will take you through the state of development, the foreseeable future, and their relation with AutoExtract and AutoCrawl projects.

Presented by Victor Torres, Web Scraping, Python and Scrapy Guru


Victor Torres is Full-stack developer with 5+ years of experience leading agile teams and building web applications. He currently works with Python and web scraping at Scrapinghub.
Panel: Cutting Edge Ways To Tackle Antibot Challenges
Extracting web data at scale can provide huge value. But with scaling up, there is often an obstacle standing between you and the data, preventing easy access: antibots.
Antibots introduce a big and important challenge to solve for anyone who wants to scrape the web at scale. If you don’t have a reliable way to solve these challenges created by antibots, you will not be able to access any data.
In this session, the panel of antibot experts at Scrapinghub will aim to dissect this problem and look at the possible solutions.

Host: Attila Toth, Technology Evangelist at Scrapinghub
Panelists: Akshay Philar,  Head of Development at Scrapinghub
                  Tomas Rinke, Team lead at Scrapinghub
                  Peng-Yu Chen, Developer at Scrapinghub
Overcoming Price Variations On The Day: In Search of Real-time Pricing
Offering a platform that can deal with B2C and B2B simultaneously is a challenge. The ever-changing and volatile market in Latin America presents even more challenges with the products going through price variations multiple times a day. This makes real-time information processing extremely difficult. 
To overcome these challenges and be able to reach both B2B and B2C markets, Alfonso will show how Prixtips has implemented technological improvements and developed various solutions that integrate databases, scraping, machine learning and networking.

Presented by Alfonso de la Guarda, CTO of Aputek and Technology Architect at and


Alfonso de la Guarda, the CTO of Aputek and a Technology Architect at and, is an old-school hacker. He collaborates and oversees projects in strategic areas such as: mining, defense and health.
DataOps and The Culture You Need If You Want To Stay Sane
Data-driven companies are in their nascent stage and most of them lack proper culture and methodology. A modern business consists of constant data updates with multidisciplinary teams of engineers, scientists and business people all working together. The customers demand complex answers within hours or minutes. The growing demands of an expanding business requires a streamlined and scalable culture in order to stay sane.
José will explain how DataOps is the perfect solution for a data company like this. Borrowing the concept from the automation spirit of DevOps applied to data delivery. In this talk he will share the lessons learned during 3 years of constant evolution and improvements in our software, data processes and culture.

Presented by José Manuel Navarro, CTO of urbanData Analytics


José Manuel is the CTO of urbanData Analytics. He’s leading all technical teams to make the most of each individual, developing the data-driven culture and coding all kinds of software. Prior to joining uDA, he was the global Lead for Mobile & API Products at Liferay.

Web Data Extraction Summit is organised by Scrapinghub.
Scrapinghub delivers world class web data extraction products and services.
© Web Data Extraction Summit 2020
linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram