What we covered in 2019

8:00 - 9:00Registration
Grab your badge, your swag and that very important caffeine injection.
9:00 - 9:30Welcome and Keynote - Keynote Speaker, Shane Evans
A higher-level view on the “data” industry and the role web scraping is playing in it.
9:30 - 10:00Lessons Learnt Scraping 9 Billion Pages Per Month - Cathal Garvey
Data expose - what data is hot, behind the scenes of use cases, applications, trends based on Scrapinghub scraping 9 billion pages per month for thousands of companies.
10:00 - 10:30Legal Compliance, GDPR in the World of Web Scraping - Kate O’Brien
Learn about how changing regulations will affect web scraping, and the best practice tips to make sure you remain compliant.
10:30 - 11:00Building a Scalable Web Scraping Infrastructure - Attila Toth & Pawel Miech
How to design and build a scalable web scraping stack, incorporate best practices and common mistakes.
11:00 - 11:30Coffee/networking
Perk yourself up with a hot coffee and something sweet.
11:30 - 12:15The Future of Web Scraping & The Next Generation of Web Scraping - Bryan O’Brien
What is the future of data extraction and web scraping with the emergence of AI-powered machine learning? Hear how AI is poised to completely change the way we scrape the web and discover Scrapinghub’s exciting new AI-powered Data Extraction solution, straight from the team that developed it.
12:15 - 13:00Web Scraping at Scale - Proxy and Anti-Ban Best Practice - Akshay Philar & Tomas Rinke
Get insights from the experts on the “bot arms race” and the latest best practices to ensure your crawlers don’t get banned.
13:00 - 14:00Lunch/Networking
Grab a bite, relax and mingle with your fellow geeks.
14:00 - 14:30The TellFinder Alliance: A Counter Human Trafficking Partner Network Empowered by Data - Amanda Towler (Hyperion Gray) & David Schroh (Uncharted Software)
Scrapinghub's technical expertise is the foundation of a network of law enforcement, technology and research partners who adapt and advance emerging capabilities in AI, data science, visual analytics and deep web data to locate potential victims of human trafficking, and identify those who exploit them.
14:30 - 15:00Real world stories of Web Data Integration for business decision insight - Andrew Fogg, Chief Data Officer, Import.io
A global investment bank, an online travel agency, an e-commerce retailer, a payments processor – they all have one thing in common: a burning need for web data! As a leading Web Data Integration services and solution provider to data-driven businesses, Import.io’s founder Andrew Fogg will share real-world examples of how businesses in a number of different industries are using Web Data Integration to identify trends and inform business decisions.
15.00 - 15.45Insights into Web Data Use Cases - Customer Panel hosted by Suzanne Hasset
Hear how some of the world’s leading companies are using the power of data extraction and web scraping to gain a competitive edge.
15.45 - 16:00Coffee Break
Perk yourself up with a hot coffee and something sweet.
16:00 - 16:30How Machine Learning can be used in Web Scraping - Mikhail Korobov
We all know that you can write a web scraper in Python to download web pages. The problem is, this approach is not enough if you need to get the data from millions of different websites. In this talk, get real examples of how machine learning can be used for web scraping.
16:30 - 17:00Quality makes for the best business plan - Or Lenchner, CEO, Luminati
In this talk Or Lenchner will take us through the hidden elements of web-scraping directly affecting your data quality and your business decisions - without you even knowing!
17:00 - 17:30Data Democratization - Juan Riaza, Idealista
Learn about the process of creating datasets ready for use company-wide. Using Apache Airflow for orchestrating the Scrapy Cloud API and a cluster of Apache Spark to power our data pipeline.
17.30Closing Remarks and Networking Drinks
Grab a beer/wine/soft drink and let's have a laugh.

Videos from the 2019 event

Want to be Involved?

Pre-Register for 2020

Web Data Extraction Summit is organised by Scrapinghub.
Scrapinghub delivers world class web data extraction products and services.
© Web Data Extraction Summit 2020

info@www.extractsummit.io
linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram