Web Data Extraction Summit 2020

Virtual Experience presented by Scrapinghub

After the huge success of the Web Data Extraction Summit last year, we’re delighted to announce it’s back in 2020!

Wherever you are
November 10th, 2020

Things are a little different...

Yep, we're living in strange times. Amidst the current situation arising due the COVID-19 pandemic, we want to ensure that every precaution is taken to look after our attendees, speakers, and partners. Therefore we have decided to make this year’s Extract Summit a completely online experience!
You can still expect the greatest minds in data extraction and web scraping to come together to educate, inspire, and innovate.

And best of all! It’s completely free!

Learn Practical Skills

Get practical and fundamental tips on web scraping, data extraction, and proxy management from experts who do this every day.

Expert Speakers

Extract Summit brings together the smartest minds in data extraction to discuss the future of data extraction and the latest industry best practices.

Get Real Insights

Get customer case studies and behind the scenes insights into how companies are leveraging web data to gain a competitive edge.

What is Extract Summit?

The Web Data Extraction Summit is a one-day event, jam-packed with talks and workshops where we will discuss everything from the latest trends in data extraction to web scraping best practices, and how web data can turbocharge your business.

We gather the preeminent thought leaders in data extraction and web scraping to share their insights on how we can leverage the internet, the world’s largest dataset, to gain a competitive edge in today’s marketplace.
Register now
Web Data Extraction Extract Summit 2020

Agenda

ryan
Product Vision & Customer-Driven Innovation At Scrapinghub
In the constantly changing world of data extraction, innovation is key to the future. Ryan Farley, the Head of Product at Scrapinghub, will talk about connecting with customers and users to seek their input in the product development lifecycle, and how this customer-driven focus is driving innovation in the market and developing an exciting product vision for the future.

Presented by Ryan Farley - Head of Product at Scrapinghub

Bio

Ryan Farley is the Head of Product at Scrapinghub, with over 20+ years of experience in building products and businesses around the world across multiple industries including fintech, proptech, government, e-learning, consumer internet, and sport.
He works to the formula of C+E=S; meaning Customers + Employees = Shareholders with a relentless focus on trying to create products and services to wow and delight customers.
Tuesday
10th Nov 2020
Time: TBC
Panel: Legal Compliance In The World Of Web Scraping
The lack of clear legal guidance in the Web Scraping industry means you have to be extra cautious in the manner and the type of data you scrape. 
In this panel, Head of Legal at Scrapinghub, Sanaea Daruwalla brings together a panel of legal experts in the field of data extraction to discuss the various aspects of web scraping compliance and updates in the legal landscape.

Host: Sanaea Daruwalla - Head of Legal at Scrapinghub
Panelists: Sarah McKenna - CEO of Sequentum
                  Marc Zwillinger - Founder of ZwillGen PLLC
                  Paul Griffin - CEO of First Compliance
Tuesday
10th Nov 2020
Time: TBC
Pierluigi Vinciguerra
Running a Business on Web Scraped Data
Every day we hear sentences like "Data is the new oil" or "Web data is a gold mine" and that's definitely true. 
In this talk, we will see how establishing a business based on web scraped data has so much in common with the old traditional mining companies. Pierluigi will cover the processes, tasks, operators and tools needed to run a reliable and modern company. Find out how web scraping can help avoid the many obstacles faced while running a successful business.

Presented by Pierluigi Vinciguerra - CTO and Co-Founder of Re Analytics

Bio

Pierluigi Vinciguerra is the CTO and Co-Founder of Re Analytics, a data boutique for consumer and luxury goods. With 10+ years of experience in business Intelligence, web data integration and scraping, Pierluigi is an expert in data management. The team at Re Analytics crawls 1+ Billion price points every month to extract valuable insights for investors and C-level executives in Consumer and Luxury goods.
Tuesday
10th Nov 2020
Time: TBC
Utilizing The Scrapy Cloud API For A Seamless Data Pipeline
In this talk, Johnel will talk about how the team at Prospel utilized the Scrapy Cloud API to create an automated process from initial scraping to clean data. Basically, automating the E and T in the ETL process!

Presented by Johnel Bacani, Data Specialist at Prospel

Bio

Johnel Bacani is a Data Specialist at Prosple. He designs and manages various data pipelines. He loves Python and he's been using it since 2013 for both work and play.
Tuesday
10th Nov 2020
Time: TBC
rameez
Everyday Low Pricing Strategy With Scrapinghub, Google App Scripts & Heroku
Is your business suffering because your competitors are undercutting your prices? Are you struggling to keep up? Say hello to your very own Price-spy! 
Made with tools like Scrapinghub, Google App Scripts, Heroku, Rameez shows how you can see your competitor's changing prices and trends.

Presented by Rameez Kakodker, Digital Transformation Expert

Bio

With over 30 e-Commerce launches in the last 10 years, Rameez has been at the helm of digital transformation within the GCC & Asia region. He regularly features on Medium and supports startups with their roadmap planning and product.
Tuesday
10th Nov 2020
Time: TBC
ondra
Web Scraping Tech Stack For 2020
As the web evolved from static sites to complex JavaScript applications, even the techniques and tools needed to scrape it have changed. From plain HTTP requests to robotized browsers - this talk will show you all the tricks you need to extract data from the modern web reliably and scalably.

Presented by Ondra Urban, Technical Web Scraping Expert at Apify

Bio

Ondra is a hacker of the browser age. He extracts terabytes of publicly available data and translates them to a language that machines can understand. At Apify, Ondra leads a team of fellow hackers who grow their open source projects, break anti-scraping walls, and dabble in AI.
Tuesday
10th Nov 2020
Time: TBC
amanda
TellFinder Alliance: Tackling Online Exploitation with Data
Last year at Extract Summit, Amanda Towler and David Schroh talked about their five-year project to build a bleeding-edge data collection and extraction pipeline to fight human trafficking.
This year, Amanda would like to expand on this topic to discuss how the team pivots their data pipeline to tackle a broader array of online exploitation, how having a solid foundation makes this a tractable, efficient process, and the impacts they can have on the world.

Presented by Amanda Towler, Co-Founder & Principal Investigator at 
Hyperion Gray, LLC

Bio

Amanda is the Co-Founder and Principal Investigator at Hyperion Gray, LLC, a technology R&D small business working primarily with the Defense Advanced Research Projects Agency (DARPA). 
She has a decade of experience spanning OSINT, offensive security, data science, and software development. She has consulted with law enforcement on several high profile dark web child exploitation cases.
Tuesday
10th Nov 2020
Time: TBC
ivan
Introducing AutoCrawl - The AI-Powered Crawler
AI is disrupting the ecosystem, altering every single process with new machine-learning powered approaches.
In this talk, Iván will show how this impacts the world of data crawling by introducing AutoCrawl, an AI-powered crawler capable of gathering data from websites automatically.

Presented by Iván de Prado Alonso, Data Scientist at Scrapinghub

Bio

Iván is a Data Scientist at Scrapinghub who loves Deep Learning and Computer Vision. He has 10+ years of experience working for and with startups, dealing with the greatest technical challenges at each.
Tuesday
10th Nov 2020
Time: TBC
victor
Separating Extraction From Crawling Logic With Web-Poet
What are Web-poet and Scrapy-poet projects? How do they work and how could they be helpful? In this talk, Victor will take you through the state of development, the foreseeable future, and their relation with AutoExtract and AutoCrawl projects.

Presented by Victor Torres, Web Scraping, Python and Scrapy Guru

Bio

Victor Torres is Full-stack developer with 5+ years of experience leading agile teams and building web applications. He currently works with Python and web scraping at Scrapinghub.
Tuesday
10th Nov 2020
Time: TBC
Attila
Panel: Cutting Edge Ways To Tackle Antibot Challenges
Extracting web data at scale can provide huge value. But with scaling up, there is often an obstacle standing between you and the data, preventing easy access: antibots.
Antibots introduce a big and important challenge to solve for anyone who wants to scrape the web at scale. If you don’t have a reliable way to solve these challenges created by antibots, you will not be able to access any data.
In this session, the panel of antibot experts at Scrapinghub will aim to dissect this problem and look at the possible solutions.

Host: Attila Toth, Technology Evangelist at Scrapinghub
Panelists: Akshay Philar,  Head of Development at Scrapinghub
                  Tomas Rinke, Team lead at Scrapinghub
                  Rodolfo Silva, Solutions Engineer at Scrapinghub
Tuesday
10th Nov 2020
Time: TBC
Overcoming Price Variations On The Day: In Search of Real-time Pricing
Offering a platform that can deal with B2C and B2B simultaneously is a challenge. The ever-changing and volatile market in Latin America presents even more challenges with the products going through price variations multiple times a day. This makes real-time information processing extremely difficult. 
To overcome these challenges and be able to reach both B2B and B2C markets, Alfonso will show how Prixtips has implemented technological improvements and developed various solutions that integrate databases, scraping, machine learning and networking.

Presented by Alfonso de la Guarda, CTO of Aputek and Technology Architect at Veo365.com and Prix.tips

Bio

Alfonso de la Guarda, the CTO of Aputek and a Technology Architect at Veo365.com and Prix.tips, is an old-school hacker. He collaborates and oversees projects in strategic areas such as: mining, defense and health.
Tuesday
10th Nov 2020
Time: TBC
DataOps and The Culture You Need If You Want To Stay Sane
Data-driven companies are in their nascent stage and most of them lack proper culture and methodology. A modern business consists of constant data updates with multidisciplinary teams of engineers, scientists and business people all working together. The customers demand complex answers within hours or minutes. The growing demands of an expanding business requires a streamlined and scalable culture in order to stay sane.
José will explain how DataOps is the perfect solution for a data company like this. Borrowing the concept from the automation spirit of DevOps applied to data delivery. In this talk he will share the lessons learned during 3 years of constant evolution and improvements in our software, data processes and culture.

Presented by José Manuel Navarro, CTO of urbanData Analytics

Bio

José Manuel is the CTO of urbanData Analytics. He’s leading all technical teams to make the most of each individual, developing the data-driven culture and coding all kinds of software. Prior to joining uDA, he was the global Lead for Mobile & API Products at Liferay.
Tuesday
10th Nov 2020
Time: TBC

Web Data Extraction Summit is organised by Scrapinghub.
Scrapinghub delivers world class web data extraction products and services.
© Web Data Extraction Summit 2020

info@www.extractsummit.io
map-markercalendarlightbulb-obookusers linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram