PINGDOM_CHECK

Powered by

logoZyte icon
location icon
Dublin, Ireland | November 5-6

WEB DATA EXTRACT SUMMIT 2025

After an incredible event in Austin, the 7th Annual Web Data Extract Summit is now gearing up for its Dublin edition! Don’t miss your chance to be part of this in-person experience, where industry leaders, innovators, and data enthusiasts come together to shape the future of web data.

November 6 | Dublin, Ireland

Main Event

6th November 2025

How to Make AI Coding Work for Enterprise Web Scraping

Iain Lennon | CPO @ Zyte & John Rooney | Developer Engagement Manager @ Zyte
Enterprise-scale scraping demands control, reliability, and engineering discipline that no-code AI tools can’t match. AI coding offers hope, but often fails to deliver effective spiders. This talk shows how to bridge the gap and meet the needs of professional web data teams.
6th November 2025

The Anatomy of a Request: Bypassing Protections and Scaling Data Extraction

Kieron Spearing | Data Collection Engineer @ Centric Software
Learn how advanced scraping teams decode anti-bot systems at scale, reverse-engineering requests, exploiting weaknesses, and deploying resilient bypasses across thousands of feeds without compromising speed, stability, or compliance in real-world high-volume environments.
6th November 2025

The New Era of AI Data Collection: A Deep Dive into Modern Web Scraping

Fabien Vauchelles | Web Scraping Expert @ Scrapoxy
Unpack the AI-era web scraping arms race, where LLMs write scrapers, proxies crash in cost, and bots evade detection—revealing the real-world stack behind scalable data extraction, price intelligence, and modern web-scale signal mining.
6th November 2025

Scaling Horizontally: Smart Strategies for Scraping 600K Unknown Websites

Daniele D'Accurso & Marco⁩ ⁨Crisafulli | Co-Founders of DataHive
Learn how data extraction teams scale horizontal scraping to uncover contact intelligence from hundreds of thousands of unknown company websites, discovering hidden domains via POI datasets and map APIs, decoding diverse layouts with adaptive rules and NER, and extracting emails and phones using LLMs and SpaCy—delivering structured intelligence across massive, unpredictable web ecosystems without sacrificing speed, cost, or precision.
6th November 2025

Why AI Agents Struggle with Web Scraping (and How to Help Them)

Iván Sánchez | Senior Data Scientist @ Zyte
Discover why web scraping is uniquely hard for AI agents. This talk explains the key challenges that cause them to fail and offers practical strategies and tools to make agents more resilient in real-world scraping.
6th November 2025

IPv6-Powered Web Scraping: Design Patterns, Pitfalls & Practical Checklists

Yuli Azarch | CEO @ RapidSeedbox
Learn how advanced scraping teams harness IPv6 to unlock vast, low-cost address space for cleaner, faster, and more resilient operations—understanding how sites assign reputation to IPv6 blocks, mastering rDNS for credibility, and isolating jobs across /48 segments to maximize success at scale.
6th November 2025

Legal Panel: The Future of Data Laws: AI, Web Data, and Intellectual Property

Sanaea Daruwalla - Chief Legal & People Officer @ Zyte | Barry Scannell, Partner @ William Fry | Nikos Minas, Global IP Counsel @ Wesco | Callum Henry, Legal Counsel @ Zyte
Learn how legal experts navigate the intersection of AI, web scraping, and intellectual property, covering topics like the EU AI Act, risk assessment for AI systems, prohibited uses, copyright case law including fair use debates, guidance from data protection authorities, and how to ensure compliance while fostering innovation.
6th November 2025

Scraping a Synthetic Web: Dead Internet Theory Meets Web Data Extraction

Domagoj Marić | AI Customer Delivery Manager @ Pontis Technology
Investigate how bot-generated content, synthetic signals, and looming regulations like the EU AI Act reshape scraping—forcing practitioners to rethink trust, traceability, and tactics in a web where authenticity is no longer guaranteed.
6th November 2025

The Technical Reality of Processing 10% of Google’s Global Search Volume

Julien Khaleghy | Founder & CEO @ SerpApi
Learn how SerpApi scales scraping Google using geolocated proxies, headless browsers, and adaptive parsing pipelines to reliably extract search results across languages, devices, and experiments while converting unstable HTML into clean, usable data.
November 5 | Dublin, Ireland

Extract Labs

5th November 2025
9:00 AM

Web Data Extraction with Agents & LLMs

Iván Sánchez | Senior Data Scientist @ Zyte
Learn how agent-style tools, such as Copilot, Cursor, Codex, and LLM prompt tactics can super-charge—and sometimes break—your scraping workflow. We’ll demo patterns, anti-patterns, security gotchas, and quick fixes for reliable selector generation and code execution.
5th November 2025
10:45 AM

Build and Scale Your Scraping System

John Rooney | Developer Engagement Manager @ Zyte
Create a scalable and robust web scraping system with Scrapy and Zyte API. We'll create and manage a queue for URL input and data output, plus handle our spiders logs, all ready to scrape data quickly and efficiently.
5th November 2025
2:00 PM

Deep dive into Zyte API and Scrapy advanced features through a live use case

Fernando Tadao Ito | Technical Team Lead @ Zyte
In this workshop, we'll go through the code of a complete Scrapy spider that's using everything that Zyte API and Scrapy offer to crawl a complex website, and show off a few features you wouldn't usually see live! We'll also discuss codebase design strategies and structures, and showcase our most recent updates to Scrapy
5th November 2025
3:30 PM

Modern Scrapy Development: Latest Features for Maintainable Web Scrapers

John Rooney | Developer Engagement Manager @ Zyte
Discover the latest Scrapy features, including split/reusable crawling and parameter validation. Learn hands-on techniques using modern tooling like scrapy-poet and scrapy-spider-metadata to write cleaner, more maintainable web scraping code for your growing projects
5th November 2025
4:30 PM

Web Scraping Roundtable: Beyond Scraping — Pivoting into Data Eng, Data Science & Full-Stack in the Age of AI

Iván Sánchez, Fernando Tadao Ito & John Rooney
Join our no-slide, open-mic roundtable where scraper pros turned ML and AI builders share how LLMs are reshaping workflows—auto-coding ETL, generating selectors, and powering RAG chatbots. Get real talk on the skills you need next (dbt, Airflow, vector DBs, LangChain) and what actually drives career pivots, salary jumps, and meaningful learning.

The Gibson Hotel, Dublin

See what attendees have to say about Extract Summit

quote icon
It's great event to learn many new things across data extraction technologies. Loved the insightful topics discussed.
quote icon
Enjoyed the informative sessions, easy and understandable even for a non technical professional.
quote icon
Extract Summit is the event for web scraping professionals. When you bring together high-quality content, engaging workshops and top experts all in one location, it always leads to amazing results!

Why attend Extract Summit 2025?

Stay at the forefront of innovation with the latest insights on cutting-edge web scraping and data technologies — delivered directly by industry leaders at both our US and European events.
Join 16,000+ data pros in the Extract Discord Community.

ONE COMMUNITY,
All things web scraping

Weekly Virtual Workshops

  • Learn from industry experts as they share insights weekly

  • Expand your knowledge of solving bans, AI, extracting data and innovative technologies

  • Exclusive content for Community members only

Bi-Weekly Developer Newsletter

  • Hear from Zyte Developer Advocates bi-weekly, straight to your inbox

  • Web scraping tips and tricks, industry news and important Zyte updates

  • Special promotions and exclusive content