Insights & Engineering

Deep dives into web data infrastructure, extraction techniques, and the future of structured data at scale.

Latest Articles

Puppeteer Submit Form: Node.js Guide for 2026

TL;DR: Use page.locator(selector).fill(value) for fast, deterministic Puppeteer submit form scripts and page.type() when the page watches for real keystrokes (autocomplete, anti-bot, live validation). Submit by clicking the button, pressing Enter, or calling form.requestSubmit(), and always wait for a concrete success signal instead of a fixed timeout.

Mihnea-Octavian Manolache12 min read
May 8, 2026

How to Build a Web Scraper with Pyppeteer (2026 Guide)

TL;DR: Pyppeteer is the unofficial Python port of Puppeteer and still works for driving a real Chromium from asyncio. In this guide you will install it, write a modern web scraper with Pyppeteer using asyncio.run and try/finally, handle waits, forms, screenshots, infinite scroll, cookies, and proxies, and learn when to migrate to Playwright, Selenium, or a hosted scraping API.

Mihnea-Octavian Manolache10 min read
May 12, 2026

How to Scrape Walmart.com: 2026 End-to-End Guide

TL;DR: This guide walks through how to web scrape Walmart product data end-to-end in Python, from parsing the hidden __NEXT_DATA__ JSON to scaling with proxies, retries, and async fetches. It also draws an honest line for when a managed scraper API beats DIY.

Raluca Penciuc11 min read
May 12, 2026