Blog

The latest on technical enforcement of crawling preferences.

Feb 3, 2026 20 min read

The State of Defensive Data Poisoning in 2026: A Report

Comprehensive analysis of AI training data enforcement: robots.txt bypass data, tool effectiveness, legal developments, and the shift from signaling to enforcement.

state of data poisoning 2026AI crawler compliance dataPoison Fountain

Mar 8, 2026 18 min read

Scrapling and Crawlee: How Open-Source Scraping Tools Get Detected

A technical analysis of Scrapling and Crawlee, two popular open-source scraping frameworks, examining their anti-detection features and the behavioral signals that content-layer defenses can exploit.

ScraplingCrawleeopen source web scraping

Mar 3, 2026 15 min read

How AI Scraping Infrastructure Works: Proxies, Evasion, and Scale

Inside the technical infrastructure AI companies use to scrape the web: residential proxy networks, fingerprint emulation, CAPTCHA solving, and why traditional defenses fail.

AI web scrapingresidential proxy networksBright Data

Mar 2, 2026 10 min read

The AI Crawler Compliance Crisis: Who Plays by the Rules?

AI crawler robots.txt compliance dropped from 96.7% to 70% in one year. Analysis of which crawlers comply, what it costs publishers, and what comes next.

AI web crawlingrobots.txt complianceAI scraping

Mar 1, 2026 17 min read

Understanding AIPREF: The IETF Standard for AI Content Preferences

AIPREF extends robots.txt with standardized vocabulary for AI training preferences. How the IETF standard works, its syntax, and what it means for publishers.

AIPREFIETF AIPREFAI preferences standard

Feb 20, 2026 23 min read

Data Poisoning FAQ: Technical, Legal, and Policy Answers

Answers to common questions about data poisoning, web crawling, robots.txt, AIPREF, legal status, and enforcement mechanisms for AI training defense.

data poisoning FAQrobots.txt AI crawlersAIPREF explained

Feb 17, 2026 27 min read

Publisher Defenses Against AI Scraping: Cost Imposition vs Poisoning

Comparing defense strategies against AI scraping: proof-of-work systems impose costs, data poisoning degrades value. Who pays and what works for publishers.

AI scraping defenseAnubis proof-of-workpublisher AI defense

Feb 13, 2026 27 min read

AI Poisoning Threat Models: Backdoors, RAG, and Supply Chain

Backdoor attacks, model degradation, and RAG poisoning explained. Technical analysis of who can attack, defense costs, and power dynamics in AI training data.

AI poisoning threat modelsbackdoor attacks AIRAG poisoning

Feb 10, 2026 10 min read

Defensive Data Poisoning: Ethics, Risks, and Alternatives

Analyzing ethical tradeoffs of defensive data poisoning: proportionality, collateral damage, and safer alternatives like proof-of-work and AIPREF standards.

defensive poisoning ethicsdata poisoning collateral damageAnubis proof-of-work

Feb 7, 2026 7 min read

What Is Data Poisoning in Machine Learning?

Data poisoning manipulates AI training data to alter model behavior. Learn how defensive tools like Nightshade protect content from unauthorized AI training.

data poisoningAI data poisoningmachine learning poisoning

Feb 5, 2026 12 min read

Why VENOM Exists: From robots.txt to AI Data Enforcement

When robots.txt fails, enforcement mechanisms emerge. VENOM analyzes data poisoning, proof-of-work, and technical countermeasures for AI training governance.

AI data enforcementenforcement vs signalingrobots.txt compliance