Robots.txt Guide

1 min read

Robots.txt is a plain-text file that tells crawlers which parts of a site they’re allowed to fetch — a small file with an outsized ability to help or hurt visibility.

Robots.txt infographic — Robots.txt Guide
Robots.txt Guide — visual overview by Plain Intelligence.

Common mistakes

  • Accidentally blocking the entire site with a leftover staging rule
  • Blocking CSS/JS files a crawler needs to render the page properly
  • Forgetting to explicitly allow AI crawlers like GPTBot and PerplexityBot — see AI crawlers explained

A sane baseline

Allow everything except admin/staging paths, link to the XML sitemap, and explicitly welcome the AI crawlers relevant to your visibility goals. See the full technical SEO guide. More in Technical SEO.

Related Reading

Related in Technical SEO:

Supporting reading from related clusters:

Cornerstone guide: Technical SEO