Skip to main contentSkip to content

Publisher playbook

For editorial, documentation-heavy, media-like, or content-led sites where the team wants to stay discoverable in search while drawing clearer lines around AI usage and archives.

The fastest route from "we publish content" to the right Better Robots.txt reading stack.

Your main decision

For publishers, the hard part is not choosing between openness and total lockdown.

The hard part is separating three different questions:

  • do you want to remain discoverable in search?
  • what kind of AI answer or retrieval use is acceptable?
  • what is your posture on training and archive capture?

Best starting posture

Start with AI-First when the publisher wants a clearer public policy around AI usage.

Stay on Essential when the site is still small and the main job is simple search visibility.

Move to Fortress only when the archive or sensitivity profile truly justifies a protection-first stance.

Read in this order

Review modules in this order

  1. Search Engine Visibility
  2. AI and LLM Governance
  3. LLMS.txt File when a machine-readable public policy helps
  4. Archive & Wayback Control
  5. Spam, Feeds & Crawl Traps for low-value search and feed paths
  6. Review & Save

Leave these alone until there is evidence

  • broad SEO-tool blocking if the real issue is policy clarity, not crawl cost
  • Fortress if the site still depends heavily on open discovery
  • blanket disallow rules on article, category, or archive paths without impact review

Common mistakes

  • collapsing search indexing, answer-generation use, training, and archive capture into one vague "AI" category
  • talking as if llms.txt were an enforcement layer
  • tightening archive controls before protecting basic search discoverability
  • publishing strong language without a clear internal policy explanation

What good publisher discipline looks like

  • keep search visibility separate from training posture
  • describe policy signals as signals, not guarantees
  • document the difference between answer use, archive use, and training use
  • preview the generated file before publication

Escalate only when

Move to a stricter setup when:

  • the archive has clear commercial or licensing value
  • newsroom or legal stakeholders need more explicit public boundaries
  • crawl logs show extraction pressure
  • the site has become large enough that archive and crawl hygiene now affect cost