Playbook for content publishers
Your main question is usually: "How do I allow useful discovery without exposing the site naively?"
Read first
Your risk model
You care most about:
- AI answer-generation usage;
- model-training posture;
- archive replay;
- scraper pressure;
- editorial reuse without clear boundaries.
Do
- separate answer-generation from training in your reasoning;
- decide what kind of machine access is acceptable;
- choose a preset that reflects editorial goals, not just technical habit;
- document the difference between openness and naivety.
Don’t
- collapse all bots into one category;
- assume
robots.txtalone solves hostile scraping; - use vague language like "block AI" when the real policy is more granular.