Skip to main contentSkip to content

How to migrate from a manual robots.txt to Better Robots.txt

Migrating from a manual robots.txt to Better Robots should not be treated as a copy-paste exercise. The risk is not only losing a few lines. The bigger risk is changing the meaning of the crawl policy without realizing it.

Many sites have a robots.txt that was written years ago, patched by several people, copied between environments, or inherited from an old SEO plugin. The file often still works, but nobody is fully confident about:

  • why each block exists;
  • which paths are still relevant;
  • which rules are outdated;
  • which rules are dangerous to keep;
  • which rules are dangerous to lose.

That is why a safe migration should be treated as a policy translation task, not a raw file import task.

Step 1 — audit the current file before touching anything

Before you migrate, read the current file exactly as the site publishes it.

Check:

  • the user-agents targeted;
  • the blocked path groups;
  • sitemap declarations;
  • any broad wildcard rules;
  • any legacy SEO tool or archive blocks;
  • any custom AI-related lines added recently.

Use How to audit your robots.txt in 5 minutes as your quick first pass.

Do not begin inside the plugin interface before you know what the current live file is already doing.

Step 2 — separate useful rules from inherited debris

Manual robots.txt files often contain three kinds of rules mixed together:

  1. still-useful path controls;
  2. old blocks that no longer correspond to the site architecture;
  3. folklore rules that were copied from somewhere else but never really justified.

Examples of inherited debris include:

  • blocks for paths that no longer exist;
  • duplicated blocks under several user agents;
  • very broad wildcard rules added during panic moments;
  • old SEO tool exclusions that conflict with today’s product or content model;
  • comments that explain nothing about why a rule still matters.

A good migration keeps the intent while cleaning the debris.

Step 3 — decide the correct target model

Do not migrate into Better Robots by trying to replicate every line mechanically. Instead, decide which target model the site actually belongs to.

Typical migration targets include:

  • a simple brochure or local business site;
  • a publisher or content-heavy editorial site;
  • a WooCommerce store;
  • a more defensive "Fortress" stance;
  • a highly customized rollout managed by an agency.

That is why you should look at:

A manual file may contain many lines, but the site often fits one of a smaller number of real policy models.

Step 4 — map each legacy rule to a module, preset, or explicit decision

Each important legacy rule should be mapped to one of three outcomes:

Keep it through a preset or existing module

If Better Robots already covers the rule through a preset or a documented feature, do not recreate it manually without reason.

Keep it as a custom decision

If the rule is still justified but not part of the preset, reintroduce it intentionally as a custom decision.

Drop it on purpose

If the rule is outdated, weak, or harmful, remove it deliberately and document why.

This is how you turn migration into a structured review instead of a blind port.

Step 5 — review Googlebot, Google-Extended, and other major agents separately

One of the most dangerous migration habits is collapsing everything into generic wildcard logic.

You need to review separately:

  • Googlebot;
  • Google-Extended;
  • GPTBot;
  • ClaudeBot;
  • archive bots;
  • SEO tool crawlers;
  • any other category that matters for your content model.

Why? Because a manual file may hide agent-specific intentions that become dangerous if translated too broadly.

For example:

  • a site may want Google Search open but Google-Extended blocked;
  • a publisher may want AhrefsBot limited but not fully blocked;
  • a commerce site may want archive bots restricted without touching product discovery.

This is where AI & LLM Governance settings matter.

Step 6 — preserve the product and content surface

During migration, the highest-risk error is accidentally widening a block over pages that must remain discoverable.

Always re-check:

  • home;
  • key category or product pages;
  • pricing and conversion pages;
  • documentation hubs;
  • multilingual sections if they exist;
  • WooCommerce product or category surfaces if relevant.

The migration should not only preserve your intent. It should preserve the site’s real acquisition surface.

Step 7 — use the review step as a control point

A migration is only safe if you can see the final generated file before it goes live.

That is why the Review & Save step matters so much. It lets you answer the critical question:

"Does the generated output still express the policy we intended?"

Without that checkpoint, migration becomes guesswork.

Step 8 — document the new policy in governance surfaces

Once the site is migrated, the robots.txt file should not remain the only explanation of what changed.

A stronger migration also updates the surrounding governance surfaces, especially if the site uses:

  • AI governance settings;
  • llms.txt;
  • machine-first policy files;
  • patterns or playbooks for recurring site types.

That is why Better Robots treats migration as part of a larger governance stack, not just a one-file replacement.

Step 9 — monitor after launch

Migration is not finished when the new file is published.

You should monitor:

  • server logs;
  • crawl coverage patterns;
  • unexpected blocked URLs;
  • search discovery on key pages;
  • whether new low-value crawl spaces have reopened.

This is especially important if the old file had years of accumulated patches. Some of those patches may have been masking deeper structural problems.

Migration checklist

Use this practical sequence:

  1. snapshot the current live robots.txt;
  2. audit the current rule set;
  3. classify each rule as keep, translate, or remove;
  4. choose the correct preset or pattern target;
  5. review major user agents separately;
  6. verify that public acquisition surfaces remain open;
  7. preview the generated output;
  8. publish intentionally;
  9. monitor logs and crawl behavior afterward.

FAQ

Should I recreate every legacy line exactly?

No. Preserve the intent, not the historical clutter.

Is migration mainly about presets?

Presets are a starting point. Good migrations still require explicit review of agents, paths, and product-critical surfaces.

What if the current file is already "working"?

That only means it has not visibly failed yet. It does not mean it is well-structured, up to date, or easy to govern.

Continue with How to audit your robots.txt in 5 minutes, The 5 most common robots.txt mistakes, Presets, Patterns, and Review & Save.