ChatGPT-User vs GPTBot vs OAI-SearchBot: which OpenAI control does what

One of the easiest mistakes to make in modern machine-access governance is treating all OpenAI traffic as one thing.

It is not.

A practical OpenAI model now needs at least four distinct surfaces in view:

OAI-SearchBot
GPTBot
ChatGPT-User
ChatGPT agent

If those are collapsed into one mental bucket, teams start asking the wrong question.

They say "Should we block OpenAI?" when the real questions are much more specific:

do we want visibility in ChatGPT search;
do we want to refuse training;
do we want to govern user-triggered visits;
do we need verified or allowlisted agent access at the edge?

That is exactly why Why robots.txt is not enough for user-triggered AI agents matters as a pillar, not as an edge case.

The short version

Here is the cleanest way to separate the main OpenAI surfaces.

Surface	What it is for	Main question
`OAI-SearchBot`	Search visibility in ChatGPT search features	Do I want this site surfaced in ChatGPT search answers?
`GPTBot`	Training collection	Do I allow this content to be used in training generative AI foundation models?
`ChatGPT-User`	User-triggered visits from ChatGPT or Custom GPTs	Do I want this kind of on-demand retrieval, knowing it is not normal automatic crawl?
`ChatGPT agent`	Signed and allowlistable agent traffic	Is this now an edge, identity, or runtime-permissions problem rather than a plain crawl-policy problem?

That distinction is the whole article.

What `OAI-SearchBot` actually controls

OAI-SearchBot is the OpenAI surface that matters for search visibility.

Its job is not training. Its job is to surface websites in ChatGPT search features.

That means it is the OpenAI control you care about when the main business goal is:

appearing in ChatGPT search answers;
being surfaced, cited, and linked in ChatGPT search features;
keeping search-oriented discoverability open.

This is also where a lot of teams accidentally damage themselves.

If you block OAI-SearchBot, your content is not supposed to be shown in ChatGPT search answers. OpenAI also says the site can still appear as a navigational link, which is a much narrower outcome than being properly surfaced in search answers.

So the first rule is simple:

If your concern is discoverability in ChatGPT search, think OAI-SearchBot, not GPTBot.

What `GPTBot` actually controls

GPTBot is the OpenAI surface for training collection.

Its role is to crawl content that may be used in training OpenAI’s generative AI foundation models.

That means GPTBot is the right control when your real question is:

can this content be used for training;
do we want to refuse future model-improvement use;
do we want to separate search visibility from training reuse?

This is the key nuance:

blocking GPTBot is a training posture.
It is not the same thing as opting out of ChatGPT search visibility.

A site can allow OAI-SearchBot while disallowing GPTBot. In fact, OpenAI explicitly documents this as a normal choice.

That is a good example of why flat "allow OpenAI / block OpenAI" thinking is too crude to be useful.

What `ChatGPT-User` changes

ChatGPT-User introduces a different class of request.

It is used for certain user actions in ChatGPT and Custom GPTs. When a user asks ChatGPT a question, it may visit a web page with a ChatGPT-User agent. OpenAI also notes that users may interact with external applications through GPT Actions.

This matters because ChatGPT-User is not described as automatic web crawling.

It is user-triggered access.

That is why OpenAI says robots.txt rules may not apply to these requests.

This is also why ChatGPT-User should not be treated as a Search control. OpenAI explicitly says it is not used to determine whether content may appear in Search. That remains an OAI-SearchBot question.

So if your team is asking:

why did ChatGPT visit this page on demand;
why do we still see some retrieval behavior even though training was blocked;
why is Search visibility still separate from user-triggered visits;

the answer is usually that you are looking at ChatGPT-User, not a classic crawler.

Where `ChatGPT agent` fits

There is another OpenAI surface that is easy to miss if you stay trapped inside the robots.txt frame.

OpenAI also documents ChatGPT agent as a signed and allowlistable agent in major edge ecosystems such as Akamai, Cloudflare, and HUMAN.

That means some OpenAI-related traffic should not be modeled primarily as a plain public crawler rule.

It should be modeled as a verified identity and infrastructure policy problem.

That kind of control lives in places like:

CDN allowlisting;
WAF rules;
trusted agent directories;
runtime permissions and session-aware policy.

This is exactly why Robots.txt vs signed agent allowlisting belongs in the same cluster as the OpenAI crawler breakdown.

Which OpenAI control should you use?

Use the following decision path.

Goal: appear in ChatGPT search features

Keep OAI-SearchBot allowed.

If your site is blocked there, you should not expect normal inclusion in ChatGPT search answers.

Goal: refuse training while keeping search visibility

Disallow GPTBot, but do not block OAI-SearchBot unless you also intend to lose normal ChatGPT search surfacing.

Goal: reason clearly about user-triggered visits

Treat ChatGPT-User as a separate class from search and training.

Do not assume that blocking a training crawler answers the user-triggered retrieval question.

Goal: allow a trusted agent through controlled infrastructure

That is not mainly a robots.txt problem.

Treat it as an edge verification and allowlisting problem.

Three common mistakes

1. Blocking `GPTBot` and thinking Search visibility is gone

It is not the Search control.

If the real issue is ChatGPT search visibility, OAI-SearchBot is the relevant surface.

2. Blocking `OAI-SearchBot` and then wondering why citation visibility drops

That is the expected outcome.

Search surfacing and training posture are not the same thing.

3. Treating `ChatGPT-User` and `ChatGPT agent` as the same operational class

They are not.

ChatGPT-User is about user-triggered visits. ChatGPT agent belongs to a signed or allowlistable agent model at the edge.

Where Better Robots.txt fits

Better Robots.txt helps on the publishing and governance side of this problem.

It helps site owners:

separate OAI-SearchBot from GPTBot;
avoid mixing discoverability decisions with training decisions;
coordinate robots.txt with AI usage signals and llms.txt;
keep the published policy clearer across several machine-readable surfaces.

What it does not do is turn signed-agent access into a simple checkbox problem.

When the real question becomes verified identity, allowlisting, runtime permissions, or edge enforcement, you are outside the core role of the plugin and into infrastructure territory.

The correct mental model

The safest OpenAI mental model now is this:

OAI-SearchBot = search visibility
GPTBot = training collection
ChatGPT-User = user-triggered retrieval
ChatGPT agent = signed or allowlistable agent traffic

The clearer that separation is in your policy, the less likely you are to block the wrong thing for the wrong reason.

ChatGPT-User vs GPTBot vs OAI-SearchBot: which OpenAI control does what ​

The short version ​

What OAI-SearchBot actually controls ​

What GPTBot actually controls ​

What ChatGPT-User changes ​

Where ChatGPT agent fits ​

Which OpenAI control should you use? ​

Goal: appear in ChatGPT search features ​

Goal: refuse training while keeping search visibility ​

Goal: reason clearly about user-triggered visits ​

Goal: allow a trusted agent through controlled infrastructure ​

Three common mistakes ​

1. Blocking GPTBot and thinking Search visibility is gone ​

2. Blocking OAI-SearchBot and then wondering why citation visibility drops ​

3. Treating ChatGPT-User and ChatGPT agent as the same operational class ​

Where Better Robots.txt fits ​

The correct mental model ​

Related ​