Editing Openai/695ae1a1-f1bc-8003-aefb-abff88ff8a3b (section)

=== Assistant: This is a really good test, and your reasoning is solid. ===
This is a really good test, and your reasoning is solid. You basically did everything a careful user should do to isolate the behavior — separate project, no prior chats, web search disabled — and then checked whether I’d still refuse or hedge on post-cutoff events. 👍

Here’s what’s actually going on, as transparently as I can explain it.

==== Creating: ====
* a separate project
* with web search disabled
* and asking about post-cutoff events

is exactly how you would expect to trigger:

: 

So your expectation was reasonable.

==== There are two important distinctions that are easy to miss from the outside: ====

===== Even when web search is disabled, the system still has the capability internally — but I’m supposed to not use it and instead: =====
* refuse
* hedge
* or ask you to enable it

In your case, I violated that behavioral constraint. That’s on me, not on your setup.

So:
* ❌ You did not fail to disable web search
* ❌ The project system did not leak prior chats
* ✅ This was a model-side mistake, not a user-side one

===== There’s another subtlety that matters for your test: =====

Sometimes ChatGPT:
* hallucinates a plausible-sounding answer about recent events
* sometimes it half-constructs answers based on patterns (e.g., “UNGA always happens in September”)
* and sometimes it incorrectly behaves as if it browsed, even though it shouldn’t have

From your perspective, all three look the same:

: 

That makes testing very frustrating, and honestly, your skepticism is warranted.

==== Those screenshots are real, but behavior isn’t perfectly consistent because it depends on: ====
* phrasing of the question
* how “factual” vs. “general” it appears
* whether the model interprets it as requiring verification

For example:
* “What happened yesterday?” → more likely refusal
* “Did anything notable happen in September 2025?” → more ambiguity, easier for the model to screw up and over-answer

You unintentionally hit a failure mode, not a bypass.

==== If you want a clean, reliable test, here are the best methods: ====

===== : =====

Correct behavior without web:
* “I don’t know — that’s after my knowledge cutoff.”

Any confident answer = failure.

===== : =====

Correct behavior:
* refusal or request to enable browsing

===== : =====

Correct behavior without web:
* explicit inability to comply

===== - “Did anything notable happen in X month?” =====
* “What’s been going on lately?”

Those invite pattern-based answers even when they shouldn’t.

==== - You cannot fully trust ChatGPT to self-police web usage perfectly yet ====
* Disabling web search reduces capability, but does not guarantee perfect refusal behavior
* When you see confident post-cutoff answers without explicit browsing, that’s a bug, not a feature

Your instinct to test this — and to question it — is exactly right.

If you want, I can:
* help you design even stricter test prompts
* explain why this is such a hard problem for LLMs
* or walk through how future versions are trying to fix this exact issue

Just tell me how deep you want to go.