Jump to content

Openai/6942038b-8f10-8003-bf4f-7895677c2166

From freem


This conversation was exported from ChatGPT[1]

gpt-5-2[edit | edit source]

User: What do you think of C's comment?[edit | edit source]

What do you think of C's comment? Does it demonstrate good reading comprehension of the two-comment chain it's in reply to? How might you respond?

---

A$ It's funny, because if there wasn't so much ai hype about LLM, they'd be great realisation of old concept of assisted knowledge/library at hand reach. Something written about 30 or even 40 years ago, about internet (which didn't deliver back then).

B$ I wonder how much of a blocker copyright is. Like, could we have a massive RAG library, but no one is willing to try because they know they'll be sued like Google Books and Internet Archive were

C$ All of the big companies have essentially everything ever written. They may be judicious about what they release but if just more training data unlocked AGI, it would have shipped by now.

Assistant: Here’s how I read it.[edit | edit source]

Here’s how I read it.

Does C show good reading comprehension of A → B?[edit | edit source]

Only partially, and arguably not very well.

  • A is making a conceptual point: LLMs as an assisted knowledge/library idea that’s been imagined for decades.
  • B responds directly and appropriately by narrowing in on a concrete obstacle to that vision: copyright, especially for something like a massive RAG-backed library.

C, however, shifts the frame. Instead of engaging with copyright as a blocker to retrieval or release, they pivot to:

  • training data scale,
  • AGI timelines,
  • and what big companies already possess internally.

That’s a related topic, but it’s not the same one. B is talking about legal constraints on deployment and access; C is talking about whether more data would materially improve intelligence. Those are orthogonal questions.

So C’s comment:

  • does not directly answer B’s concern, and
  • implicitly reframes the discussion without acknowledging the shift.

That’s usually a sign of weak reading comprehension or a conversational habit of steering toward a preferred talking point.

What’s specifically off?[edit | edit source]

The key mismatch is this sentence:

“if just more training data unlocked AGI, it would have shipped by now.”

No one claimed that:

  • B was talking about RAG libraries and access, not AGI,
  • copyright issues apply much more strongly to retrieval, quotation, and redistribution than to opaque training.

C is responding to an argument that wasn’t made.

How might you respond?[edit | edit source]

A good response would gently but clearly realign the discussion:

Option 1: Clarifying and redirecting

I think that’s a slightly different issue. B isn’t saying more training data would unlock AGI — they’re asking whether copyright blocks retrieval-based systems or public-facing knowledge libraries, even if the models themselves already exist.

Option 2: Pointing out the category error

That seems orthogonal. Companies having access to copyrighted text for training doesn’t mean they can legally ship a RAG-style library that retrieves or quotes it. That’s exactly where Google Books / IA ran into trouble.

Option 3: Charitable but corrective

True that raw data scale probably isn’t the bottleneck for intelligence anymore. But the copyright question is more about what kinds of interfaces and products are legally viable, not about whether models could be smarter internally.

All three approaches:

  • acknowledge C’s point so it doesn’t feel dismissed,
  • but make it clear they answered a different question than the one asked.

If you wanted to be sharper, you could say:

That doesn’t really address the copyright issue being raised.

But whether you go soft or sharp depends on the norms of the space you’re in.

References[edit | edit source]

  1. "C's comment analysis". ChatGPT. Retrieved 2025-12-17.