They do memorize some books. You can test this trivially by asking ChatGPT to produce the first chapter of something in the public domain -- for example a Tale of Two Cities. It may not be word for word exact, but it'll be very close.
These academics were able to get multiple LLMs to produce large amounts of text from Harry Potter:
Unfortunately a settlement doesn't really show you anything definitive about the legality or illegality of something.
It only shows you that the defendant thought it would be better for them to pay up rather than continue to be dragged through court, and that the plaintiff preferred some amount of certain money now over some other amount of uncertain money later, or never.
We cannot say with any amount of confidence how the court would have ruled on the legality, had things been allowed to play out without a settlement.
>Also, generating output is what these models are primarily trained for.
Yes but not generating illegal output. These models were trained with intent to generate legal output. The fact that it can generate illegal output is a side effect. That's my point.
If you use AI to generate illegal output, that act is illegal. If you use AI to generate legal output that act is not illegal. Thus the point of output is where the legal question lies. From inception up to training there is clear legal precedence for the existence of AI models.
These academics were able to get multiple LLMs to produce large amounts of text from Harry Potter:
https://arxiv.org/abs/2601.02671