Discussion about this post

User's avatar
Charles Pick's avatar

There are many purposes and uses for LLMs, but probably the most important one in my view is to retrieve and use information from a broader array of copyrighted sources than just web pages. Many web pages infringe on copyright, but in a somewhat occluded way that is challenging for rights holders to verify on its face. With LLMs, the training set can include a lot of copyrighted material (original works of authorship fixed in a tangible medium of expression within the statutory time period of protection), and then produce a nonoriginal work of non-authorship through a pastiche of its training data.

Using LLMs to clone authors doesn't really make sense because of how they work. It asks the system to do something that it's not really set up to do. What it is good at is pulling information from many sources (both restricted and non-restricted) and then producing non-infringing derivative works very quickly and at reasonable cost.

Expand full comment
Betsy's avatar

"I tell authors to imagine that years from now the only readers of your book will be your grandchildren. Write the book that you would want them to read. Don’t make concessions to agents, editors, or marketing departments that would detract from the book that your want to give to your grandchildren." THIS.

Expand full comment
9 more comments...

No posts