• panda_abyss@lemmy.ca
    link
    fedilink
    arrow-up
    4
    ·
    2 days ago

    What is the limit of transformative works here?

    If I build an auto encoder neural network and train it to learn and return content exactly, that’s probably not allowed, but if I train it to rephrase the news, is that allowed?

    • mindbleach@sh.itjust.works
      link
      fedilink
      arrow-up
      2
      ·
      2 days ago

      Knowing facts seems hard to copyright.

      But at the same time, you could not run a newspaper through a thesaurus, word by word, and claim that’s a wholly original publication.

      The general questions for protection seem to be: is the use minimal, and does it serve the same function? One news site copying another is not okay. But a comedy site riffing on the news, especially on that news site in particular, is clearly distinct. And one news site mentioning the conclusion of another news site’s article seems fine, even if they do not explicitly reference the source. They would tend to do so for the sake of building audience trust… but they could just treat it as unsurprising.

      If you reproduce long-form content verbatim, it doesn’t really matter how many hoops you jumped through to get there.

      Building a chatbot using the entire corpus of available English text doesn’t sound minimal - but every individual work makes a minuscule contribution. If a one-gigabyte model was trained on one million books, the takeaway from each book is about the size of this comment.