This month, the Internet Archive’s Wayback Machine archived its trillionth webpage, and the nonprofit invited its more than 1,200 library partners and 800,000 daily users to join a celebration of the moment. To honor “three decades of safeguarding the world’s online heritage,” the city of San Francisco declared October 22 to be “Internet Archive Day.” The Archive was also recently designated a federal depository library by Sen. Alex Padilla (D-Calif.), who proclaimed the organization a “perfect fit” to expand “access to federal government publications amid an increasingly digital landscape.”
The Internet Archive might sound like a thriving organization, but it only recently emerged from years of bruising copyright battles that threatened to bankrupt the beloved library project. In the end, the fight led to more than 500,000 books being removed from the Archive’s “Open Library.”
“We survived,” Internet Archive founder Brewster Kahle told Ars. “But it wiped out the Library.”
An Internet Archive spokesperson confirmed to Ars that the archive currently faces no major lawsuits and no active threats to its collections. Kahle thinks “the world became stupider” when the Open Library was gutted—but he’s moving forward with new ideas.
Distributed archives seem to be the way forward. It’s much harder to take something down if it’s spread across the globe and not controlled by a single entity
There are some around. I know of https://annas-archive.org/ at least
It’s also much harder to guarantee preservation with distributed archive. Example: torrents with 0 seeders.
That’s why you need more people and spread the word. If enough people and devices are dedicated to the archival probably cess, the safer it is
So 5 times more overhead to guarantee the safety of data, that is x5 more cost cause it’s not like regular people have servers with lots of memory just sitting at their homes.
That’s the price you pay to ensure archival in the face of adversity
I have mixed feelings. I’m glad they survived the lawsuits, and now they can spend their funding on their actual goals rather than it going towards lawyers.
On the other hand, it’s really sad that they had to delete so much of their archive - over half a million books, and a bunch of recordings from their Great 78 Project (which was archiving 300k+ music albums released between ~1900 and 1950). A lot of the things that can’t be archived are eventually going to become lost media.
I really hope that they didn’t actually delete anything, and only just removed public access.
And open themselves up to massive penalties? That would be beyond stupid.
the judgement did not require they delete the books from their archives, only that they stop lending out digital copies of books fitting specific criteria. which should be obvious because possession not copyright infringement, reproduction/distribution is.
in fact, the judgement specfically allows Internet Archive to continue to use those books “for the purpose of accessibility for ‘eligible persons’”
I wouldn’t think a library/archive retaining data in an offline form would incur penalties, and I feel like preserving books for the future is the opposite of stupid.
I’m 95% sure the settlement with the publishers would have included a clause requiring the Internet Archive to delete all “infringing” material in their possession.
what’s your methodology for that 95% figure? because Internet Archive themselves mention no such clause:
The lawsuit only concerns our book lending program. The injunction clarifies that the Publisher Plaintiffs will notify us of their commercially available books, and the Internet Archive will expeditiously remove them from lending. Additionally, Judge Koeltl also signed an orderin favor of the Internet Archive, agreeing with our request that the injunction should only cover books available in electronic format, and not the publishers’ full catalog of books in print
Because this case was limited to our book lending program, the injunction does not significantly impact our other library services. The Internet Archive may still digitize books for preservation purposes, and may still provide access to our digital collections in a number of ways, including through interlibrary loan and by making accessible formats available to people with qualified print disabilities. We may continue to display “short portions” of books as is consistent with fair use—for example, Wikipedia references (as shown in the image above). The injunction does not affect lending of out-of-print books. And of course, the Internet Archive will still make millions of public domain texts available to the public without restriction.
Preserving is important, sure. But if the settlement required them to delete it and they keep an offline backup and this ever gets out, the settlement is voided and it opens up a world of hurt for them.
This is not a debate about the merits of preservation but about legal repercussions for the Internet Archive.
I didn’t know if it did or didn’t. But since you say that’s the case, that sucks and I hate the publishers even more.




