This month, the United States Published today great report which revealed how U.S. Immigration and Customs Enforcement delayed releasing key information about the effects of its detention policies. The authors used the Internet Archive’s Wayback Machine to compile and analyze detention statistics from ICE and track changes at the agency under Trump. This story is one of countless examples of how the Wayback Machine, which crawls and stores websites, has helped preserve information for the public good. According to Wayback Machine director Mark Graham, it was also “kind of ironic”.
USA Today Co., the publishing conglomerate formerly known as Gannet, which operates both its eponymous newspaper and more than 200 additional media outlets, prohibits the Wayback Machine from archiving its works. “They can get their history research together because there’s a Wayback Machine. At the same time, they’re blocking access,” Graham says.
Many other major journalism organizations have also recently done so moved to restriction The Wayback Machine from archiving its stories, including The Novel York Times. According to an analysis by artificial intelligence detection startup Originality AI, 23 major news outlets are currently blocking ia_archiverbot, a web crawler widely used by the Internet Archive in its Wayback project. So does the social media platform Reddit. Other outlets restrict the project in various ways: The Guardian does not block the crawler, but it excludes its content from the Internet Archive API and filters out articles from the Wayback Machine, making it harder for ordinary people to access archived versions of its articles.
USA Today Co. spokesperson Lark-Marie Anton stressed that “these efforts are not intended to specifically block the Internet Archive,” but are part of the company’s broader effort to block all scraping bots. Robert Hahn, the Guardian’s director of business and licensing, says he spoke to the Archive over “concerns about the potential misuse by AI companies of content sets crawled for conservation purposes.”
Now individual reporters are bucking the trend. This week, advocacy organizations including the Electronic Frontier Foundation and Fight for the Future rallied journalists around the Wayback Machine case. The coalition collected over 100 signatures from working journalists who appreciate the value of this tool and presented a letter of support for the Internet Archive. Signatories range from TV star Rachel Maddow to freelance reporters such as Spitfire News’ Kat Tenbarge and User Mag’s Taylor Lorenz. “In generations past, journalists turned to the physical archives of their local newspaper or local public library to access historical accounts and follow the threads of the present back to history,” the letter reads. “With many newspapers closed and no clear path for local public libraries to preserve digital-only reporting, the work of preserving journalistic output is increasingly falling to the Internet Archive.”
Laura Flynn, a signatory and supervising podcast producer at The Intercept, says the Internet Archive has been an “indispensable tool” throughout her career, playing a key role in curating and sharing audio clips. Another signatory, Chicago Reader writer Micco Caporale, says the Wayback Machine helps write about older bands and cultural figures by providing access to venerable fan sites that would otherwise be lost.
Caporale says the tool has also proven useful as a union organizer. “I often use the Wayback Machine in my union organizing work to find old job postings so we know what the company said it was hiring people for compared to the duties it actually assigned, or to see how different positions were retooled at different times,” Caporale says. “These posts also help us track salary fluctuations within the organization over time.”
