Reddit restricts access to the Wayback Machine Internet archive

0
148
Reddit restricts access to the Wayback Machine Internet archive

The Wayback Machine from the Internet Archive has become the latest victim of Reddit’s tough measures on data access. The company has begun to introduce new restrictions on access to the archive site, which will significantly limit the Wayback Machine’s ability to preserve information from Reddit.

With this change, Wayback Machine, a project run by the non-profit organization Internet Archive, will only be able to scan Reddit’s home page. It will no longer have access to comments, subreddit pages, post details, profiles, and other data.

This move is Reddit’s latest step in its quest to limit the ability of artificial intelligence companies to use its data to train large language models without paying licensing fees. This is also a significant departure from the position the company took last year, when it clearly stated that it would not restrict “good faith participants,” including the Internet Archive. It is unclear what has changed since then. Reddit appears to believe that AI companies are circumventing its rules by collecting data using the Wayback Machine. We have reached out to the Internet Archive for comment.

Data licensing has become an important business area for Reddit. The company has signed multi-million dollar deals with OpenAI and Google, allowing them to use Reddit posts to train their artificial intelligence models. At the same time, Reddit is taking an increasingly tough stance against companies that try to use its data without such agreements. Earlier this year, the company filed a lawsuit against Anthropic, accusing it of collecting data from Reddit without permission for years.

LEAVE A REPLY

Please enter your comment!
Please enter your name here