Recent reporting by Nieman Lab describes how some major news organizations—including The Guardian, The New York Times, and Reddit—are limiting or blocking access to their content in the Internet Archive’s Wayback Machine. As stated in the article, these organizations are blocking access largely out of concern that generative AI companies are using the Wayback Machine as a backdoor for large-scale scraping.

These concerns are understandable, but unfounded. The Wayback Machine is not intended to be a backdoor for large-scale commercial scraping and, like others on the web today, we expend significant time and effort working to prevent such abuse. Whatever legitimate concerns people may have about generative AI, libraries are not the problem, and blocking access to web archives is not the solution; doing so risks serious harm to the public record.

  • 🌞 Alexander Daychilde 🌞@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    6 days ago

    Knowledge rot is already a problem and has been for years – where you try to follow some links only to find they’re dead, or people deleted their content. The anecdotes of finding some old problem and someone just said “I figured it out”. Sure, archival won’t fix that specific example, but the principle is there - we lose so much information.

    It would be nice if we had a government that worked for We the People and made information archival mandatory — likr the Library of Congress already does with printed materials.

  • Cherry@piefed.social
    link
    fedilink
    English
    arrow-up
    1
    ·
    6 days ago

    The core product at the bottom of this is information. People feel information should be free. Corps wanna charge for it. Govs and influentials wanna use it to push an agenda.

    I kinda see it like the library. There’s info and there’s the library. Traditionally these were made open to walk in and the library system allowed you access to all the info, peacefully, privately.

    Now it’s like library has rooms. Some locked. They are owned by gob$bites some are owned by greedy. They aa regularly steal the open info, change it, hide it, copy it, they hijack the librarians and paint the walls. As they have access to the librarian they can see what I look it.

    The library is no fun no more.

  • FaceDeer@fedia.io
    link
    fedilink
    arrow-up
    1
    ·
    6 days ago

    It’s been darkly amusing watching the various social media hive-minds that used to be all for the concept of “information wanting to be free” suddenly discovering that they hate AI more than they love freedom of information.

    • CheeseNoodle@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      6 days ago

      I mean the end goal of AI is to monetize access to information while obsfucating the pre-existing free information so there’s no real conflict there?

    • tomalley8342@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      6 days ago

      There is no conflict here, the strain of “serving” clankers denies resources to real people that actually need to access that information.