Denial
Digital information that is created and verified by humans is the most valuable asset in the AI tech platform wars.
Jeremy Keith is taking a stand on using AI tools like ChatGPT. I admit, I´m using ChatGPT every now and then and I find it useful – but reading about how the Wikimedia servers, and others,1 are facing unprecedented loads because bots are scraping their content to collect training data and feeding it into large language models (LLMs), one has to realize that big money is externalizing their infrastructure cost to the public domain to make more money.
Since January 2024, we have seen the bandwidth used for downloading multimedia content grow by 50%. This increase is not coming from human readers, but largely from automated programs that scrape the Wikimedia Commons image catalog of openly licensed images to feed images to AI models. […] This expansion happened largely without sufficient attribution, which is key to drive new users to participate in the movement, and is causing a significant load on the underlying infrastructure that keeps our sites available for everyone.
The content of Wikimedia is free to everyone but there are infrastructure costs attached to it. How can Wikimedia continue to enable the community while putting boundaries around automatic content consumption, which is required to sustainably dedicate engineering resources. To address the question systemically, the Wikimedia foundation started to draft a plan for Responsible Use of Infrastructure (WE5).
Jeremey, who I find very good at framing a new situation with a few sentences in a way that immediately will give you an important angle for looking at it, is saying it this way:
The worst of the internet is continuously attacking the best of the internet. This is a distributed denial of service attack on the good parts of the World Wide Web.
If you’re using the products powered by these attacks, you’re part of the problem. Don’t pretend it’s cute to ask ChatGPT for something. Don’t pretend it’s somehow being technologically open-minded to continuously search for nails to hit with the latest “AI” hammers.
If you’re going to use generative tools powered by large language models, don’t pretend you don’t know how your sausage is made.
Please stop externalizing your costs directly into my face, Drew DeVault ↩︎