If you use ChatGPT to get answers or generate content, you will notice that it does not cite the sources of its information or provide any references.
UPDATE: On February 7, 2023, during the introduction of AI-powered Bing, Microsoft indicated that it will be citing sources of AI-generated content. That is some good news.
Microsoft CEO Satya Nadella also clarified in a Verge interview that they will link to the original source, if not, their bots are not going to be allowed to crawl sites if they don’t drive traffic.
ChatGPT on citing sources of its data
We asked chatGPT why it does not cite the sources of its data and here was its response,
“OpenAI’s ChatGPT is a machine learning model that was trained on a massive amount of text data from the internet. The specific sources of this text data are not retained in the model because the primary focus during the training process was to create a model that can generate coherent and informative text, rather than tracking the sources of the data used to train it. Additionally, keeping track of the sources for all the data used in the training process would require a significant amount of computational and storage resources”
From the response provided by ChatGPT, the AI tool says that it does not disclose the sources of its data because it was trained on massive amounts of text, and citing the sources during the training will affect its computational resources.
Content Creator’s concerns about ChatGPT
The question that remains in every content creator’s mind is, what happens if content creators stop producing content? Where will the AI tools get data for training?
If content creators decide to let AI use their content without citation, how will they benefit now that there will be no users coming to their site?
Are we reaching a point where content creators will be forced to block web crawlers and enable subscriptions to finance their work?
We are likely to see laws passed in various jurisdictions limiting AI tools from using content without paying for it.