X

Google proposes new 'opt-out' solution to stop AI companies from scrapping data

In this day and age of generative AI, companies mining for data to train their machine-learning models and reproducing copyrighted content from artists and publishers have caused a lot of concerns. Now, in an effort to address this issue, Google has proposed a new “opt-out” option, which would allow publishers to prevent their works from being used to train AI.

This solution comes after the Australian government’s tough stance on “high-risk” AI applications, which involved imposing restrictions on generating deep fakes, spreading disinformation, and perpetuating discrimination. Although Google has maintained its stance on the fair use of data to train AI models, this new solution could potentially mitigate the ongoing dispute between publishers and AI companies over the utilization of copyrighted material.

“The general rule is that you need millions of data points to be able to produce useful outcomes … which means that there’s going to be copying, which is prima facie a breach of a whole lot of people’s copyright,” said Dr Kayleen Manwaring, a senior lecturer at UNSW Law and Justice.

Google’s solution

While specific details concerning the implementation remain unclear, Google’s proposal for the opt-out system closely mirrors the robots.txt protocol commonly employed by websites to block search engines from indexing specific segments of their content.

“We believe everyone benefits from a vibrant content ecosystem. Key to that is web publishers having choice and control over their content and opportunities to derive value from participating in the web ecosystem,” reads Google’s blog post.

However, Google’s approach towards data scrapping has also raised some eyebrows in the industry. This is because the company recently updated its privacy policy, which allowed it to leverage user-generated content for AI development. Nevertheless, the introduction of the opt-out mechanism represents a paradigm shift in copyright dynamics, as AI companies, including Google, would not be able to extract data from publishers or musicians without obtaining explicit consent from the rightful owner.