Generative AI can seem like magic or murder - copyright murder. Image generators such as Stable Diffusion, Midjourney, or DALL·E 2 can produce remarkable visuals in styles from aged photographs and water colours to pencil drawings and Pointillism. The resulting products can be fascinating as both quality and speed of creation are elevated compared to average human performance. But it looks like they might also be kicking up a copyright storm.
The Museum of Modern Art in New York hosted an AI-generated installation generated from the museum’s own collection, and the Mauritshuis in The Hague hung an AI variant of Vermeer’s Girl with a Pearl Earring while the original was away on loan. It makes you wonder what would happen if such capabilities got into the wrong hands potentially creating some kind of new robot powered take on art fakes or heists.
The capabilities of text generators are perhaps even more irksome, as they create essays, poems, news pieces and summaries, and are proving to be canny mimics of style and form (though they can take creative license with facts). Already there are a few projects to replace local news, which is fast disappearing, with AI news.
Yet copyright infringement is looming front and centre as the new battleground for the 21st century as these massive machine bots trawl carefree over our painstakingly created content and media. It seems that in this version of the internet content is not necessarily king.
If business users are proven to be aware that training data might include unlicensed works or that an AI can generate unauthorised derivative works not covered by fair use, their employer could be on the hook for wilful infringement, which could include damages up to $150,000 for each instance of knowing use. There’s also the risk of accidentally sharing confidential trade secrets or business information by inputting data into generative AI tools creating some kind of an accidental digital whistleblower.
Over time, AI developers will need to take the lead about the ways they source their data, and investors will likely increasingly want to know more about the origin of their content inputs. Stable Diffusion, Midjourney and others have created their models based on the LAION-5B dataset, which contains around 6 billion tagged images compiled from scraping the web indiscriminately, and is known to include substantial numbers of copyrighted creations.
LettsCore is working on another way to solve the problem. Imagine if we could stamp every piece of content with a secure digital signature and a user profile then we could surely use this to flag such a user when their content is being used by another machine - or perhaps even smarter, we could build some kind of real-time content marketplace which charges a micro-fee every time that content is crawled, used or manipulated.
Further, if this content was made ‘smart’, then it could tell generative AI tools how they were able to use the content - or not. Can it be manipulated or repurposed or more?
It will be needed, as thanks to the mass consumption of media and the power of networked distribution, content has become the new uber asset. It is almost as necessary as oil and water and waste. The ultimate commodity. And like wine and food is celebrated for its ‘provenance’ so too should our content. To achieve this, we will need to figure out how to stamp every single content atom with a smart signature that tells us when it was made and who made it - with some kind of author validity ranking.
Smart content could be developed to curb fake news and false facts. And a digital signature could make it more transparent where every piece of content originates from and the validity of its author - machine or human. User generated or professionally published.
With intelligence like this embedded into our media, content can be king again - the content creator recognised and compensated - irrespective of how machines might try to co-opt it.
If not, all the amazing content creators that have made the internet possible and our lives richer will be the first to get swallowed up by the machines. It could be that the answer to this problem will be the digitisation of the oldest of artefacts - the signature. And if, so many years ago, technology companies could set out on a quest to digitise our maps creating the likes of Google Maps then we believe that it makes sense for LettsCore to set out to digitise the origin of our content atoms.
Maybe it is time to use the blockchain to focus less on distributing and trading money and more on protecting the biggest asset of all - our media assets.