The Golden Age of Data Owners

}
8 August 2024

The Promise of AI

Throughout the past months, generative AI has taken by storm the minds of decision makers and practitioners alike. It is finding its way into corporate strategy documents. The AI cornucopia is churning out hundreds upon hundreds of start-ups that offer solutions promising to revolutionise businesses with AI technology. This evolution creates a unique business opportunity to monetise proprietary data, promising a golden age of data owners.

There is no shortage of impressive yet carefully curated demoes of solutions powered by AI. However, examples of successful corporate-wide deployments of AI initiatives are few and far between. Few doubt that AI brings the promise of radical improvements in productivity. Even fewer realise that materialising this promise requires ever large volumes of high quality data to train AI models. Such data is in short supply and businesses have a unique opportunity to monetise proprietary data.

So far, organisations developing foundational models made extensive use of data publicly available on the Internet. Beyond that, they used social media, Internet forums and curated content to train models. However, over time the marginal cost of acquiring additional proprietary data sets becomes higher.  To mitigate this, researchers theorised about using synthetic data to train and improve models. Indeed, generative AI models creates ever more content published on the Internet today and AI start-ups use it to train new model versions. However, recent work published in Nature indicates that this approach leads to a dead end.

AI Model Collapse

In their paper AI models collapse when trained on recursively generated data the authors find that indiscriminate use of model-generated content in training causes irreversible defects in the resulting models. They call this effect AI model collapse. One of the core conclusions that authors highlight is that “the value of data collected about genuine human interactions with systems will be increasingly valuable in the presence of LLM-generated content in data crawled from the Internet.”

Golden Age of Data Owners

This brings several important implications for organisations with large datasets recording human interactions, or any other type of original data. First – don’t give data away for free in exchange for hazy promises of leadership in the AI race. Chances are, access to your data benefits the AI service providers (much) more than the value you extract. Second, make sure to require data protection throughout its lifecycle: at rest, in transit and most importantly in use. CanaryBit’s Confidential Cloud ensures data protection throughout its lifecycle. Likewise, Apple recently announced a similar approach marketed as private cloud compute. Finally make sure to technologically restrict the scope of data processing to guarantee you maintain control over the data at all times. Confidential Computing is a foundation technology in CanaryBit’s Confidential Cloud that allows fine-grained control of data processing.

Conclusion

In the race towards and AI-powered society, data owners should be mindful of the value  in the data they hold. Research shows that the value of data collected about genuine human interactions with systems will be increasingly valuable in the presence of LLM-generated content in data crawled from the Internet. Businesses and public administration should avoid giving away their data in exchange for the dubious benefit of early access to subpar AI services. Instead, they must develop a strong data monetisation strategy and protect their data throughout its lifecycle with solutions such as CanaryBit’s Confidential Cloud.

Get Started!

Explore how Confidential Cloud helps to secure your cloud infrastructure, protect your data from any AI workload and in turn, enable new business.

 

YOU MAY ALSO LIKE …

CanaryBit joins ABB ‘s innovation growth hub SynerLeap

CanaryBit joins ABB ‘s innovation growth hub SynerLeap

CanaryBit has become a member of Synerleap, ABB's innovation growth hub. Synerleap aims to create an ecosystem where ABB can utilize and enable technology companies to grow and expand on a global market in its business areas including industrial automation, robotics...

2023: more business, more challenges, more success to celebrate

2023: more business, more challenges, more success to celebrate

And just like that, in a blink of an eye, we have found ourselves at the end of yet another year. 2023 meant a lot to CanaryBit: it brought more business and challenges but also set the ground for growth for several years ahead. Let's rewind the year before it ends...

Preparing for DORA – a new challenge for financial entities

Preparing for DORA – a new challenge for financial entities

Preparations underway The Digital Operational Resilience Act (DORA), establishes the European Union’s new regulatory framework for the management of digital risks in financial markets. You can get a PDF of the regulation from the EU commission website. It entered into...