How Clean ML Is Shattering Data Science’s Glass Ceiling

Matthew Karasick

July 5, 2021

In the days before the Internet, brands had access to drastically less data about….well, just about everything. Intelligence around customer intent and behavioral insights typically required self-reporting from surveys and focus groups conducted and analyzed by marketing analysts.

The web exponentially increased the amount of data available to organizations and marketers. In 2008, D.J. Patil and Jeff Hammerbacher were leading the intersection of data and analytics at LinkedIn and Facebook when they coined a new term to describe what they were doing on a daily basis: data science.

Within a few years, everyone, across every sector and company function, was talking about Big Data, and its volume, velocity and potential. Industry pundits rightly convinced marketers that they were sitting on a vast pool of consumer data that, if tapped, could readily power better outcomes and even new business models through better analytics and data-driven execution.

Corporations set to work building teams of data scientists who worked across the organization to add data-driven intelligence to their Sales, Marketing, HR, Operations, & beyond.

Brands, and the technology and data providers who make up their “stacks”, have been steadily growing their investments and focus in building analytics and data science disciplines and are markedly more sophisticated than just a handful of years ago.

With access to the wide-open pipes of the ad tech ecosystem, and the results are impressive; data is captured, parsed and activated within milliseconds of its creation. It is ubiquitously understood that a click on a product page anywhere will likely shape what you see, starting within milliseconds from the click. If you’re like me, you have friends who swear that they see ads for things based on words they have spoken (“my devices are listening…”).

Big Data Hits a Data Ceiling

With the changes to browsers, regulations, and beyond, access to second and third party data has recently become far less ubiquitous. Data collaboration now requires far more intentionality and clean room software has emerged as a viable path to enable data collaboration to occur at scale by allowing data owners to have fine grained control over how their data is accessed and used.

Using clean rooms, endemic publishers are able to enable strategic advertising partners to use their data for measurement without fear of their audience leaking and being activated without their involvement. Retailers are able to allow it’s CPG partners to utilize transaction data to inform optimization opportunities. Auto OEMs and dealer groups are pushing past friction which has existed in it’s three-tier system for decades.

With the momentum of data collaboration which clean rooms have paved the way for, savvy teams quickly landed at a question. If we can find ways to join distributed data, can we also find a way to put my machine learning model with your dataset to run inference or predictions, without either of us ever having to share/ship our respective assets with each other?

CleanML is the natural evolution of Clean Rooms whereby two (or more) parties can each bring distributed raw materials for machine learning/AI such as a model or model-training code or dataset(s), with each respective partners’ assets remaining safe and protected in its own clean room. CleanML then creates a temporal neutral compute environment whereby the assets are joined to produce output which is then written to one or both of the partners’ mutually agreed upon clean room(s). The compute environment then quickly evaporates with the only remaining artifact being the generated desired output.

Using CleanML, data science teams are now the proverbial kids in the candy store. Ask 100 data scientists what they’d rather have, better data or better algorithms, and you’ll hear somewhere between 98-100 of the same answer: “better data”. With CleanML, these smart teams are now realizing that they can quickly leave their own four walls and to start thinking who has potentially valuable data (or models) and would be a candidate for collaboration using CleanML.

Clean ML Use Cases

CleanML is now being used across a number of verticals and use cases. CPG companies and their retail partners are building new propensity models which are driving both advertising and distribution powered by CPG models which utilize retail partners’ data. Brands are able to utilize data enrichment vendors without their data ever leaving its home. R&D departments are now using secret product data from distribution partners to inform new product development. Partners in highly regulated industries such as Financial Services & Healthcare are shattering past ceilings which seemed immovable due to regulation and trust.

CleanML opens the door, not just to an incremental new tactic, but rather a whole path of innovation for data science teams and their partners alike. Roadmaps can and should now imagine working with datasets or models which could come from anywhere. And while it is still early on this arc of innovation where models (or other types of executables) and data can be joined, something tells me that we can hardly even imagine how smart enterprises will use this technology to do amazing things.

Latest News

Cloud for Good Acquires EMS Consulting, Expanding Salesforce Expertise and Impact to the Financial Services Sector

March 27, 2026

Wishpond Appoints Jordan Gutierrez as Chief Executive Officer to Lead Next Phase of Execution

March 27, 2026

Quantcast Sales Veteran Andrew Double Joins Adora as VP of Revenue

March 27, 2026

CB Insights Partners with Perplexity to Bring Curated Research Selection into Perplexity’s Leading AI Platform

March 27, 2026

Horizon Media Holdings Names Bhavana Smith Chief Operating Officer, Advancing AI-Led Transformation

March 27, 2026

Why SalesTech Is Becoming a Critical Investment for Digital-First Businesses

Technology B2B Sales Leader to Drive Profitable Growth for Chief Outsiders Clients

How API-First SalesTech Is Redefining Revenue Operations?

March 11, 2026

Channel Conflict 2.0: Managing the “Influencer” Ecosystem

March 6, 2026

The Prescriptive Content Engine: AI Telling Reps Exactly What to Show

March 2, 2026

SalesTech Consolidation vs Specialization: Where the Market Is Headed?

February 25, 2026

Seeing is Selling: The Boom in Visual CPQ for Complex Products

February 23, 2026

Who Do We Really Know? Unlocking Revenue with Relationship Intelligence

February 20, 2026

Salestech for Network-Led Growth: Turning Internal Relationships into Pipeline

January 23, 2026

From Competitors To Ecosystems: Mapping Indirect Threats In A Networked Economy

January 19, 2026

Matthew Karasick

Matt is passionate about finding ways to use data to achieve the wins that data-powered technology can create between companies and consumers. Matt has spent his career helping companies do more with data. He has held product leadership positions at DoubleClick, Trilogy, Acerno, Akamai, and most recently at Indeed. After working closely together with Matt Kilmartin at Akamai, Matt (Karasick) worked with the Krux team as a consultant, where he helped create Krux for Marketers. Matt believes that, when done correctly and with sustainable mutual value as the measuring stick, interests between consumers and companies are always aligned.

How Clean ML Is Shattering Data Science’s Glass Ceiling

Matthew Karasick

Big Data Hits a Data Ceiling

Clean ML Use Cases

Latest News

Wishpond Appoints Jordan Gutierrez as Chief Executive Officer to Lead Next Phase of Execution

Quantcast Sales Veteran Andrew Double Joins Adora as VP of Revenue

CB Insights Partners with Perplexity to Bring Curated Research Selection into Perplexity’s Leading AI Platform

Horizon Media Holdings Names Bhavana Smith Chief Operating Officer, Advancing AI-Led Transformation

Trending Articles

Why SalesTech Is Becoming a Critical Investment for Digital-First Businesses

How API-First SalesTech Is Redefining Revenue Operations?

SalesTech Consolidation vs Specialization: Where the Market Is Headed?

Seeing is Selling: The Boom in Visual CPQ for Complex Products

From Competitors To Ecosystems: Mapping Indirect Threats In A Networked Economy

Matthew Karasick

You Might Also Like

More From Author

Wishpond Appoints Jordan Gutierrez as Chief Executive Officer to Lead Next Phase of Execution

Novartis Agrees to Acquire Excellergy, Building on Allergy Leadership With Next-Generation Anti-Ige Innovation

About Us

Quick Links

Visit Out Other Sites

Follow Us

Interested in our Customized Editorial Services?

Please fill your details and we'll get in touch with you!

How Clean ML Is Shattering Data Science’s Glass Ceiling

Matthew Karasick

Big Data Hits a Data Ceiling

Clean ML Use Cases

Latest News

Stay With Us

Trending Articles

Matthew Karasick

You Might Also Like

About Us

Quick Links

Visit Out Other Sites

Follow Us

Interested in our Customized Editorial Services?

Please fill your details and we'll get in touch with you!