Pretty soon, curated data will be traded as a commodity along with goods and services
Thanks to the ever-marching progression of Moore’s Law, we’re moving into the era where data itself becomes a traded commodity. Before we look at this megatrend and how it will almost certainly impact nearly everything and everyone on Earth in some form, let’s take a brief look at where we’ve been:
Online marketplaces for goods and services
We know them and, in some cases, love them: Amazon, eBay, E*TRADE, Match.com (why not?), just about everything conceivable has found its way online and into our Internet-enabled handheld devices. Add to that the emergence of the hosted service ala Airbnb, Uber and so many more. In all cases, the ‘datafication’ of what, who, where and when have brought an unprecedented level of ease, speed and most interestingly, expectation of same, to nearly everyone with a smartphone, tablet or laptop. To qualify as a marketplace, a company must facilitate the buying and selling of goods and/or services through some form of curation, and generally produces its revenues through taking a percentage of the transactions, subscription fees, or a combination of the two.
Early data marketplaces
There are some. Without a doubt, the most prominent and successful “original” data marketplace is Bloomberg — in 2014 the company reportedly generated $9 billion in revenue, primarily consisting of financial data sales. In most respects, Bloomberg meets the “marketplace test” by bringing in data from a variety of sources and suppliers (some generated in-house), curating and making it available to its customers who pay per-transaction and/or subscription fees.
Now, big data marketplaces
As Moore’s Law, applied to processing power, storage, available low-cost bandwidth and other core elements, brings us headlong into the generation of more data each year than was generated in the entire history of computing before it, we’re now talking about big data.
It’s logical that with so much data ripping around and no signs of slowing, opportunities are emerging for whole new marketplaces that can bring in, organize and make the data available for third party consumption. The sheer volume of data calls out for curation, and just like what the now household names have done for goods, services and intangibles such as financial information, we are doubtless going to experience the rise and global acceptance of data marketplaces in both vertical and horizontal categories.
Ideas abound, but let’s look at a few of the more interesting areas. We know that eventually most healthcare data will be online, and that simply placing it all into databases with Web front ends won’t cut it, for many obvious reasons. But that same data when curated by a trusted provider concerned with the filtering of what can be given to whom (for example, a state or federal agency might have mandated access to full records, whereas someone doing research for a pharmaceuticals company or device manufacturer would be allowed only selected information), will make a lot of the digging around much less onerous. Companies and governmental entities will pay for that curation and convenience.
This is where the term ‘marketplace’ is apt — imagine the time before supermarkets (or Amazon for that matter), when you would have to visit many sources to get your household goods and meals together.
In the digital world, convenience is The Master of All. A data marketplace might be formed around online video — we really haven’t seen that yet in a grand way, where many suppliers vend to many buyers in a transactional way via a system not owned or initiated by the content suppliers — but it’s plausible that such a thing could emerge. Whether the incumbents or newcomers own that space remains to be seen.
Again, the hallmarks of a data marketplace are:
- data is gathered from a multiplicity of suppliers
- data is curated (to mean, at minimum, ensuring quality, tagging with provenance and other factors as appropriate to a given data type, and more)
- data is indexed to make it straightforward for buyers to find out what they require, and
- data is vended with a pricing/billing methodology.
These criteria filter out the big search providers, social media sites and others who don’t vend datasets from third party suppliers in the form of a merchant exchange.
What a data marketplace isn’t
Often data gets confused with analytics. When one is considering a marketplace, the analytics aren’t necessarily done within the marketplace itself, but could be.
For example, Bloomberg, as it has grown, introduced various “products” which are based upon analyzing various input data sources and “packaging” them as turnkey suites of statistics and other information useful to its customers in the financial services industry.
Note that some amount of analysis is required to ensure data quality and other important elements of curation. But in broad terms, the data marketplace is where the raw data, albeit with many operations performed on it to allow indexing and therefore searching, is input, stored and vended.
A fun example is a traditional brick-and-mortar supermarket. You can go to the baking goods section and purchase a bag of flour, and the other ingredients needed to make bread. Or you can go to the bakery section and buy a loaf of bread. Or you can go to the deli section and get yourself a sandwich. Within each section, value is added to the wheat, with the attendant increase in the price per gram of the wheat itself plus other elements introduced with the ‘processing’ into a finished product.
This is a pretty apt analogy to the digital version, whereby you can buy raw but curated data, or a pre-filtered/packaged version, or even a conglomerated “product” built from various data components.
When will we see true data marketplaces?
At the vanguard of this trend, and with the belief that before long data marketplaces will be integrated within normal business operations, pioneering efforts are underway to create the first systems for “data sellers” to vend to “data buyers.” The advent of IoT will bring this forward rapidly since data will be coming from millions of sources and with many potential buyers. Watch This Space.