With mountains of market data, historical prices, and transactions data stored in disparate systems, securities and investment firms are shifting from a focus on collecting data to extracting value from it.
A December 2019 paper by capital markets consultancy GreySpark Partners examined the potential for buy-side and sell-side firms to transform large quantities of big data into actionable intelligence – producing what is known as ‘smart data – through specialized analytics.
The move comes as electronic trading has generated massive data sets across equities, fixed income and currencies. Firms are hiring data scientists and coding analytics to mine this data for trading opportunities or to identify patterns that help lower transaction costs.
In the report, titled “Smart Data Analytics Set to Play Key Role in Reducing Buy Side and Sell Side Trading Costs,” GreySpark predicts that smart data inputs and data analytics will become more significant in the next three-to-five years in terms of client performance analytics, competitive differentiation, and value creation.
“Meaningful analytics can enable an asset manager or trader to understand what happened in the market five seconds ago, what is going to happen next, and crucially why things happened,” said Russell Dinnage, head of the Capital Markets Practice at GreySpark in London, who authored the report with Mosaic Smart Data, an analytics provider.
“Effectively transforming big data through specialized analytics can generate millions of USD dollars in cost savings as a result of spotting new opportunities,” states the report.
Market Data Spend
Of concern is that asset managers and investment banks spent an estimated $50 billion in 2019 on external market data provided by exchange groups, inter-dealer brokers and vendors such as Bloomberg, FactSet and Refinitiv, to support a broad spectrum of trading activity across all major asset classes, according to the GreySpark report.
Raw data includes market data, historical, and end-of-day closing prices, but it also pulls in news and sentiment-related information from Twitter, along with proprietary transaction data, including orders and trades.
Despite the huge spend on market data and transaction data, the report notes that data asset management and value creation through analytics is an underdeveloped area across a firm’s business.
Defining Smart Data
But what exactly is smart data? And how does it differ from big data? It turns out that smart data extracts value from big data, and entails reformatting, normalization, and other data management activities.
“Smart data [outputs] happen when a firm transforms a very large or big data set into another data set that is continuously updated in real time with new, live, data points all of which are coming from multiple sources and are being systematically standardized as they are flowed into the large data set,” said Dinnage in an interview. “All the data sets are being automatically standardized so they are normalized or uniform with the data in the rest of the data set,” explained Dinnage.
In the case of Tier 1 to Tier III investment banks, smart data is “curated” from the combination of firm-specific, client-proprietary, historical order and pricing data with externally sourced, raw markets and transactions data.
Once this data is standardized, to make it smart, firms integrate these data sets together and process it through analytics software, which is viewable through a dashboard, graphical user interface (GUI), or desktop application by portfolio manager or trader within a bank.
But the problem is that most of these huge and large data sets are siloed in legacy system architectures.
In order to realize the potential of smart data, the report contends that firms must “undo decades of legacy, siloed data capture, management, and storage architecture, which is no longer fit-for-purpose from a cost management perspective in the age of algorithmic and quantitative business and trading models of the future.”
For example, GreySpark found that Tier I to Tier III investment bank spending on cash equities and fixed-income and currency (FIC) trade execution-linked data products and services alone grew by 5% year-on-year in 2019 and will likely continue to increase at the same rate through to the end of 2020.
At Tabb Group’s FinTech Festival conference in November, panelists working in enterprise data analytics, machine learning, and AI at banks and fintech firms, cautioned that 70-80% of data projects fail. “It is a known fact in the financial industry,” said the chief data officer for a global bank in North America.
One of the biggest challenges is gaining access to the data. “Ninety percent of the data is unstructured and certainly within the enterprise. Most of it is scattered and distributed and living in data silos. The fact that we can’t get access to the data for starters is the biggest problem. And second, when we do, we’re faced with all these different formats, and all these different types of data. We need to solve for that first,” said the founder and CEO of a cloud-based analytics platform.
Writing APIs to Access OMSs and EMSs
To get around these issues, investment banks, asset managers, and hedge funds globally manage and consume these raw data inputs through inhouse-built application programming interface (API) technology stack architecture as well as through the usage of vendor-provided order and execution management systems on a trading desk-by-trading desk basis to hold costs down over the short-to-medium term, notes the report.
However, investment banks typically run and maintain multiple OMSs and EMSs across many different desks across the entirety of the banks.
There is no one big data source; there are big data sources inside different silos within the banks in equities, fixed income and FX. “Each of those asset classes have an associated middle-and back-office facility, so each asset class is siloed and doesn’t necessarily share the same data management facility,” said Dinnage.
Data can be shared via open or closed APIs. “APIs allow firms to create a layer underneath all of those front, middle, and-back-office systems so the data that is commonly shared between those systems can be cleansed, standardized and linked together in a rational way,” said Dinnage.
While APIs came to market around 2012, they didn’t become a commoditized technology in the marketplace until around 2015-2016. “It’s now become a more uniformly best practice to write these APIs to be commonly shared,” said Dinnage. Almost every bank today is using API stacks to commonly share data, but it’s a huge undertaking involving years of labor and potentially millions of dollars in costs, he said.
One of the reasons why that data analytics is expected to become more popular over the next five years is that banks are experimenting with how different data architectures can be linked together. Many firms realize that maintaining data sources in silos is not optimal for running a competitive data infrastructure.
In order to move beyond writing APIs and building new architecture in the cloud, firms need desktop analytics to show traders and C-level executives that it’s worth the money to start rethinking how they’ve sourced data and who has access to it, said Dinnage. Rather than take this on, firms are partnering with a new breed of vendors that not only understand big data and data science, but can create dashboards, GUIs, and screens to provide analytical services.
Investing in FICC Startup
As a sign of the trend, banks have invested in analytical startups to get a first mover advantage. In March of 2018, JP Morgan took a minority stake in fntech startup Mosaic Smart Data, which collects and analyzes fixed-income, currencies, and commodities (FICC) data from the trading divisions of banks so that they make more informed decisions, reported The Trade. This followed JP Morgan signing a multi-year contract in October of 2017 with Mosaic spanning its entire fixed-income trading business.
The MSX platform help sales and trading professionals visualize market and client activity, and it can be used by a trader to determine which buy side clients are more likely to accept a deal, noted Reuters. The platform uses machine learning and natural language processing to provide narratives in plain language, which can explain why activity with a certain client or product has dropped off.
“Having a more holistic view of trading data will improve our service delivery for clients. The Mosaic platform integrates securely with our existing technology infrastructure and enables our teams to quickly make better informed decisions,” said Troy Rohrbaugh, head of macro at JP Morgan in a news release announcing the partnership.
On a fixed-income trading desk, smart data and smart data analytics can help to mask the complexity of data silos, client-and markets-connectivity infrastructure, and pricing and execution engines that exist on a desk. Vital information can be passed onto a single screen working in concert with an OMS that can used to prioritize client liquidity provision, block size order price formation and or markets aggregation, illustrates the report.
On the buy side, smart data analytics can help small-to-medium asset managers break their dependence on investment banks or prime brokers to provide them with market and trade-data feeds. Instead, with smart data analytics, firms could develop in-house data management capabilities to draw the feeds directly form relevant brokerage venues and exchanges.
But overcoming data management and data access hurdles will be key to extracting value from heterogeneous data sets. A study by Greenwich Associates found that 85% of capital markets firms intend to increase their spending on data management technology over the next three-to-five years. In addition, data analysis is forecast to become the most valued skill on trading desks in the next three-to-five years, based on 74% of capital markets professionals surveyed by Greenwich.
In short, firms are realizing that external and internal data sets contain value and they must treat data as if it were an investable asset class such as a house or a car. Financial firms have only begun to scratch the surface of extracting value from their data, and clearly there is a race to monetize those assets.