The financial services field is not just about fast math for real-time machine trading; text data mining has come to the fore as a leading indicator of market pricing. Sentiment analysis on breaking news tips derived from millions of news feeds, social media, government, and other sources every second requires massively parallel, near real-time processing. The first company to analyze the dizzying amount of news wire and social media data to find relevant information wins the lion’s share when the market moves.
“Buy on rumor, sell on news” has been a truism on Wall Street long before high-speed, algorithmic trading or real-time computational finance was even invented. But humans can’t possibly keep up with rumors or news. Computational news analytics (NA) and natural language processing (NLP) have vastly improved the ability of computers to contextually analyze the tone and relevance of news stories, making it possible to incorporate news flow systematically into trading and investment strategies.
Financial services has a wide range of applications such as options and derivative pricing, fraud detection, portfolio management, risk analysis, Monte Carlo modeling, and many other analytic techniques. Not all applications require the high velocity analysis achieved by hardware acceleration. For some analysis it’s OK to get answers in minutes to hours, even days. But many applications can benefit from orders-of-magnitude faster analysis for high volume data from a larger variety of sources. For those applications, time is money, literally.
Today, financial applications have been augmented by new technologies for ingesting and analyzing the relevance and meaning of massive quantities of textual data from a huge variety of sources. Web 2.0 news sources, such as blogs, Twitter, Facebook, Tumblr, Seeking Alpha, and others, are getting a lot of attention. These newer media sources are sometimes first to surface new information. For example, news of the raid on Osama Bin Laden broke on Twitter a full 20 minutes before mainstream media picked it up. Social media is a realtime phenomenon which can be harnessed for information to trade on.
Volume can sometimes surge and overwhelm compute resources. As of early 2014, Twitter’s volume was somewhat under 10,000 Tweets per second, but can surge to some tens of thousands of Tweets per second around major events. Of course, other more traditional news sources add to the huge volume of data, including mainstream news, SEC filings, legal documents, government news and statistics, scheduled data releases such as economic statistics, leading economic indicators (LEIs), industry statistics, corporate earnings reports, press releases and many others.
The types of heavily textual data mining analysis are particularly amenable to acceleration with the immensely parallel symbol processing power of Micron’s Automata Processors (AP). AP technology can be readily deployed in HPC server farms with PCIe accelerator cards, with a straightforward programming model and tools suite for developers.
Commonly used text-based news analysis includes:
- Data clean-up prior to running news analytics
- Graph analytics for fraud detection
- Rule checking for portfolio management and regulatory compliance
- Parts-of-speech tagging (nouns, verbs, phrases, idioms, hash tags, …)
- Sentiment scoring (positive, negative, neutral)
- Media intensity scoring (lightly covered news vs. being in the media spotlight results in different market trading patterns)
- Entity tagging (people, stock symbols, company names, government organizations, statistics, …)
- Location tagging (cities, countries, regions)
- Novelty rating (the first instance of news might have more value)
- Newness rating (repeated stories have less impact)
- News category tagging (10Q, press release, …)
- Event detection and tagging (earnings warnings, oil spills, coups, earthquakes, crop forecasts, acquisitions, …)
- Category tagging (press release, blog, opinion, news, analyst rating, …)
- Metadata generation (sentiment scores, relevance scores, locations, …)
Micron’s AP acceleration technology can be deployed at the news aggregators and analytics suppliers such as Thomson Reuters, RavenPack, Bloomberg, Selerity, Dow Jones, Wall Street Horizon, Gnip, and others to deliver value-added, fast news feeds to trading firms.
At HFT trading firms, AP technology can speed the next layer of proprietary analytics that feed into trading-specific algorithms. Software may take into account news effects on stocks’ directionality, momentum, volatility, bid-ask spreads, volume, and liquidity among other factors that drive HFT tactics, such as “short-term reversals” for stocks or country sentiment for Forex trading. The ease of programming Automata Processors is a significant advantage for implementing rapidly evolving HFT tactics and strategies in software. The more complex logic of natural language and linguistic analysis executes on the server CPUs, while the AP accelerator technology executes immensely parallel symbol processing.