Contents |
Introduction
Wikibon has consistently asserted that the most important business value from persistent memory technologies such as NAND flash derives from the understanding that access to data no longer becomes a constraint to business application design. One of these Wikibon discussion documents is entitled Flash and Hyperscale Changing Database and System Design Forever.
By removing the rusting anchor of disk-based data access, applications can be designed with orders of magnitude more data and more accesses to data. However, exploiting such capabilities means thinking outside the box, and using software and hardware technologies designed for a post-Fe2O3 era. Figure 1 below shows how enterprise-wide integrated applications could be connected in this era.
eXelate understands the business potential of processing billions of transactions and making real-time decisions in milliseconds. eXalate uses a range of technologies to achieve the volumes of processing required. The case study below shows the technologies used.
The conclusion of this study is that CIOs and CTOs need to find staff with the imagination to support new business methods with new integrated real-time analytic and transactional systems.
eXelate, the Company
eXelate is in the business of supporting its advertising customers by reducing Big Data to actionable smart data. It ingests huge amounts of click and other data from websites and other sources, builds models describing what that data means, and uses those models to populate databases that optimize bidding for advertising space in real time. The consumer is shown more relevant advertisements and the advertiser increases revenue. eXelate is planning to expand to be a global force in Big Data real-time analytics.
The volumes are very high. eXelate processes 60 billion transactions per month for over 200 publishers and marketers across multiple geographic regions. The key challenge is processing that huge amount of data cost effectively and in real-time to extract the wheat from the chaff, the signal from the noise. Traditional on-line processing methods and databases are simply far, far too expensive – at least an order-of-magnitude too expensive. eXelate had to design a completely different way of achieving high performance and high availability.
eXelate’s IT Architecture
eXelate’s architecture and software technology is to develop complex sophisticated models of likely behavior, create a real-time software capability for making business buying decisions against that model:
- Data Capture
- Data is captured from publishing sites and logged by 200 front-end servers at locations distributed throughout the world to minimize latency.
- Transaction Data Processing – supporting 60 billion transactions per month & 2 terabytes of data per day.
- The data is processed in the nearest Aerospike real-time NoSQL database cluster, using key value pairs;
- The Aerospike cluster is based on 12 industry standard x86 processor nodes, each node with 5 (growing to 7) SSDs and 128GB DRAM;
- The Aerospike database has a flash-first architecture, to support 50% of the IOs being writes - the SSDs are the persistent storage layer;
- The decision tables are held in memory in compressed columnar format.
- High Availability
- The component of the Aerospike technology, Cross Data Center Replication (XDR), is used to provide geographic redundancy across all the eXelate site;
- Within Aerospike, indexes are co-located with data to support synchronous replication, very fast ACID consistency, automatic fail-over and re-balancing to ensure that data is available when a node is lost.
- Big Data Analysis & Predictive Model Creation
- eXelate’s proprietary prediction models are mainly developed on Revolution R Enterprise software and IBM PureData System for Analytics;
- The software is run on a IBM Netezza “shared nothing” parallel database appliance;
- The data comes from the front-end data capture servers, from the data transaction systems and from other data sources;
- The models are then held in a single-node MySQL database (with a Fusion-io PCIe card) and loaded into Aerospike systems as required.
Conclusions & Recommendations
Traditional SQL database systems would cost an order-of-magnitude more in equipment and software to achieve the volumes processed by eXelate, and the additional costs could negate the eXelate business model. The success of eXelate is driving additional innovation in eXelate’s technology partners.
eXelate has been pragmatic about the technologies it has used and focused them on achieving business results. It has not been tempted by architectures that have not been proven to scale (e.g., most Data-in-Memory technologies such as Memcache), but have focused on technologies that reduce persistent IO latency (e.g., Aerospike’s Flash-first architecture).
Figure 1 shows the integration of different systems that are needed to exploit Big Streams, Big Systems and Big Data.
These systems need to read and write large amounts of data. Although large amounts of traditional memory should be part of architectures, there are limits to data-in-memory scalability and reliability. The most important technology to ensure that these systems can scale effectively is NAND flash memory, both to enable very rapid IO and to extend main memory. While it is ironic that technology to improve Internet ad placement is an important driving force behind this rapid innovation, it is clear that these technologies are going to the basis of many real-time decision systems that support business processes.
Wikibon believes that retailers wanting real-time price adjustments and variable marketing/pricing to a customer segment of one will be the next major wave to use similar technology architectures to achieve their business objectives. Over time, these systems types will permeate all verticals.
Action Item: CIOs and CTOs should have a clear strategic path to integrate real-time decision support systems that can be integrated into transactional systems. There is enormous potential to improve business efficiency and increase revenue with such architectures. The biggest challenge is finding senior architects that have the imagination to shake off the shackles of traditional system designs, and re-architect completely new business and technology solutions.
Footnotes: