In-Memory Data Processing for Sales Planning

Abstract:

In 2004, MapReduce [1] was released as a general engine for batch processing. It was great for working with big volume of data but users needed quickly interactive ad-hoc queries and real-time stream processing. Today’s Big Data systems cover a broad range of such use cases and are specialized in solving very specific problems. One special category of such systems is In- Memory databases, which rely on a distributed cache system to make data processing faster by limiting the I/O disk bottleneck. Such systems claim to bridge the gap between Online Transactional Processing (OLTP) and Online Analytical Processing (OLAP) workloads into a single system by offering real-time distributed processing and analytics. This paper questions the real time claim by testing performance of four in memory databases: MemSQL, Oracle, SQL Server and Apache Ignite in the context of a sales planning model. This case study requires a mixture between data changes (either small or big updates) with analytics (including aggregations) in the context of real-time performance. Preliminary results show that real-time can be achieved when using In-Memory databases for sales planning