For many applications using the Hadoop MapReduce framework, the jobs have short execution times. For such jobs, the Apache Hadoop based implementations do not provide a response time that makes it feasible to use this framework. In addition, optimizing the data transfer and consumption impacts the performance of map reduce jobs.
IBM Platform Symphony accelerates your MapReduce application performance.
- In Platform Symphony, the Map and Reduce requests are served by stateful services, which can be shared and reused across multiple tasks invocations and job runs. Because data and intermediate results are being optimally cached, reused and shared, such architecture reduces data loading time and avoids unnecessary re-calculation and minimizes the memory consumption.
- Many jobs have large reference data (10+GB) per mapper, the result of such reuse and sharing is very significant - It essentially allows clients to efficiently use commodity servers to complete tasks that would require expensive big memory boxes without such a solution. The shared memory solution makes significant contribution to not only hardware savings but also to performance gains.
- A Fine –tuned shuffle step helps improve overall application performance which shortens the time to solution from hours to minutes
- An optimized MapReduce implementation reduces the I/O volume on network and disk for accelerated application performance.
Download Platform Symphony Developer Edition and see its benefits for your applications.
How to Reduce Cost in Government and Make Faster Decisions with near Realtime Hadoop Analytics
Merv Adrian, Research Vice President, Gartner, Inc.
Rohit Valia, Program Director, High Performance Cloud and Analytics, IBM Platform Computing