ITSO Redbooks team from around the world (Gereon Vey, Martin Bachmaier, and myself) is currently working on the second edition of the "In-memory computing with SAP HANA on IBM eX5 Systems" book, and I wanted to share key things about the solution.
SAP HANA is a flexible, data source agnostic appliance that allows you to analyze large volumes of data in real time without the need to materialize aggregations. It is a combination of hardware and software, and it is delivered as an optimized appliance in cooperation with SAP’s hardware partners for SAP HANA.
1. IBM Systems solution for SAP HANA is an integrated appliance that comes with the complete software stack.
IBM solution for SAP HANA is delivered as a pre-integrated software stack, including the operating system, IBM General Parallel File System (GPFS), and the SAP HANA software, loaded onto workload-optimized IBM eX5 servers.
2. IBM Systems solution for SAP HANA runs on proven high-performance, scalable IBM eX5 family of servers.
The IBM eX5 product portfolio, built on the Intel Xeon processor E7-8800/4800/2800 product families, represents the fifth generation of servers built upon IBM Enterprise X-Architecture. Enterprise X-Architecture is the end product of generations of IBM technology and innovation derived from our experience in high-end enterprise servers. These servers can be expanded on demand and configured by using a building block approach that optimizes system design servers for your workload requirements.
IBM eX5 portfolio contains the HX5 blade server, the x3690 X5, the x3850 X5, and the x3950 X5.
3. IBM Systems solution for SAP HANA includes IBM General Parallel File System (GPFS) high-performance shared-disk file management solution.
GPFS leverages its cluster architecture to provide quicker access to your file data. File data is automatically spread across multiple storage devices, providing optimal use of your available storage to deliver high performance:
GPFS provides a stable, industry-proven, cluster-capable file system for SAP HANA.
GPFS transparently works with multiple replicas (that is, copies) of a single file in order to protect from disk failures.
GPFS adds extra performance to the storage devices by striping data across devices.
With the new File Placement Optimizer (FPO) extensions, GPFS enables the IBM Systems solution for SAP HANA to grow beyond the capabilities of a single system, into a scale-out solution, without introducing the need for external storage.
GPFS adds high-availability and disaster recovery features to the solution.
4. IBM Systems solution for SAP HANA is built on workload-optimized server building blocks.
IBM created several custom server models for SAP HANA. These workload-optimized models are designed to match and exceed the performance requirements and the functional requirements as specified by SAP. With a small set of IBM System x workload-optimized models for SAP HANA, all sizes of SAP HANA solutions can be built, from the smallest to large installations, utilizing the building block approach.
The building blocks are configured to match the SAP HANA sizing requirements. The main memory sizes match the number of CPUs, to give the correct balance between processing power and data volume. Also, the storage devices in the systems provide the storage performance and capacity required to match the amount of main memory.
Server internal storage consists of a RAID protected array of 10 K SAS hard drives, optimized for data throughput, and flash memory storage devices: RAID-protected, hot swap eXFlash SSD drives on the building blocks based on IBM System x3690 X5 and flash-based High IOPS PCIe adapters for the building blocks based on IBM System x3950 X5.
5. IBM Systems solution for SAP HANA can be easily and seamlessly scaled to meet growing computing demands.
IBM Systems Solution for SAP HANA supports a scale-out approach (that is, combining a number of systems into a clustered solution, which represents a single SAP HANA instance). A SAP HANA system can span multiple servers, partitioning the data, to be able to hold and process larger amounts of data than a single server can accommodate.
This scale-out solution consists of a homogeneous cluster of building blocks, interconnected with two separate 10 Gb Ethernet networks, one for the SAP HANA application and one for the GPFS file system communication. The SAP HANA database is split into partitions, forming a single instance of the SAP HANA database. Each node of the cluster holds its own savepoints and database logs on the local storage devices of the server. The GPFS file system spans all nodes of the cluster, making the data of each node available to all other nodes of the cluster.
Currently, IBM Sysetm solution for SAP HANA can be scaled-out up to 56 nodes, which means that the cluster has a total main memory of up to 56 TB, and up to 28 TB can be used to store the compressed data. Assuming the compression factor is 7:1, this potentially accommodates up to 196 TB of source data.
For more information about SAP HANA and the IBM System solution, refer to the In-memory Computing with SAP HANA on IBM eX5 Systems publication.
Ilya Krutov is an IBM Redbooks Project Leader. He writes books and papers on many topics related to IBM System x, IBM BladeCenter, and IBM Flex System. Follow Ilya on Twitter at @IlyaAtRedbooks.