By Patty Driever, STSM System z I/O & Networking Technology, IBM Systems and Technology Group
Faster is better.
There’s currently a commercial on television from a telecommunications company where a group of young children are asked the question “What’s better, faster or slower?” and they reply with a resounding response of “Faster!”. Then to solve the problem of a grandma that’s too slow, one child suggests a solution of taping a cheetah to her back.
Well, in the world of datacenter network latency, I think most of us would agree with the children’s assertion that faster is better. In my many years of working on Storage I/O and networking channels for System z we have continued to focus substantial effort on how to reduce latency and response times for operations and improve throughput.
When we developed the new Shared Memory Communications-RDMA (SMC-R) protocol, the primary goal was to improve network performance. And the micro- and macro-benchmarks we’ve done tell a very compelling story in that area. We see substantial reductions in network latency and transaction response times, and for streaming (bulk data transfer) workloads we also see significant CPU savings. Faster is better, and SMC-R delivers on that.
But when searching for solutions to bottlenecks affecting performance, we also know that we can’t lose sight of a broad range of needs of our enterprise class customers, and the practical implications of implementing any new solution we provide. Not to take the analogy too far, but we know we can’t just tape a cheetah to grandma’s back!
I think what makes the story of SMC-R so powerful is that it delivers these significant performance improvements while also preserving a number of other critical aspects of network security and management, and the value of the new protocol can be realized quickly because it integrates seamlessly into existing networking infrastructures and requires no changes to the applications that will realize the benefits.
RDMA over Converged Ethernet (RoCE, pronounced “Rocky”) is a relatively new standard that basically allows the RDMA programming model, originally embodied in the Infiniband transport, to be implemented over Ethernet. So while new RDMA-capable network adapters are required (RoCE Express on System z), they can be connected to standard Ethernet switches (the only special requirement for these switches is that they support the ‘not so new’ 802.3x standard for flow control).
The SMC-R protocol, support for which is available in z/OS R2V1, is a ‘hybrid’ protocol in that it combines the use of TCP and RDMA, leveraging the best of both worlds. By preserving the sockets interface, it enables the plethora of TCP/IP sockets applications in existence today to realize the performance advantages of RDMA, without requiring any changes to them. The use of RDMA to transfer the data (bypassing the IP stack) is transparent to the applications. By preserving the 3-way TCP ‘handshake’, use of SMC-R is fully compatible with TCP connection load-balancing solutions. It is also compatible with many existing TCP security features such as IP filters, SAF-based network access control checks, as well as TCP-based network encryption and authentication technologies such as TLS and SSL.
And because it’s basically another type of connection into an Ethernet network, z/OS provides visibility to SMC-R traffic through traditional network management functions such as NetStat, SMF, RMF, and NMI. Minimal new configuration and operational tasks exist because both the discovery of the host peer’s RDMA capabilities and the RDMA connection setup are performed dynamically within the SMC-R protocol.
What could make this story even better? Well, since it’s a protocol to enhance communications between hosts, we realize that the maximum benefit to our clients can be realized when more and more hosts are capable of exploiting the new protocol. So we have not and are not keeping this protocol proprietary. We have shared it with the open community through documenting it in a draft Internet Engineering Task Force (IETF) RFC and have presented on SMC-R to the Open Fabrics Alliance. Widespread adoption of the protocol is our hope and goal.
For now, our multi-LPAR/multi-CPC z/OS clients can quickly benefit from the significant performance advantages that SMC-R with the RoCE Express feature deliver, with minimal changes to their Ethernet datacenter ecosystem.
With SMC-R, faster IS better!
Patty Driever is a Senior Technical Staff Member in the System z brand organization in Poughkeepsie, and is currently the Storage I/O and Networking Technology Leader for System z. Patty has held a wide variety of technical and management positions in System z, more recently in technical assignments focused on System z networking and storage I/O.