In my last blog, I cited the data volume challenge faced by retailers like Walmart. For Walmart, 2.5 petabytes per hour is a lot of data that needs to be ingested by its infrastructure or any firm’s systems for that matter. Think of the systems and overhead needed to process, transport and store all or even a small percentage of this data.
Now the data volume dealt by retailers like Walmart or social media firms such as Google on a daily basis are extreme cases of big data. At Financial Market firms, the amount of generated data is much smaller but can still approach terabytes in short periods of time. For example, Credit Value Adjustment (CVA) analytics, a measure of counterparty credit risk, can output up to 100 terabytes or within 2 hours.
Furthermore, while many financial firms have upgraded their servers and storage devices, they likely have 10 gigabit Ethernet (GbE) networks. Ingesting 100 terabytes over a 10GbE network will max out the bandwidth for over 2 hours, creating bottlenecks for all other network processes and increasing latency for time-critical processes such as pre-trade pricing and portfolio risk analytics. This is best avoided.
Credit risk calculations such as CVA often rely on Monte Carlo simulations running on high performance grid systems such as IBM Platform Symphony. One key feature supported by Platform Symphony is data awareness. Rather than shuffling data to and fro across the network, Platform Symphony intelligently schedules the calculations at the computational resources closest to the targeted data to avoid creating bottlenecks.
In addition to data volumes, infrastructures now need to support wide range of workloads e.g. short and long running, compute and data intensive, scheduled and ad hoc, etc. Upgrading storage, servers and network to meet these needs is not enough. Plus it’s expensive. Firms need smart scheduling software such as Platform Symphony that can help optimize the use of IT resources given such diverse, unpredictable workloads