Computerworld

Coders must reprogram how they write for Wall Street

Parallel programming knowledge is becoming a must-have skill
High Performance Computing facility

High Performance Computing facility

As high-performance computing (HPC) becomes more important in helping financial services companies deal with a rising tsunami of data, there's growing angst on Wall Street about a dearth of skilled programmers who can write for multicore chip architectures and parallel computing systems.

"In high-performance computing, there is a major sea change that's been happening... and it's getting more dramatic," said Jeffrey Birnbaum, chief technology architect at Bank of America/Merrill Lynch. "With the sea change that's coming -- parallel computing, multicore processors -- the skill of the programmer matters more."

Given that the financial services industry is often an early adopter of technology that eventually trickles down into other markets, the skills Wall Street coders need now are likely to be the same ones that other coders will need in the future.

Birnbaum talked about why programmers should hone their skills during a presentation at the High Performance Computing in Financial Markets Conference in New York this week.

Birnbaum stressed that programmers need to do more than simply tackle languages, such as Assembly, that can take advantage of parallel computing. They need to be more skillful with even more traditional programming languages.

"All things being equal, sure, there is a difference in performance [between languages]," Birnbaum said. "So your best guy who programs in Assembly will be marginally better than your greatest C or C++ guy. And [he] will easily beat the best Java guy. But that's not the point. Bad programmers create bad code. It doesn't matter what language they use."

The rise of distributed computing

About five years ago, Moore's Law ran into a dead end in terms of CPUs that could keep up with application performance requirements, according to Charles King, principal analyst at research firm Pund-IT. That led to the emergence of multicore processors and parallel, or distributed, computing -- the ability to spread a complex programming task among many CPUs.

Yet most programmers haven't yet embraced parallel programming, with as many as 98 per cent still relying on serial coding methods, King said. The main issue is this: Programming for parallel architectures is complicated.

In the financial services industry, a parallel computing architecture often relies on hundreds or even thousands of x86 servers all working on a single data set that has been divided up to spread the workload. As the work is completed, the data set must be put back together in an automated fashion.

Whether for business uses such as predictive analysis, financial modeling, or for business intelligence through transactional databases, the only way to get full performance in HPC architectures that use multicore chips is through parallel programming, King said.

Recently, makers of graphics processing units (GPU) and field-programmable gate arrays (FPGA) such as Nvidia have pitched their products as simpler platforms for which to code, making a parallel programmer's job less arduous.

Birnbaum disagreed that GPUs and FPGAs are better platforms for parallel computing; he said comparisons are often unfair because parallel code on the latter is often compared with serial code run on multicore CPUs.

"What if I wrote code to run on many [CPU] cores? That would be a fair comparison," he said. "Too many people are rewriting stuff with parallel algorithms for GPUs and FPGAs claiming performance advantages. CPUs are still much faster than most programmers know."

Moving way from relational databases

Here's something else programmers should keep in mind: They don't necessarily need to code for relational databases, which the financial services industry has used for years. Birnbaum said he sees movement in the industry away from SQL.

Bank of America, for instance, is building a noSQL database for the same reason Facebook chose to use Cassandra: It can scale across many nodes, has high availability and is decentralized.

"So they're basically fancy blob stores," he said. "It comes back to what's my problem. If it's that I'm doing high-performance computing, and really all I need to do is substantiate some code on various nodes and then I need to get access to the data, why does it have to be relational?"

Another hindrance facing veteran programmers is their own experience. Because they're steeped in serial programming, the prospect of moving to parallel programming looks daunting. In fact, Birnbaum said, high school and college-age coders have a greater advantage over their older counterparts today than at any time in computing history.

"Their minds aren't constricted," he said.

Birnbaum also said programmers should be multilingual so they can more easily create hybrid programs that take advantage of separate programming components. "There's nothing that says, especially in a distributed system, that all components have to be in the same language. That said, I think there are some languages that got this right more than others."

Birnbaum said .Net languages, such as Python, are best suited for integration with other code like C or C++ without any performance penalty.

The changing needs of the financial services industry -- and the variety of new skills needed by programmers -- mean that coders have to evolve as fast as the systems on which they work, he said. The bottom line is that programmers who don't adapt could one day find themselves out of a job.

"The underlying machinery is changing, and if you don't relearn how to take advantage of it, you're going to be the person they write about in the New York Times about the people in their 60s who can't find a job because their skills are outdated," Birnbaum said.