John David McCalpin
Research Scientist, High Performance Computing, TACC, University of Texas at Austin


Over the past 40 years, the high performance computing market has been characterized by a sequence of disruptive technologies – mainframes, vector supercomputers, minicomputers, massively parallel systems, RISC microprocessors, x86 microprocessors – each of which has significantly changed the price points and/or price/performance of available systems.   The most recent disruptive technology (GPUs) has stalled after capturing only a fraction of the market, failing to displace x86 processors as the dominant technology.  This talk reviews the underlying technology trends and economics of the HPC market and argues that the current stagnation is due to a common set of architectural assumptions that do not allow designers to address the fundamental technology issues.  For example, as core counts increase, communication and synchronization are increasingly critical for performance, yet neither are actually first-class concepts in any of the current microprocessor architectures.  Instead, they must be implemented as side effects of sequences of ordered memory transactions.  Similarly, data motion through the memory hierarchy is increasingly critical for performance and power efficiency, yet it remains deliberately hidden from the user behind a functionally transparent (but performance-opaque) system of automatic caches.   These architectural features, intended to provide ease of use, instead create a significant barrier to continued improvements in price/performance.  The talk will conclude with a review of proposed architectural extensions would allow designers to more directly address the challenges of communication, synchronization, data motion, and energy consumption.


John joined Texas Advanced Computing Center (TACC) in 2009 as a Research Scientist in the High Performance Computing Group after a twelve year career in performance analysis and system architecture in the computer industry. His industrial experience includes 3 years at SGI (performance analysis and optimization on the Origin2000 and performance lead on the architecture team for the Altix3000), 6 years at IBM (performance analysis for HPC, processor and system design for Power4/4+ and Power5/5+), and 3 years at AMD (accelerated computing technologies and performance analysis). Prior to his industrial career, John was an oceanographer (Ph.D., Florida State), spending six years as an assistant professor at the University of Delaware engaged in research and teaching on numerical simulation of the large-scale circulation of the oceans.