CPC 2006 CPC 2006

Call for Participation
Current List of
Program (List of Talks)
Registration Form
Travel Information

Funded by the Vicerreitoría de Investigación of the University of A Coruña and the Ministry of Education and Science of Spain

Universidade de A Coruña Ministerio de
		      Educación y Ciencia

Optimizing Sorting with Genetic Algorithms

Xiaoming Li, María Jesús Garzarán and David Padua
Department of Computer Science, University of Illinois at Urbana-Champaign, USA

The growing complexity of modern processors has made the generation of highly efficient code increasingly difficult. Manual code generation is very time consuming, but it is often the only choice since the code generated by today's compiler technology often has much lower performance than the best hand-tuned codes. A promising code generation strategy, implemented by systems like ATLAS, FFTW, and SPIRAL, uses empirical search to find the parameter values of the implementation, such as the tile size and instruction schedules, that deliver near-optimal performance for a particular machine. However, this approach has only proven successful on scientific codes whose performance does not depend on the input data. In this paper we study machine learning techniques to extend empirical search to the generation of sorting routines, whose performance depends on the input characteristics and the architecture of the target machine.

We build on a previous study that selects a "pure" sorting algorithm at the outset of the computation as a function of the standard deviation. The approach discussed in this paper uses genetic algorithms and a classifier system to build hierarchically-organized hybrid sorting algorithms capable of adapting to the input data. Our results show that such algorithms generated using the approach presented in this paper are quite effective at taking into account the complex interactions between architectural and input data characteristics and that the resulting code performs significantly better than conventional sorting implementations and the code generated by our earlier study. In particular, the routines generated using our approach perform better than all the commercial libraries that we tried including IBM ESSL, INTEL MKL and the C++ STL. The best algorithm we have been able to generate is on the average 26% and 62% faster than the IBM ESSL in an IBM Power 3 and IBM Power 4, respectively.

Back to the Workshop Program 

Please contact our webadmin with any comments or changes.