CPC 2006 CPC 2006

Call for Participation
Current List of
		      Participants
Program (List of Talks)
Registration Form
Accomodation
Travel Information
Links

Funded by the Vicerreitoría de Investigación of the University of A Coruña and the Ministry of Education and Science of Spain

Universidade de A Coruña Ministerio de
		      Educación y Ciencia



Register Allocation for VLIW DSP Processors with Irregular Register Files

Yung-Chia Lin, Yi-Ping You and Jenq Kuen Lee
National Tsing Hua University, Hsinchu, Taiwan


A variety of new register file architectures have been developed for embedded processors in recent years, advantaging hardware design to achieve low power dissipation and reduced die size over traditional unified register file structures. However, due to the more specific accessing features and irregular constraints, more appropriate code generation, register allocation, and instruction scheduling schemes than conventional compilation techniques are in great demand for processors incorporating such novel register file organizations, to attain optimal performance. This paper presents a novel register allocation scheme for a clustered VLIW DSP processor, known as Parallel Architecture Core (PAC) DSP, which is designed with distinctively banked register files in which port access is highly restricted. The PAC DSP uses a heterogeneous design that equips one singular scalar unit (for light-weight arithmetic, address calculation, and program flow control), plus two data stream processing clusters in which each one contains a pair of load/store unit and ALU/MAC unit with powerful SIMD capabilities; every unit in the clusters collocates three varied types of register files, providing different accessing manners and constraints, and the scalar unit has its own accessible register file deployed. The major speciality of the register file architectures featured by the PAC DSP processor is that it incorporates a so-called ping-pong register file structure, which is divided into two banks and in which banks can only be restrictedly accessible in a mutual-exclusive way, as a semi-centralized register file among clusters and functional units within a cluster. With this design, considered to decrease the power consumption because of fewer port connections, not only does the clustered design make register access across clusters an additional issue, but the switched access nature of the ping-pong register file raises our interest in investigating further register assignment to increase instruction level parallelism. We propose a heuristic algorithm, named as ping-pong aware local favorable register allocation, to obtain preferable register allocation that is expected to well utilize the irregular register file architectures in PAC DSP. The algorithm involves the proper consideration of various characteristics in accessing different register files, and attempts to minimize the penalty caused by the interference of register allocation and instruction scheduling, with retaining desirable parallelism over ping-pong register constraints and inter-cluster overheads. Experiments were done with a developing compiler for the PAC DSP based on the Open Research Compiler (ORC), and the results indicate that the compilation with the proposed approach delivers significant performance improvement, comparable to a simulated annealing approach, which is considered as a near-optimal but an exhaustive solution.


Back to the Workshop Program 

Please contact our webadmin with any comments or changes.