Compiling collection-oriented languages onto massively parallel computers
This paper introduces techniques for compiling the nested parallelism of collection-oriented languages onto existing parallel hardware. Programmers of parallel machines encounter nested parallelism whenever they write a routine that performs parallel operations, and then want to call that routine itself in parallel. This occurs naturally in many applications. Most parallel systems, however, do not permit the expression of nested parallelism. This forces the programmer to exploit only one level of parallelism or to implement nested parallelism themselves. Both of these alternatives tend to produce code that is harder to maintain and less modular than code described at a higher level with nested parallel constructs. Not permitting the expression of nested parallelism is analogous to not permitting nested loops in serial languages. This paper describes issues and techniques for taking high-level descriptions of parallelism in the form of operations on nested collections and automatically translating them into flat, single-level parallelism. A compiler that translates a subset of a collection-oriented language,Paralation Lisp, into the instruction set of a flat virtual machine is presented. The instructions of the virtual machine are simple instructions on vectors of atomic values, and can be implemented on a broad class of target architectures, including vector machines, single-instruction parallel machines, and multiple-instruction parallel machines. We have implemented the instructions on the Connection Machine computer (CM-2), a massively parallel, single-instruction computer. As an illustration of the compiler techniques, the paper presents a quicksort example. The example has been tested on the CM-2. The speed of the compiled sort is only a factor of 3 slower than that of the fastest CM-2 sort.