Tutorial: Cell BE Processorô: Programming Tools and Techniques

Orginized by: Michael Perrone
IBM Research


The Cell BE Processorô (Cell) incorporates several interesting architectural features which enable Cell to support highly parallel, compute intensive codes. Though targeted for multimedia and gaming applications, these architectural features also enable Cell-based Blades to efficiently handle a broad class of applications. This first generation Cell processor implements on a single chip a collection of heterogeneous processors, including a POWER processor with two levels of cache and eight attached streaming processors with their own local memories and globally consistent DMA engines. In addition to processor-level parallelism, each processing element has Single Instruction Multiple Data (SIMD) units that can each process from 2 double floats up to 16 chars per cycle. Programming techniques which can unleash the power of this processor are key to attaining the high compute performance of which it is capable, but such techniques may require some different programming concepts. In this tutorial we will discuss programming tools, techniques and tips for programming Cell. The tutorial material will be presented in two parts. In the first part we will focus on techniques for programming applications on Cell. We will cover the Cell architecture and its functional components, including discussion of some of the more popular programming "models" such as function offload, streaming, SMP etc. Then we will discuss manual approaches to getting the best SIMD code, overlapping computation and communication, and manually scheduling the instruction stream. In the second part we will focus on mechanics and will describe the tools available for programming the CBEA including the compiler, simulation tools, profiling and debugging. We will include approaches to exploit the available compilers to get more automatic support for simdization, parallelization and automatic overlay support. Finally, we will do code walk-throughs.

Target Audience:

This tutorial is intended for those interested in emerging heterogeneous architectures, such as the CBEA, and especially in programming them to exploit performance.

Relevant links:

Cell BE at IBM AlphaWorks

Cell BE Processor SDK