11 October 2010
26 into 1 does go
Multicore chip design targets basestation market 'sweet spot'.
The 26 risc and dsp cores that make up the Transcede family may be eye catching, but it is the overall design – including custom hardware, on chip memory and a variety of I/O – that make this picocell on a chip noteworthy.
Mindspeed had been eyeing the cellular basestation chip market for years, but was unclear how best to enter a mature market led by Texas Instruments and Freescale. The emergence of the Long Term Evolution (LTE) and WiMAX wireless standards, based on orthogonal frequency division multiple access, provided Mindspeed with the technology shift needed for market entry. The result is Transcede, an SoC aimed at picocell and basestation equipment designs for the W-CDMA, LTE and WiMAX wireless standards.
Two Transcede devices have been announced: the 4000 and 4020. The devices share the same basic architecture – two risc processors, 10 dsp cores and 10 Mindspeed application processors. Where they differ is clock speed: the 4000 is clocked at 600MHz while the 4020 is at 750MHz.
Mindspeed had two goals when it started the design three years ago. One was to develop a scalable design that would continue to support the software as it evolved. The second was to address picocell applications, which Mindspeed sees as a market 'sweet spot', using an SoC.
According to Mindspeed, a picocell supports between 100 and 200 cellular users. In terms of processing performance, it equates to one sector of a wireless cell, with a 20MHz band and 2x2 multi-input, multi-output (MIMO) made up of two receive and two transmit antennas; or three sectors, each at 10MHz. In terms of data throughput, the design can handle 250Mbit/s full duplex.
"The key design challenge was to get so much processing onto a SoC – layer one, layer two processing and all the hardware acceleration – in a cost effective package at the right power [consumption] point," said Alan Taylor, marketing director for Mindspeed's multiservice access business unit.
Because the SoC must support a variety of wireless standards, it is important that the design is programmable. To achieve this, Mindspeed licensed two cores: the ARM Cortex-A9 and the CEVA-X1641 dsp.
The ARM core was chosen for its high processing performance and low power characteristics. "We wanted to make sure that it was a low power solution," said Taylor. "If you look at pico basestations, these can be colocated with antennas such that power is a key aspect of the design." The same thinking applies to the choice of the CEVA dsp, a core that is also used within cellular handsets.
The design comprises four main blocks: the system cluster; signal processing units (SPUs); I/O unit; and the expansion unit.
The system cluster (see fig 1) houses two risc processors: one a dual ARM core, the other a quad core. The dual core runs a real time dispatcher that assigns tasks to the chip's 10 SPUs. In a basestation design with multiple Transcedes, one device's dual core can be assigned as the master that oversees all others. "The master dispatcher can send tasks not only to the local dsps but also to adjacent Transcedes," said Taylor.
The quad core risc processor is used for radio scheduling. For LTE, this involves coordinating various handsets within a cell every millisecond. "This is a computationally complex task scheduling 100 users, a task that is recalculated every millisecond," said Taylor. "You are looking at the various class of service for each user, how much data you are transferring to them and you have to maintain the latency with the handset."
Another key functional block developed by Mindspeed is the forward error correction (FEC) hardware accelerator, located within the system cluster. This implements convolutional turbo codes and Viterbi coding, as well as encryption algorithms such as DES, triple-DES and the SNOW-3G used for LTE. "The FEC block itself is a complex engine," said Taylor. "It does the entire FEC functions, the mapping and interleaving; there are a lot of functions besides the coders."
The SPU cluster (see fig 2) comprises 10 SPUs, each of which includes the X1641 dsp core and Mindspeed's application processor, known as the filter processor. The dsp core modulates radio signals and performs such basestation tasks as beam forming and interference mitigation. The filter processor implements those tasks best suited to hardware, such as the FFTs and inverse FFTs used for OFDM. "The filter processor is programmable to a certain extent – it is microcoded so it can do other functions besides the FFT, such as for W-CDMA, the Rake and G-Rake filtering," said Taylor.
The third main block, the I/O unit, interfaces the SoC to the basestation's antenna or up to seven other Transcede ics linked in a ring. The I/O unit has a PCI Express Gen II controller (four lanes at 5Gbit/s), two Rapid I/O controllers (each one four lanes at 5Gbit/s) and six CPRI controllers (each 6Gbit/s). There are 10 serdes, such that the three interfaces can mixed to use up to 10 lanes. "These serdes draw quite a bit of power and you can shut down those that aren't used," said Taylor.
Finally, the expansion unit is a collection of I/O, including SPI and i2c buses, two Gigabit Ethernet interfaces that include IEEE1588 v2 clock recovery used for basestation synchronisation, and a classifying engine that supports the Metro Ethernet's eight classes of service. LTE and WiMAX are the first wireless standards that adopt an all IP core network.
According to Taylor, designing the internal bus and on chip memory to avoid system bottlenecks has been a huge challenge. "Getting the MIPS is easy; what is a challenge is understanding the data flow through the SoC and ensuring you have the right bus and internal memory hierarchy," said Taylor. Transcede devices use 7Mbyte of on chip memory, accounting for 40% of the die area. "If you over engineer and end up including too much on chip memory, you've got a less competitive device," said Taylor.
As well as the dynamic scheduler, Mindspeed has developed a static scheduler that bounds the worst case processing load. Mindspeed has a profiling tool that shows how the scheduler will dispatch tasks. This can be used for task optimisation before any code is written. Once the design runs on chip, it looks to see if there is a static schedule which it will execute. If not, it uses the dynamic schedule. "The assumption is that the dynamic schedule will always work within the worst case boundary set by the static schedule," said Taylor.
Mindspeed is working with two leading basestation equipment makers, with the most interest in LTE pico basestations. Although early LTE basestation trials used existing technology, system vendors recognise they need to use SoCs to replace network processors, fpgas and dsps if they are to reduce system cost.
As for the future, Taylor hints the next Transcede developments, planned for 2011, will use more advanced CEVA and ARM cores and a more advanced cmos process than the current 40nm.