comment on this article

Fancy footwork

A software programmable processor is said to outperform the competition. By Graham Pitcher

The communications market has been one of the worst affected sectors by the downturn in the semiconductor industry, but there is general consensus that demand for bandwidth will grow strongly in the future. That bandwidth will play host to a range of services.

One of the problems with rapidly developing services and protocols is that hardware can equally rapidly go out of date. Manufacturers of digital signal processors and fpgas would claim their products are suited to handle this particular problem, but there is one company that believes its technology – a full software programmable parallel processor architecture – is far better suited to the application. Not only that, it claims the first product based on the architecture will offer ten times the sustained application performance of a dsp at a fraction of the price.

Aspex Technology (www.aspex.co.uk), with its roots in Brunel University, has been working on parallel processing for the last 10 years. Recently appointed ceo Paul Greenfield claimed that now is an exciting time for the company. "We've been working for the last two years on Linedancer, a software programmable vlsi processor which implements our associative string processor (ASP) architecture in silicon." According to Greenfield, Aspex has had 'three or four goes' at silicon, including a silicon on sapphire implementation. "But the current device has been produced on Chartered Semiconductor's 0.18um cmos process."

Data hungry
In Greenfield's view, Linedancer is suited to any application where large amounts of data need to be processed in parallel. Major target applications include cellular infrastructure and secure networks, but Aspex is also looking with interest at the professional imaging market. The ASP architecture is modular and is implemented in silicon using ASProCores. These are programmable, homogeneous and fault tolerant single instruction, multiple data (simd) parallel processors, featuring a string of processing units, a software programmable intercommunication network and a vector data buffer. "The architecture has been proven with 450 man years of engineering," Greenfield claimed, "and we have around 20 patents covering such aspects as filtering and I/O. We know how to get simd working in the real world."

Linedancer is created using multiple iterations of a standard IP block. "It's a 2bit alu, step and repeated on a high speed internal bus," said Greenfield. "Each block is a proper cpu and has addressing, much like a content addressable memory."

John Lancaster, chief technology officer, expanded: "We have a string of processing elements controlled from a common instruction interface. Processors are connected via a flexible high performance communications network."

In operation, an embedded 32bit Sparc core runs a high level program to generate low level instructions. These instructions in turn are fed through a low level ASP controller (LAC), to produce the ASProCore's instruction stream.

Linedancer's ASProCore features 4096 associative processing elements (APEs) in four blocks of 1024. Each APE has an alu, 200bit of associative data register and 64bit of primary data store. The APEs are connected via a high speed communications network, supporting both synchronous and asynchronous communication.

A data transfer router allows data to be passed between cascaded Linedancer chips using the LN and RNports. These ports include all protocol signals for glueless connection of multiple Linedancers. A 64bit PCI port allows external devices to access the chip's internal resources and memory interfaces.

Lancaster noted that the architecture works in two ways: "Firstly, synchronous for kernel filtering applications. Secondly, asynchronously. If you found an irregular shape and wanted to colour it in – communicating with the elements inside that shape – it can do that in one or two clock cycles."

Lancaster is, understandably, proud of the architecture. "It's conceptually elegant; not full of blocks that interact in a complex way. At the system architecture level, it's simple to put together and scalable."

He noted there is natural parallelism latent in many applications and said that's what the architecture exploits. "Modern dsps are trying to exploit full parallelism that may or may not exist. Someone makes it sequential, then dsp like for parallelism to speed things up. But how easy is it to map the problem to the device?"

Greenfield added: "Although it has a Sparc core, you can use an ARM or whatever you like on an external board; you just turn the Sparc off."

Alternative approach
Greenfield gave an example of where he believes Linedancer will find application in image processing. "Some people are currently using a high end fpga like a Stratix or Virtex, but also need a 'C64 or a TigerSharc to work with it. Many can't afford to go to an asic solution because they don't have the volume. We help because Linedancer sells for $500 in volume. Even though you might need eight Linedancers for an application, it'll still be cheaper than an fpga solution."

Multiple Linedancers can be cascaded and, said Greenfield, there is no need for software to be recompiled. "Compare this with an fpga solution," he continued, "where you have to do system partitioning."

Not only does Linedancer reduce component count and cost – by up to 90% in some cases, said Greenfield – but it also makes development quicker and cheaper. "With a Virtex, you have to simulate, place and route and so on. Linedancer is 100% software programmable, with no need for simulation, synthesis or place and route. Potentially, you can replace a vhdl engineer with someone writing in C or C++."

Greenfield claims that, in raw performance terms, Linedancer is faster than a 'C64 dsp. "So you can afford to allow engineers to use a high level language and, as the chip gets smaller, take advantage of increased performance."

Although Aspex' main target market at the moment is professional imaging, Greenfield claims Linedancer is already being used by the communications industry and one particular application is xml acceleration.

Lancaster said that xml processing can be looked at as two phases: parsing and translating. "Parsing is associative, whilst translating is changing meta tags; essentially a look up operation."

Greenfield noted that xml takes a lot of cpu time to parse. "A number of companies are building xml chips for parsing data before it gets to an application. But Canadian based Xaplica has built a board with a number of Linedancer chips which parses the xml then passes it to the right application." Xaplica claims the board can save a company the cost of a webserver and middleware. Robert Arn, Xaplica's ceo, noted: "The programmability of Aspex' processors is a very attractive alternative to traditional approaches using asics or fpgas." Xaplica also claimed that it could build the application in a fifth of the time because the processor is software programmable."

Lancaster concluded: "Web services, content switches and routers, layer 7 switches trying to route packets based on content, a firewall inspecting addresses and ports – all these are naturally associative tasks and, whether you're looking for groups of pixels or checking text for certain phrases, it's pattern recognition."

Author
Graham Pitcher

Comment on this article


This material is protected by MA Business copyright See Terms and Conditions. One-off usage is permitted but bulk copying is not. For multiple copies contact the sales team.

What you think about this article:


Add your comments

Name
 
Email
 
Comments
 

Your comments/feedback may be edited prior to publishing. Not all entries will be published.
Please view our Terms and Conditions before leaving a comment.