22 February 2011

Serving a purpose: Why the ARM architecture is attracting attention in the server market

  • Why the ARM architecture is attracting attention in the server market
  • Why the ARM architecture is attracting attention in the server market
  • Why the ARM architecture is attracting attention in the server market

Power consumption in data centres is becoming an increasingly important issue – no surprise when these centres can house tens of thousands of servers. So there is a push towards the development of processors which offer higher performance with lower power consumption.

This emerging market is attracting the attention of a number of processor developers, including Marvell. "Marvell is targeting data centres that support cloud computing and which provide web services," said Linley Gwennap, principal analyst at The Linley Group. "In those cases, there would be some significant interest in what Marvell is doing."

The latest processor family from Marvell, the Armada XP (Extreme Performance), uses the low power ARM architecture to deliver a 1.6GHz quad core variant that has the processing performance needed for the enterprise market. According to Viren Shah, Marvell's senior director of marketing for embedded SoCs, the device family is aimed at networking, network attached storage, laser printers and the server market. "The server market is dominated by the x86 [architecture], but ARM is making forays into that segment – and the reason is mainly its low power," said Shah.

Power is the all important metric. "With the quad core design, our goal is to be sub 10W," said Shah. This is a noteworthy figure; according to Gwennap, Intel's Xeon processor consumes around 40W. "Even at that power level, Xeon is not designed as an SoC." In addition to Xeon, a South Bridge chip and Ethernet controllers would also be needed. Marvell has Sheeva, an ARM based core developed after gaining an architectural license when it acquired Intel's XScale business in 2006.

Sheeva, which is ARM v6 and v7 instruction set compatible, is a two issue design: either two integer instructions or an integer and floating point are issued each clock cycle. The core also has a limited ability for instruction look ahead, boosting code throughput by reordering the sequence in which instructions are processed.

There are five devices in the Armada XP family: two single core; two dual core; and a quad core, the MV78460 (see fig 1). All are pin compatible, but vary in the on chip peripherals, cache size and the width of the memory interface. Each ARM cpu on the MV78460 has a 32kbyte instruction cache and 32kbyte data cache, while the four cores share a 2Mbyte L2 cache. The other Armada XP SoCs have a 1Mbyte L2 cache. The L2 cache is doubled in size in the MV78460 to maintain processing performance.

Sheeva cores access external memory through a controller, with the processor supporting DDR3 memory clocked at 800MHz. The device has 40bit physical addressing that supports up to 1Tbyte of dram. Three Armada XP SoCs, including the MV78460, support a 32 or 64bit memory data interface, while the rest have a 32bit bus. The MV78460 includes two serial ATA (SATA), four PCI Express and four Gigabit Ethernet (GbE) media access controllers (MACs).

These 10 controllers share 16 6GHz serdes, so the PCI Express controllers could be configured as three x4 ports and one x1 port, while the chip cojuld also support two SATA interfaces and a GbE interface. The Ethernet MAC supports the QSGMII interface such that all four GbE ports can be put onto a single serial link. One design challenge with the MV78460, according to Marvell, was cramming four cores and the I/O peripherals onto an SoC. "We have multiple fast I/O that must coexist in the system," said Erez Alfiya, an application manager at Marvell. "Contention on one affects the whole system performance." To this end, an on chip crossbar switch connects the cores and the L2 cache, as well as the peripherals as they access DDR3 memory.

The interface between each core and the L2 cache is 128bit wide and includes a coherency unit, which ensures cache coherency by updating the cache whenever data is written to external memory. The crossbar switch also supports the various on chip blocks. "That is a lot of bandwidth we need to supply to the different I/Os," said Alfiya.

As an example, he cites the case of a GbE interface being used alongside two PCI Express ports. "You have traffic coming from the Ethernet port and from the two PCI Express ports. You need to balance the traffic and allow DDR access to the three interfaces," said Alfiya. "We have arbitration between the units because only one unit can access the DDR at any time." Other on chip peripherals include a security engine and support for VoIP via a time division multiplexing (TDM) interface.

The security engine can encrypt 2Gbit/s data streams using such algorithms as AES and 3DES. With the TDM interface, the SoC supports up to 32 channels of VoIP. Marvell uses several power saving techniques to limit the MV78460's power consumption to 10W. The device can power down unused cpus and vary the clock frequency dynamically to adapt power consumption to processing load. In sleep mode, the cpus can be turned off while the L2 cache remains powered. In deep sleep mode, the L2 cache is saved in dram before being powered down.

The I/O ports then wake the cpus when data arrives. The GbE MACs are Energy Efficient Ethernet compliant (see NE, 25 January 2011) and support DDR3L. Because DDR3L operates at 1.35V, instead of 1.5V, this can reduce power consumption by up to 20%. The device can run one operating system in symmetric multiprocessing mode or asymmetrically.

The latter is less common for servers, but features more widely in embedded applications, where the cores can run separate operating systems. "By integrating everything onto one chip, Marvell has designed a single chip quad core server," said Gwennap. This is different to Intel's approach, where two Xeon multicore chips can be put side by side – a so called two socket server configuration. "You cannot do that with the Marvell chip," said Gwennap. "Marvell has boiled the whole server down to a chip; if you want to scale it, you have to add a whole new, separate, server." The Armada XP is currently implemented on TSMC's 40nm G cmos process, although the roadmap includes an eight core design at the 28nm node. The Sheeva cores will be clocked at 3GHz or more, while the SoC will support 10Gbit Ethernet and the PCI Express 3.0 specification.

But Marvell isn't the only company looking to bring ARM cores to the server market and ARM is seeding the process with a quad core reference design for the Cortex-A9 architecture, while the basic design for the Cortex-A15 is also quad core (see fig 2). One contender is Calxeda, in which ARM is an investor. It is using a quad core implementation of the Cortex-A9 but, because it is limited to four cores per die, it will probably need to use multiple chips to match the performance of Xeon processors.

But the startup is not providing details on the interconnect or the blocks it plans to integrate. "We are going to see a lot of quad core Cortex-A15 designs coming out in a year or so," Gwennap concluded.

Roy Rubenstein

Supporting Information



Marvell Semiconductor Inc

This material is protected by Findlay Media copyright
See Terms and Conditions.
One-off usage is permitted but bulk copying is not.
For multiple copies contact the sales team.

Do you have any comments about this article?

Add your comments


Your comments/feedback may be edited prior to publishing. Not all entries will be published.
Please view our Terms and Conditions before leaving a comment.

Related Articles

Amp works at 50% efficiency

Researchers from the Universities of Bristol and Cardiff have created an ...

64bit ARM based SoCs for SDN

Freescale has announced the second generation of products based on the ...

Industrial Internet Consortium

AT&T, Cisco, GE, IBM and Intel have established the Industrial Internet ...

Linear ears make sense

'Linear assets' is a term that covers anything from roads and railway to pipes ...

The truth about phase noise

Manufacturers of signal generators set great store by the specification of ...

Down to the wire

Once the plain old telephone service, the role of the telephone wire continues ...

Securing Smart Grid Devices Using ...

Energy providers and governments worldwide are looking for ways to upgrade ...

MontaVista Linux Carrier Grade Edition

Communications networks are very different from other kinds of computing ...

Extensive High-End Remote Manageability ...

Kontron Embedded Motherboards support Intel Active Management Technology. Read ...

Fibre optic modules

Four new devices have been added to Toshiba's TOSLINK family of fibre optic ...

ERIC – the half an ic

LPRS has launched its low cost easyRadio Integrated Controller (eRIC), which is ...

High speed spdt bus switch

Toshiba has released the TC7SB3157CFU, a 1 bit spdt bus switch suitable for ...

Range and coexistence demo

This video shows you how a TI device and non TI device work together in an ...

Data concentrator demo

Learn how to run the power line communication interface on the TMDSDC3359 data ...

Data concentrator overview

Watch a quick overview of TI's TMDSDC3359 data concentrator EVM, which gives ...

TI, National Semi takeover

It's been a while since there has been a takeover on the scale of that ...

Mobile phones, 25 years on

It's 25 years since the first call was made on the UK's mobile phone network on ...

Hossein Yassaie, Imagination

Hossein Yassaie tells Graham Pitcher how taking the right road has turned ...

John Schwartz, Digi Int'l

Graham Pitcher finds out from a communications specialist that M2M is slowly ...

Maria Marced, President, TSMC

Innovation, technology and the right people. Graham Pitcher finds out why ...