10 July 2012
Pushing packet performance: How IP networks are handling ever more data
Internet Protocol (IP) core routers are tasked with moving large amounts of IP traffic across the network backbone, unlike edge routers, which typically aggregate a variety of operator services, such as residential broadband, cellular traffic backhaul and business connectivity. IP traffic in general is growing rapidly: analysts estimate the amount of data is growing by 30% a year.
So it's no surprise that equipment developers are focusing on performance. Looking to handle the vast amounts of data flowing around networks, Alcatel-Lucent's 7950 core router family uses its latest 400Gbit/s FP3 packet processing chipset.
The 7950 core router family complements Alcatel-Lucent's existing edge router portfolio. But according to the company, the platform is designed not just to tackle growing IP traffic, but also to process traffic types associated with edge routing.
"The 7950 is designed with the understanding that the core is expanding in terms of what it does," said to Houman Modarres, director of product marketing with Alcatel-Lucent's IP division. "There is going to be core like scale with some service level functionality." One example trend is the distribution of digital content, such as video, across the network to be closer to end users. Another is the advent of cloud computing and the associated growth in symmetric traffic streams.
The FP3 chipset, announced in 2011, is already in use in Alcatel-Lucent's edge routers, but the 7950 is its first router platform family to fully exploit the chipset's capability. The high end 7950 XRS-40, available early in 2013, will have a 32Tbit system capacity and support up to 160 100Gbit Ethernet interfaces.
The FP3 chipset, like the previous generation 100Gbit/s FP2, comprises three devices: the p-chip network processor; a q-chip traffic manager; and the t-chip that interfaces to the router fabric (See NE, Jan 26, 2010).
The p-chip inspects packets and performs the look ups that determine where the packets should be forwarded. It determines a packet's class and the quality of service it requires, then tells the q-chip traffic manager in which queue the packet is to be placed. The q-chip handles the packet flows and makes decisions as to how packets should be dealt with, especially when congestion occurs.
Alcatel-Lucent has embraced several techniques to quadruple the FP3's p-chip packet processor's capability. Using a 40nm cmos process and a higher clock speed – 1GHz, rather than 840GHz – provides some of the needed improvement. Alcatel-Lucent has also worked with various memory vendors to ensure the look up requirements of the FP3's design would be met.
The latest p-chip also sees the number of on chip microcoded programmable cores increased from 112 to 288. Each core has been rearchitected such that it can process two instructions per clock cycle. "We realised that if you just scale up the processors, you weren't going to achieve the goal [of 400Gbit/s]," said Ken Kutzler, vp of engineering with Alcatel-Lucent's IP division. "Realistically, you are looking at a 30 to 35% increase in processing performance just doing that."
The p-chip's 288 cores are arranged as 32 rows by 9 columns. One way to add packet processing features is to add memory look ups, says Kutzler, and that is related to the number of columns: two more than the FP2's 16x7 core matrix.
The matrix's rows can be viewed as a pipeline that can be partitioned, while each column can be assigned several tasks. For example, to implement a 200Gbit/s line card (200Gbit/s in each direction), the rows are partitioned to either process the incoming or the outgoing packet streams. "We can change what the instruction set of each row can do," said Kutzler.
A certain row will be assigned incoming or outgoing streams in the single chipset example, but its columns may initiate and end tasks such as multi protocol label switching before passing data to another task. "That will be done in parallel with a similar task that has a similar look up; you want to maximise your look ups," said Kutzler. "We have a compiler that optimises the columns based on the feature set being programmed."
The FP3's traffic manager q-chip retains the FP2's four risc cores, but the instruction set has been enhanced and the cores are now clocked at 900GHz.
Alcatel Lucent settled on 10Gbit/s serdes to carry traffic between the chips and for the interfaces on the t-chip, believing the technology to be the most viable and sufficiently mature when the design was undertaken. "While 25Gbit/s signals looked good in the lab, their real abilities over backplanes we believed were building too much risk," said Kutzler. "And when 25Gbit/s becomes available, we have nice wide buses to put it on and scale accordingly."
Alcatel-Lucent has kept the line card configuration of using two p-chips with each q-chip. The second p-chip is viewed as an inexpensive way to add spare processing in case operators need to support more complex service mixes in future. "It is rare that we've used any of the capability of the second p-chip," said Modarres.
The FP3 based router halves power consumption to 2W/Gbit, with power savings at the chip level achieved using the 40nm process and by expanding clock gating over a wider part of the devices. Alcatel-Lucent also carefully selects particular chips after each production run to avoid using parts with higher leakage currents. There are also system savings by using one 400Gbit/s chipset, rather than four with the FP2, each with its own memory.
Alcatel-Lucent uses RLDRAM and QDR memories for fast table and counter look ups. The line card supports up to 6Gbyte of buffer memory, which can be met using cheaper DDR3 memory.
Given the quadrupling in line rate the FP3 delivers, the next development, the FP4, will likely be a 1Tbit device. The interface to the carrier module (XCM, see box) is already rated at up to 2Tbit and can support two line cards. And Alcatel Lucent could also use a 28nm or more advanced process by the time the FP4 enters production. But the company may have to use 25Gbit/s serdes interfaces and continue to work with memory players to ensure the the FP4 can achieve the 2.5x increase in look up rates.
Kutzler will not discuss any specification details of its future chipset, but he does say that none of the identified design challenges appear insurmountable.
Alcatel-Lucent has designed the router hardware such that card level control functions are separate from the Ethernet interfaces and the FP3 chipset that sit together on a line card. The redesign preserves the service provider's investment.
The 7950 XRS-20 platform, now in trials, has up to 20 slots to house media adaptors – XMAs – which carry the various Ethernet interface options and the FP3 chipset. There are up to 10 carrier modules (XCM) in the system. Each includes control processing, interfaces to the router's system fabric and holds up to two XMAs.
Two XCM types are used with the 7950 family. The 800Gbit/s XCM supports a pair of 400Gbit/s XMAs or 200Gbit/s XMAs, while the 400Gbit/s XCM supports on 400Gbit/s XMA or a pair of 200Gbit/s XMAs. Carrier modules can thus be upgraded independently of the media line cards.