Meeting the performance, power and intelligence needs of advanced applications

4 mins read

Embedded systems have been revolutionising our way of life for years, but whether they are being employed in cars, industrial systems, consumer products or in a range of other applications, design engineers are coming under growing pressure to deliver more capability and flexibility, while at the same time coping with evolving standards.

Performance and power scalability, system integration and intelligence, as well as security, safety and reliability, are the key metrics against which FPGA developers, such as Xilinx, are judged. Can its recently launched UltraScale+ family of FPGAs deliver?

This newly extended portfolio, to be manufactured on TSMC's 16FF+ FinFET process, includes the Kintex UltraScale+ and Virtex UltraScale+ FPGAs and 3D IC families, while the Zynq UltraScale+ family includes the first all programmable multiprocessing SoCs (MPSoCs).

The UltraScale+ architecture is intended to deliver 'two to five times the system level performance/Watt compared to 28nm devices', a crucial metric, according to Giles Peckham, European marketing director, who added that the new range of products will offer design engineers 'greater integration, intelligence and higher levels of security and safety'.

Xilinx said that it would be using the UltraScale+ product range to target a number of fast growing markets and that there was a lengthening list of advanced applications in which the devices could be deployed, including LTE Advanced and early 5G wireless, in-car driver assistance systems and the industrial Internet of Things.

5G is, for example, in its early stages and providers are looking to deliver improved throughput and modulation schemes. Design engineers are searching for solutions that will be able to provide improved levels of parallelism in order for them to realise the compute bandwidth and flexibility to develop pre-standard systems.

"This market has a particular requirement and we believe that UltraScale+ FPGAS will be able to provide engineers with the serial bandwidth they need for current and future standards, as well as the fabric to meet the processing performance and processor integration that is critical if you are looking to offload functions," explained Peckham.

Another fast growing sector is the automotive market, where there is now a move to go beyond simple parking assistance systems to complete autonomous driving. As a result, ADAS providers are moving from MCU components to higher function embedded processors that will enable them to acquire, process and fuse data from more sensors – whether that is from a camera, radar, laser or GPS system.

"The ability to add new algorithms to offer more features is typically limited by a fixed computing architecture," suggested Peckham. "The Zynq UltraScale's quad core A53, coupled with the safety capabilities of its various cores, has been designed to provide the processor 'horsepower' to deliver more intelligent and scalable analytics whilst safeguarding against any 'out of range' conditions."

"Through intelligent system wide interconnect optimisation, we have found that we can
provide a 20 to 30% improvement in current performance, area and power."
Giles Peckham

As for the Internet of Things, especially in the industrial sector, there is significant demand for data gathering, diagnostics, local decision making and greater energy awareness to drive mechanical systems and revamp industrial production.

As the number of data streams continues to grow, solutions have to be able to scale processing bandwidth to translate massive amounts of data into useful information, while meeting new requirements for security and access control.

"With the UltraScale portfolio, we have made a significant effort to integrate diverse cores and programmable logic and are now in a position to offer high performance processing and acceleration to power key control systems, robotics and industrial drive products."

The UltraScale+ architecture has been designed with an integrated SRAM which is intended to provide faster memory access and a new on-chip interconnect fabric, called SmartConnect, looks to optimise for throughput, latency and area.

According to Peckham: "Through intelligent system wide interconnect optimisation, we have found that we can provide a 20 to 30% improvement in current performance, area and power."

The tool looks to analyse a design's IP interfaces and connectivity, assessing the entire system – not just individual IP blocks – and then makes 'intelligent' optimisations to eliminate excess logic and avoid any congestion or 'bloating', while at the same time matching system level throughput and latency requirements.

"Because different IP blocks need different interconnect technologies, SmartConnect can identify the right interconnect IP depending not just on the entire system, but also the individual requirements of each piece of the design," explained Peckham.

"SmartConnect can also bridge between different interface types intelligently and automatically, such as connecting streaming video to memory using a simple drag and drop GUI."

While SmartConnect Technology is intended to scale across other Xilinx product families, UltraRAM is specific to UltraScale+ devices.

This new memory technology is said by Xilinx to address one of the largest bottlenecks affecting FPGA and SoC based system performance and power and it does this by enabling SRAM integration.

In the past data was stored off-chip, explained Peckham. While that was useful for limited data and coefficient storage, as applications have demanded more memory, particularly those requiring video and deep packet buffering, customers have had to start looking to cascade block RAM. A technique that is, according to Peckham, "neither efficient nor cost effective."

"UltraRAM technology is intended to create high capacity on-chip memory for a variety of use cases, providing predictable latency and performance," Peckham explained. "By integrating large amounts of embedded memory very close to the associated processing engines, designers will be able to achieve greater system performance/Watt, while benefitting from a reduction in the bill of materials cost."

According to Xilinx, UltraRAM can be designed to scale up to 432Mbit in a variety of configurations for longer buffering.

Intended for computationally intensive application processing, the Zynq UltraScale+ MPSoC has a 64bit quad core ARM Cortex A-53 processor, as well as a 32bit dual core ARM Cortex-R5 real-time processor that can be run either independently or in Lock-Step mode. The Cortex-R5 can share its memory with the Cortex-A53 cores or they can be configured to use separate memories.

These devices also come with a dedicated platform and power management unit that looks to support system monitoring, system management and dynamic power gating of each of the processing engines.

The Zynq UltraScale+ also employs ARM's TrustZone technology, a system wide approach to security for an array of client and server computing platforms. The technology is tightly integrated into Cortex-A processors, but the secure state has also been extended throughout the system via the AMBA AXI bus and specific TrustZone System IP blocks.

This system approach means that it is possible to secure peripherals such as memory, crypto blocks, keyboard and screen to ensure they can be protected from any software attack.

In addition, a separate security unit enables military class security solutions, such as secure boot, key and vault management, and anti-tamper capabilities. "Standard requirements for machine-to-machine communication and industrial IoT applications," Peckham suggested.

First tape out and early access release of the design tools has been scheduled for the second quarter of this year, while the first shipment has been scheduled for the fourth quarter of 2015.