Tachyum takes further step towards a production ready Prodigy processor chip

2 mins read

Tachyum has taken another step towards achieving production-ready status of its universal processor – Prodigy.

The company has run LINPACK benchmarks using Prodigy’s Floating-Point Unit (FPU) on a Field Programmable Gate Array (FPGA). This was achieved by running applications under Linux on the integer part of the processor and uses an IEEE compliant Floating-Point Unit (FPU) to analyse and solve linear equations and linear least-square problems.

The vector unit includes copies of 16 Floating-Point Units (FPUs) and additional shuffle and reduction operations. While there are many instructions to test in a vector unit, the Floating-Point vector operations are the hardest part of a vector unit, and this has now been successfully achieved by Tachyum’s product development team.

LINPACK measures a system’s floating-point computing power by solving a dense system of linear equations to determine performance and it is a widely used benchmark for supercomputers.

After successfully reaching this FPU milestone, Tachyum has only four more steps to go before the final netlist of the Prodigy processor chip. The next milestone is running UEFI and boot loaders loading Linux on the FPGA, completing vector-based LINPACK testing with I/O, followed by I/O with virtualization, RAS (Reliability, Availability and Serviceability). Afterwards, Prodigy will be ready for final netlist, followed by tape-out.

Tachyum’s FPU is one of the most advanced in the world and includes FMA, divider, format converter, reciprocal approximator, reciprocal square root approximator and square root approximator. Its FPU is fully IEEE compliant and corner cases have been successfully debugged. In addition to IEEE single and double precision, the Prodigy processor will also support 16-bit Bfloat16 (Brain Floating Point).

The next milestone to be achieved is running vector operations, including mask operations and operations of unaligned vectors. The vectorization in the compiler reaching the production stage and vectorizing compilers and vectorized libraries will be fully available before chip shipments next year.

“Despite having to overcome obstacles of replacing IP and EDA tools, our engineering team has risen to the challenge of advancing the Prodigy stack so that we can get to tape-out and production next year,” said Dr. Radoslav Danilak, founder and CEO of Tachyum. “We have taken every opportunity to develop Prodigy as a processor that does not simply meet expectations but exceeds them. Successfully running LINPACK means that we are one step closer to completing our vision of transforming data centres into Universal Computing Centers with Prodigy.”

Prodigy will target data centres and because of its utility for both high-performance and line-of-business applications, Prodigy-powered data centre servers will, according to Tachyum, be able to seamlessly and dynamically switch between workloads, eliminating the need for expensive dedicated AI hardware and dramatically increasing server utilisation.

Prodigy integrates 128 high-performance custom-designed 64-bit compute cores, to deliver up to 4x the performance of the highest-performing x86 processors for cloud workloads, up to 3x that of the highest performing GPU for HPC, and 6x for AI applications.