In part 1 of this series, we discussed the the role that the SERDES has played in the past 20 years in enabling high speed signaling, and its technical advantages. In part 2, we’ll explore the power advantages of SERDES, how the technology has evolved and what challenges lie ahead for future development.)
The SERDES Power Advantage
It is only recently that SERDES have had a power advantage versus serial data busses.
The power consumed by an ideal parallel bus is the power used to charge and discharge the TX and RX capacitances and the trace capacitance. The trace capacitance can be significant on FR4 when distances of 10, 20, or 100cm are considered. (see : https://www.edn.com/design/pc-board/4427201/Rule-of-Thumb--5--Capacitance-per-length-of-50-Ohm-transmission-lines-in-FR4)
From first principles, we know that the power for an LVCMOS link is ~C*V^2*f. In the case of data, the frequency is ½ the total bit rate multiplied by the transition density. The total number of transitions and hence power is independent of the number of lanes needed to first order, since the more lanes, the fewer transitions per lane. For a 1Gbps link, it’s likely that 8-16 lanes would be needed for 10cm-1m. For a 10Gbps, a highly impractical 120 lanes may be needed for 1m!
The plot below shows the power for a parallel LVCMOS links of different voltages versus the power consumed by SERDES from the 1990s and today. It can be seen that there is a power advantage of modern SERDES for longer distances, but the power advantage isn't clear.
Where SERDES really shine for power is for higher data rates.
The plot below shows the power for a parallel LVCMOS links of different voltages versus the power consumed by various production 28nm SERDES the mid-late 2010s. It can be seen that there is a power advantage of modern SERDES for almost all distances. For a power optimized SERDES, the power advantage is large and clear over all distances.
As process technologies advance, the SERDES power advantage continues to grow of course.
My First-person View of SERDES Evolution
My career began working on the development team of Hewlett-Packard’s discrete SERDES ASICs. The HDMP-1638 was one of the first products I worked on. This ASIC is shown below with an “Agilent” label rather than the “HP” label thanks to the Agilent spin-off from HP.
The specifications for this chip were competitive at the time and the sales were good. So I believe this is a reasonable benchmark for an industrial SERDES from 20-25 years ago.
This part was designed in a bipolar process. It had a line rate of 1.25Gbps to support Gigabit Ethernet (802.3z), 1000BASE-X Gbps Ethernet over Fiber.
The power dissipation of the HDMP-1638 was about 1W, which included an external parallel interface – it was a SERDES chip after all! The power dissipation of the chip excluding the parallel interface is estimated at 650mW, or roughly 500pJ/bit. We will come back to the power efficiency (in pJ/bit) later in comparing with more recent SERDES.
Since 2006, I’ve been at Silicon Creations helping to develop low power SERDES in advanced nodes. In recent years, Silicon Creations has been developing SERDES for up to 32Gbps transmission with power efficiency down to 2.5pJ/bit.
Comparing the speed and power efficiency of these SERDES to the SERDES of 20 years ago:
- The speed is 25x higher
- The power efficiency is 200x better
Again, many factors have contributed to this improvement including huge advances in technology, voltage scaling, and of course good design!
As described in the previous section, SERDES have a compelling advantage for power, pin count, and reach. The disadvantages of SERDES have been the complexity and costs related to SERDES.
At a minimum, for low data rates, a good TX PLL, RX CDR, TX driver, and RX front end are needed. Each of these are complex analog sub-systems. Designing these blocks and the total SERDES system requires a skilled team of analog/mixed-signal designers to complete.
At a minimum, a SERDES requires these analog/mixed-signal blocks along with complex digital control:
- A good TX PLL
- This block is needed to produce a typically multi-GHz clock from a typically 25-100MHz reference clock with very low (~1ps or better) long term jitter.
- A good RX CDR
- This block is a complex control loop to track the average phase of the incoming data despite any noise, distortion, or cross-talk on the link. This is typically done with either complex phase rotators or a CDR driven PLL.
- TX line driver
- This block translates the serialized data into a typically 50 Ohm differential signal, often with precursor and post cursor emphasis.
- RX equalizer
- This block attempts to equalize the high-speed channel effects either with a continuous time equalizer or with a DFE or both. Often an automatic gain control (AGC) circuit is needed to facilitate the equalization.
- The RX equalizer typically includes automatic calibration routines either as state machine logic or as software.
- High-speed serializer and deserializer logic
All of the blocks listed above take considerable design time (up to many person-years) by an experienced design team. As data rates increase [Gbps] and demands on efficiency increase [pJ/bit], the complexity and cost of the SERDES increases. As reliability demands increase, an increasing number of ageing and electromigration simulations must be run and analyzed further increasing the cost.
The main focus of this article is on PAM2/NRZ SERDES. PAM4 systems offer alternatives for higher BW per pin, but typically come at the cost of further increases in chip area, power, and complexity over PAM2/NRZ systems.
Fortunately, SERDES have become widely available as IP blocks. So, system companies can license proven designs from leading IP design providers. In this way, the complexity is handled by specialized design teams, and the R&D costs can be shared across multiple chips, projects, and even industries which helps to mitigate costs.
The major cost of SERDES comes as design costs (many designers for many total years) and verification costs, but secondary considerations such as die area and PCB area are important.
SERDES verification at the PMA level is typically handled by the design team, or a subset of the design team. At the system level, verification can be quite complex, especially for standards such as PCIe.
For complex serial standards, test benches (typical in System Verilog) are needed to verify the system from the physical layer (including the PMA and PCS), Data Link Layer, Transaction Layer and Device level. Verification covering these levels typically checks protocols, modes, negotiation, error injection and recovery, etc. The verification typically takes many man months as well and often involves third-party verification IP (VIP).
On die, a SERDES could potentially be cheaper or more expensive than a parallel interface. Depending on the process node, a SERDES could consume roughly 0.15-0.5mm^2 per lane. A parallel interface can be much smaller than this, but would require more I/Os. So depending on whether the chip is I/O limited or pin limited, a SERDES could result in more or less die cost than a parallel interface.
At the package and PCB level, SERDES allow for a reduced pin and trace count, so should result in smaller and lower cost packages and PCB designs. However, the design of packages and PCBs using SERDES can be more difficult due to the complexities of high-speed controlled impedance (example 50 Ohm) traces and hence more expensive than PCBs using slower parallel interfaces.
The last 20 years have seen SERDES move from an optical and networking circuit to a circuit that is all around us – from our phones to our laptops and TVs to datacenters and more.
PCIe was introduced roughly in 2002 at 2.5Gbps line rate. Since then, design improvements and CMOS process improvements have allowed line rates to improve by ~20X (from ~2.5Gbps to ~50Gbps) and power efficiency (pJ/bit) to improve by ~200X.
Despite their design and verification complexity, SERDES have become an indispensable part of an SoC block diagram. Of course, the availability of SERDES IP blocks has helped to keep the cost, risk, and time-to-market reasonable.
- Multiprotocol Wirebond 2-lane SerDes PMA - 0.25Gbps to 8.1 Gbps
- Multiprotocol 4-lane SerDes PMA - 0.125Gbps to 16 Gbps
- Chip-Chip SerDes 1-lane PMA - 1.25Gbps to 5.0Gbps wirebond