Designing SoC Power Networks

With no tools available to ensure an optimal power delivery network, the industry turns to heuristics and industry advice.

popularity

Designing a power network for a complex SoC is becoming critical for the success of the product, but most chips are still using old techniques that are ill-suited to the latest fabrication technologies, resulting in an expensive, overdesigned product. Not only is the power network as designed too large, but this has several knock-on effects that impact area, timing and power.

In the first part of this two-part series, the basics of power networks were discussed along with the recently introduced problems associated with finFETs, increasing resistivity of the wires in the latest geometries and some of the problems associated with the placement of decoupling capacitance.

Drew Wingard, chief technology officer at Sonics, summarized the design challenge: “You don’t want to hit the power delivery network (PDN) with a large voltage difference and a low resistance at the same time.” That leads to excessive current, which in turn causes IR drop and electromigration (EM) problems.

While most of the focus is about delivering power to the digital components of the design, Jerry Zhao, director of product marketing for power signoff at Cadence, reminds us that the entire chip has to be considered. “You must never forget the analog power domain. When people have multiple power domains and they are shared between analog and digital, then they have noise injection to each other.”

An analog current can impact the grid that may disturb the digital blocks causing failure. Similarly, digital switching can disturb the supply into the more delicate analog circuitry. “This means that you have to solve the entire power delivery network together including all of the domains,” says Zhao. “You may have very accurate block timing, but power is all over the place and the current that goes into any particular block is determined by the whole circuit. You cannot sign off power without considering the whole chip.”

Package impacts
Increasingly, the chip is not the boundary that has to be taken into account. Packaging is taking on a more important role, especially as 2.5D and 3D integrations start to become mainstream. The impact on design is still under industry discussion. “In interposer-based (2.5D) ICs, dies are arranged side by side, with power supply lines and blocking caps on the interposer,” explains Herb Reiter, president of EDA2ASIC Consulting and the chair of the ESD Alliance‘s System Scaling Committee. “Therefore, power integrity can be achieved easily and the right cooling measures can be implemented readily.”

But problems remain. “The current drawn by a 3D stacked device is expected to be much higher than that of a 2D SoC chip,” says Hem Hingarh, vice president of engineering for Synapse Design. Hingarh is concerned about the number of bumps allocated just for the PDN. “This has a large impact on package costs and in some cases may be equal to, or larger than, SoC die cost.”

In fact, some see a whole new world of issues that have to be considered. “Power integrity (PI), thermal integrity (TI), and signal integrity (SI) requirements on both chip- and package-levels need to be considered, and a co-design and co-analysis methodology become necessary,” says Sudhakar Jilla, group director of marketing in the IC Implementation Division of Mentor Graphics. “Chip power/thermal/signal modeling and package/interposer modeling are mandatory. Through-silicon vias (TSV) design, the number of TSVs and their placement will impact PI/TI/SI. Bump numbers and pitch may impact both performance and cost. Thermal interaction on multiple dies and thermal gradient may cause chip(s) to malfunction. ESD clamp placement should also take into account multi-die.”

Arvind Shanmugvel, director of applications engineering for Ansys, adds that even with non-interposer-based packaging, design is becoming more complex. “The most recent wafer level fan-out packaging techniques have big implications on power delivery and reliability. Thermal-aware EM, thermal-aware stress, and multi-die ESD are a few requirements for these technologies.”

Industry advice
How does one get started with this kind of analysis? “To perform analysis, engineer should be proficient with 3D assembly as well as with PDN analysis,” says Alex Samoylov, application engineer for power integrity products at Silvaco. “3D PDN analysis spans across several whole die designs and requires reading huge amount of data, unless some simplification is applied. 3D PDN networks contain many different types of objects — regular dies, TSVs, interposer layers, BGA objects. All of these are very different, and all have to be properly extracted. There are some parameters/values that cannot be extracted for separate dies. The PDN may originate on one die and spread into multiple dies. The creation of an accurate load (current consumption) model is a challenge. It is a difficult and costly procedure to verify the accuracy of PDN analysis in 3D packaging.”

The interposer also may add some new optimization options. “The interposer is an important piece of the PDN puzzle,” says Alin Florea, senior manager of package engineering at eSilicon. “We have found the use of metal-insulator-metal (MIM) capacitors on the interposer to be very useful. If the interposer supports these devices, you can offload a lot of chip area and routing to the interposer this way. “

A balance is required. “In the ideal case, you have a very well balanced power mesh,” says Cadence’s Zhao. “By putting more meshes and more vias to the power switches, you may not have IR drop problems and avoid EM rules violations. But, in reality, as you create more power mesh structures you are eating up your routing resources. The earlier you can consider the power delivery in your implementation cycle, the better the eventual solution will be. If you don’t do that, and I have seen this in very large chips, there is a lot of congestion in the power grid. They have ensured that the power grid will not break, but that has added challenges on the die side and impacted the routability for signals.”

Starting analysis early is a key recommendation of the industry. “PDN planning, prototyping, implementation, and optimization have to be done early and along the course of chip implementation,” says Ming Ting, product marketing manager for the IC Implementation Division of Mentor. “Early analysis and repair allow for less violation at final signoff. Otherwise, options of fixing design issues or optimization may become limited if the problems are found too late in the design cycle. Moreover, while we are doing early PDN prototyping, the methodology has to be power-noise and reliability aware; that is, timing/extraction/simulation for power, EM, thermal analyses should be seamlessly linked into place and route.”

Hingarh says that “we need reliable verification in the face of uncertainty,” and suggests the following steps: “Vectorless, early high-level power grid verification; early high-level grid verification and planning; incremental verification during redesign cycles, and finally, detailed grid verification at sign-off time.”

The old ways of designing the PDN need to change. Aveek Sarkar, vice president of product engineering and support at Ansys, recounts a story he hears a lot: “All of my chips will have this particular power grid. They do extensive simulation and come up with a robust, well-defined grid. But it takes a lot of routing resources, and with the newer technology nodes you start to see the delay effect and you are stuck with these over-designed power grids that are not needed for most of the blocks. So the overall chip size increases. And to meet the timing needs, the area will start to go up.”

The chip also has to be considered for all of its intended purposes, and that includes test. “There have been some customers who have increased their power rails specifically for test,” says Robert Ruiz, senior director of marketing at Synopsys. “They do the economic calculation and see if it makes sense to minimize test time by increasing the activity and thus have to add capacity to the power rails. Not a lot of users do this, but there are some.”

When an interposer is going to be used, eSilicon’s Florea suggest that “you have a lot of redundant VDD/VSS uBumps. To reduce the impact of the interposer, use a checkerboard pattern mostly. If metal-insulator-metal (MiM) technology is available at your interposer provider, I recommend you use that.”

Shanmugvel suggests a systematic approach for developing a power architecture driven by simulation. His top three recommendations are: “Do not take margin based approach for granted. You may never be able to close PDN noise for advanced technology nodes if you do not consider the chip, package and system in a holistic manner. Margining can lead to over-design and high cost. Complex interactions between the impedances on the chip and package need to be simulated to ensure proper decoupling schemes. Interactions between voltage drop, timing and routing congestion could have large implications on die-size and cost. Getting actionable information from data analytics from multiple domains is critical.”

His second recommendation is to “expand your vector coverage. Start from the architectural stage (RTL) and make sure you capture necessary operating vectors for your PDN design. It is easier to profile large vector sets during RTL design as opposed to physical design stage. Peak power cycles and large di/dt cycles are the most important. Do not underestimate test mode analysis for power integrity validation. Test modes can typically have large di/dt and a strong coupling with the package/board impedances.”

And his final recommendation: “Do not make assumptions from previous designs. Every chip is unique in functionality, technology node or target market requirement. Careful tradeoffs in power, performance, reliability and cost need to be made early in the design cycle. Reliability failures due to EM, ESD and thermal are not to be taken lightly. Adopt a simulation-driven power grid design and start early in the design stage to avoid surprises.”

Zhao recommends one approach to make analysis easier. “A powerful technique is to perform resistance analysis for the whole die. Before you do expensive simulation analysis and dynamic IR drop, you just do a static analysis of all of the grid resistances. So, if I have an instance in the middle of the die, then I may want to know the effective resistance of that instance to Vdd. It could have multiple ports and multiple paths through the voltage supply. I want to find any threshold of effective resistance rule violated. Then I can further ask for that instance, what is my minimum resistance path. If I don’t have the via placed properly the current may violate the EM rule. A small via cannot sustain that much current. Those kinds of problems can be identified fairly early.”

Sometimes, the right approach may come from turning the problem upside down. “We have a customer who intentionally under-designed the grid and then ran simulations,” explains Ansys’ Sarkar. “When they collated the data they saw which were the hot spots across a hundred different scenarios and identified the ones that they needed to fix. They only fixed those. By doing that, the routing area opened up immediately. They were able to converge on timing and saw a 5% reduction in die size. And more importantly, they reduced the power. Now they don’t have to route the longer wires and go into LVT structures because timing is no longer that big an issue. They saw an overall 10% reduction in total power as a side benefit. Over-design techniques that have been used have to be scaled back in order to compensate for the R increase in the routing.”

The development of the power grid is no longer the design of a fortress capable of withstanding any onslaught that the chip may provide. As the overall functionality of the chip is defined, power consumption can be estimated, and this information used to tactically design the most effective power delivery network. While it may always be seen as a cost on the chip, its optimization ensures the minimum necessary overhead and that gain may improve other aspects of the chip.

Related Stories
SoC Power Grid Challenges
How efficient is the power delivery network of an SoC, and how much are they overdesigning to avoid a multitude of problems?
Electromigration: Not Just Copper Anymore
Advanced packaging is creating new stresses and contributing to reliability issues.
Thermal Damage To Chips Widens
Heat issues resurface at advanced nodes, raising questions about how well semiconductors will perform over time for a variety of applications.



Leave a Reply


(Note: This name will be displayed publicly)