Keeping space chips cool and reliable

Article By : Rajan Bedi

Next-gen space-grade semiconductors will consume almost 100W, but with junction temperatures remaining the same, removing this heat is paramount for reliability and performance.

With COTS and space-grade semiconductors increasingly switching more circuitry at faster speeds, a key challenge for designers is ensuring their safe and reliable operation by preventing devices over-heating. The maximum allowable junction temperature has remained the same and parts will get damaged if its absolute limit is exceeded for any length of time, e.g., metal migration, gate-oxide failure and shifts in parametric performance. An increase in temperature can also result in thermal-expansion mismatches between different materials within a package or with the PCB, causing interfacial stress and warpage.

I recently spoke to a well-known FPGA provider and his latest, high-end devices consume 100 W. If you compare the surface area of an equivalently-rated light bulb to that of the die, the resulting heat-flux density of the bulb is around 13 times less than the IC! This is staggering and I’m sure you would not want to touch a hot 100 W light bulb! Semiconductor reliability and lifetime are inversely related to junction temperature, and effective dissipation of heat from the microchip through the package to the ambient environment is essential for optimal device and system performance. Motorola previously estimated that the operating life of semiconductors decreases by half for every 10°C rise above 100°C—that’s an exponential decline as a function of temperature!

With small satellites baselining lower-cost, plastic-packaged ICs with higher thermal resistances, and as PCBs become more integrated locating low-voltage, high-current loads next to their regulators using low-impedance PDNs, there is less physical space and often no financial budget available to attach an external heat sink to a hot component. Given the time-to-orbit pressures for avionics’ manufacturers, an understanding of how to keep your semiconductors cool will ensure your designs are right-first-time and operate reliably throughout the intended lifetime of a mission.

Typically around 80% of the heat generated within a plastic-packaged, QFP semiconductor is removed by conduction through its pins, with the metal lead frame transferring heat generated within the device (Figure 1). Ordinarily, the remaining 20% is convected away, however, in the vacuum of space, this is not an option. As junctions get hotter, they become less reliable and more vulnerable to long-term, thermally-accelerated failures. The successful design-in of parts relies on keeping them cool so they function within their safe operating limits.

Figure 1 Cooling of standard plastic packages [Texas Instruments].

Conduction and radiation are the techniques used in-orbit to remove heat from semiconductors, and in this post, I want to focus on re-using the PCB as a heat sink for leaded or QFN chips with exposed paddles. This is typically grounded and soldered to a corresponding pad on the PCB which is electrically connected to one or more ground or power planes within the stack using thermal vias to increase the effective surface area for cooling.

This method of heat sinking exploits high thermal conductivity and devices with paddles have much lower thermal resistance between the junction and case, i.e., ӨJC, to allow heat to conduct away from the bottom of the package and then spread laterally using copper planes within the PCB stack (Figure 2). If the vias are through-hole, then radiation can remove further heat from the device and the board, but for this to occur, the opposite side of the PCB must ‘see’ a lower ambient temperature, e.g., a cold face. An external heat sink can also be attached to this side of the board.

Figure 2 Cooling of exposed-paddle plastic packages [Texas Instruments].

Heat flows when there is a temperature difference from a hot junction to a cold one and the path with the lowest thermal resistance will draw the most. A simplified resistor model analogous to Ohm’s law is commonly used to relate heat dissipation, temperature rise and thermal resistance (Figure 3). This can be used to estimate the impact of PCB area, thermal vias, copper thickness and external heat sinking on junction temperature and reliability.

The resistor circuit models the electrical quantities of charge, current, potential and ohmic resistance as heat, power, temperature and thermal impedance respectively as shown below. Heat flows from the semiconductor junction through its case/package and then into a heat sink, PCB planar, external or both. You simply multiply power dissipation by the sum of all the individual thermal resistances to calculate temperature rise above ambient, i.e., TJ = PDISS * (θJC + θCS + θSA) + TA.

Figure 3 Equivalent thermal circuit.

θ defines the resistance heat encounters when transferring from one structure to another, e.g., from junction to case, θJC. This is expressed in terms of temperature difference per unit heat flow, °C/W, and is dependent on die thickness, surface area and the thermal conductivity of the package material. For example, a device which has a junction-to-air thermal resistance, θJA, of 100 °C/W, will experience an increase of 100°C between the die and ambient for every watt of power dissipation. θJA is mostly used to rate different packages used in the same environment and should not be used to predict the thermal performance of a space-electronics’ component.

When re-using the PCB’s ground planes for cooling semiconductors, since most of the heat transfer is through the exposed pad to the PCB, the most critical value is the thermal resistance of the PCB, ӨCS. We need to size our board, i.e., planar surface area, and design our stack, i.e., number of planes to be used for heat sinking, their copper thickness and whether to use thermal vias, to target the required ӨCS to achieve a reliable junction temperature (Figure 4).

Figure 4 Expanded thermal-resistance model for a typical PCB [Texas Instruments].

To a first approximation, temperature rise is proportional to power dissipation and inversely proportional to surface area. The total area needed to cool and maintain a die at a target temperature can be approximated to be:

A key question at this point is how much board (plane) area is required to conduct heat from the device under test to allow it to operate reliably at a safe junction temperature. The minimum PCB size required to meet a target ӨCS can be approximated using the following equations:

Figure 5 shows that the total thermal impedance of a PCB can be calculated by adding the thermal resistances of all the individual layers, with each estimated using its thickness, cross-sectional area and material thermal conductivity, Ke.g., K = 355 W/mK for copper, 0.25 for FR4, 58 for SnAgCu, lead-free solder and 0.21 for solder mask:

Figure 5 Thermal resistances of a four-layer, one square inch PCB [Renesas].

The use of thermal vias, thicker copper and the vertical thermal resistance of the substrate can reduce the total ӨCS: open vias have higher thermal resistance than filled ones because the area normal to the heat source is reduced. Multiple vias increase surface area reducing their total thermal resistance, which is then paralleled with that of the dielectric layers to calculate a lower equivalent value:

In general, increasing plating thickness during fabrication improves the thermal conductivity of vias. You also need to ensure that your planes are sized to carry the required load current and the associated temperature rises are budgeted and compliant with the PCB materials. From Figure 3Ts becomes the temperature of the board!

How many plane layers should you use for heat sinking? What thickness and coverage? How do you know if thermal vias will be required? How many, what diameter and spacing? Copper filled or empty? Will an external heat sink also be required on the opposite side of the PCB? Do the above analyses to lower the overall thermal resistance, ӨCS, and once the rise in junction temperature meets your reliability needs, you are done! At some point, you will reach a point of diminishing returns when adding more heat sinking increases complexity and cost without actually contributing to cooling.

Some devices such as linear regulators are offered in different package types and sizes each with its own thermal conductivity and current rating (Figures 6 and 7). The smallest will have a higher thermal resistance as shown below, resulting in increased junction temperature. For one customer, I had to replace the PFM part with the larger TO case, because the former was over-heating, causing it to only operate intermittently due to its thermal cut-out.

Figure 6 Comparison of LM117 relative package sizes and load-current ratings.

Figure 7 Thermal and area comparison of packages [Texas Instruments].

CGA/BGA devices often contain dedicated thermal columns/balls to provide a heat sinking path to ground layers within the PCB stack using vias. Electrically, this low-impedance return also assists noise immunity and design-for-EMC. For CGA/BGA packages, θJC is defined as the thermal impedance from the junction to the top of the case. Typical values for plastic and ceramic-packaged space-grade semiconductors range from 0.15 to 22 °C/W. These values support the attachment of an external heat sink to ensure device junction temperatures remain within their safe operating area.

A heat sink conducts heat away from a hot junction and the choice of material, e.g., aluminum or copper, fin design and surface treatment all affect its cooling performance. The greater the surface area, the lower the value of thermal resistance between the package and the heat sink, θCS, the better the heat transfer to ambient or a cold face.

The total thermal resistance between a hot semiconductor and ambient is the sum of all the individual resistances, e.g., between the junction and the case, θJC, the package and the heat sink, θCS, and the latter and the surrounding air θSAi.e., θ = θJC + θCS + θSA. Values for each of these are readily available from manufacturers’ datasheets. To specify the size a heat sink, the simplified, steady-state, heat-transfer model above can be re-written as: θ = ΔT / PD, which gives the maximum value of thermal resistance between the die and ambient that our design can tolerate without overheating. As an example, if device power dissipation is 10W, the junction temperature is 125°C and ambient 25°C, then the maximum value of thermal resistance is (125-25)/10 = 10°C/W. In practice, a de-rated value for TJ is commonly used, e.g., 100°C. If θJC is 2.5°C/W and θCS is 0.5°C/W, then the required thermal impedance between the heat sink and the surrounding air, θSA, must be less than 10 – 2.5 – 0.5 = 7°C/W. This is how to specify the heat sink and the next step is to choose one that physically fits with your sub-system and of course budget!

Some space-grade CGAs/BGAs contain an internal copper slug (or lid) to spread the heat from the die to the perimeter and the PCB as shown below (Figure 8):

Figure 8 Space-grade plastic and ceramic CGA/BGA packages.

Previously I was contacted by a client who had designed-in an expensive, space-grade FPGA in a CGA package and discovered during hardware testing that the device was over-heating. While we were able to make many suggestions on how to power and use the part to reduce its overall dissipation, e.g., the use of a lower core voltage and less-consuming I/O respectively, this unwelcome discovery should not have been made during the commissioning of the avionics. Thermal analyses, power-prediction spreadsheets and HDL simulation before manufacturing, would have warned our customer of an impending reliability issue. These would have indicated that an external, physical and/or planar PCB heat sink was/were required to remove the excess heat from the junction to ensure safe operation of the IC.

The idea of thermal resistance for a semiconductor heat sink is an approximation: it does not account for non-uniform distribution of heat over a device and only models a system in thermal equilibrium, i.e., does not consider the change in temperatures with time nor does it reflect the non-linearity of radiation and convection with respect to temperature rise. However, manufacturers specify typical values of thermal resistance for heat sinks and semiconductors, which simplifies their selection.

Until next month, the person who shares the best technique to reduce junction temperature during device operation will win a Courses for Rocket Scientists World Tour tee-shirt. Congratulations to Dimitar from Bulgaria, the first to answer the riddle from my previous post.

This article was originally published on EDN.

Dr. Rajan Bedi is the CEO and founder of Spacechips, which designs and builds a range of advanced, L to K-band, ultra high-throughput on-board processors, transponders and Edge-based OBCs for telecommunication, Earth-Observation, navigation, internet and M2M/IoT satellites. The company also offers Space-Electronics Design-Consultancy, Avionics Testing, Technical-Marketing, Business-Intelligence and Training Services. (www.spacechips.co.uk). Rajan can also be contacted on Twitter to discuss your space-electronics’ needs: https://twitter.com/DrRajanBedi

Spacechips’ Design-Consultancy Services develop bespoke satellite and spacecraft sub-systems, as well as advising customers how to use and select the right components, how to design, test, assemble and manufacture space electronics.

 

Leave a comment