4405ch04 Continuous availability and manageability.fmDraft Document for Review September 2, 2008 5:05 pm
94 IBM Power 570 Technical Overview and Introduction
single-CPU-per-processor design. Not only does this reduce the total number of system
components, it reduces the total amount of heat generated in the design, resulting in an
additional reduction in required power and cooling components.
Parts selection also plays a critical role in overall system reliability. IBM uses three grades of
components, with grade 3 defined as industry standard (off-the-shelf). As shown in
Figure 4-1, using stringent design criteria and an extensive testing program, the IBM
manufacturing team can produce grade 1 components that are expected to be 10 times more
reliable than industry standard. Engineers select grade 1 parts for the most critical system
components. Newly introduced organic packaging technologies, rated grade 5, achieve the
same reliability as grade 1 parts.
Figure 4-1 Component failure rates
4.1.2 Placement of components
Packaging is designed to deliver both high performance and high reliability. For example, the
reliability of electronic components is directly related to their thermal environment, that is,
large decreases in component reliability are directly correlated with relatively small increases
in temperature, POWER6 processor-based systems are carefully packaged to ensure
adequate cooling. Critical system components such as the POWER6 processor chips are
positioned on printed circuit cards so they receive fresh air during operation. In addition,
POWER6 processor-based systems are built with redundant, variable-speed fans that can
automatically increase output to compensate for increased heat in the central electronic
complex.
4.1.3 Redundant components and concurrent repair
High-opportunity components, or those that most affect system availability, are protected with
redundancy and the ability to be repaired concurrently.
Component failure rates
0
0.2
0.4
0.6
0.8
1
Grade 3 Grade 1 Grade 5