EasyManua.ls Logo

IBM BladeCenter PS700 User Manual

IBM BladeCenter PS700
148 pages
To Next Page IconTo Next Page
To Next Page IconTo Next Page
To Previous Page IconTo Previous Page
To Previous Page IconTo Previous Page
Page #117 background imageLoading...
Page #117 background image
Chapter 4. Continuous availability and manageability 103
4.3.2 General detection and deallocation of failing components
Runtime correctable or recoverable errors are monitored to determine if there is a pattern of
errors. If these components reach a predefined error limit, the service processor initiates an
action to deconfigure the faulty hardware to avoid a potential system outage and to enhance
system availability.
Persistent deallocation
To enhance system availability, a component that is identified for deallocation or
deconfiguration on a POWER processor-based system is flagged for persistent deallocation.
Component removal can occur either dynamically (as the system is running) or at boot-time
(IPL), depending on both the type of fault and when the fault is detected.
In addition, runtime unrecoverable hardware faults can be deconfigured from the system after
the first occurrence. The system can be rebooted immediately after failure and resume
operation on the remaining stable hardware. This approach prevents the same faulty
hardware from affecting system operation again, and the repair action is deferred to a more
convenient, less critical time.
Persistent deallocation includes the following elements:
򐂰 Processor
򐂰 L2/L3 cache lines (cache lines are dynamically deleted)
򐂰 Memory
򐂰 Deconfigure or bypass failing I/O adapters
Processor instruction retry
As in POWER6, the POWER7 processor has the ability to retry processor instruction and
alternate processor recovery for a number of core related faults. This approach significantly
reduces exposure to both permanent and intermittent errors in the processor core.
Intermittent errors, often as a result of cosmic rays or other sources of radiation, are generally
not repeatable.
With this function, when an error is encountered in the core, in caches and certain logic
functions, the POWER7 processor automatically retries the instruction. If the source of the
error was truly transient, the instruction succeeds and the system continues as before.
On IBM systems prior to POWER6, this error would have caused a checkstop.
Alternate processor retry
Hard failures are more difficult, being permanent errors that are replicated each time the
instruction is repeated. Retrying the instruction does not help in this situation because the
instruction continues to fail.
As in POWER6, POWER7 processors have the ability to extract the failing instruction from the
faulty core and retry it elsewhere in the system for a number of faults, after which the failing
core is dynamically deconfigured and scheduled for replacement.
Dynamic processor deallocation
Dynamic processor deallocation enables automatic deconfiguration of processor cores when
patterns of recoverable core-related faults are detected. Dynamic processor deallocation
prevents a recoverable error from escalating to an unrecoverable system error, which might
otherwise result in an unscheduled server outage. Dynamic processor deallocation relies on
the service processor’s ability to use FFDC-generated recoverable error information to notify

Table of Contents

Question and Answer IconNeed help?

Do you have a question about the IBM BladeCenter PS700 and is the answer not in the manual?

IBM BladeCenter PS700 Specifications

General IconGeneral
Form FactorBlade
ProcessorIBM POWER6
Number of Processors1 or 2
MemoryUp to 64 GB
NetworkingEthernet, Fibre Channel
Power SupplyRedundant power supplies in BladeCenter chassis
Blade HeightFull-height
Operating System SupportAIX, Linux
Dimensions (HxWxD)Varies by BladeCenter chassis

Summary

Chapter 1. Introduction and General Description

1.6 Supported BladeCenter I/O Modules

Describes the various I/O modules supported by the BladeCenter chassis for these blades.

1.5 System Features

Details the features of the PS700, PS701, and PS702 POWER7 processor-based blade servers.

Chapter 2. Architecture and Technical Overview

2.2 The IBM POWER7 Processor

Details the features, architecture, and capabilities of the POWER7 processor.

2.3 POWER7 Processor-Based Blades

Summarizes POWER7 processor options for PS700, PS701, and PS702 blades.

2.4 Memory Subsystem

Explains the memory subsystem, including DIMM slots, types, and placement rules.

2.6 Internal I/O Subsystem

Details the internal I/O interfaces, PCIe, and expansion card slots on the blades.

Chapter 3. Virtualization

3.1 POWER Hypervisor

Explains the POWER Hypervisor's role in system virtualization and its functions.

3.3 PowerVM

Introduces the PowerVM platform and its family of virtualization technologies.

Chapter 4. Continuous Availability and Manageability

4.3 Availability

Explains IBM's approach to system availability, including fault detection and deallocation.

Related product manuals