IBM P260 Installation And Service Manual

Summary of P260

  • Page 1

    Power systems ibm flex system p260 and p460 compute nodes installation and service guide ibm.

  • Page 3

    Power systems ibm flex system p260 and p460 compute nodes installation and service guide ibm.

  • Page 4

    Note before using this information and the product it supports, read the information in “safety notices” on page v, “notices,” on page 501, the ibm systems safety notices manual, g229-9054, and the ibm environmental notices and user guide, z125–5823. This edition applies to ibm power systems servers...

  • Page 5: Contents

    Contents safety notices . . . . . . . . . . .. V chapter 1. Introduction . . . . . . .. 1 product registration . . . . . . . . . . .. 1 related documentation . . . . . . . . . .. 3 ibm documentation cd . . . . . . . . .. 4 hardware and software requirements . . . .. 4 using the documentation browser...

  • Page 6

    A2xxyyyy logical partition srcs . . .. 134 a6xxyyyy licensed internal code or hardware event srcs. . . . . . .. 135 a7xxyyyy licensed internal code srcs .. 138 aaxxyyyy partition firmware attention codes . . . . . . . . . . . .. 140 b1xxyyyy service processor srcs. . .. 143 b2xxyyyy logical partitio...

  • Page 7: Safety Notices

    Safety notices safety notices may be printed throughout this guide. V danger notices call attention to a situation that is potentially lethal or extremely hazardous to people. V caution notices call attention to a situation that is potentially hazardous to people because of some existing condition. ...

  • Page 8

    Danger when working on or around the system, observe the following precautions: electrical voltage and current from power, telephone, and communication cables are hazardous. To avoid a shock hazard: v connect power to this unit only with the ibm provided power cord. Do not use the ibm provided power...

  • Page 9

    Observe the following precautions when working on or around your it rack system: v heavy equipment–personal injury or equipment damage might result if mishandled. V always lower the leveling pads on the rack cabinet. V always install stabilizer brackets on the rack cabinet. V to avoid hazardous cond...

  • Page 10

    Caution: removing components from the upper positions in the rack cabinet improves rack stability during relocation. Follow these general guidelines whenever you relocate a populated rack cabinet within a room or building: v reduce the weight of the rack cabinet by removing equipment starting at the...

  • Page 11

    (l003) or all lasers are certified in the u.S. To conform to the requirements of dhhs 21 cfr subchapter j for class 1 laser products. Outside the u.S., they are certified to be in compliance with iec 60825 as a class 1 laser product. Consult the label on each part for laser certification numbers and...

  • Page 12

    Caution: data processing environments can contain equipment transmitting on system links with laser modules that operate at greater than class 1 power levels. For this reason, never look into the end of an optical fiber cable or open receptacle. (c027) caution: this product contains a class 1m laser...

  • Page 13: Chapter 1. Introduction

    Chapter 1. Introduction the ibm ® flex system p260 compute node or ibm flex system p460 compute node is based on ibm power ® technologies. These compute nodes run in ibm flex system enterprise chassis units to provide a high-density, high-performance compute-node environment with advanced processing...

  • Page 14

    Vital product data print table 1 and use it to record information about your compute node. You will need this information when you register the compute node with ibm. You can register the compute node at http://www.Ibm.Com/support/mynotifications. To determine the values for your compute node, use t...

  • Page 15

    Table 1. Vital product data (continued) vital product data field vital product data how to find this data serial number ________________________ (7 characters) v for fsm: – chassis manager in the management software web interface of the ibm flex system manager v for hmc: 1. In the navigation area, c...

  • Page 16

    The compute node might have features that are not described in the documentation that comes with the compute node. Occasional updates to the documentation might include information about those features, or technical updates might be available to provide additional information that is not included in...

  • Page 17

    Select the compute node from the product menu. The available topics list displays all the documents for the compute node. Some documents might be in folders. A plus sign (+) indicates each folder or document that has additional documents under it. Click the plus sign to display the additional docume...

  • Page 18

    The ibm flex system p260 compute node is a one-bay compute node and is used in an ibm flex system enterprise chassis. Notes: v power, cooling, removable-media drives, external ports, and advanced system management (asm) are provided by the ibm flex system enterprise chassis. V the operating system i...

  • Page 19

    Reliability and service features: v dual alternating current power supply v ibm flex system enterprise chassis: chassis redundant and hot-plug power and cooling modules v boot-time processor deallocation v compute node hot plug v customer setup and expansion v automatic reboot on power loss v intern...

  • Page 20

    Core electronics: 64-bit 2 x power7 processors ibm flex system p460 compute node 2-bay: v model 7895-42x 32-way smp 2-bay: 4 socket, 4-core or 8-core at 3.2, 3.3, or 3.5 ghz v model 7895-43x 32-way smp 2-bay: 4 socket, 4-core at 4.0 ghz or 8-core at 3.6 or 4.1 ghz v 32 dimm ddr3 slots. Maximum capac...

  • Page 21

    Reliability and service features: v dual alternating current power supply v ibm flex system enterprise chassis: chassis redundant and hot-plug power and cooling modules v boot-time processor deallocation v compute node hot plug v customer setup and expansion v automatic reboot on power loss v intern...

  • Page 22

    The compute node supports either serial advanced technology attachment (sata) solid-state drives (ssds) or serial-attached scsi (sas) hard disk drives (hdds) in one of the following configurations: – up to two 1.8 in. Sata ssds – up to two 2.5 in. Sas hdds v impressive performance using the latest m...

  • Page 23

    Chapter 2. Power, controls, indicators, and connectors you can use the control panel to turn the compute nodes on or off and to view some controls and indicators. Other indicators are on the system board. The system board also has connectors for various components. Compute node control panel button ...

  • Page 24

    4. Enclosure fault led : when this amber led is lit, it indicates that a system error has occurred in the compute node. The compute-node error led will turn off after one of the following events: v correcting the error v reseating the compute node in the ibm flex system enterprise chassis v cycling ...

  • Page 25

    Turning off the compute node when you turn off the compute node, it is still connected to power through the ibm flex system enterprise chassis. The compute node can respond to requests from the service processor, such as a remote request to turn on the compute node. To remove all power from the comp...

  • Page 26

    System-board connectors compute node components attach to the connectors on the system board. The following figure shows the connectors on the base-unit system board in the ibm flex system p260 compute node. The following table identifies and describes the connectors for the ibm flex system p260 com...

  • Page 27

    The following table identifies and describes the connectors for the ibm flex system p460 compute node. Table 3. Connectors for the ibm flex system p460 compute node callout ibm flex system p460 compute node connectors ▌1▐ 3 v lithium battery connector (p1-e1) ▌2▐ dimm connectors (see figure 5 on pag...

  • Page 28

    The following figure shows individual dimm connectors for the ibm flex system p460 compute node system board. Figure 4. Dimm connectors for the ibm flex system p260 compute node figure 5. Dimm connectors for the ibm flex system p460 compute node 16 power systems: ibm flex system p260 and p460 comput...

  • Page 29

    System-board leds use the illustration of the leds on the system board to identify a light emitting diode (led). Press and hold the front power-control button to see any light path diagnostic leds that were turned on during error processing. Use the following figure to identify the failing component...

  • Page 30

    The following table identifies the light path diagnostic leds. Table 4. Ibm flex system p260 compute node and ibm flex system p460 compute node leds callout unit leds ▌1▐ 3 v lithium battery led ▌2▐ drv2 led (hdd or ssd) ▌3▐ drv1 led (hdd or ssd) ▌4▐ drive board led (solid-state drive interposer, wh...

  • Page 31

    Input/output connectors and devices the input/output connectors that are available to the compute node are supplied by the ibm flex system enterprise chassis. See the documentation that comes with the ibm flex system enterprise chassis for information about the input/output connectors. The ethernet ...

  • Page 32

    20 power systems: ibm flex system p260 and p460 compute nodes installation and service guide.

  • Page 33

    Chapter 3. Configuring the compute node while the firmware is running power on system test (post) and before the operating system starts, a post menu with post indicators is displayed. The post indicators are the words memory, keyboard, network, scsi, and speaker that are displayed as each component...

  • Page 34

    Updating the firmware ibm periodically makes firmware updates available for you to install on the compute node, on the management module, or on expansion cards in the compute node. Before you begin attention: installing the wrong firmware update might cause the compute node to malfunction. Before yo...

  • Page 35

    About this task to install compute node firmware using an in-band method, complete the following steps: procedure 1. Download the ibm flex system p260 compute node one-bay firmware or the ibm flex system p460 compute node two-bay firmware. A. Go to http://www.Ibm.Com/software/brandcatalog/puresystem...

  • Page 36

    6. Restart the compute node to apply the firmware update. 7. Run the following command in aix or linux to verify if the firmware update was successful: lsmcode -a run the following command in vios to verify if the firmware update was successful: lsfware -all starting the temp image the system firmwa...

  • Page 37

    Starting the sms utility start the sms utility to configure the compute node. Procedure 1. Turn on or restart the compute node, and establish an sol session with it. See the ibm chassis management module command-line interface reference guide for more information. 2. When the post menu and indicator...

  • Page 38

    About this task the ce login must have a role of run diagnostics and must be in a primary group of system. This setting enables the ce login to perform the following tasks: v run the diagnostics, including the service aids, certification, and formatting. V run all the operating-system commands that ...

  • Page 39

    Mac addresses for integrated ethernet controllers two integrated ethernet ports are used by the service processor on the ibm flex system p260 compute node or ibm flex system p460 compute node. Additional ethernet ports are provided by the feature cards plugged into the two expansion cards slots. The...

  • Page 40

    Configuring a raid array use this information to configure a raid array. About this task configuring a raid array applies to a compute node in which disk drives or solid-state drives are installed. Note: when configuring a raid array, the hard disk drives must use the same type of interface and must...

  • Page 41

    Chapter 4. Installing the operating system before you install the operating system on the compute node, verify that the compute node is installed in the ibm flex system enterprise chassis, that the management-module firmware is at the latest available level, and that the compute node is turned on. A...

  • Page 42

    You can install the aix operating system by following the installation instructions in the ibm systems information center. See the online aix installation and migration topic for more information. You can find more information about aix in the ibm system p ® information roadmap on the ibm website. N...

  • Page 43

    Results after you install the operating system, install operating system updates, and then install any utilities that apply to your operating system. Installing service and productivity tools for linux linux service and productivity tools include hardware diagnostic aids and productivity tools, and ...

  • Page 44

    32 power systems: ibm flex system p260 and p460 compute nodes installation and service guide.

  • Page 45

    Chapter 5. Accessing the service processor you can access the service processor remotely. The management console can connect directly to the advanced system management interface (asmi) for a selected system. Asm is an interface to the service processor that you can use to manage the operation of the...

  • Page 46

    34 power systems: ibm flex system p260 and p460 compute nodes installation and service guide.

  • Page 47

    Chapter 6. Installing and removing components install or remove hardware components, such as memory modules or input/output expansion cards. Some installation procedures require you to remove an installed component. Returning a device or component if you are instructed to return a device or componen...

  • Page 48

    System reliability guidelines follow these guidelines to help ensure proper cooling and system reliability. V verify that the ventilation holes on the compute node are not blocked. V verify that you are maintaining proper system cooling in the unit. Do not operate the ibm flex system enterprise chas...

  • Page 49

    Removing the compute node from an ibm flex system enterprise chassis remove the compute node from the ibm flex system enterprise chassis to access options, connectors, and system-board indicators. About this task attention: v to maintain proper system cooling, do not operate the ibm flex system ente...

  • Page 50

    3. Press the power-control button to turn off the compute node. See “turning off the compute node” on page 13. 4. Wait at least 30 seconds for the hard disk drive to stop spinning. 5. Open the release handles, as shown in figure 8 on page 37. The ibm flex system p260 compute node has one release han...

  • Page 51

    Removing and replacing tier 1 crus replacement of tier 1 customer-replaceable units (crus) is your responsibility. About this task if ibm installs a tier 1 cru at your request, you will be charged for the installation. The illustrations in this documentation might differ slightly from your hardware....

  • Page 52

    To remove the compute node cover, complete the following steps: procedure 1. Read the safety topic and the “installation guidelines” on page 35. 2. Shut down the operating system on all partitions of the compute node, turn off the compute node, and remove the compute node from the ibm flex system en...

  • Page 53

    Caution: hazardous energy is present when the compute node is connected to the power source. Always replace the compute node cover before installing the compute node. Installing and closing the compute node cover install and close the cover of the compute node before you insert the compute node into...

  • Page 54

    Statement 21 caution: hazardous energy is present when the compute node is connected to the power source. Always replace the compute node cover before installing the compute node. To replace and close the compute node cover, complete the following steps: procedure 1. Read the safety topic and the “i...

  • Page 55

    3. Slide the cover forward to the closed position until the releases click into place in the cover. 4. Install the compute node into the ibm flex system enterprise chassis. See “installing the compute node in an ibm flex system enterprise chassis” on page 93. Removing the bezel assembly if a bezel i...

  • Page 56

    To remove the compute node bezel, complete the following steps: procedure 1. Read the safety topic and the “installation guidelines” on page 35. 2. Shut down the operating system on all partitions of the compute node, turn off the compute node, and remove the compute node from the ibm flex system en...

  • Page 57

    Installing the bezel assembly you can replace a damaged bezel on an ibm flex system p260 compute node or ibm flex system p460 compute node. About this task if the bezel becomes damaged, you can install a new bezel by using the following procedure. Figure 15. Installing the bezel assembly in an ibm f...

  • Page 58

    To replace a bezel on the compute node, complete the following steps: procedure 1. Align the bezel with the compute node according to the circled locations indicated in figure 15 on page 45 and figure 16. 2. Align the bezel assembly with the front of the compute node. Firmly press the bezel at the s...

  • Page 59

    About this task to remove the hard disk drive, complete the following steps: procedure 1. Back up the data from the drive to another storage device. 2. Read the safety topic and the “installation guidelines” on page 35. 3. Shut down the operating system on all partitions of the compute node, turn of...

  • Page 60

    7. If you are instructed to return the hard disk drive, follow all packaging instructions, and use any packaging materials for shipping that are supplied to you. Installing a sas hard disk drive if your serial-attached scsi (sas) hard disk drive needs replacing, install another sas hard drive in the...

  • Page 61

    2. Shut down the operating system on all partitions of the compute node, turn off the compute node, and remove the compute node from the ibm flex system enterprise chassis. See “removing the compute node from an ibm flex system enterprise chassis” on page 37. 3. Carefully lay the compute node on a f...

  • Page 62

    To remove the drive, complete the following steps: procedure 1. Back up the data from the drive to another storage device. 2. Read the safety topic and the “installation guidelines” on page 35. 3. Shut down the operating system on all partitions of the compute node, turn off the compute node, and re...

  • Page 63

    7. Remove any ssds from the carrier case. See “removing a sata solid-state drive” on page 53. 8. If you are instructed to return the ssd carrier, follow all packaging instructions, and use any packaging materials for shipping that are supplied to you. Note: even if you do not plan to install another...

  • Page 64

    1. Read the safety topic and the “installation guidelines” on page 35. 2. Shut down the operating system on all partitions of the compute node, turn off the compute node, and remove the compute node from the ibm flex system enterprise chassis. See “removing the compute node from an ibm flex system e...

  • Page 65

    Removing a sata solid-state drive if your serial advanced technology attachment (sata) solid-state drive (ssd) needs to be replaced, you can remove it from the compute node. About this task to remove the sata ssd, complete the following steps: procedure 1. Back up the data from the drive to another ...

  • Page 66

    7. Gently spread open the carrier with your fingers while sliding the ssd out of the carrier case with your thumbs. 8. If you are instructed to return the drive, follow all packaging instructions, and use any packaging materials for shipping that are supplied to you. Installing a sata solid-state dr...

  • Page 67

    8. Replace the carrier case in the drive tray. See “installing a solid-state drive carrier” on page 51. Attention: do not press the top of the drive. Pressing the top might damage the drive. 9. Install and close the compute node cover. See “installing and closing the compute node cover” on page 41. ...

  • Page 68

    Attention: to avoid breaking the dimm retaining clips or damaging the dimm connectors, open and close the clips gently. Figure 23. Dimm connectors for the ibm flex system p260 compute node figure 24. Dimm connectors for the ibm flex system p460 compute node 56 power systems: ibm flex system p260 and...

  • Page 69

    6. Carefully open the retaining clips (a) on each end of the dimm connector by pressing them in the direction of the arrows. Remove the dimm (b). 7. Install a dimm filler (c) in any location where a dimm is not present to avoid machine damage. Note: before you replace the compute node cover, ensure ...

  • Page 70

    Installing a dimm the very low profile (vlp) dual-inline memory module (dimm) is a tier 1 cru. You can install it yourself. If ibm installs a tier 1 cru at your request, you will be charged for the installation. The low profile (lp) dimm is a tier 2 cru. You can install it yourself or request ibm to...

  • Page 71

    Important: if there is a gap between the dimm and the retaining clips, the dimm is not correctly installed. Open the retaining clips to remove and reinsert the dimm. Install a dimm filler (c) in any location where a dimm is not present to avoid machine damage. 11. Install and close the compute node ...

  • Page 72

    Table 6. Memory module combinations for the ibm flex system p260 compute node dimm count dimm slots for the ibm flex system p260 compute node 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 2 x x 4 x x x x 6 x x x x x x 8 x x x x x x x x 10 x x x x x x x x x x 12 x x x x x x x x x x x x 14 x x x x x x x x x ...

  • Page 73

    Related reference : chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 the parts listing identifies each replaceable part and its part number. Related information : http://www.Ibm.Com/systems/info/x86servers/serverproven/compat/us/ removing a network adapter if yo...

  • Page 74

    To remove a network adapter, complete the following steps: procedure 1. Read the safety topic and the “installation guidelines” on page 35. 2. Shut down the operating system on all partitions of the compute node, turn off the compute node, and remove the compute node from the ibm flex system enterpr...

  • Page 75

    9. Install the compute node into the ibm flex system enterprise chassis. See “installing the compute node in an ibm flex system enterprise chassis” on page 93. Installing a network adapter you can install a network adapter on its connector. About this task to install a network adapter, complete the ...

  • Page 76

    5. Make sure that the slot’s two blue release tabs are in the open position, as indicated in figure 30 on page 63. 6. Orient the network adapter over the system board. 7. Lower the card to the system board, aligning the connectors on the card with its connector on the system board. Press down gently...

  • Page 77

    2. Shut down the operating system on all partitions of the compute node, turn off the compute node, and remove the compute node from the ibm flex system enterprise chassis. See “removing the compute node from an ibm flex system enterprise chassis” on page 37. 3. Carefully lay the compute node on a f...

  • Page 78

    Caution: when replacing the lithium battery, use only ibm part number 33f8354 or an equivalent type battery recommended by the manufacturer. If your system has a module containing a lithium battery, replace it only with the same module type made by the same manufacturer. The battery contains lithium...

  • Page 79

    Replacing the thermal sensor on an ibm flex system p460 compute node you can install this tier 2 cru, or you can request ibm to install it, at no additional charge, under the type of warranty service that is designated for the compute node. Use this procedure to remove the thermal sensor on an ibm f...

  • Page 80

    3. Carefully lay the compute node on a flat, static-protective surface, with the cover side up. 4. Open and remove the compute node cover. See “removing the compute node cover” on page 39. 5. Locate the thermal sensor as shown in figure 33 on page 67. 6. For best visibility, orient yourself to the l...

  • Page 81

    About this task to remove a dimm, complete the following steps: procedure 1. Read the safety topic and the “installation guidelines” on page 35. 2. Shut down the operating system on all partitions of the compute node, turn off the compute node, and remove the compute node from the ibm flex system en...

  • Page 82

    Attention: to avoid breaking the dimm retaining clips or damaging the dimm connectors, open and close the clips gently. 6. Carefully open the retaining clips (a) on each end of the dimm connector by pressing them in the direction of the arrows. Remove the dimm (b). Figure 35. Dimm connectors for the...

  • Page 83

    7. Install a dimm filler (c) in any location where a dimm is not present to avoid machine damage. Note: before you replace the compute node cover, ensure that you have at least the minimum dimm configuration installed so that your compute node operates properly. 8. If you are instructed to return th...

  • Page 84

    8. Verify that both of the connector retaining clips are in the fully open position. 9. Turn the dimm so that the dimm keys align correctly with the connector on the system board. Attention: to avoid breaking the dimm retaining clips or damaging the dimm connectors, handle the clips gently. 10. Inse...

  • Page 85

    12. Install the compute node into the ibm flex system enterprise chassis. See “installing the compute node in an ibm flex system enterprise chassis” on page 93. 13. If you replaced the part because of a service action, verify the repair by checking that the amber enclosure fault led is off. For more...

  • Page 86

    5. Save the system identifiers. Note: to complete this operation, your authority level must be administrator or authorized service provider. For information about asmi authority levels, see http://pic.Dhe.Ibm.Com/infocenter/ powersys/v3r1m5/topic/p7hby/asmiauthority.Htm. A. In the asm welcome pane, ...

  • Page 87

    11. If you are instructed to return the management card, follow all packaging instructions, and use any packaging materials for shipping that are supplied to you. 12. Replace the management card. See “installing the management card.” installing the management card you can install this tier 2 cru, or...

  • Page 88

    4. Insert the management card (as shown by ▌1▐ in figure 39 on page 75) and verify that the card is securely on the connector and pushed down all the way to the main board. 5. Were you sent to this procedure from the “replacing the system-board and chassis assembly” on page 83 procedure? Yes: return...

  • Page 89

    11. Verify that the management card vpd is correct. Note: to complete this operation, your authority level must be administrator or authorized service provider. For information about asmi authority levels, see http://pic.Dhe.Ibm.Com/infocenter/ powersys/v3r1m5/topic/p7hby/asmiauthority.Htm. A. In th...

  • Page 90

    B) click tasks > operations > launch advanced system management (asm). 3) if you do not have a management console, access asmi by using a web interface. For more information, see chapter 5, “accessing the service processor,” on page 33. B. In the asm welcome pane, if you have not already logged in, ...

  • Page 91

    Cod vet information system type: 7895 system serial number: 12-34567 anchor card ccin: 52ef anchor card serial number: 01-231s000 anchor card unique identifier: 30250812077c3228 resource id: ca1f activated resources: 0000 sequence number: 0040 entry check: ec go to step 3. V by using integrated virt...

  • Page 92

    A. Enter the system type, which is the value of the sys_type field from the ivm or the system type field from the asmi, hmc, or fsm. B. Enter the serial number, which is the value of the sys_serial_num from the ivm or the system serial number field from the asmi, hmc, or fsm. If the code has not yet...

  • Page 93

    Removing the light path diagnostics panel you can remove this tier 2 cru, or you can request ibm to remove it, at no additional charge, under the type of warranty service that is designated for the compute node. Remove the light path diagnostics panel to replace the panel or to reuse the panel in a ...

  • Page 94

    4. Open and remove the compute node cover. See “removing the compute node cover” on page 39. 5. Grasp the light path diagnostics panel by its pull tab (b) and pull it horizontally out of its bracket. 6. Disconnect the cable (a) from the system board. For best maneuverability, orient yourself in fron...

  • Page 95

    1. Read the safety topic and the “installation guidelines” on page 35. 2. Shut down the operating system on all partitions of the compute node, turn off the compute node, and remove the compute node from the ibm flex system enterprise chassis. See “removing the compute node from an ibm flex system e...

  • Page 96

    Before you begin attention: replacing the management card and the system-board at the same time might result in the loss of vital product data (vpd) and information about the number of active processor cores. If the management card and system-board must both be replaced, replace them one at a time. ...

  • Page 97

    4. Does the compute node have fibre channel adapters? Yes if the vios is operational, have the customer complete “save vfchost map data” on page 492. Then, continue with the next step. Note: if the vios is not operational and vfchost map data is not available from a previous save operation, the vfch...

  • Page 98

    8. Have the customer record the vios boot device. If multiple devices are mapped to the vios, record the current device. Note: boot device information is stored in the service processor of the system-board and it must be manually restored after the system-board is replaced with a new system-board. E...

  • Page 99

    19. Write the machine type, model number, and serial number of the compute node on the repair identification tag that is provided with the replacement system-board and chassis assembly. This information is on the identification label that is on the lower right corner of the bezel on the front of the...

  • Page 100

    24. Have the customer set the compute node system name. Complete the following steps: a. Access the asmi. For information about how to access the asmi, see chapter 5, “accessing the service processor,” on page 33. B. The system name was recorded in step 5 on page 85 of this procedure. If the correct...

  • Page 101

    Note: the compute node is turned on as part of the recovery process. V if this system is managed by an hmc, complete the following steps: a. Log in to the hmc. B. Click systems management > servers. C. If the compute node displays failed authentication in the status column and incorrect password in ...

  • Page 102

    32. Have the customer reset the system date and time through the operating system that is installed. For more information, see the documentation for your operating system. 33. If the part was replaced because of a service action, verify the repair by checking that the amber enclosure fault led is of...

  • Page 103

    Installing and closing the compute node cover install and close the cover of the compute node before you insert the compute node into the ibm flex system enterprise chassis. Do not attempt to override this important protection. About this task figure 42. Installing the cover for an ibm flex system p...

  • Page 104

    Statement 21 caution: hazardous energy is present when the compute node is connected to the power source. Always replace the compute node cover before installing the compute node. To replace and close the compute node cover, complete the following steps: procedure 1. Read the safety topic and the “i...

  • Page 105

    3. Slide the cover forward to the closed position until the releases click into place in the cover. 4. Install the compute node into the ibm flex system enterprise chassis. See “installing the compute node in an ibm flex system enterprise chassis.” installing the compute node in an ibm flex system e...

  • Page 106

    1. Go to http://www.Ibm.Com/support/entry/portal/overview to download the latest firmware for the compute node. Download the firmware so that you can use it later to update the compute node after you start it. 2. Read the safety topic and the “installation guidelines” on page 35. 3. If you have not ...

  • Page 107

    What to do next if you have changed the configuration of the compute node or if this is a different compute node than the one you removed, you must configure the compute node. You might also have to install the compute node operating system. See chapter 4, “installing the operating system,” on page ...

  • Page 108

    96 power systems: ibm flex system p260 and p460 compute nodes installation and service guide.

  • Page 109: Compute Nodes

    Chapter 7. Parts listing for ibm flex system p260 and p460 compute nodes the parts listing identifies each replaceable part and its part number. Figure 45 on page 98 shows replaceable components that are available for the ibm flex system p260 compute node. Figure 46 on page 101 shows replaceable com...

  • Page 110

    Table 8. Ibm flex system p260 compute node parts index description cru part number fru part number failing function code (ffc) (tier 1) (tier 2) 7895-22x base system-board and chassis assembly: v 3.3 ghz, 8-core system board (ccin 547d) v 3.2 ghz, 16-core system board (ccin 547f) v 3.5 ghz, 16-core ...

  • Page 111

    Table 8. Ibm flex system p260 compute node parts (continued) index description cru part number fru part number failing function code (ffc) (tier 1) (tier 2) 1457-7f2 base system-board and chassis assembly: v 3.6 ghz, 16-core system board (ccin a898) v 4.1 ghz, 16-core system board (ccin a897) v 00e1...

  • Page 112

    Table 8. Ibm flex system p260 compute node parts (continued) index description cru part number fru part number failing function code (ffc) (tier 1) (tier 2) 6 memory, 8 gb ddr3, 1066 mhz very low profile (vlp) dimm note: 7895-22x only. 78p0502 6 memory, 8 gb ddr3, 1066 mhz very low profile (vlp) dim...

  • Page 113

    Table 9. Ibm flex system p460 compute node parts index description cru part number fru part number failing function code (ffc) (tier 1) (tier 2) 7895-42x base system-board and chassis assembly: v 3.3 ghz, 16-core system board (ccin 547c) v 3.2 ghz, 32-core system board (ccin 547e) v 3.5 ghz, 32-core...

  • Page 114

    Table 9. Ibm flex system p460 compute node parts (continued) index description cru part number fru part number failing function code (ffc) (tier 1) (tier 2) 1 ibm flex system 8-port 10 gb converged adapter 00e1677 2e28, 2e52 1 ibm flex system 2-port 10 gb roce adapter 90y3481 2770 1 ibm flex system ...

  • Page 115

    Table 9. Ibm flex system p460 compute node parts (continued) index description cru part number fru part number failing function code (ffc) (tier 1) (tier 2) 9 177 gb sata solid-state drive 74y9115 26b4 10 solid-state drive carrier 74y9407 11 300 gb sas small form factor disk drive and screws (4) (op...

  • Page 116

    104 power systems: ibm flex system p260 and p460 compute nodes installation and service guide.

  • Page 117: Chapter 8. Troubleshooting

    Chapter 8. Troubleshooting use this information to diagnose and fix any problems that might occur in your compute node. Introduction to problem solving this problem determination and service information helps you solve problems that might occur in your ibm flex system enterprise chassis compute node...

  • Page 118

    If you cannot see the aix console after you install aix locally by using the keyboard/video select button and local media, run the change console command and restart the compute node to switch the aix console to a serial over lan (sol) connection: chcons /dev/vty0 shutdown -fr the commands do not af...

  • Page 119

    Diagnostics use the available diagnostic tools to help solve any problems that might occur in the compute node. The first and most crucial component of a solid serviceability strategy is the ability to accurately and effectively detect errors when they occur. While not all errors are a threat to sys...

  • Page 120

    Led locations see “system-board leds” on page 17. Front panel see “compute node control panel button and leds” on page 11. V troubleshooting tables use the troubleshooting tables to find solutions to problems that have identifiable symptoms. These tables are in the online information and the problem...

  • Page 121

    Collecting dump data a dump might be critical for fault isolation when the built-in first failure data capture (ffdc) mechanisms are not capturing sufficient fault data. Even when a fault is identified, dump data can provide additional information that is useful in problem determination. About this ...

  • Page 122

    See “system-board connectors” on page 14 for component locations. Notes: 1. Location codes do not indicate the location of the compute node within the ibm flex system enterprise chassis. The codes identify components of the compute node only. 2. For checkpoints with no associated location code, see ...

  • Page 123

    Table 10. Location codes for ibm flex system p260 compute node (continued) components physical location code cru led usb port 1 (front panel) un-p1-t1 no integrated sas controller un-p1-t2 no machine location code utttt.Mmm.Sssssss no um codes are for firmware. The format is the same as for a un loc...

  • Page 124

    Table 11. Mapping network adapter ports to ibm flex system enterprise chassis i/o bays for the ibm flex system p260 compute node (continued) port location code ibm flex system 2-port qdr infiniband adapter ibm flex system 2-port 10 gb roce adapter ibm flex system 2-port 8 gb and 2-port 16 gb fibre c...

  • Page 125

    Table 12. Location codes for ibm flex system p460 compute node (continued) components physical location code cru led dimm 24 un-p1-c24 yes dimm 25 un-p1-c25 yes dimm 26 un-p1-c26 yes dimm 27 un-p1-c27 yes dimm 28 un-p1-c28 yes dimm 29 un-p1-c29 yes dimm 30 un-p1-c30 yes dimm 31 un-p1-c31 yes dimm 32...

  • Page 126

    Table 13. Mapping network adapter ports to ibm flex system enterprise chassis i/o bays for the ibm flex system p460 compute node (continued) port location code ibm flex system 2-port qdr infiniband adapter ibm flex system 2-port 10 gb roce adapter ibm flex system 2-port 8 gb and 2-port 16 gb fibre c...

  • Page 127

    Table 13. Mapping network adapter ports to ibm flex system enterprise chassis i/o bays for the ibm flex system p460 compute node (continued) port location code ibm flex system 2-port qdr infiniband adapter ibm flex system 2-port 10 gb roce adapter ibm flex system 2-port 8 gb and 2-port 16 gb fibre c...

  • Page 128

    Isolation procedures: if the fault analysis does not determine a definitive cause, the service processor might indicate a fault isolation procedure that you can use to isolate the failing component. Viewing the codes the ibm flex system p260 compute node or ibm flex system p460 compute node does not...

  • Page 129

    Use asmi to display the error and event logs. See displaying error and event logs in the ibm power systems hardware information center. Src formats srcs are strings of either six or eight alphanumeric characters. The first two characters designate the reference code type. The first characters indica...

  • Page 130

    110000ac explanation: the 110000ac src indicates your compute node lost ac power. This procedure will help you determine the source of the power loss condition that brought you here. If the system or expansion unit that exhibited the power loss starts normally, or stays powered on after an ac power ...

  • Page 131

    11002620 explanation: 12v dc pgood input fault. Response: 1. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware 2. Replace the system board. If there is more than one system board, ...

  • Page 132

    1100262d explanation: 3.3v reg_pgood fault. Response: 1. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware 2. Replace the system board. If there is more than one system board, use ...

  • Page 133

    11002640 explanation: vrm cp1 core pgood fault response: 1. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware 2. Replace the system board. If there is more than one system board, u...

  • Page 134

    1100265a explanation: 1.2va pgood fault response: 1. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware 2. Replace the system board. If there is more than one system board, use the ...

  • Page 135

    11002674 explanation: expansion server pgood fault at standby. Response: 1. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware 2. Replace the expansion system-board (un-p2) assembly...

  • Page 136

    1100267a explanation: cffh expansion card a0_pgood fault. Response: perform the dtrcard symbolic cru isolation procedure by completing the following steps: 1. Reseat the pcie expansion card. 2. If the problem persists, replace the expansion card. 3. If the problem persists: search retain tips and th...

  • Page 137

    1100267f explanation: ete daughter card pgood fault. Response: 1. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware 2. Replace the ete daughter card. 1100269c explanation: 1.2v a p...

  • Page 138

    11002901 explanation: ssd interposer card pgood fault on the base system board. Response: 1. Replace the interposer card on the base system board. 11002902 explanation: ssd drive power-on self-test failure on the base system board response: perform one of the following: v replace the first ssd drive...

  • Page 139

    11003134 explanation: fault on the hardware monitoring chip. Response: perform the dtrcard symbolic cru isolation procedure by completing the following steps: 1. Reseat the pcie expansion card. 2. If the problem persists, replace the expansion card. 3. If the problem persists: search retain tips and...

  • Page 140

    11008414 explanation: invalid processor 2 vpd. Response: 1. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware 2. Replace the system board. If there is more than one system board, u...

  • Page 141

    632bcfc1 explanation: a virtual optical device cannot access the file containing the list of volumes. Service action / serviceable event: yes called home automatically: no problem determination: on this partition and on the network file system server, verify that the proper file is specified and tha...

  • Page 142

    632bcfc6 explanation: virtual optical device error. The file specified does not contain data that can be processed as a virtual optical volume. Service action / serviceable event: yes called home automatically: no problem determination: on the network file system server, verify that all the files sp...

  • Page 143

    632cc010 explanation: virtual optical device error. Undefined sense key returned by device. Service action / serviceable event: no called home automatically: no problem determination: refer to the hosting partition for problem analysis. 632cc020 explanation: virtual optical device error. Configurati...

  • Page 144

    632cc301 explanation: virtual optical device error. Media or device error occurred. Service action / serviceable event: no called home automatically: no problem determination: refer to the hosting partition for problem analysis. 632cc302 explanation: virtual optical device error. Media or device err...

  • Page 145

    632cc402 explanation: virtual optical device driver error. An internal program error occurred. Service action / serviceable event: yes called home automatically: yes problem determination: install any available operating system updates. 632ccff2 explanation: informational system log entry only. Serv...

  • Page 146

    632cff3d explanation: informational system log entry only. Service action / serviceable event: no called home automatically: no response: no corrective action is required. 632cff6d explanation: informational system log entry only. Service action / serviceable event: no called home automatically: no ...

  • Page 147

    A2d03001 explanation: user-initiated rscdump of rpa partition's pfw content. Response: no corrective action is required. A2d03002 explanation: user-initiated rscdump of ibm i partition's slic bootloader and pfw content. Response: no corrective action is required. A2xxyyyy explanation: see the descri...

  • Page 148

    A6xx0255 explanation: contact was lost with the device indicated in the src. The operating system is halted waiting for the i/o function of a critical resource to return. When the i/o returns this src will disappear from the panel and the o/s will continue running. This problem occurs when the conne...

  • Page 149

    B6000266 explanation: contact was lost with the device indicated in the src. The operating system is halted waiting for the i/o function of a critical resource to return. When the i/o returns this src will disappear from the panel and the o/s will continue running. This problem occurs when the conne...

  • Page 150

    A7xxyyyy licensed internal code srcs: an a7xxyyyy system reference code (src) is an error/event code that is related to licensed internal code. A700173c explanation: informational system log entry only. Response: no corrective action is required. A7003000 explanation: a user-initiated platform dump ...

  • Page 151

    A7004730 explanation: informational system log entry only. Response: no corrective action is required. A7004740 explanation: informational system log entry only. Response: no corrective action is required. A7004741 explanation: informational system log entry only. Response: no corrective action is r...

  • Page 152

    Aaxxyyyy partition firmware attention codes: aaxxyyyy attention codes provide information about the next target state for the platform firmware. These codes might indicate that you need to perform an action. The aaxxyyyy srcs below are the partition firmware codes that might be displayed if the post...

  • Page 153

    Aa060007 explanation: a keyboard was not found. Response: verify that a keyboard is attached to the usb port that is assigned to the partition. Aa06000b explanation: the system or partition was not able to find an operating system on any of the devices in the boot list. Response: 1. Use the sms menu...

  • Page 154

    Aa060011 explanation: the firmware did not find an operating system image and at least one hard disk in the boot list was not detected by the firmware. The firmware is retrying the entries in the boot list. Response: this might occur if a disk enclosure that contains the boot disk is not fully initi...

  • Page 155

    Aa260001 explanation: enter the type model number (must be 8 characters) response: enter the machine type and model of the server at the prompt. Aa260002 explanation: enter the serial number (must be 7 characters) response: enter the serial number of the server at the prompt. Aa260003 explanation: e...

  • Page 156

    B15611b1 explanation: the service processor could not communicate with the chassis management module. Response: go to reseating the compute node in a chassis. B15611b9 explanation: the service processor could not communicate with the chassis management module. Response: go to reseating the compute n...

  • Page 157

    B1817204 explanation: error reading boot parameters response: go to checkout procedure. B1817205 explanation: boot code error response: go to checkout procedure. B1817206 explanation: unit check timer was reset response: go to checkout procedure. B1817207 explanation: error reading from nvram respon...

  • Page 158

    B1818bc2 explanation: the service processor was reset due to a communication failure with the chassis management module. Response: if this is tracking event, no service action is required. Otherwise look for chassis management module problems and resolve them. B181d548 explanation: an informational ...

  • Page 159

    B2001132 explanation: a problem occurred during the startup of a partition. A platform firmware error occurred while it was trying to allocate memory. The startup will not continue. Response: collect a platform dump and then go to isolating firmware problems. B2001133 explanation: a problem occurred...

  • Page 160

    B2001144 explanation: a problem occurred during the migration of a partition. The migration of a partition did not complete. Response: check for server firmware updates; then, install the updates if available. B2001148 explanation: a problem occurred during the migration of a partition. The migratio...

  • Page 161

    B2001260 explanation: a problem occurred during the startup of a partition. The partition could not start at the timed power on setting because the partition was not set to normal. Response: set the partition to normal. B2001265 explanation: the partition could not start up. An operating system main...

  • Page 162

    B2001321 explanation: a problem occurred during the startup of a partition. Response: verify that the correct slot is specified for the load source. Then restart the partition. B2001322 explanation: in the partition startup, code failed during a check of the load source path. Response: verify that t...

  • Page 163

    B2002260 explanation: during the startup of a partition, the partition firmware attempted an operation that failed. Response: go to isolating firmware problems. B2002300 explanation: during the startup of a partition, an attempt to toggle the power state of a slot has failed. Response: check for ser...

  • Page 164

    B2003000 explanation: informational system log entry only. Response: no corrective action is required. B2003081 explanation: during the startup of a partition, the startup did not complete due to a copy error. Response: check for server firmware updates; then, install the updates if available. B2003...

  • Page 165

    B2003114 explanation: a problem occurred during the startup of a partition. Response: look for other errors and resolve them. B2003120 explanation: informational system log entry only. Response: no corrective action is required. B2003123 explanation: informational system log entry only. Response: no...

  • Page 166

    B2003141 explanation: informational system log entry only. Response: no corrective action is required. B2003142 explanation: informational system log entry only. Response: no corrective action is required. B2003143 explanation: informational system log entry only. Response: no corrective action is r...

  • Page 167

    B2005106 explanation: a problem occurred during the startup of a partition. There is not enough space to contain the partition main storage dump. The startup will not continue. Response: verify that there is sufficient memory available to start the partition as it is configured. If there is already ...

  • Page 168

    B2005123 explanation: informational system log entry only. Response: no corrective action is required. B2005127 explanation: timeout occurred during a main store dump ipl. Response: there was not enough memory available for the dump to complete before the timeout occurred. Retry the main store dump ...

  • Page 169

    B200514a explanation: a firmware assisted dump did not complete due to a copy error. Response: check for server firmware updates; then, install the updates if available. B200542a explanation: a firmware assisted dump did not complete due to a read error. Response: check for server firmware updates; ...

  • Page 170

    B2006006 explanation: during the startup of a partition, a system firmware error occurred when the partition memory was being initialized; the startup will not continue. Response: go to isolating firmware problems. B2006008 explanation: during the ipl of the partition, an error was detected while tr...

  • Page 171

    B2006025 explanation: a problem occurred during the startup of a partition. This is a problem with the load source media being corrupt or not valid. Response: replace the load source media. B2006027 explanation: during the startup of a partition, a failure occurred when allocating memory for an inte...

  • Page 172

    B2008105 explanation: during the startup of a partition, there was a failure loading the vpd areas of the partition; the load source media has been corrupted or is unsupported on this server. Response: check for server firmware updates; then, install the updates if available. B2008106 explanation: a...

  • Page 173

    B2008115 explanation: during the startup of a partition, there was a low level partition-to-partition communication failure. Response: check for server firmware updates; then, install the updates if available. B2008117 explanation: during the startup of a partition, the partition did not start up du...

  • Page 174

    B2008140 explanation: informational system log entry only. Response: no corrective action is required. B2008141 explanation: informational system log entry only. Response: no corrective action is required. B2008142 explanation: informational system log entry only. Response: no corrective action is r...

  • Page 175

    B2008152 explanation: no active system processors. Response: verify that processor resources are assigned to the partition. B2008160 explanation: a problem occurred during the migration of a partition. Response: contact ibm support. B2008161 explanation: a problem occurred during the migration of a ...

  • Page 176

    B200b215 explanation: a problem occurred after a partition ended abnormally. There was a communications problem between this partition's service processor and the platform's service processor. Response: restart the platform. B200c1f0 explanation: an internal system firmware error occurred during a p...

  • Page 177

    B200f006 explanation: during the startup of a partition, the code load operation for the partition startup timed out. Response: 1. Check the error logs and take the actions for the error codes that are found. 2. Go to isolating firmware problems. B200f007 explanation: during a shutdown of the partit...

  • Page 178

    B2d03001 explanation: informational system log entry only. Response: no corrective action is required. B2d03002 explanation: informational system log entry only. Response: no corrective action is required. B6xxyyyy licensed internal code or hardware event srcs: a b6xxyyyy system reference code (src)...

  • Page 179

    A6xx0255 explanation: contact was lost with the device indicated in the src. The operating system is halted waiting for the i/o function of a critical resource to return. When the i/o returns this src will disappear from the panel and the o/s will continue running. This problem occurs when the conne...

  • Page 180

    B6000266 explanation: contact was lost with the device indicated in the src. The operating system is halted waiting for the i/o function of a critical resource to return. When the i/o returns this src will disappear from the panel and the o/s will continue running. This problem occurs when the conne...

  • Page 181

    B7xxyyyy licensed internal code srcs: a b7xxyyyy system reference code (src) is an error code or event code that is related to licensed internal code. The b7xxyyyy srcs below are the system reference codes that might be displayed if system firmware detects a problem. Follow the suggested actions in ...

  • Page 182

    B7000106 explanation: system firmware failure. Response: 1. Collect the event log information. 2. Collect the platform dump information. 3. Go to isolating firmware problems. B7000107 explanation: system firmware failure. The system detected an unrecoverable machine check condition. Response: 1. Col...

  • Page 183

    B7000441 explanation: service processor failure. The platform encountered an error early in the startup or termination process. Response: replace the system-board and chassis assembly, as described in replacing the system-board and chassis assembly. B7000443 explanation: service processor failure. R...

  • Page 184

    B7000650 explanation: system firmware detected an error. Resource management was unable to allocate main storage. A platform dump was initiated. Response: 1. Collect the event log. 2. Collect the platform dump data. 3. Collect the partition configuration information. 4. Go to isolating firmware prob...

  • Page 185

    B7001150 explanation: informational system log entry only. Response: no corrective action is required. B7001151 explanation: informational system log entry only. Response: no corrective action is required. B7001152 explanation: informational system log entry only. Response: no corrective action is r...

  • Page 186

    B7001733 explanation: system firmware failure. The startup will not continue. Response: look for and correct b1xxxxxx errors. If there are no serviceable b1xxxxxx errors, or if correcting the errors does not correct the problem, contact ibm support to reset the server firmware settings. Attention: r...

  • Page 187

    B7004402 explanation: a system firmware error occurred while attempting to allocate the memory necessary to create a platform dump. Response: go to isolating firmware problems. B7004705 explanation: system firmware failure. A problem occurred when initializing, reading, or using the system vpd. The ...

  • Page 188

    B7005120 explanation: system firmware detected an error response: if the system is not exhibiting problematic behavior, you can ignore this error. Otherwise, go to isolating firmware problems. B7005121 explanation: system firmware detected a programming problem for which a platform dump may have bee...

  • Page 189

    B7005209 explanation: informational system log entry only. Response: no corrective action is required. B7005219 explanation: informational system log entry only. Response: no corrective action is required. B7005300 explanation: system firmware detected a failure while partitioning resources. The pla...

  • Page 190

    B7005305 explanation: there are no remaining world wide port names (wwpn). Response: refer to https://www-912.Ibm.Com/supporthome.Nsf/document/51455410 b7005306 explanation: there is insufficient licensed memory for firmware to support the manufacturing default configuration. Response: contact ibm s...

  • Page 191

    B7005603 explanation: enclosure feature code and/or serial number not valid. Response: verify that the machine type, model, and serial number are correct for this server. B7006900 explanation: pci host bridge failure response: 1. Replace the system-board, as described in replacing the system-board a...

  • Page 192

    B7006911 explanation: platform lic unable to find or retrieve vpd lid file. Response: check for server firmware updates; then, install the updates if available. B7006912 explanation: platform lic unable to find or retrieve vpd lid file. Response: check for server firmware updates; then, install the ...

  • Page 193

    B7006955 explanation: informational system log entry only. Response: no corrective action is required. B7006956 explanation: an nvram failure was detected. Response: go to isolating firmware problems. B7006965 explanation: informational system log entry only. Response: no corrective action is requir...

  • Page 194

    B7006974 explanation: informational system log entry only. Response: no corrective action is required. B7006978 explanation: informational system log entry only. Response: no corrective action is required. B7006979 explanation: informational system log entry only. Response: no corrective action is r...

  • Page 195

    B7006987 explanation: remote i/o (rio), high-speed link (hsl), or 12x connection failure. Response: replace the system-board and chassis assembly, as described in replacing the system-board and chassis assembly. B7006990 explanation: service processor failure. Response: replace the system-board and ...

  • Page 196

    B70069c3 explanation: informational system log entry only. Response: no corrective action is required. B70069d9 explanation: host ethernet adapter (hea) failure. Response: replace the system-board and chassis assembly, as described in replacing the system-board and chassis assembly. B70069da explana...

  • Page 197

    B70069f3 explanation: i/o controller failed. Response: if there are any operating system errors logged against i/o adapters, do not replace any i/o adapter hardware. This i/o controller failure is the root cause. 1. Collect error logs and informational logs. 2. Replace the system-board and chassis a...

  • Page 198

    B7006a22 explanation: pci-e switch had a recoverable failure, however, all downstream i/o had to be failed during recovery. Recovery for the pci-e switch failure is being attempted. If the recovery is successful, no other src is logged. Response: once the pci-e switch is recovered, the downstream i/...

  • Page 199

    B7006a28 explanation: pci-e switch had a correctable error. Downstream i/o is not affected. Response: no corrective action is required. B7006a2a explanation: pci-e switch downstream i/o bus had a recoverable failure. The downstream ioa might be failed. Response: look for and resolve other errors. If...

  • Page 200

    B700bad2 explanation: system firmware detected an error. Response: 1. Collect the event log information. 2. Go to isolating firmware problems. B700f103 explanation: system firmware failure response: 1. Collect the event log information. 2. Collect the platform dump information. 3. Go to isolating fi...

  • Page 201

    B700f108 explanation: a firmware error caused the system to terminate. Response: 1. Collect the event log information. 2. Go to isolating firmware problems. B700f10a explanation: system firmware detected an error response: look for and correct b1xxxxxx errors. B700f10b explanation: a processor resou...

  • Page 202

    The baxxyyyy srcs below are the error codes that might be displayed if post detects a problem. Follow the suggested actions in the order in which they are listed until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. See chapter 7, “parts li...

  • Page 203

    Ba000032 explanation: the firmware failed to register the lpevent queues response: 1. Reboot the server. 2. If the problem persists, search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmwa...

  • Page 204

    Ba000060 explanation: the firmware was unable to obtain the open firmware code lid details response: 1. Reboot the server. 2. If the problem persists: search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see u...

  • Page 205

    Ba000091 explanation: unable to load a firmware code update module response: 1. Reboot the server. 2. If the problem persists: search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware ba0...

  • Page 206

    Ba00e850 explanation: failure when initializing dynamic reconfiguration response: 1. Reboot the server. 2. If the problem persists: search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmwar...

  • Page 207

    Ba010003 explanation: cannot get server hardware address response: perform the following actions that checkpoint ca00e174 describes: 1. Verify that: v the bootp server is correctly configured; then, retry the operation. V the network connections are correct; then, retry the operation. 2. If the prob...

  • Page 208

    Ba010008 explanation: the device_type property for this device is not supported by the iscsi initiator configuration specification. Response: 1. Reboot the server. 2. If the problem persists: search retain tips and the firmware change history for the reference code to determine the recommended actio...

  • Page 209

    Ba01000e explanation: the lun specified is not valid. Response: the embedded host ethernet adapters (heas) help provide iscsi, which is supported by iscsi software device drivers on either aix or linux. Verify that all of the iscsi configuration arguments on the operating system comply with the conf...

  • Page 210

    Ba010020 explanation: a trace entry addition failed because of a bad trace type. Response: 1. Reboot the server. 2. If the problem persists: search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating th...

  • Page 211

    Ba017020 explanation: failed to open the tftp package response: verify that the trivial file transfer protocol (tftp) parameters are correct. Ba017021 explanation: failed to load the tftp file response: verify that the tftp server and network connections are correct. Ba01b010 explanation: opening th...

  • Page 212

    Ba01b013 explanation: the discover mode is invalid response: 1. Reboot the server. 2. If the problem persists: search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware ba01b014 explanatio...

  • Page 213

    Ba01d030 explanation: dhcp failed to write to the network response: 1. Verify that the network cable is connected, and that the network is active. 2. If the problem persists: search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware...

  • Page 214

    Ba01d053 explanation: dhcp::discover received a reply, but without a message type response: verify that the dhcp server is properly configured. Ba01d054 explanation: dhcp::discover: dhcp nak received response: dhcp discovery did receive a dhcp offer from a server that meets the client requirements, ...

  • Page 215

    Ba030011 explanation: rtas attempt to allocate memory failed response: 1. Reboot the server. 2. If the problem persists: search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware ba04000f ...

  • Page 216

    Ba040035 explanation: the firmware was unable to find the plant of manufacture in the vpd. This may cause problems with the licensing of the aix operating system. Response: verify that the machine type, model, and serial number are correct for this server. If this is a new server, check for server f...

  • Page 217

    Ba050004 explanation: failed to locate service processor device tree node. Response: 1. Reboot the server. 2. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware ba05000a explanation...

  • Page 218

    Ba060021 explanation: the environment variable boot-device contained more than five entries. Response: 1. Reboot the server. 2. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware ba...

  • Page 219

    Ba060061 explanation: the operating system expects a non-iosp partition, but it failed to make the transition to mgc mode. Response: 1. Verify that: v the alpha-mode operating system image is intended for this partition. V the configuration of the partition supports an alpha-mode operating system. 2...

  • Page 220

    Ba060201 explanation: failed to read the vpd "boot path" field value response: 1. Reboot the server. 2. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware ba060202 explanation: fail...

  • Page 221

    Ba090002 explanation: scsd dasd: test unit ready failed; sense data available response: 1. Reboot the server. 2. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware ba090003 explanat...

  • Page 222

    Ba09000c explanation: the media is write-protected response: 1. Change the setting of the media to allow writing, then retry the operation. 2. Insert new media of the correct type. 3. Reboot the server. 4. Search retain tips and the firmware change history for the reference code to determine the rec...

  • Page 223

    Ba090011 explanation: the retry limit has been exceeded. Response: 1. Troubleshoot the scsd devices. 2. Verify that the scsd cables and devices are properly plugged. Correct any problems that are found. 3. Replace the scsd cables and devices. 4. Reboot the server. 5. Search retain tips and the firmw...

  • Page 224

    Ba120003 explanation: on an undetermined scsd device, send diagnostic failed; sense data available response: 1. Troubleshoot the scsd devices. 2. Verify that the scsd cables and devices are properly plugged. Correct any problems that are found. 3. Replace the scsd cables and devices. 4. Reboot the s...

  • Page 225

    Ba130011 explanation: usb cd-rom in the media tray: execution of ata/atapi command was not completed with the allowed time. Response: 1. Retry the operation. 2. Reboot the server. 3. Troubleshoot the media tray and cd-rom drive. 4. Replace the usb cd or dvd drive. 5. Search retain tips and the firmw...

  • Page 226

    Ba130015 explanation: usb cd-rom in the media tray: ata/atapi packet command execution failed. Response: 1. Remove the cd or dvd in the drive and replace it with a known-good disk. 2. Retry the operation. 3. Reboot the server. 4. Troubleshoot the media tray and cd-rom drive. 5. Replace the usb cd or...

  • Page 227

    Ba140003 explanation: the scsd read/write optical send diagnostic failed; sense data available. Response: 1. Troubleshoot the scsd devices. 2. Verify that the scsd cables and devices are properly plugged. Correct any problems that are found. 3. Replace the scsd cables and devices. 4. If the problem ...

  • Page 228

    Ba153002 explanation: gigabit ethernet adapter failure response: verify that the mac address programmed in the flash/eeprom is correct. Ba153003 explanation: gigabit ethernet adapter failure response: 1. Reboot the server. 2. Check for server firmware updates; install the updates if available. 3. Re...

  • Page 229

    Ba154040 explanation: the tftp package open failed response: 1. Reboot the server. 2. Check for server firmware updates; install the updates if available. 3. If the problem persists: search retain tips and the firmware change history for the reference code to determine the recommended actions. If a ...

  • Page 230

    Ba170000 explanation: nvramrc initialization failed; device test failed response: 1. Reboot the server. 2. Check for server firmware updates; install the updates if available. 3. If the problem persists: search retain tips and the firmware change history for the reference code to determine the recom...

  • Page 231

    Ba170210 explanation: setenv/$setenv parameter error - name contains a null character response: 1. Reboot the server. 2. Check for server firmware updates; install the updates if available. 3. If the problem persists: search retain tips and the firmware change history for the reference code to deter...

  • Page 232

    Ba180008 explanation: pci device fcode evaluation error response: 1. Reboot the server. 2. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware ba180009 explanation: the fcode on a pc...

  • Page 233

    Ba180014 explanation: msi software error response: 1. Reboot the server. 2. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware ba180020 explanation: no response was received from a ...

  • Page 234

    Ba190001 explanation: firmware function to get/set time-of-day reported an error response: 1. Reboot the server. 2. Check for server firmware updates; install the updates if available. 3. Search retain tips and the firmware change history for the reference code to determine the recommended actions. ...

  • Page 235

    Ba210001 explanation: partition firmware reports a stack underflow was caught response: 1. Reboot the server. 2. Check for server firmware updates; install the updates if available. 3. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If ...

  • Page 236

    Ba210011 explanation: the transfer of control to the io reporter failed response: 1. Reboot the server. 2. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware ba210012 explanation: t...

  • Page 237

    Ba210101 explanation: the partition firmware event log queue is full response: 1. Reboot the server. 2. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware ba210102 explanation: ther...

  • Page 238

    Ba220020 explanation: crq registration error; partner vslot may not be valid response: verify that this client virtual slot device has a valid server virtual slot device in a hosting partition. Ba278001 explanation: failed to flash firmware: invalid image file response: download a new firmware updat...

  • Page 239

    Ba278007 explanation: failed to reboot the system after a firmware flash update response: 1. Reboot the server. 2. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware ba278009 explan...

  • Page 240

    Ba290001 explanation: rtas low memory corruption was detected response: 1. Reboot the server. 2. Check for server firmware updates; install the updates if available. 3. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware updat...

  • Page 241

    Ba330000 explanation: memory allocation error. Response: 1. Reboot the server. 2. Check for server firmware updates; install the updates if available. 3. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, se...

  • Page 242

    Ba340001 explanation: there was a logical partition event communication failure reading the bladecenter open fabric manager parameter data structure from the service processor. Response: 1. Reboot the server. 2. Check for server firmware updates; install the updates if available. 3. Search retain ti...

  • Page 243

    Ba340005 explanation: an internal firmware error occurred; the location code mapping table was corrupted. Response: 1. Reboot the server. 2. Check for server firmware updates; install the updates if available. 3. Search retain tips and the firmware change history for the reference code to determine ...

  • Page 244

    Ba340009 explanation: an internal firmware error occurred; the open fabric manager system initiator capability processing encountered an unexpected error. Response: 1. Reboot the server. 2. Check for server firmware updates; install the updates if available. 3. Search retain tips and the firmware ch...

  • Page 245

    Ba340021 explanation: a logical partition event communication failure occurred when writing the bladecenter open fabric manager system initiator capabilities data to the service processor. Response: 1. Reboot the server. 2. Check for server firmware updates; install the updates if available. 3. Sear...

  • Page 246

    Use asmi to display the error and event logs. See displaying error and event logs in the ibm power systems hardware information center. C1xxyyyy service processor checkpoints: the c1xxyyyy progress codes, or checkpoints, offer information about the initialization of both the service processor and th...

  • Page 247

    C1001f0f explanation: pre-standby: waiting for standby synchronization from initial transition file response: 1. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware. 2. Replace the s...

  • Page 248

    C1009x04 explanation: hardware object manager (hom): build cards ipl step in progress response: 1. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware. 2. Replace the system-board, a...

  • Page 249

    C1009x18 explanation: hardware object manager (hom): gard in progress response: 1. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware. 2. Replace the system-board, as described in r...

  • Page 250

    C1009x2c explanation: processor initialization in progress response: 1. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware. 2. Replace the system-board, as described in replacing th...

  • Page 251

    C1009x3c explanation: processor self-test step in progress response: 1. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware. 2. Replace the system-board, as described in replacing th...

  • Page 252

    C1009x4c explanation: processor wire test ipl step in progress response: 1. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware. 2. Replace the system-board, as described in replacin...

  • Page 253

    C1009x5e explanation: processor cache test case in progress response: 1. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware. 2. Replace the system-board, as described in replacing t...

  • Page 254

    C1009x6c explanation: processor link initialization step in progress response: 1. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware. 2. Replace the system-board, as described in re...

  • Page 255

    C1009x80 explanation: asic test being run response: 1. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware. 2. Replace the system-board, as described in replacing the system-board an...

  • Page 256

    C1009x90 explanation: wire test in progress response: 1. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware. 2. Replace the system-board, as described in replacing the system-board ...

  • Page 257

    C1009x9e explanation: asic hss set up in progress response: 1. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware. 2. Replace the system-board, as described in replacing the system-...

  • Page 258

    C1009xb0 explanation: asic i/o initialization step in progress response: 1. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware. 2. Replace the system-board, as described in replacin...

  • Page 259

    C1009xbd explanation: avp memory test case in progress response: 1. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware. 2. Replace the system-board, as described in replacing the sy...

  • Page 260

    C1009xd0 explanation: message passing waiting period has begun response: 1. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware. 2. Replace the system-board, as described in replacin...

  • Page 261

    C103a400 explanation: special purpose registers are being loaded, and instructions are being started, on the server processors response: 1. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the...

  • Page 262

    C162e4xx explanation: vpd is being collected; xx indicates the type of device from which vpd is being collected response: 1. Search retain tips and the firmware change history for the reference code to determine the recommended actions. If a firmware update is needed, see updating the firmware. 2. R...

  • Page 263

    C2001010 explanation: startup source response: 1. Go to recovering the system firmware. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. C2001100 explanation: adding partition resources to the secondary configuration response: 1. Go to recovering the syst...

  • Page 264

    C20013ff explanation: isl roadmap initialized successfully response: 1. Go to recovering the system firmware. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. C2001400 explanation: initializing sp communication area #1 response: 1. Go to recovering the sy...

  • Page 265

    C200211f explanation: power on command successful response: 1. Go to recovering the system firmware. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. C20021ff explanation: power on phase complete response: 1. Go to recovering the system firmware. 2. Repla...

  • Page 266

    C2002400 explanation: begin powering on slots response: 1. Go to recovering the system firmware. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. C2002450 explanation: waiting for power on of slots to complete response: 1. Go to recovering the system firm...

  • Page 267

    C2003111 explanation: waiting for bus object to become operational response: 1. Go to recovering the system firmware. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. C2003112 explanation: waiting for bus unit to become disabled response: 1. Go to recover...

  • Page 268

    C2003300 explanation: start softpor of a failed isl slot response: 1. Go to recovering the system firmware. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. C2003350 explanation: waiting for softpor of a failed isl slot response: 1. Go to recovering the s...

  • Page 269

    C20043ff explanation: load source device is connected response: 1. Go to recovering the system firmware. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. C2006000 explanation: locating first lid information on the load source response: 1. Go to recovering...

  • Page 270

    C2006040 explanation: preparing to initiate lid load from load source response: 1. Go to recovering the system firmware. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. C2006050 explanation: lp configuration lid primed successfully response: 1. Go to rec...

  • Page 271

    C2008040 explanation: begin transfer slot locks to partition response: 1. Go to recovering the system firmware. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. C2008060 explanation: end transfer slot locks to partition response: 1. Go to recovering the s...

  • Page 272

    C2008104 explanation: loading data structures into main store response: 1. Go to recovering the system firmware. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. C2008110 explanation: initializing event paths response: 1. Go to recovering the system firmw...

  • Page 273

    C20081ff explanation: processors started successfully, now waiting to receive the continue acknowledgement from system firmware response: 1. Go to recovering the system firmware. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. C2008200 explanation: conti...

  • Page 274

    C700xxxx explanation: a problem has occurred with the system firmware during startup. Response: 1. Shutdown and restart the server from the permanent-side image. See starting the perm image. 2. Search retain tips and the firmware change history for the reference code to determine the recommended act...

  • Page 275

    Ca000030 explanation: attempting to establish a communication link by using lpevents response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. Ca000032 explanation: attempting to register lpevent queues response: 1. Go to che...

  • Page 276

    Ca000060 explanation: attempting to obtain open firmware details response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. Ca000070 explanation: attempting to load open firmware response: 1. Go to checkout procedure. 2. Repla...

  • Page 277

    Ca0000a0 explanation: open firmware package corrupted (phase 2) response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. Ca00d001 explanation: pci probe process completed, create pci bridge interrupt routing properties respo...

  • Page 278

    Ca00d00c explanation: the partition firmware is about to search for an nvram script response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. Ca00d00d explanation: evaluating nvram script response: 1. Go to checkout procedure...

  • Page 279

    Ca00d020 explanation: about to download the run the slic loader (iop-less boot) response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. Ca00d021 explanation: about to download the run the io reporter (for vpd collection) re...

  • Page 280

    Ca00e10b explanation: set rtas device properties response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. Ca00e110 explanation: create kdump properties response: 1. Reboot the server. 2. If the problem persists: a. Go to che...

  • Page 281

    Ca00e135 explanation: create hca node response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. Ca00e136 explanation: create bsr node response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replaci...

  • Page 282

    Ca00e13a explanation: create packages node response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. Ca00e13b explanation: create hea node response: 1. Reboot the server. 2. If the problem persists: a. Go to checkout procedur...

  • Page 283

    Ca00e142 explanation: the management module bootlist is being set from the operating system bootlist response: 1. Reboot the server. 2. If the problem persists: a. Go to checkout procedure. B. Replace the system-board, as described in replacing the system-board and chassis assembly. Ca00e143 explana...

  • Page 284

    Ca00e151 explanation: probing pci bus response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. Ca00e152 explanation: probing for adapter fcode; evaluate if present response: 1. Go to checkout procedure. 2. Replace the system...

  • Page 285

    Ca00e15b explanation: transfer control to operating system (service mode boot) response: go to boot problem resolution. Ca00e15f explanation: adapter vpd evaluation response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. Ca...

  • Page 286

    Ca00e175 explanation: bootp request response: 1. Verify that: v the bootp server is correctly configured; then, retry the operation. V the network connections are correct; then, retry the operation. 2. If the problem persists: a. Go to checkout procedure. B. Replace the system-board, as described in...

  • Page 287

    Ca00e179 explanation: closing bootp response: 1. Verify that: v the bootp server is correctly configured; then, retry the operation. V the network connections are correct; then, retry the operation. 2. If the problem persists: a. Go to checkout procedure. B. Replace the system-board, as described in...

  • Page 288

    Ca00e19b explanation: nvram menu? Variable not found - assume false response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. Ca00e19d explanation: create nvram node response: 1. Go to checkout procedure. 2. Replace the syste...

  • Page 289

    Ca00e1a4 explanation: user requested boot to sms menus response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. Ca00e1a5 explanation: user requested boot to open firmware prompt response: 1. Go to checkout procedure. 2. Repl...

  • Page 290

    Ca00e1ac explanation: system booting using customized service mode boot list response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. Ca00e1ad explanation: system booting to the operating system response: 1. Go to checkout p...

  • Page 291

    Ca00e1b3 explanation: xon received response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. Ca00e1b4 explanation: system-directed boot-string didn't load an operating system response: 1. Go to checkout procedure. 2. Replace ...

  • Page 292

    Ca00e1dc explanation: dynamic console selection response: 1. Verify the video session and the sol session. The console might be redirected to the video controller. 2. Start a remote control session or access the local kvm to see the status. 3. Go to checkout procedure. 4. Replace the system-board, a...

  • Page 293

    Ca00e1f3 explanation: privileged-access password prompt response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. Ca00e1f4 explanation: end self-test sequence on one or more boot devices; begin system management services resp...

  • Page 294

    Ca00e1f9 explanation: build boot device list for fibre-channel adapters. (the location code of the san adapter being scanned is also displayed.) response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. Ca00e1fa explanation: ...

  • Page 295

    Ca00e1ff explanation: build device list for fibre-channel (san) adapters. (the lun of the san adapter being scanned is also displayed.) response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. Ca00e440 explanation: validate ...

  • Page 296

    Ca00e701 explanation: create memory vpd response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. Ca00e800 explanation: initialize rtas response: 1. Go to checkout procedure. 2. Replace the system-board, as described in repla...

  • Page 297

    Ca00e843 explanation: initializing interface/aix access response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. Ca00e850 explanation: initializing dynamic reconfiguration response: 1. Go to checkout procedure. 2. Replace th...

  • Page 298

    Ca00e876 explanation: initializing rtas_error_inject response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. Ca00e877 explanation: initializing dump interface response: 1. Go to checkout procedure. 2. Replace the system-boa...

  • Page 299

    Ca00e890 explanation: starting to initialize open firmware response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. Ca00e891 explanation: finished initializing open firmware response: 1. Go to checkout procedure. 2. Replace ...

  • Page 300

    Ca279001 explanation: the firmware update image contains an update module that is not already on the system. Response: 1. Look at the event log for a ba27xxxx error code to determine if a firmware installation error occurred. 2. If a firmware installation error did occur, resolve the problem. 3. Ret...

  • Page 301

    D1xx1yyy service processor dump status codes: d1xx1yyy service processor dump status codes indicate the cage or node id that the dump component is processing, the node from which the hardware data is collected, and a counter that increments each time that the dump processor stores 4k of dump data. S...

  • Page 302

    D101c00f explanation: no power off to allow debugging for cpu controls response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. D1021xxx explanation: dump dump header directory response: 1. Go to checkout procedure. 2. Repla...

  • Page 303

    D1071xxx explanation: dump component trace for failing component response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. D1081xxx explanation: dump component data from /opt/p0 response: 1. Go to checkout procedure. 2. Repla...

  • Page 304

    D1141xxx explanation: dump code version response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. D1151xxx explanation: dump all /opt/p3 except rtbl response: 1. Go to checkout procedure. 2. Replace the system-board, as descr...

  • Page 305

    D11a1xxx explanation: dump any state information before dumping starts response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. D11b1xxx explanation: dump /proc filesystem response: 1. Go to checkout procedure. 2. Replace th...

  • Page 306

    D1251xxx explanation: dump crc1 calculation on response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. D1261xxx explanation: dump crc2 calculation off response: 1. Go to checkout procedure. 2. Replace the system-board, as d...

  • Page 307

    D12b1xxx explanation: initialize the headers dump time and serial numbers response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. D12c1xxx explanation: display final src to panel response: 1. Go to checkout procedure. 2. Re...

  • Page 308

    D1311xxx explanation: turn on error log capture into dump response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. D1321xxx explanation: store information about existing core files response: 1. Go to checkout procedure. 2. R...

  • Page 309

    D1ff1xxx explanation: dump complete response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. D1xx3yyy explanation: platform dump status codes are described in d1xx3y01 to d1xx3yf2 service processor dump codes response: d1xx3...

  • Page 310

    D1xx3y03 explanation: get array values response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. D1xx3y04 explanation: stop the clocks response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replac...

  • Page 311

    D1xx3y09 explanation: get optimized cache response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. D1xx3y0a explanation: get general purpose (gp) register response: 1. Go to checkout procedure. 2. Replace the system-board, a...

  • Page 312

    D1xx3yf1 explanation: memory collection dma step response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. D1xx3yf2 explanation: memory collection cleanup response: 1. Go to checkout procedure. 2. Replace the system-board, as...

  • Page 313

    D1xxc000 explanation: indicates a message is ready to send to the hypervisor to power off response: 1. Go to checkout procedure. 2. Replace the system-board, as described in replacing the system-board and chassis assembly. D1xxc001 explanation: waiting for the hypervisor to acknowledge the delayed p...

  • Page 314

    2. See “failing function codes” on page 446 for a description of each ffc value. 3. If the srn does not appear in the table, see “solving undetermined problems” on page 498. 4. After replacing a component, verify the replacement part and perform a log-repair action using the aix diagnostics. 101-711...

  • Page 315

    101-714 explanation: the system hung while trying to configure an unknown resource. 1. Run the stand-alone diagnostics problem determination procedure. 2. If the problem remains, refer to failing function codes to find the ffc that matches the last three digits of the srn. 3. Suspect the device adap...

  • Page 316

    101-718 explanation: the system hung while trying to configure an unknown resource. 1. Run the stand-alone diagnostics problem determination procedure. 2. If the problem remains, refer to failing function codes to find the ffc that matches the last three digits of the srn. 3. Suspect the device adap...

  • Page 317

    101-722 explanation: the system hung while trying to configure an unknown resource. 1. Run the stand-alone diagnostics problem determination procedure. 2. If the problem remains, refer to failing function codes to find the ffc that matches the last three digits of the srn. 3. Suspect the device adap...

  • Page 318

    101-726 explanation: the system hung while trying to configure an unknown resource. 1. Run the stand-alone diagnostics problem determination procedure. 2. If the problem remains, refer to failing function codes to find the ffc that matches the last three digits of the srn. 3. Suspect the device adap...

  • Page 319

    110-101 explanation: the diagnostics did not detect an installed resource. If this srn appeared when running concurrent diagnostics, then run concurrent diagnostics using the diag -a command. Response: 110-921 explanation: the system halted while diagnostics were executing. Go to performing the chec...

  • Page 320

    110-924 explanation: the system halted while diagnostics were executing. Go to performing the checkout procedure or problem resolution. Note: xxx corresponds to the last three digits of the srn. Response: failing item: v ffc_812 v ffc_xxx 110-925 explanation: the system halted while diagnostics were...

  • Page 321

    111-107 explanation: a machine check occurred. Go to performing the checkout procedure. Response: 111-108 explanation: an encoded srn was displayed. Go to performing the checkout procedure. Response: 111-121 explanation: there is a display problem. Go to performing the checkout procedure. Response: ...

  • Page 322

    651-199 explanation: this is a test of the flow of a serviceable event from this system to management console. This is not a real problem. Response: problem determination: use the management console to list serviceable events and verify the serviceable event was sent to the hmc. 652-600 explanation:...

  • Page 323

    652-613 explanation: a non-critical error has been detected: external cache ecc single-bit error. Schedule deferred maintenance. Go to performing the checkout procedure. Response: failing item: v ffc_d01 652-623 explanation: a non-critical error has been detected: correctable error threshold exceede...

  • Page 324

    652-633 explanation: a non-critical error has been detected: i/o expansion unit not in an operating state. Schedule deferred maintenance. Go to performing the checkout procedure. Response: failing item: v ffc_307 652-634 explanation: a non-critical error has been detected: internal device error. Sch...

  • Page 325

    652-669 explanation: a non-critical error has been detected: correctable error threshold exceeded. Schedule deferred g maintenance. Go to performing the checkout procedure. Response: failing item: v ffc_2cd 652-66a explanation: a non-critical error has been detected: correctable error threshold exce...

  • Page 326

    652-733 explanation: a non-critical error has been detected: intermediate or system bus address parity error. Schedule deferred maintenance. Go to performing the checkout procedure. Response: failing item: v ffc_2c8 v ffc_292 652-734 explanation: a non-critical error has been detected: intermediate ...

  • Page 327

    652-770 explanation: a non-critical error has been detected: intermediate system bus address parity error. Schedule deferred maintenance. Go to performing the checkout procedure. Response: failing item: v ffc_2c8 v ffc_292 652-771 explanation: a non-critical error has been detected: intermediate or ...

  • Page 328

    652-89x explanation: the cec or spcn reported a non-critical error. 1. Schedule deferred maintenance. 2. Refer to the entry map in this system unit system service guide, with the 8-digit error and location codes, for the necessary repair action. 3. If the 8-digit error and location codes were not re...

  • Page 329

    815-101 explanation: floating point processor failed. Go to performing the checkout procedure. Response: failing item: v ffc_815 815-102 explanation: floating point processor failed. Go to performing the checkout procedure. Response: failing item: v ffc_815 815-200 explanation: power-on self-test in...

  • Page 330

    817-124 explanation: time of day ram test failed. Go to performing the checkout procedure. Response: failing item: v ffc_817 817-210 explanation: the time-of-day clock is at por. Go to performing the checkout procedure. Response: failing item: v ffc_817 817-211 explanation: time of day por test fail...

  • Page 331

    817-217 explanation: time of day clock not running. Go to performing the checkout procedure. Response: failing item: v ffc_817 887-101 explanation: pos register test failed. Go to performing the checkout procedure. Response: failing item: v ffc_887 887-102 explanation: 887i/o register test failed. G...

  • Page 332

    887-106 explanation: internal loopback test failed. Go to performing the checkout procedure. Response: failing item: v ffc_887 887-107 explanation: external loopback test failed. Go to performing the checkout procedure. Response: failing item: v ffc_887 887-108 explanation: external loopback test fa...

  • Page 333

    887-112 explanation: external loopback (twisted pair) test failed. Go to performing the checkout procedure. Response: failing item: v ffc_887 887-113 explanation: external loopback (twisted pair) parity test failed. Go to performing the checkout procedure. Response: failing item: v ffc_887 887-114 e...

  • Page 334

    887-117 explanation: software device configuration fails. Go to performing the checkout procedure. Response: failing item: v ffc_887 887-118 explanation: device driver indicates a hardware problem. Go to performing the checkout procedure. Response: failing item: v ffc_887 887-120 explanation: device...

  • Page 335

    887-124 explanation: software error log indicates a hardware problem. Go to performing the checkout procedure. Response: failing item: v ffc_887 887-125 explanation: fuse test failed. Go to performing the checkout procedure. Response: failing item: v ffc_887 887-202 explanation: vital product data t...

  • Page 336

    887-305 explanation: internal loopback test failed. Go to performing the checkout procedure. Response: failing item: v ffc_887 887-306 explanation: internal loopback test failed. Go to performing the checkout procedure. Response: failing item: v ffc_887 887-307 explanation: external loopback test fa...

  • Page 337

    887-402 explanation: ethernet 10 base-2 transceiver test failed. Go to performing the checkout procedure. Response: failing item: v ffc_887 887-403 explanation: ethernet 10 base-t transceiver test failed. Go to performing the checkout procedure. Response: failing item: v ffc_887 887-405 explanation:...

  • Page 338

    110-xxxx explanation: the system halted while diagnostics were executing. Note: xxxx corresponds to the last three or four digits of the srn following the dash (-). 1. If your 110 srn is not listed, substitute the last three or four digits of the srn for xxxx and go to failing function codes to iden...

  • Page 339

    252b-712 explanation: adapter failure. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system-board response: failing item: v ffc_252b 252b-713...

  • Page 340

    252b-716 explanation: pci bus error detected by eeh. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system-board response: failing item: v ffc...

  • Page 341

    252b-719 explanation: device bus termination power lost or not detected. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system-board response:...

  • Page 342

    252b-723 explanation: device bus interface problem. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system-board response: failing item: v ffc_...

  • Page 343

    254e-604 explanation: error log analysis indicates a permanent adapter failure. Go to performing the checkout procedure. Response: failing item: v ffc_254 254e-605 explanation: error log analysis indicates permanent adapter failure is reported on the other port of this adapter. Go to performing the ...

  • Page 344

    2567-xxx explanation: usb integrated system-board and chassis assembly. Go to performing the checkout procedure. Response: failing item: v ffc_2567 256d-201 explanation: adapter configuration error. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see pos...

  • Page 345

    256d-603 explanation: error log analysis indicates that the microcode could not be loaded on the adapter. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. R...

  • Page 346

    256d-701 explanation: error log analysis indicates permanent adapter failure. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system-board resp...

  • Page 347

    25c4-602 explanation: eeprom read error. Go to performing the checkout procedure. Response: failing item: v ffc_25c4 25c4-701 explanation: permanent adapter failure. Go to performing the checkout procedure. Response: failing item: v ffc_25c4 25c4-xxx explanation: generic reference for broadcom adapt...

  • Page 348

    2604-105 explanation: internal wrap test failure for the fibre channel adapter card. Replace the 4gb fibre channel adapter card. Response: failing item: v ffc_2604 2604-106 explanation: gigabit link module (glm) wrap test failure for the fibre channel adapter card. Replace the 4gb fibre channel adap...

  • Page 349

    2604-203 explanation: pci wrap test failure for the fibre channel adapter card. Replace the 4gb fibre channel adapter card. Response: failing item: v ffc_2604 2604-204 explanation: dma test failure for the fibre channel adapter card. Go to performing the checkout procedure. Response: failing item: v...

  • Page 350

    2604-704 explanation: error log analysis indicates that an adapter error has occurred for the fibre channel adapter card. Go to performing the checkout procedure. Response: failing item: v ffc_2604 2604-705 explanation: error log analysis indicates that a parity error has been detected for the fibre...

  • Page 351

    2607-103 explanation: register test failure for the fibre channel adapter card. Replace the 8gb pcie fibre channel expansion card. Response: failing item: v ffc_2607 2607-104 explanation: sram test failure for the fibre channel adapter card. Replace the 8gb pcie fibre channel expansion card. Respons...

  • Page 352

    2607-110 explanation: enhanced error handling failure on the fibre channel adapter card. Replace the 8gb pcie fibre channel expansion card. Response: failing item: v ffc_2607 2607-201 explanation: configuration register test failure for the fibre channel adapter card. Go to performing the checkout p...

  • Page 353

    2607-701 explanation: error log analysis indicates that the adapter self-test failed for the fibre channel adapter card. Go to performing the checkout procedure. Response: failing item: v ffc_2607 2607-703 explanation: error log analysis indicates that an unknown adapter error has occurred for the f...

  • Page 354

    2607-xxx explanation: generic reference for 8gb pcie fibre channel expansion card. Go to performing the checkout procedure. Response: failing item: v ffc_2607 2624-101 explanation: configuration failure. Replace the 4x pci-e ddr infiniband host channel adapter. Response: failing item: v ffc_2624 262...

  • Page 355

    2624-301 explanation: loop back test failure. Do the following steps one at a time, in order, and rerun the test after each step: 1. Reseat the cable. 2. Replace the cable. 3. Verify that the network is functional. 4. Verify that the network switch is functional. 5. Go to performing the checkout pro...

  • Page 356

    2624-704 explanation: error log analysis indicates that this adapter has failed due to a unrecoverable internal parity error replace the 4x pci-e ddr infiniband host channel adapter. Response: failing item: v ffc_2624 2624-705 explanation: error log analysis indicates that this adapter has failed du...

  • Page 357

    2625-102 explanation: queue pair create failure. Replace the 2-port qdr 40 gb/s infiniband expansion card (cffh). Response: failing item: v ffc_2625 2625-103 explanation: loop back test failure. Replace the 2-port qdr 40 gb/s infiniband expansion card (cffh). Response: failing item: v ffc_2625 2625-...

  • Page 358

    2625-701 explanation: error log analysis indicates that this adapter has failed due to an internal error. Replace the 2-port qdr 40 gb/s infiniband expansion card (cffh). Response: failing item: v ffc_2625 2625-702 explanation: error log analysis indicates that this adapter has failed due to a failu...

  • Page 359

    2625-706 explanation: error log analysis indicates that this adapter has failed due to a memory error. Replace the 2-port qdr 40 gb/s infiniband expansion card (cffh). Response: failing item: v ffc_2625 2625-xxx explanation: generic reference for 2-port qdr 40 gb/s infiniband expansion card (cffh). ...

  • Page 360

    2627-201 explanation: loop back test failure. Do the following steps one at a time, in order, and rerun the test after each step: 1. Reseat the cable. 2. Replace the cable. 3. Verify that the network is functional. 4. Verify that the network switch is functional. Response: failing item: v cable v ne...

  • Page 361

    2627-703 explanation: error log analysis indicates that this adapter has failed due to a memory error. Replace the ibm flex system ib6132 qdr infiniband adapter. Response: failing item: v ffc_2627 2627-704 explanation: error log analysis indicates that this adapter has failed due to a unrecoverable ...

  • Page 362

    2640-131 explanation: smart status threshold exceeded. Go to performing the checkout procedure. Response: failing item: v ffc_2640 2640-132 explanation: command timeouts threshold exceeded. Go to performing the checkout procedure. Response: failing item: v ffc_2640 2640-133 explanation: command time...

  • Page 363

    268b-102 explanation: an unrecoverable media error occurred. Replace the 300 gb sff sas hard disk drive. Response: failing item: v ffc_268b 268b-104 explanation: the motor failed to restart. Replace the 300 gb sff sas hard disk drive. Response: failing item: v ffc_268b 268b-105 explanation: the driv...

  • Page 364

    268b-112 explanation: the diagnostic test failed. Replace the 300 gb sff sas hard disk drive. Response: failing item: v ffc_268b 268b-114 explanation: an unrecoverable hardware error. Replace the 300 gb sff sas hard disk drive. Response: failing item: v ffc_268b 268b-116 explanation: a protocol erro...

  • Page 365

    268b-120 explanation: a scsi busy or command error. Replace the 300 gb sff sas hard disk drive. Response: failing item: v ffc_268b 268b-122 explanation: a scsi reservation conflict error. Replace the 300 gb sff sas hard disk drive. Response: failing item: v ffc_268b 268b-124 explanation: a scsi chec...

  • Page 366

    268b-129 explanation: error log analysis indicates a scsi bus problem. Replace each part reported by the diagnostic program one at a time. Run the diagnostic program in problem determination mode on each part reported in the original srn. If the problem persists, replace the next part in the list. R...

  • Page 367

    268b-136 explanation: the certify operation failed. Replace the 300 gb sff sas hard disk drive. Response: failing item: v ffc_268b 268b-137 explanation: unit attention condition has occurred on the send diagnostic command. Replace the 300 gb sff sas hard disk drive. Run diagnostics again on the driv...

  • Page 368

    268b-640 explanation: error log analysis indicates a path error. Replace the 300 gb sff sas hard disk drive. Response: failing item: v ffc_268b 26b4-102 explanation: an unrecoverable media error occurred. Replace the 200 gb sata solid state drive. Response: failing item: v ffc_26b4 26b4-104 explanat...

  • Page 369

    26b4-110 explanation: the media format is corrupted. Replace the 200 gb sata solid state drive. Response: failing item: v ffc_26b4 26b4-112 explanation: the diagnostic test failed. Replace the 200 gb sata solid state drive. Response: failing item: v ffc_26b4 26b4-114 explanation: an unrecoverable ha...

  • Page 370

    26b4-118 explanation: a scsi command time-out occurred. Replace each part reported by the diagnostic program one at a time. Retry the diagnostic test after each part is replaced. If the problem persists, replace the next part in the list. Response: failing item: v ffc_26b4 v ffc_26bd v ffc_b88 26b4-...

  • Page 371

    26b4-126 explanation: a software error was caused by a hardware failure. Replace each part reported by the diagnostic program one at a time. Retry the diagnostic test after each part is replaced. If the problem persists, replace the next part in the list. Response: failing item: v ffc_26b4 v ffc_26b...

  • Page 372

    26b4-132 explanation: a disk drive hardware error occurred. Replace each part reported by the diagnostic program one at a time. Retry the diagnostic test after each part is replaced. If the problem persists, replace the next part in the list. Response: failing item: v ffc_26b4 v ffc_26bd 26b4-134 ex...

  • Page 373

    26b4-137 explanation: unit attention condition has occurred on the send diagnostic command. Replace each part reported by the diagnostic program one at a time. Retry the diagnostic test after each part is replaced. If the problem persists, replace the next part in the list. Response: failing item: v...

  • Page 374

    26d2-102 explanation: an unrecoverable media error occurred. Replace the 600 gb sff sas hard disk drive. Response: failing item: v ffc_26d2 26d2-104 explanation: the motor failed to restart. Replace the 600 gb sff sas hard disk drive. Response: failing item: v ffc_26d2 26d2-105 explanation: the driv...

  • Page 375

    26d2-112 explanation: the diagnostic test failed. Replace the 600 gb sff sas hard disk drive. Response: failing item: v ffc_26d2 26d2-114 explanation: an unrecoverable hardware error. Replace the 600 gb sff sas hard disk drive. Response: failing item: v ffc_26d2 26d2-116 explanation: a protocol erro...

  • Page 376

    26d2-120 explanation: a scsi busy or command error. Replace the 600 gb sff sas hard disk drive. Response: failing item: v ffc_26d2 26d2-122 explanation: a scsi reservation conflict error. Replace the 600 gb sff sas hard disk drive. Response: failing item: v ffc_26d2 26d2-124 explanation: a scsi chec...

  • Page 377

    26d2-129 explanation: error log analysis indicates a scsi bus problem. Replace each part reported by the diagnostic program one at a time. Run the diagnostic program in problem determination mode on each part reported in the original srn. If the problem persists, replace the next part in the list. R...

  • Page 378

    26d2-136 explanation: the certify operation failed. Replace the 600 gb sff sas hard disk drive. Response: failing item: v ffc_26d2 26d2-137 explanation: unit attention condition has occurred on the send diagnostic command. Replace the 600 gb sff sas hard disk drive. Run diagnostics again on the driv...

  • Page 379

    26d2-640 explanation: error log analysis indicates a path error. Replace the 600 gb sff sas hard disk drive. Response: failing item: v ffc_26d2 26d7-102 explanation: an unrecoverable media error occurred. Replace the 900 gb sff sas hard disk drive. Response: failing item: v ffc_26d7 26d7-104 explana...

  • Page 380

    26d7-110 explanation: the media format is corrupted. Replace the 900 gb sff sas hard disk drive. Response: failing item: v ffc_26d7 26d7-112 explanation: the diagnostic test failed. Replace the 900 gb sff sas hard disk drive. Response: failing item: v ffc_26d7 26d7-114 explanation: an unrecoverable ...

  • Page 381

    26d7-118 explanation: a scsi command time-out occurred. Replace the 900 gb sff sas hard disk drive. Run diagnostics again on the drive. If the error repeats, replace the 2nd fru identified by diagnostics. Response: failing item: v ffc_26d7 v ffc_b88 26d7-120 explanation: a scsi busy or command error...

  • Page 382

    26d7-128 explanation: the error log analysis indicates a hardware failure. Replace each part reported by the diagnostic program one at a time. Run the diagnostic program in problem determination mode on each part reported in the original srn. If the problem persists, replace the next part in the lis...

  • Page 383

    26d7-135 explanation: the device failed to configure. Replace each part reported by the diagnostic program one at a time. Retry the diagnostic test after each part is replaced. If the problem persists, replace the next part in the list. Response: failing item: v ffc_26d7 v ffc_b88 v software 26d7-13...

  • Page 384

    26d7-140 explanation: error log analysis indicates poor signal quality. Replace each part reported by the diagnostic program one at a time. Run the diagnostic program in problem determination mode on each part reported in the original srn. If the problem persists, replace the next part in the list. ...

  • Page 385

    2750-601 explanation: adapter taken off-line. Replace the pcie2 16gb 2-port fc mezzanine adapter response: failing item: v ffc_2750 2750-602 explanation: adapter parity error replace the pcie2 16gb 2-port fc mezzanine adapter response: failing item: v ffc_2750 2750-603 explanation: permanent adapter...

  • Page 386

    2755-601 explanation: adapter taken off-line. Replace the pcie2 16gb 4-port fc mezzanine adapter response: failing item: v ffc_2755 2755-602 explanation: adapter parity error replace the pcie2 16gb 4-port fc mezzanine adapter response: failing item: v ffc_2755 2755-603 explanation: permanent adapter...

  • Page 387

    2770-103 explanation: loop back test failure. Replace the ibm flex system en4132 2-port 10gb roce adapter response: failing item: v ffc_2770 2770-201 explanation: loop back test failure. Do the following steps one at a time, in order, and rerun the test after each step: 1. Reseat the cable. 2. Repla...

  • Page 388

    2770-602 explanation: error log analysis indicates adapter configuration error replace the ibm flex system en4132 2-port 10gb roce adapter response: failing item: v ffc_2770 2770-603 explanation: error log analysis indicates adapter eeh service error replace the ibm flex system en4132 2-port 10gb ro...

  • Page 389

    2770-704 explanation: error log analysis indicates that this adapter has failed due to a unrecoverable internal parity error replace the ibm flex system en4132 2-port 10gb roce adapter response: failing item: v ffc_2770 2770-705 explanation: error log analysis indicates that this adapter has failed ...

  • Page 390

    2d14-710 explanation: permanent controller error. Replace the system board (sas controller). Response: failing item: v ffc_2d14 2d14-713 explanation: controller error. Replace the system board (sas controller). Response: failing item: v ffc_2d14 2d14-720 explanation: controller devce bus configurati...

  • Page 391

    2d29-720 explanation: controller devce bus configuration error. Use map3150 to determine resolution. Response: 2e00-201 explanation: configuration error. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace an...

  • Page 392

    2e10-201 explanation: adapter configuration error. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system-board response: failing item: v ffc_2...

  • Page 393

    2e10-604 explanation: error log analysis indicates a permanent adapter failure. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system-board re...

  • Page 394

    2e10-702 explanation: error log analysis indicates permanent adapter failure. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system-board resp...

  • Page 395

    2e13-603 explanation: error log analysis indicates that the microcode could not be loaded on the adapter. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. R...

  • Page 396

    2e13-701 explanation: error log analysis indicates permanent adapter failure. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system-board resp...

  • Page 397

    2e14-601 explanation: error log analysis indicates adapter failure. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system-board response: fail...

  • Page 398

    2e14-605 explanation: error log analysis indicates permanent adapter failure is reported on the other port of this adapter. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagno...

  • Page 399

    2e15-201 explanation: adapter configuration error. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system-board response: failing item: v ffc_2...

  • Page 400

    2e15-604 explanation: error log analysis indicates a permanent adapter failure. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system-board re...

  • Page 401

    2e15-702 explanation: error log analysis indicates permanent adapter failure is reported. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the syste...

  • Page 402

    2e16-603 explanation: error log analysis indicates that the microcode could not be loaded on the adapter. Replace the ibm flex system fc3172 2-port 8gb fc adapter. Response: failing item: v ffc_2e16 2e16-604 explanation: error log analysis indicates a permanent adapter failure. Replace the ibm flex ...

  • Page 403

    2e16-702 explanation: error log analysis indicates permanent adapter failure is reported on the other port of this adapter. Replace the ibm flex system fc3172 2-port 8gb fc adapter. If the problem persists, replace the system board. Response: failing item: v ffc_2e16 v ffc_221 2e21-201 explanation: ...

  • Page 404

    2e21-603 explanation: error log analysis indicates that the microcode could not be loaded on the adapter. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. R...

  • Page 405

    2e21-701 explanation: error log analysis indicates permanent adapter failure. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system-board resp...

  • Page 406

    2e23-103 explanation: flash test failure 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system-board response: failing item: v ffc_2e23 2e23-1...

  • Page 407

    2e23-107 explanation: external wrap with tcp checksum test failure 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system-board response: faili...

  • Page 408

    2e23-202 explanation: network link test failure. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system-board response: failing item: v ffc_241...

  • Page 409

    2e23-604 explanation: error log analysis indicates transmission errors. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system-board response: ...

  • Page 410

    2e28-604 explanation: adapter down error replace the ibm flex system cn4058 8-port 10gb converged adapter response: failing item: v ffc_2e28 2e29-201 explanation: adapter configuration error replace the ibm flex system cn4058 8-port 10gb converged adapter if the problem persists, replace the system ...

  • Page 411

    2e29-604 explanation: adapter down error replace the ibm flex system cn4058 8-port 10gb converged adapter response: failing item: v ffc_2e29 2e33-201 explanation: adapter configuration error. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progr...

  • Page 412

    2e33-701 explanation: permanent adapter failure. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system-board response: failing item: v ffc_2e3...

  • Page 413

    2e34-701 explanation: permanent adapter failure. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system-board response: failing item: v ffc_2e3...

  • Page 414

    2e35-701 explanation: permanent adapter failure. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system-board response: failing item: v ffc_2e3...

  • Page 415

    2e36-701 explanation: permanent adapter failure. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system-board response: failing item: v ffc_2e3...

  • Page 416

    2e37-141 explanation: dma write test failed. Replace the ibm flex system en4054 4-port 10gb ethernet adapter. Response: failing item: v ffc_2e37 2e37-142 explanation: dma read-write test failed. Replace the ibm flex system en4054 4-port 10gb ethernet adapter. Response: failing item: v ffc_2e37 2e37-...

  • Page 417

    2e37-154 explanation: internal loopback udp checksum offload test failed. Replace the ibm flex system en4054 4-port 10gb ethernet adapter. Response: failing item: v ffc_2e37 2e37-155 explanation: internal loopback lso test failed. Replace the ibm flex system en4054 4-port 10gb ethernet adapter. Resp...

  • Page 418

    2e37-703 explanation: failure to initialize due to a problem while reading the eeprom on the adapter. Replace the ibm flex system en4054 4-port 10gb ethernet adapter. Response: failing item: v ffc_2e37 2e3d-101 explanation: pci configuration registers test failure. Replace the ibm flex system en2024...

  • Page 419

    2e3d-111 explanation: internal loopback test failure. Replace the ibm flex system en2024 4-port 1gb ethernet adapter response: failing item: v ffc_2e3d 2e3d-201 explanation: adapter configuration error. Replace the ibm flex system en2024 4-port 1gb ethernet adapter if the problem persists, replace t...

  • Page 420

    2e52-201 explanation: adapter configuration error replace the ibm flex system cn4058 8-port 10gb converged adapter if the problem persists, replace the system board. Response: failing item: v ffc_2e52 v ffc_221 2e52-605 explanation: ethernet hardware error replace the ibm flex system cn4058 8-port 1...

  • Page 421

    2e53-605 explanation: ethernet hardware error. Replace the pcie2 10gb 8-port fcoe mezzanine adapter response: failing item: v ffc_2e53 2e53-606 explanation: ethernet configuration error replace the pcie2 10gb 8-port fcoe mezzanine adapter response: failing item: v ffc_2e53 2e53-607 explanation: ethe...

  • Page 422

    950-2d14 explanation: a resource was not detected that was previously installed. Check the management module event log for an system event. If there is not a system error, replace the system board (sas controller). Response: failing item: v ffc_2d14 950-2d29 explanation: a resource was not detected ...

  • Page 423

    2506-3020 explanation: storage subsystem configuration error. Go to performing the checkout procedure. Response: 2506-3100 explanation: controller detected device bus interface error. Go to performing the checkout procedure. Response: 2506-3109 explanation: controller timed out a device command. Go ...

  • Page 424

    2506-4041 explanation: configuration error, incomplete multipath connection between enclosure and device detected. Go to performing the checkout procedure. Response: 2506-4050 explanation: attached enclosure does not support required multipath function. Go to performing the checkout procedure. Respo...

  • Page 425

    2506-4160 explanation: pci bus error detected by controller. Go to performing the checkout procedure. Response: failing item: v ffc_2506 2506-7001 explanation: temporary disk data error. Go to performing the checkout procedure. Response: failing item: v ffc_722 2506-8008 explanation: a permanent cac...

  • Page 426

    2506-9000 explanation: controller detected device error during configuration discovery. Go to performing the checkout procedure. Response: 2506-9001 explanation: controller detected device error during configuration discovery. Go to performing the checkout procedure. Response: 2506-9002 explanation:...

  • Page 427

    2506-9021 explanation: two or more disks are missing from a raid-5 or raid 6 disk array. Go to performing the checkout procedure. Response: 2506-9022 explanation: two or more disks are missing from a raid-5 or raid 6 disk array. Go to performing the checkout procedure. Response: 2506-9023 explanatio...

  • Page 428

    2506-9028 explanation: maximum number of functional disk arrays has been exceeded. Go to performing the checkout procedure. Response: 2506-9029 explanation: maximum number of functional disk arrays disks has been exceeded. Go to performing the checkout procedure. Response: 2506-9030 explanation: dis...

  • Page 429

    2506-9051 explanation: cache data exists for one or more missing/failed disks. Go to performing the checkout procedure. Response: 2506-9052 explanation: cache data exists for one or more modified disks. Go to performing the checkout procedure. Response: 2506-9054 explanation: raid controller resourc...

  • Page 430

    2506-9074 explanation: multiple controllers not capable of similar functions or controlling same set of devices. Go to performing the checkout procedure. Response: 2506-9075 explanation: incomplete multipath connection between controller and remote controller. Go to performing the checkout procedure...

  • Page 431

    2506-ff3d explanation: temporary controller failure. Retry the operation. Response: failing item: v ffc_2506 2506-fff3 explanation: disk media format bad. Reformat the disk and retry the operation. Response: 2506-fff4 explanation: device problem. Perform diagnostics on the device and retry the opera...

  • Page 432

    2d14-102e explanation: reallocation failed due to disk out of alternate sectors. Replace the fru identified by diagnostics. Response: failing item: v ffc_722 2d14-3002 explanation: addressed device failed to respond to selection. Replace the fru identified by diagnostics. Response: failing item: v f...

  • Page 433

    2d14-4010 explanation: configuration error, incorrect connection between cascaded enclosures. Use map3142 to determine resolution. Response: 2d14-4020 explanation: configuration error, connections exceed ioa design limits. Use map3143 to determine resolution. Response: 2d14-4030 explanation: configu...

  • Page 434

    2d14-4101 explanation: temporary device bus fabric error. Use map3152 to determine resolution. Response: 2d14-4110 explanation: unsupported enclosure function detected. Use map3145 to determine resolution. Response: 2d14-4150 explanation: pci bus error detected by controller. Replace the system boar...

  • Page 435

    2d14-8157 explanation: temporary controller failure. Replace the system board (sas controller). Response: failing item: v ffc_2d14 2d14-9000 explanation: controller detected device error during configuration discovery. Use map3190 to determine resolution. Response: 2d14-9001 explanation: controller ...

  • Page 436

    2d14-9021 explanation: two or more disks are missing from a raid-5 or raid 6 disk array. Use map3111 to determine resolution. Response: 2d14-9022 explanation: two or more disks are missing from a raid-5 or raid 6 disk array. Use map3111 to determine resolution. Response: 2d14-9023 explanation: one o...

  • Page 437

    2d14-9028 explanation: maximum number of functional disk arrays has been exceeded. Use map3190 to determine resolution. Response: 2d14-9029 explanation: maximum number of functional disk arrays disks has been exceeded. Use map3190 to determine resolution. Response: 2d14-9030 explanation: disk array ...

  • Page 438

    2d14-9051 explanation: cache data exists for one or more missing/failed disks. Use map3132 to determine resolution. Response: 2d14-9052 explanation: cache data exists for one or more modified disks. Use map3190 to determine resolution. Response: 2d14-9054 explanation: raid controller resources not a...

  • Page 439

    2d14-9074 explanation: multiple controllers not capable of similar functions or controlling same set of devices. Use map3141 to determine resolution. Response: 2d14-9075 explanation: incomplete multipath connection between controller and remote controller. Use map3149 to determine resolution. Respon...

  • Page 440

    2d14-ff3d explanation: temporary controller failure. Retry the operation. Replace the system board (sas controller). Response: failing item: v ffc_2d14 2d14-fff3 explanation: disk media format bad. Reformat the disk and retry the operation. Use map3135 to determine resolution. Response: 2d14-fff4 ex...

  • Page 441

    2d29-102e explanation: reallocation failed due to disk out of alternate sectors. Replace the fru identified by diagnostics. Response: failing item: v ffc_722 2d29-3002 explanation: addressed device failed to respond to selection. Replace the fru identified by diagnostics. Response: failing item: v f...

  • Page 442

    2d29-4010 explanation: configuration error, incorrect connection between cascaded enclosures. Use map3142 to determine resolution. Response: 2d29-4020 explanation: configuration error, connections exceed ioa design limits. Use map3143 to determine resolution. Response: 2d29-4030 explanation: configu...

  • Page 443

    2d29-4101 explanation: temporary device bus fabric error. Use map3152 to determine resolution. Response: 2d29-4110 explanation: unsupported enclosure function detected. Use map3145 to determine resolution. Response: 2d29-4150 explanation: pci bus error detected by controller. Replace the pcie x8 int...

  • Page 444

    2d29-8157 explanation: temporary controller failure. Replace the pcie x8 internal 3gb sas adapter. Response: failing item: v ffc_2d29 2d29-9000 explanation: controller detected device error during configuration discovery. Use map3190 to determine resolution. Response: 2d29-9001 explanation: controll...

  • Page 445

    2d29-9021 explanation: two or more disks are missing from a raid-5 or raid 6 disk array. Use map3111 to determine resolution. Response: 2d29-9022 explanation: two or more disks are missing from a raid-5 or raid 6 disk array. Use map3111 to determine resolution. Response: 2d29-9023 explanation: one o...

  • Page 446

    2d29-9028 explanation: maximum number of functional disk arrays has been exceeded. Use map3190 to determine resolution. Response: 2d29-9029 explanation: maximum number of functional disk arrays disks has been exceeded. Use map3190 to determine resolution. Response: 2d29-9030 explanation: disk array ...

  • Page 447

    2d29-9051 explanation: cache data exists for one or more missing/failed disks. Use map3132 to determine resolution. Response: 2d29-9052 explanation: cache data exists for one or more modified disks. Use map3190 to determine resolution. Response: 2d29-9054 explanation: raid controller resources not a...

  • Page 448

    2d29-9074 explanation: multiple controllers not capable of similar functions or controlling same set of devices. Use map3141 to determine resolution. Response: 2d29-9075 explanation: incomplete multipath connection between controller and remote controller. Use map3149 to determine resolution. Respon...

  • Page 449

    2d29-ff3d explanation: temporary controller failure. Retry the operation. Replace the pcie x8 internal 3gb sas adapter. Response: failing item: v ffc_2d29 2d29-fff3 explanation: disk media format bad. Reformat the disk and retry the operation. Use map3135 to determine resolution. Response: 2d29-fff4...

  • Page 450

    A00-ff0 through a24-xxx srns: the aix operating system might generate service request numbers (srns) from a00-ff0 to a24-xxx. Note: some srns in this sequence might have 4 rather than 3 digits after the dash (–). Table 15 shows the meaning of an x in any of the following srns, such as a01-00x. Table...

  • Page 451

    Ssss-102 explanation: an unrecoverable media error occurred. 1. Check the bladecenter⌂⌂ management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system board and chassis assembl...

  • Page 452

    Ssss-108 explanation: the bus test failed. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system board and chassis assembly, as described in r...

  • Page 453

    Ssss-116 explanation: a protocol error. 1. Make sure that the device, adapter and diagnostic firmware, and the application software levels are compatible. 2. If you do not find a problem, call your operating-system support person. Response: failing item: v ffc_ssss ssss-117 explanation: a write-prot...

  • Page 454

    Ssss-122 explanation: a scsd reservation conflict error. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system board and chassis assembly, as ...

  • Page 455

    Ssss-129 explanation: error log analysis indicates a scsd bus problem. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system board and chassis...

  • Page 456

    Ssss-135 explanation: the device failed to configure. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see post progress codes (checkpoints). 2. Replace any parts reported by the diagnostic program. 3. Replace the system board and chassis assembly, as des...

  • Page 457

    Ssss-138 explanation: error log analysis indicates that the disk drive is operating at a higher than recommended temperature. 1. Make sure that: v the ventilation holes in the blade server bezel are not blocked. V the management-module event log is not reporting any system environmental warnings. 2....

  • Page 458

    Failing function codes: failing function codes (ffcs) identify a function within the system unit that is failing. Table 16 describes the component that each function code identifies. Note: when replacing a component, perform system verification for the component. See “using the diagnostics program” ...

  • Page 459

    Table 16. Failing function codes (continued) ffc description and notes 2d6 system-board and chassis assembly (i2c secondary) 2d7 system-board and chassis assembly (vpd module) 2e8 system-board and chassis assembly (cache controller) 308 system-board and chassis assembly (i/o bridge problem) 650 unkn...

  • Page 460

    Controller maintenance analysis procedures: these procedures are intended to resolve adapter, cache, or disk array problems associated with a controller. See “service request numbers (srns)” on page 301 to identify which map to use. Map 3110: perform map 3110 in the systems hardware information cent...

  • Page 461

    Perform map 3132 in the systems hardware information center. Map 3133: perform map 3133 in the systems hardware information center. Map 3134: perform map 3134 in the systems hardware information center. Map 3135: perform map 3135 in the systems hardware information center. Map 3140: perform map 3140...

  • Page 462

    Perform map 3146 in the systems hardware information center. Map 3147: perform map 3147 in the systems hardware information center. Map 3148: perform map 3148 in the systems hardware information center. Map 3149: perform map 3149 in the systems hardware information center. Map 3150: perform map 3150...

  • Page 463

    Checkout procedure the checkout procedure is the sequence of tasks that you follow to diagnose a problem in the compute node. About the checkout procedure review this information before performing the checkout procedure. V read the safety topic and the “installation guidelines” on page 35. V the fir...

  • Page 464

    5. Perform the following steps: note: when possible, run aix online diagnostics in concurrent mode. Aix online diagnostics perform more functions than the stand-alone diagnostics. A. Perform the aix online diagnostics, see “starting aix concurrent diagnostics” on page 453. Record any diagnostic resu...

  • Page 465

    Verifying the partition configuration perform this procedure if there is a configuration problem with the system or a logical partition. Procedure 1. Check the processor and memory allocations of the system or the partition. Processor or memory resources that fail during system startup could cause t...

  • Page 466

    2. Using the management module web interface, verify the following items: v the compute node firmware is at the latest version. V sol is enabled for the compute node. V the cd or dvd drive or usb flash drive is selected as the first boot device for the compute node. 3. Insert diagnostic media. V if ...

  • Page 467

    2. If the system is running in a full-machine partition, turn on the compute node and establish a serial over lan (sol) session. For information about starting an sol session, see the ibm chassis management module command-line interface reference guide. 3. Perform the following steps to check the ni...

  • Page 468

    A. Select diagnostic routines and press enter. B. From the diagnostic mode selection menu, select system verification. C. Select the resource to be tested, and press f7=commit. D. Record any results provided and go to “service request numbers (srns)” on page 301 to identify the failure and perform t...

  • Page 469

    2. If you are attempting to boot from the network: a. Make sure that the network cabling to the ibm flex system enterprise chassis network switch is correct. B. Check with the network administrator to make sure that the network is up. C. Verify that the compute node for your system is running and co...

  • Page 470

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to determine which components are crus and which components are frus. V if an action ste...

  • Page 471

    Connectivity problems identify connectivity problem symptoms and the corrective actions to take. V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on...

  • Page 472

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to determine which components are crus and which components are frus. V if an action ste...

  • Page 473

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to determine which components are crus and which components are frus. V if an action ste...

  • Page 474

    Pci expansion card (piocard) problem isolation procedure the hardware that controls pci adapters and pci card slots detected an error. The direct select address (dsa) portion of the system reference code (src) identifies the location code of the failing component. The following table shows the synta...

  • Page 475

    Table 19. Pci expansion card problem isolation procedure v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to determine which components a...

  • Page 476

    Table 20. Hypervisor isolation procedures and symbolic failing items (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to dete...

  • Page 477

    Table 20. Hypervisor isolation procedures and symbolic failing items (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to dete...

  • Page 478

    Table 20. Hypervisor isolation procedures and symbolic failing items (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to dete...

  • Page 479

    Service processor problems the service processor provides error diagnostics with associated error codes, isolation procedures, and symbolic failing items for troubleshooting. Note: resetting the service processor causes a power7 reset and reload, which generates a dump. The dump is recorded in the m...

  • Page 480

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to determine which components are crus and which components are frus. V if an action ste...

  • Page 481

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to determine which components are crus and which components are frus. V if an action ste...

  • Page 482

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to determine which components are crus and which components are frus. V if an action ste...

  • Page 483

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to determine which components are crus and which components are frus. V if an action ste...

  • Page 484

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to determine which components are crus and which components are frus. V if an action ste...

  • Page 485

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to determine which components are crus and which components are frus. V if an action ste...

  • Page 486

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to determine which components are crus and which components are frus. V if an action ste...

  • Page 487

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to determine which components are crus and which components are frus. V if an action ste...

  • Page 488

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to determine which components are crus and which components are frus. V if an action ste...

  • Page 489

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to determine which components are crus and which components are frus. V if an action ste...

  • Page 490

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to determine which components are crus and which components are frus. V if an action ste...

  • Page 491

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to determine which components are crus and which components are frus. V if an action ste...

  • Page 492

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to determine which components are crus and which components are frus. V if an action ste...

  • Page 493

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to determine which components are crus and which components are frus. V if an action ste...

  • Page 494

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to determine which components are crus and which components are frus. V if an action ste...

  • Page 495

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to determine which components are crus and which components are frus. V if an action ste...

  • Page 496

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes,” on page 97 to determine which components are crus and which components are frus. V if an action ste...

  • Page 497

    Software problems use this information to recognize software problem symptoms and to take corrective actions. V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compu...

  • Page 498

    Leds are available for the following components: v battery v sas disk drive v ssd v management card v memory modules (dimms) v network adapter v system-board and chassis assembly viewing the light path diagnostic leds after reading the required safety information, look at the control panel to determ...

  • Page 499

    The following figure shows leds on the system board of the ibm flex system p460 compute node. The following table identifies the light path diagnostic leds. Table 21. Ibm flex system p260 compute node and ibm flex system p460 compute node leds callout unit leds ▌1▐ 3 v lithium battery led ▌2▐ drv2 l...

  • Page 500

    Table 21. Ibm flex system p260 compute node and ibm flex system p460 compute node leds (continued) callout unit leds ▌3▐ drv1 led (hdd or ssd) ▌4▐ drive board led (solid-state drive interposer, which is integrated in the cover) ▌5▐ management card led ▌6▐ system board led ▌7▐ light path power led ▌8...

  • Page 501

    Table 22. Light path diagnostic led information for the ibm flex system p260 compute node (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes...

  • Page 502

    Table 23 describes the leds on the system board and suggested actions for correcting any detected problems. Table 23. Light path diagnostic led information for the ibm flex system p460 compute node v follow the suggested actions in the order in which they are listed in the action column until the pr...

  • Page 503

    Table 23. Light path diagnostic led information for the ibm flex system p460 compute node (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 7, “parts listing for ibm flex system p260 and p460 compute nodes...

  • Page 504

    Isolating firmware problems you can use this procedure to isolate firmware problems. About this task to isolate a firmware problem, follow the procedure until the problem is solved. Procedure 1. If the compute node is operating, shut down the operating system. Use the cmm to perform a virtual reseat...

  • Page 506

    3. To restore the new fcsx device to the correct vfchost, enter the vfcmap command as follows: vfcmap -vadapter vfchost0 -fcp fcs0 4. To check the vfchost mapping, enter the lsmap command as follows: lsmap -vadapter vfchost2 -npiv the output might look like the following example: name physloc clntid...

  • Page 507

    About this task to start the temp image, see http://publib.Boulder.Ibm.Com/infocenter/flexsys/information/topic/ com.Ibm.Acc.Psm.Hosts.Doc/dpsm_managing_hosts_power_firmware.Html. Recovering the temp image from the perm image to recover the temp image from the perm image, you must perform the reject...

  • Page 508

    Committing the temp system firmware image after updating the system firmware and successfully starting the compute node from the temporary (temp) image, copy the temp image to the permanent (perm) image by using the diagnostics program commit function. About this task note: if you install the comput...

  • Page 509

    2. Verify that power management is set correctly for your ibm flex system enterprise chassis configuration. 3. Verify whether the problem is being experienced on more than one compute node. 4. Perform a test of the failing function on a compute node that is known to be operational. 5. Try the comput...

  • Page 510

    Results if these steps do not resolve the problem, it is probably a problem with the compute node. See “connectivity problems” on page 459 for more information. Solving shared power problems problems with shared resources might appear to be in the compute node, but might actually be a problem in a i...

  • Page 511

    1. Use the management console to determine if the compute node is recognized. 2. Use the management console to ensure that power-on permission is not denied due to power policy settings. 3. Turn off the compute node. 4. Remove the compute node from the ibm flex system enterprise chassis, and remove ...

  • Page 512

    500 power systems: ibm flex system p260 and p460 compute nodes installation and service guide.

  • Page 513: Appendix. Notices

    Appendix. Notices this information was developed for products and services offered in the u.S.A. The manufacturer may not offer the products, services, or features discussed in this document in other countries. Consult the manufacturer's representative for information on the products and services cu...

  • Page 514

    This information is for planning purposes only. The information herein is subject to change before the products described become available. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the...

  • Page 515

    Electronic emission notices when attaching a monitor to the equipment, you must use the designated monitor cable and any interference suppression devices supplied with the monitor. Class a notices the following class a statements apply to the ibm servers that contain the power7 processor and its fea...

  • Page 516

    Warning: this is a class a product. In a domestic environment, this product may cause radio interference, in which case the user may be required to take adequate measures. Vcci statement - japan the following is a summary of the vcci japanese statement in the box above: this is a class a product bas...

  • Page 517

    Electromagnetic interference (emi) statement - taiwan the following is a summary of the emi taiwan statement above. Warning: this is a class a product. In a domestic environment this product may cause radio interference in which case the user will be required to take adequate measures. Ibm taiwan co...

  • Page 518

    Germany compliance statement deutschsprachiger eu hinweis: hinweis für geräte der klasse a eu-richtlinie zur elektromagnetischen verträglichkeit dieses produkt entspricht den schutzanforderungen der eu-richtlinie 2004/108/eg zur angleichung der rechtsvorschriften über die elektromagnetische verträgl...

  • Page 519

    Electromagnetic interference (emi) statement - russia class b notices the following class b statements apply to features designated as electromagnetic compatibility (emc) class b in the feature installation information. Federal communications commission (fcc) statement this equipment has been tested...

  • Page 520

    Industry canada compliance statement this class b digital apparatus complies with canadian ices-003. Avis de conformité à la réglementation d'industrie canada cet appareil numérique de la classe b est conforme à la norme nmb-003 du canada. European community compliance statement this product is in c...

  • Page 521

    Japanese electronics and information technology industries association (jeita) confirmed harmonics guideline with modifications (products greater than 20 a per phase) ibm taiwan contact information electromagnetic interference (emi) statement - korea germany compliance statement deutschsprachiger eu...

  • Page 522

    Dieses gerät ist berechtigt, in Übereinstimmung mit dem deutschen emvg das eg-konformitätszeichen - ce - zu führen. Verantwortlich für die einhaltung der emv vorschriften ist der hersteller: international business machines corp. New orchard road armonk, new york 10504 tel: 914-499-1900 der verantwor...

  • Page 523

    Appendix. Notices 511.

  • Page 524

    Ibm® printed in usa.