IBM PS700 Problem Determination And Service Manual

Summary of PS700

  • Page 1

    Power systems problem determination and service guide for the ibm power ps700 (8406-70y) gi11-9831-00

  • Page 3

    Power systems problem determination and service guide for the ibm power ps700 (8406-70y) gi11-9831-00

  • Page 4

    Note before using this information and the product it supports, read the information in “notices,” on page 271, “safety notices” on page v, the ibm systems safety notices manual, g229-9054, and the ibm environmental notices and user guide , z125–5823. This edition applies to ibm power systems server...

  • Page 5

    Contents safety notices . . . . . . . . . . . . V chapter 1. Introduction . . . . . . . . 1 related documentation . . . . . . . . . . . 1 notices and statements . . . . . . . . . . . 2 features and specifications. . . . . . . . . . 2 supported dimms . . . . . . . . . . . . 4 blade server control pan...

  • Page 6

    Returning a device or component . . . . . 234 removing the blade server from a bladecenter unit . . . . . . . . . . . . . . . . 235 installing the blade server in a bladecenter unit 236 removing and replacing tier 1 crus . . . . . 237 removing the blade server cover . . . . . . 237 installing and cl...

  • Page 7

    Safety notices safety notices may be printed throughout this guide: v danger notices call attention to a situation that is potentially lethal or extremely hazardous to people. V caution notices call attention to a situation that is potentially hazardous to people because of some existing condition. ...

  • Page 8

    Danger when working on or around the system, observe the following precautions: electrical voltage and current from power, telephone, and communication cables are hazardous. To avoid a shock hazard: v connect power to this unit only with the ibm provided power cord. Do not use the ibm provided power...

  • Page 9

    Observe the following precautions when working on or around your it rack system: v heavy equipment–personal injury or equipment damage might result if mishandled. V always lower the leveling pads on the rack cabinet. V always install stabilizer brackets on the rack cabinet. V to avoid hazardous cond...

  • Page 10

    Caution: removing components from the upper positions in the rack cabinet improves rack stability during relocation. Follow these general guidelines whenever you relocate a populated rack cabinet within a room or building: v reduce the weight of the rack cabinet by removing equipment starting at the...

  • Page 11

    (l003) or all lasers are certified in the u.S. To conform to the requirements of dhhs 21 cfr subchapter j for class 1 laser products. Outside the u.S., they are certified to be in compliance with iec 60825 as a class 1 laser product. Consult the label on each part for laser certification numbers and...

  • Page 12

    Caution: data processing environments can contain equipment transmitting on system links with laser modules that operate at greater than class 1 power levels. For this reason, never look into the end of an optical fiber cable or open receptacle. (c027) caution: this product contains a class 1m laser...

  • Page 13

    Chapter 1. Introduction this problem determination and service information helps you solve problems that might occur in your ps700 blade server. The information describes the diagnostic tools that come with the blade server, error codes and suggested actions, and instructions for replacing failing c...

  • Page 14

    Additional documents might be included in the online information center and on the ibm bladecenter documentation cd. The blade server might have features that are not described in the documentation that comes with the blade server. Occasional updates to the documentation might include information ab...

  • Page 15

    Core electronics: v 64-bit power 7 processors (12s technology) v four core, single socket (4-way) processors @ 3.0 ghz v 64 gb maximum in 8 very low profile (vlp) dimm slots; supports 4 gb ddr3 at 1066mhz, and 8 gb ddr3 at 800hmz p5ioc2 i/o hub on-board, integrated features: v two 1 gb ethernet port...

  • Page 16

    Supported dimms each planar in the ps700 blade server contains eight very low profile (vlp) memory connectors for registered dual inline memory modules (rdimms). The maximum size for a single dimm is 8 gb. The total memory capacity ranges for ps700 from a minimum of 4 gb to a maximum of 64 gb. See c...

  • Page 17

    Blade server control panel buttons and leds blade server control panel buttons and leds provide operational controls and status indicators. Note: figure 2 shows the control-panel door in the closed (normal) position. To access the power-control button, you must open the control-panel door. 1 media-t...

  • Page 18

    2 information led: when this amber led is lit, it indicates that information about a system error for the blade server has been placed in the management-module event log. The information led can be turned off through the web interface of the management module or through ibm director console. 3 blade...

  • Page 19

    You can start the blade server in any of the following ways. V start the blade server by pressing the power-control button on the front of the blade server. The power-control button is behind the control panel door, as described in “blade server control panel buttons and leds” on page 5. After you p...

  • Page 20

    System-board layouts illustrations show the connectors and leds on the system board. The illustrations might differ slightly from your hardware. System-board connectors blade server components attach to the connectors on the system board. Figure 3 shows the connectors on the base unit system board i...

  • Page 21

    System-board leds use the illustration of the leds on the system board to identify a light emitting diode (led). Remove the blade server from the bladecenter unit, open the cover, press the blue button to see any error leds that were turned on during error processing, and use figure 5 to identify th...

  • Page 22

    Table 3. Ps700 leds (continued) callout base unit leds 8 ciov (1xe) expansion card connector led 9 high-speed (cffh) expansion card connector led 10 hdd2 led 11 dimm 5-8 leds 10 power systems: problem determination and service guide for the ibm power ps700 (8406-70y).

  • Page 23

    Chapter 2. Diagnostics use the available diagnostic tools to help solve any problems that might occur in the blade server. The first and most crucial component of a solid serviceability strategy is the ability to accurately and effectively detect errors when they occur. While not all errors are a th...

  • Page 24

    V power-on self-test (post) progress codes (checkpoints), error codes, and isolation procedures the post checks out the hardware at system initialization. Ipl diagnostic functions test some system components and interconnections. The post generates eight-digit checkpoints to mark the progress of pow...

  • Page 25

    Linux on power service and productivity tools include hardware diagnostic aids and productivity tools, and installation aids. The installation aids are provided in the ibm installation toolkit for linux on power, a set of tools that aids the installation of linux on ibm servers with power architectu...

  • Page 26

    Location codes location codes identify components of the blade server. Location codes are displayed with some error codes to identify the blade server component that is causing the error. See “system-board connectors” on page 8 for component locations. Notes: 1. Location codes do not indicate the lo...

  • Page 27

    Table 4. Location codes (continued) components physical location code cru led firmware version um-y1 reference codes reference codes are diagnostic aids that help you determine the source of a hardware or operating system problem. To use reference codes effectively, use them in conjunction with othe...

  • Page 28

    System reference codes (srcs) system reference codes indicate a server hardware or software problem that can originate in hardware, in firmware, or in the operating system. A blade server component generates an error code when it detects a problem. An src identifies the component that generated the ...

  • Page 29

    Src formats srcs are strings of either six or eight alphanumeric characters. The first two characters designate the reference code type. The first character indicates the type of error. In a few cases, the first two characters indicate the type of error: v 1xxxxxxx - system power control network (sp...

  • Page 30

    Table 7. 1xxxyyyy srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type 8406,” on page 229 to d...

  • Page 31

    Table 7. 1xxxyyyy srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type 8406,” on page 229 to d...

  • Page 32

    Table 7. 1xxxyyyy srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type 8406,” on page 229 to d...

  • Page 33

    Table 7. 1xxxyyyy srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type 8406,” on page 229 to d...

  • Page 34

    Look for the rightmost 4 characters ( yyyy in 6xxx yyyy ) in the error code; this is the reference code. Find the reference code in table 8. Table 8. 6xxxyyyy srcs v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solv...

  • Page 35

    Table 8. 6xxxyyyy srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type 8406,” on page 229 to d...

  • Page 36

    A1xxyyyy service processor srcs an a1xxyyyy system reference code (src) is an attention code that offers information about a platform or service processor dump or confirms a control panel function request. Take the steps in the action column only if the bladesystem appears to hang on an attention co...

  • Page 37

    Table 11. A700yyyy licensed internal code srcs (continued) reference code description action a7004721 the world wide port name (wwpn) prefix is not valid. Https://www-912.Ibm.Com/supporthome.Nsf/ document/51455410 a7004730 informational system log entry only. No corrective action is required. A70047...

  • Page 38

    Table 12. Aa00e1a8 to aa260005 partition firmware attention codes (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “...

  • Page 39

    Table 12. Aa00e1a8 to aa260005 partition firmware attention codes (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “...

  • Page 40

    Bxxxxxxx service processor early termination srcs a bxxxxxxx system reference code (src) is an error code that is related to an event or exception that occurred in the service processor firmware. To find a description of a src that is not listed in this ps700 blade server documentation, refer to the...

  • Page 41

    B200xxxx logical partition srcs a b200xxxx src is a logical partition reference code that is related to logical partitioning. Table 14 describes system reference codes that might be displayed if system firmware detects a problem. Suggested actions to correct the problem are also listed. Note: for pr...

  • Page 42

    Table 14. B200xxxx logical partition srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type 8406...

  • Page 43

    Table 14. B200xxxx logical partition srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type 8406...

  • Page 44

    Table 14. B200xxxx logical partition srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type 8406...

  • Page 45

    Table 14. B200xxxx logical partition srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type 8406...

  • Page 46

    Table 14. B200xxxx logical partition srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type 8406...

  • Page 47

    Table 14. B200xxxx logical partition srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type 8406...

  • Page 48

    Table 14. B200xxxx logical partition srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type 8406...

  • Page 49

    Table 14. B200xxxx logical partition srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type 8406...

  • Page 50

    Table 14. B200xxxx logical partition srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type 8406...

  • Page 51

    Table 14. B200xxxx logical partition srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type 8406...

  • Page 52

    Table 15. B700xxxx licensed internal code srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type...

  • Page 53

    Table 15. B700xxxx licensed internal code srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type...

  • Page 54

    Table 15. B700xxxx licensed internal code srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type...

  • Page 55

    Table 15. B700xxxx licensed internal code srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type...

  • Page 56

    Table 15. B700xxxx licensed internal code srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type...

  • Page 57

    Table 15. B700xxxx licensed internal code srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type...

  • Page 58

    Table 15. B700xxxx licensed internal code srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type...

  • Page 59

    Table 15. B700xxxx licensed internal code srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type...

  • Page 60

    Table 15. B700xxxx licensed internal code srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listing, type...

  • Page 61

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 62

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 63

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 64

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 65

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 66

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 67

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 68

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 69

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 70

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 71

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 72

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 73

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 74

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 75

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 76

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 77

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 78

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 79

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 80

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 81

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 82

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 83

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 84

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 85

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 86

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 87

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 88

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 89

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 90

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 91

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 92

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 93

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 94

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 95

    Table 16. Ba000010 to ba400002 partition firmware srcs (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, then you can stop performing the remaining actions. V see chapter 3, “parts listi...

  • Page 96

    Post progress codes (checkpoints) when you turn on the blade server, the power-on self-test (post) performs a series of tests to check the operation of the blade server components. Use the management module to view progress codes that offer information about the stages involved in powering on and pe...

  • Page 97

    C1001f00 to c1645300 service processor checkpoints the c1xx progress codes, or checkpoints, offer information about the initialization of both the service processor and the server. Service processor checkpoints are typical reference codes that occur during the initial program load (ipl) of the serve...

  • Page 98

    Table 18. C1001f00 to c1645300 checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 99

    Table 18. C1001f00 to c1645300 checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 100

    Table 18. C1001f00 to c1645300 checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 101

    Table 18. C1001f00 to c1645300 checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 102

    Table 18. C1001f00 to c1645300 checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 103

    Table 18. C1001f00 to c1645300 checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 104

    Table 18. C1001f00 to c1645300 checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 105

    Table 18. C1001f00 to c1645300 checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 106

    Table 19. C2001000 to c20082ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 107

    Table 19. C2001000 to c20082ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 108

    Table 19. C2001000 to c20082ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 109

    Table 19. C2001000 to c20082ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 110

    Table 19. C2001000 to c20082ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 111

    Table 19. C2001000 to c20082ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 112

    Table 19. C2001000 to c20082ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 113

    Table 19. C2001000 to c20082ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 114

    Ipl status progress codes a server that stalls during an initial program load (ipl) of the operating system indicates a problem with the operating system code or hardware configuration. The systems hardware information center at http://publib.Boulder.Ibm.Com/infocenter/powersys/ v3r1m5/index.Jsp des...

  • Page 115

    Table 21 lists the progress codes that might be displayed during the power-on self-test (post), along with suggested actions to take if the system hangs on the progress code. Only when you experience a hang condition should you take any of the actions described for a progress code. In the following ...

  • Page 116

    Table 21. Ca000000 to ca2799ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 117

    Table 21. Ca000000 to ca2799ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 118

    Table 21. Ca000000 to ca2799ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 119

    Table 21. Ca000000 to ca2799ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 120

    Table 21. Ca000000 to ca2799ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 121

    Table 21. Ca000000 to ca2799ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 122

    Table 21. Ca000000 to ca2799ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 123

    Table 21. Ca000000 to ca2799ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 124

    Table 21. Ca000000 to ca2799ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 125

    Table 21. Ca000000 to ca2799ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 126

    Table 21. Ca000000 to ca2799ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 127

    Table 21. Ca000000 to ca2799ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 128

    Table 21. Ca000000 to ca2799ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 129

    Table 21. Ca000000 to ca2799ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 130

    Table 21. Ca000000 to ca2799ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 131

    Table 21. Ca000000 to ca2799ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 132

    Table 21. Ca000000 to ca2799ff checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 133

    Table 22. D1001xxx to d1xx3fff dump codes v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see chapter 3, “p...

  • Page 134

    Table 22. D1001xxx to d1xx3fff dump codes (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see c...

  • Page 135

    Table 22. D1001xxx to d1xx3fff dump codes (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see c...

  • Page 136

    Table 22. D1001xxx to d1xx3fff dump codes (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see c...

  • Page 137

    Table 22. D1001xxx to d1xx3fff dump codes (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see c...

  • Page 138

    Table 23 lists the progress codes that might be displayed during the power-on self-test (post), along with suggested actions to take if the system hangs on the progress code. Only when you experience a hang condition should you take any of the actions described for a progress code. Table 23. D1xx3y0...

  • Page 139

    Table 23. D1xx3y01 to d1xx3yf2 checkpoints (continued) v if the system hangs on a progress code, follow the suggested actions in the order in which they are listed in the action column until the problem is solved. If an action solves the problem, you can stop performing the remaining actions. V see ...

  • Page 140

    D1xx900c to d1xxc003 service processor power-off checkpoints these d1xx service processor power-off status codes offer information about the status of the service processor during a power-off operation. Table 24 lists the progress codes that might be displayed during the power-on self-test (post), a...

  • Page 141

    Service request numbers (srns) service request numbers (srns) are error codes that the operating system generates. The codes have three digits, a hyphen, and three or four digits after the hyphen. Srns can be viewed using the aix diagnostics or the linux service aid “diagela” if it is installed. Not...

  • Page 142

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 103-151 151 the time-of-day battery failed. 1. Go to “removing the battery” on page 250 to start the battery replacement procedure. 2. Go to “installing the battery” on page 251 to complete the procedure. 109-200 the s...

  • Page 143

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 651-140 221 display character test failed. Note: diagnostic will provide this srn but there is no action to be taken. Do not perform operator panel test from diagnostics. 651-151 152 2e2 sensor indicates a voltage is o...

  • Page 144

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 651-613 d01 external cache ecc single-bit error. Go to “performing the checkout procedure” on page 184. 651-614 214 system bus time-out error. Go to “performing the checkout procedure” on page 184. 651-615 292 time-out...

  • Page 145

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 651-66a 2ce correctable error threshold exceeded. Go to “performing the checkout procedure” on page 184. 651-66b 2cc correctable error threshold exceeded. Go to “performing the checkout procedure” on page 184. 651-674 ...

  • Page 146

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 651-734 292 intermediate or system bus data parity error. Go to “performing the checkout procedure” on page 184. 651-735 292 intermediate or system bus time-out error. Go to “performing the checkout procedure” on page ...

  • Page 147

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 651-786 304 214 uncorrectable memory error. Go to “performing the checkout procedure” on page 184. 651-789 2cd 214 uncorrectable memory error. Go to “performing the checkout procedure” on page 184. 651-78a 2ce 214 unco...

  • Page 148

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 651-841 152 2e2 sensor detected a voltage outside of the normal range. Go to “performing the checkout procedure” on page 184. 651-842 2e1 sensor detected an abnormally high internal temperature. Make sure that: 1. The ...

  • Page 149

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 652-731 2c8 a non-critical error has been detected: intermediate or system bus address parity error. Schedule deferred maintenance. Go to “performing the checkout procedure” on page 184. 652-732 2c8 a non-critical erro...

  • Page 150

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 815-100 815 the floating-point processor test failed. Go to “performing the checkout procedure” on page 184. 815-101 815 floating point processor failed. Go to “performing the checkout procedure” on page 184. 815-102 8...

  • Page 151

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 887-113 887 external loopback (twisted pair) parity test failed. Go to “performing the checkout procedure” on page 184. 887-114 887 ethernet loopback (twisted pair) fairness test failed. Go to “performing the checkout ...

  • Page 152

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 950-2506 2506 221 missing options resolution for 3gb sas adapter card. Try each of the following steps. After reseating, removing, or replacing a part, retry the operation. 1. Check the bladecenter management-module ev...

  • Page 153

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 2506-8008 bat a permanent cache battery pack failure occurred. Go to “performing the checkout procedure” on page 184. 2506-8009 bat impending cache battery pack failure. Go to “performing the checkout procedure” on pag...

  • Page 154

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 2506-9042 - background disk array parity checking detected and corrected errors on specified disk. Go to “performing the checkout procedure” on page 184. 2506-9050 - required cache data can not be located for one or mo...

  • Page 155

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 252b-710 252b permanent adapter failure. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see “post progress codes (checkpoints)” on page 84. 2. Replace any parts reported b...

  • Page 156

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 252b-719 252b device bus termination power lost or not detected. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see “post progress codes (checkpoints)” on page 84. 2. Repl...

  • Page 157

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 256d-201 256d 221 adapter configuration error. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see “post progress codes (checkpoints)” on page 84. 2. Replace any parts repo...

  • Page 158

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 25c4-xxx 25c4 generic reference for broadcom adapter. Go to “performing the checkout procedure” on page 184. 25c4-201 25c4 configuration error. Go to “performing the checkout procedure” on page 184. 25c4-701 25c4 perma...

  • Page 159

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 2604-706 2604 error log analysis indicates that a fatal hardware error has occurred for the fibre channel adapter card. This adapter was successfully taken off-line. It will remain off-line until reconfigured or the sy...

  • Page 160

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 2624-xxx 2624 generic reference for 4x pci-e ddr infiniband host channel adapter - system-board and chassis assembly. Go to “performing the checkout procedure” on page 184. 2624-101 2624 configuration failure - system-...

  • Page 161

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 2640-134 2640 hardware command or dma failure. Go to “performing the checkout procedure” on page 184. 2640-136 2640 2631 timeout waiting for controller or drive with no busy status. Go to “performing the checkout proce...

  • Page 162

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 2e10-605 2e10 error log analysis indicates permanent adapter failure is reported on the other port of this adapter. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see “pos...

  • Page 163

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 2e13-605 2e13 error log analysis indicates permanent adapter failure is reported on the other port of this adapter. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see “pos...

  • Page 164

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 2e14-605 2e14 error log analysis indicates permanent adapter failure is reported on the other port of this adapter. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see “pos...

  • Page 165

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 2e15-605 2e15 error log analysis indicates permanent adapter failure is reported on the other port of this adapter. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see “pos...

  • Page 166

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 2e21-605 2e21 error log analysis indicates permanent adapter failure is reported on the other port of this adapter. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see “pos...

  • Page 167

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 2e23-106 2e23 external wrap with ip checksum test failure 1. Check the bladecenter management-module event log. If an error was recorded by the system, see “post progress codes (checkpoints)” on page 84. 2. Replace any...

  • Page 168

    Table 25. 101-711 through ffc-725 srns (continued) srn ffc description and action 2e23-604 2e23 error log analysis indicates transmission errors. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see “post progress codes (checkpoints)” on page 84. 2. Repla...

  • Page 169

    A00-ff0 through a24-xxx srns aix might generate service request numbers (srns) from a00-ff0 to a24-xxx. Note: some srns in this sequence might have 4 rather than 3 digits after the dash (–). Table 26 shows the meaning of an x in any of the following srns, such as a01-00x. Table 26. Meaning of the la...

  • Page 170

    Table 27. A00-ff0 through a24-xxx srns (continued) srn description fru/action a01-06x time-out error waiting for i/o. 1. Check the bladecenter management-module event log; if an error was recorded by the system, see “post progress codes (checkpoints)” on page 84. 2. If no entry is found, replace the...

  • Page 171

    Table 27. A00-ff0 through a24-xxx srns (continued) srn description fru/action a02-06x memory data error (bad data going to memory). 1. Check the bladecenter management-module event log; if an error was recorded by the system, see “post progress codes (checkpoints)” on page 84. 2. If no entry is foun...

  • Page 172

    Table 27. A00-ff0 through a24-xxx srns (continued) srn description fru/action a03-05x i/o error on non-pci bus. 1. Check the bladecenter management-module event log; if an error was recorded by the system, see “post progress codes (checkpoints)” on page 84. 2. If no entry is found, replace the syste...

  • Page 173

    Table 27. A00-ff0 through a24-xxx srns (continued) srn description fru/action a03-16x i/o expansion unit not in an operating state. 1. Check the bladecenter management-module event log; if an error was recorded by the system, see “post progress codes (checkpoints)” on page 84. 2. If no entry is foun...

  • Page 174

    Table 27. A00-ff0 through a24-xxx srns (continued) srn description fru/action a05-06x system shutdown due to abnormally high internal temperature. 1. Make sure that: a. The room ambient temperature is within the system operating environment. B. There is unrestricted air flow around the system. C. Al...

  • Page 175

    Table 27. A00-ff0 through a24-xxx srns (continued) srn description fru/action a05-21x system shutdown due to over temperature condition. 1. Make sure that: a. The room ambient temperature is within the system operating environment. B. There is unrestricted air flow around the system. C. All system c...

  • Page 176

    Table 27. A00-ff0 through a24-xxx srns (continued) srn description fru/action a0d-09x service processor error accessing vital product data eeprom. 1. Check the bladecenter management-module event log; if an error was recorded by the system, see “post progress codes (checkpoints)” on page 84. 2. If n...

  • Page 177

    Table 27. A00-ff0 through a24-xxx srns (continued) srn description fru/action a0d-36x other ipl diagnostic error. 1. Check the bladecenter management-module event log; if an error was recorded by the system, see “post progress codes (checkpoints)” on page 84. 2. If no entry is found, replace the sys...

  • Page 178

    Table 27. A00-ff0 through a24-xxx srns (continued) srn description fru/action a11-01x a non-critical error has been detected, a cpu internal error. 1. Check the bladecenter management-module event log; if an error was recorded by the system, see “post progress codes (checkpoints)” on page 84. 2. If ...

  • Page 179

    Table 27. A00-ff0 through a24-xxx srns (continued) srn description fru/action a11-550 recoverable errors on resource indicate a trend toward an unrecoverable error. However, the resource could not be deconfigured and is still in use. The system is operating with the potential for an unrecoverable er...

  • Page 180

    Table 27. A00-ff0 through a24-xxx srns (continued) srn description fru/action a12-07x a non-critical error has been detected, a memory bus/switch internal error. 1. Check the bladecenter management-module event log; if an error was recorded by the system, see “post progress codes (checkpoints)” on p...

  • Page 181

    Table 27. A00-ff0 through a24-xxx srns (continued) srn description fru/action a12-16x a non-critical error has been detected, a system bus internal hardware/switch error. 1. Check the bladecenter management-module event log; if an error was recorded by the system, see “post progress codes (checkpoin...

  • Page 182

    Table 27. A00-ff0 through a24-xxx srns (continued) srn description fru/action a13-06x a non-critical error has been detected, a mezzanine bus address parity error. 1. Check the bladecenter management-module event log; if an error was recorded by the system, see “post progress codes (checkpoints)” on...

  • Page 183

    Table 27. A00-ff0 through a24-xxx srns (continued) srn description fru/action a13-16x a non-critical error has been detected, an i/o expansion unit not in an operating state. 1. Check the bladecenter management-module event log; if an error was recorded by the system, see “post progress codes (check...

  • Page 184

    Table 27. A00-ff0 through a24-xxx srns (continued) srn description fru/action a15-12x sensor detected redundant power supply failure. 1. Check the bladecenter management-module event log; if an error was recorded by the system, see “post progress codes (checkpoints)” on page 84. 2. If no entry is fo...

  • Page 185

    Table 27. A00-ff0 through a24-xxx srns (continued) srn description fru/action a15-22x fan failure and over temperature condition. 1. Check the bladecenter management-module event log; if an error was recorded by the system, see “post progress codes (checkpoints)” on page 84. 2. If no entry is found,...

  • Page 186

    Table 27. A00-ff0 through a24-xxx srns (continued) srn description fru/action a1d-05x a non-critical error has been detected, a service processor error accessing special registers. 1. Check the bladecenter management-module event log; if an error was recorded by the system, see “post progress codes ...

  • Page 187

    Table 27. A00-ff0 through a24-xxx srns (continued) srn description fru/action a1d-19x a non-critical error has been detected, a service processor error accessing real time clock/time-of-day clock. 1. Check the bladecenter management-module event log; if an error was recorded by the system, see “post...

  • Page 188

    Table 27. A00-ff0 through a24-xxx srns (continued) srn description fru/action a1d-34x a non-critical error has been detected: wire test error. 1. Check the bladecenter management-module event log; if an error was recorded by the system, see “post progress codes (checkpoints)” on page 84. 2. If no en...

  • Page 189

    Table 27. A00-ff0 through a24-xxx srns (continued) srn description fru/action a24-xxx spurious interrupts have exceeded threshold. 1. Check the bladecenter management-module event log; if an error was recorded by the system, see “post progress codes (checkpoints)” on page 84. 2. Replace part numbers...

  • Page 190

    Table 28. Ssss-102 through ssss-640 srns (continued) srn ffc description and action ssss-108 ssss the bus test failed. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see “post progress codes (checkpoints)” on page 84. 2. Replace any parts reported by th...

  • Page 191

    Table 28. Ssss-102 through ssss-640 srns (continued) srn ffc description and action ssss-122 ssss a scsd reservation conflict error. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see “post progress codes (checkpoints)” on page 84. 2. Replace any parts ...

  • Page 192

    Table 28. Ssss-102 through ssss-640 srns (continued) srn ffc description and action ssss-134 252b software the adapter failed to configure. 1. Check the bladecenter management-module event log. If an error was recorded by the system, see “post progress codes (checkpoints)” on page 84. 2. Replace any...

  • Page 193

    Failing function codes 151 through 2e33 failing function codes (ffcs) identify a function within the system unit that is failing. Table 29 describes the component that each function code identifies. Note: when replacing a component, perform system verification for the component. See “using the diagn...

  • Page 194

    Table 29. Failing function codes 151 through 2e33 (continued) ffc description and notes 2d7 system-board and chassis assembly (vpd module) 2d9 system-board and chassis assembly (power controller) 2e0 system-board and chassis assembly (fan sensor problem) 2e1 system-board and chassis assembly (therma...

  • Page 195

    Table 29. Failing function codes 151 through 2e33 (continued) ffc description and notes 2d02 system-board and chassis assembly (generic usb reference to controller/adapter) 2e00 qlogic 4gb fibre channel and broadcom 1 gb ethernet combo 2e10 qlogic 4gb fibre channel and broadcom 1 gb ethernet combo 2...

  • Page 196

    See the online information or the bladecenter management module user's guide for more information about the event log. Checkout procedure the checkout procedure is the sequence of tasks that you should follow to diagnose a problem in the blade server. About the checkout procedure review this informa...

  • Page 197

    3. If the aix operating system records a service request number (srn), see “service request numbers (srns)” on page 129. 4. Check the bladecenter management-module event log. If an error was recorded by the system, see “post progress codes (checkpoints)” on page 84 or “system reference codes (srcs)”...

  • Page 198

    2. Turn off the blade server and wait 45 seconds before proceeding. 3. Turn on the blade server and establish an sol session. 4. Check for the following responses: a. Progress codes are recorded in the management-module event log. B. Record any messages or diagnostic information that might be in the...

  • Page 199

    3. When testing is complete, press f3 until the diagnostic operating instructions panel is displayed, then press f3 to exit the diagnostic program. Starting stand-alone diagnostics from a cd perform these procedures to start the stand-alone diagnostics from a cd. These procedures can be used if the ...

  • Page 200

    Starting stand-alone diagnostics from a nim server perform this procedure to start the stand-alone diagnostics from a network installation management (nim) server. Note: see network installation management in the aix information center for information about configuring the blade server as a nim serv...

  • Page 201

    7. When testing is complete, press f3 until the diagnostic operating instructions screen is displayed; then press f3 again to exit the diagnostic program. Using the diagnostics program follow the basic procedures for running the diagnostics program. 1. Start the diagnostics from the aix operating sy...

  • Page 202

    Boot problem resolution depending on the boot device, a checkpoint might be displayed in the list of checkpoints in the management module for an extended period of time while the boot image is retrieved from the device. This situation is particularly true for cd and network boot attempts. When booti...

  • Page 203

    V midplane 4. If you are attempting to boot from a hard disk drive. A. Verify that the hard disk drive is installed. B. Select the cd or dvd drive as the boot device. C. Go to “performing the checkout procedure” on page 184. D. Reload the operating system onto the hard disk drive if the boot attempt...

  • Page 204

    Drive problems identify hard disk drive problem symptoms and what corrective actions to take. V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 3, “parts listing, type 8406,” on page 229 to determine which components ...

  • Page 205

    Management module service processor problems determine if a problem is a management module service processor problem and, if so, the corrective action to take. V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 3, “par...

  • Page 206

    Microprocessor problems identify microprocessor problem symptoms and what corrective actions to take. V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 3, “parts listing, type 8406,” on page 229 to determine which com...

  • Page 207

    The following table shows the syntax of a nine-word b700xxxx src as it might be displayed in the event log of the management module. The first word of the src in this example is the message identifier, b7001111 . This example numbers each word after the first word to show relative word positions. Th...

  • Page 208

    Optional device problems identify optional device problem symptoms and what corrective actions to take. V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 3, “parts listing, type 8406,” on page 229 to determine which c...

  • Page 209

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 3, “parts listing, type 8406,” on page 229 to determine which components are crus and which components are frus. V if an action step is preceded by “(trained service t...

  • Page 210

    Power hypervisor (phyp) problems the power hypervisor (phyp) provides error diagnostics with associated error codes and fault isolation procedures for troubleshooting. When the power7 hypervisor error analysis determines a specific fault, the hypervisor logs an error code that identifies a failing c...

  • Page 211

    Table 33. Power hypervisor isolation procedures (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 3, “parts listing, type 8406,” on page 229 to determine which components are crus and which components are ...

  • Page 212

    Service processor problems the baseboard management controller (bmc) is a flexible service processor that provides error diagnostics with associated error codes, and fault isolation procedures for troubleshooting. Note: resetting the service processor causes a power7 reset/reload, which generates a ...

  • Page 213

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 3, “parts listing, type 8406,” on page 229 to determine which components are crus and which components are frus. V if an action step is preceded by “(trained service t...

  • Page 214

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 3, “parts listing, type 8406,” on page 229 to determine which components are crus and which components are frus. V if an action step is preceded by “(trained service t...

  • Page 215

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 3, “parts listing, type 8406,” on page 229 to determine which components are crus and which components are frus. V if an action step is preceded by “(trained service t...

  • Page 216

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 3, “parts listing, type 8406,” on page 229 to determine which components are crus and which components are frus. V if an action step is preceded by “(trained service t...

  • Page 217

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 3, “parts listing, type 8406,” on page 229 to determine which components are crus and which components are frus. V if an action step is preceded by “(trained service t...

  • Page 218

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 3, “parts listing, type 8406,” on page 229 to determine which components are crus and which components are frus. V if an action step is preceded by “(trained service t...

  • Page 219

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 3, “parts listing, type 8406,” on page 229 to determine which components are crus and which components are frus. V if an action step is preceded by “(trained service t...

  • Page 220

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 3, “parts listing, type 8406,” on page 229 to determine which components are crus and which components are frus. V if an action step is preceded by “(trained service t...

  • Page 221

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 3, “parts listing, type 8406,” on page 229 to determine which components are crus and which components are frus. V if an action step is preceded by “(trained service t...

  • Page 222

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 3, “parts listing, type 8406,” on page 229 to determine which components are crus and which components are frus. V if an action step is preceded by “(trained service t...

  • Page 223

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 3, “parts listing, type 8406,” on page 229 to determine which components are crus and which components are frus. V if an action step is preceded by “(trained service t...

  • Page 224

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 3, “parts listing, type 8406,” on page 229 to determine which components are crus and which components are frus. V if an action step is preceded by “(trained service t...

  • Page 225

    Software problems use this information to recognize software problem symptoms and to take corrective actions. V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 3, “parts listing, type 8406,” on page 229 to determine w...

  • Page 226

    Light path diagnostics light path diagnostics is a system of leds on the control panel and on the system board of the blade server. When an error occurs, leds are lit throughout the blade server. If the control panel indicates an error, use the descriptions of the leds to diagnose the problem and ta...

  • Page 227

    Table 34 shows led descriptions. Table 34. Ps700 leds callout base unit leds 1 3v lithium battery led 2 dimm 1-4 leds 3 management card led 4 light path power led 5 system board led 6 hdd1 led 7 interposer led 8 ciov (1xe) expansion card connector led 9 high-speed (cffh) expansion card connector led...

  • Page 228

    Table 35 describes the leds on the system board and suggested actions for correcting any detected problems. Table 35. Light path diagnostic led descriptions v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 3, “parts ...

  • Page 229

    Table 35. Light path diagnostic led descriptions (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V see chapter 3, “parts listing, type 8406,” on page 229 to determine which components are crus and which components are...

  • Page 230

    Isolating firmware problems you can use this procedure to isolate firmware problems. To isolate a firmware problem, follow the procedure until the problem is solved. 1. If the blade server is operating, shut down the operating system and turn off the blade server. 2. Turn on the blade server. If the...

  • Page 232

    4. To check the vfchost mapping, enter the lsmap command as follows: lsmap -vadapter vfchost2 -npiv the output might look like the following example: name physloc clntid clntname clntos --------------- ---------------------------------------- ------- ---------- ------ vfchost2 u7895.42x.9999999-v1-c...

  • Page 233

    See the documentation for the management module to learn more. Starting the temp image start the temp image before you update the firmware. Perform the following procedure to start the temp image. 1. Access the advanced management module. See the bladecenter management module command-line interface ...

  • Page 234

    2. From the function selection menu, select task selection and press enter. 3. From the tasks selection list menu, select update and manage system flash and press enter . The update and manage system flash menu is displayed. The top of the window displays the system firmware level for the perm and t...

  • Page 235

    2. Verify that power management is set correctly for your bladecenter unit configuration. 3. Verify whether the problem is being experienced on more than one blade server. 4. Perform a test of the failing function on a blade server that is known to be operational. 5. Try the blade server in a differ...

  • Page 236

    5. Troubleshoot the diskette drive if it is the only failing component. If there is a diskette in the drive, make sure that: v the diskette is inserted correctly in the drive. V the diskette is good and not damaged; the drive led light flashes once per second when the diskette is inserted. (try anot...

  • Page 237

    If these steps do not resolve the problem, it is likely a problem with the blade server. See “universal serial bus (usb) port problems” on page 213 for more information. Solving shared network connection problems problems with bladecenter shared resources might appear to be in the blade server, but ...

  • Page 238

    To check the general function of shared bladecenter power resources, perform the following procedure. 1. Verify that the leds on all the bladecenter power modules are lit. 2. Verify that power is being supplied to the bladecenter unit. 3. Verify that the installation of the blade server type is supp...

  • Page 239

    7. Replace the monitor cable, if applicable. 8. Replace the monitor. 9. Replace the management module. See the online information center or the problem determination and service guide or the hardware maintenance manual and troubleshooting guide for your bladecenter unit. Solving undetermined problem...

  • Page 240

    If the problem is solved when you remove an i/o expansion option from the blade server but the problem recurs when you reinstall the same expansion option, suspect the expansion option; if the problem recurs when you replace the expansion option with a different one, suspect the system-board and cha...

  • Page 241

    Chapter 3. Parts listing, type 8406 the parts listing identifies each replaceable part and its part number. Figure 7 shows replaceable components that are available for the ps700 blade server. Figure 7. Parts illustration, type 8406. Ps700 base unit with cover. © copyright ibm corp. 2010, 2011 229.

  • Page 242

    Replaceable components are of three types: v tier 1 customer replaceable unit (cru): replacement of tier 1 crus is your responsibility. If ibm installs a tier 1 cru at your request, you will be charged for the installation. V tier 2 customer replaceable unit: you may install a tier 2 cru yourself or...

  • Page 243

    Table 36. Parts listing, type 8406 (continued) index description cru part number fru part number failing function code (ffc) (tier 1) (tier 2) hard drive filler 40k5928 service label 46k5891 ibm fru/cru label 46k5893 oem ibm fru/cru label 46k5894 cover warning label 90p4799 miscellaneous parts kit 3...

  • Page 244

    232 power systems: problem determination and service guide for the ibm power ps700 (8406-70y).

  • Page 245

    Chapter 4. Removing and replacing blade server components use this information to remove and replace components of the ps700 blade server that are replaceable. Replaceable components are of three types: v tier 1 customer replaceable unit (cru): replacement of tier 1 crus is your responsibility. If i...

  • Page 246

    System reliability guidelines follow these guidelines to help ensure proper cooling and system reliability. V verify that the ventilation holes on the blade server are not blocked. V verify that you are maintaining proper system cooling in the unit. Do not operate the bladecenter unit without a blad...

  • Page 247

    Removing the blade server from a bladecenter unit remove the blade server from the bladecenter unit to access options, connectors, and system-board indicators. Attention: v to maintain proper system cooling, do not operate the bladecenter unit without a blade server, expansion unit, or blade filler ...

  • Page 248

    Installing the blade server in a bladecenter unit install the blade server in a bladecenter unit to use the blade server. Statement 21 caution: hazardous energy is present when the blade server is connected to the power source. Always replace the blade server cover before installing the blade server...

  • Page 249

    5. Verify that the release handles on the blade server are in the open position (perpendicular to the blade server, as shown in 1 in figure 9 on page 236). 6. If you installed a filler blade or another blade server in the bay from which you removed the blade server, remove it from the bay. 7. Slide ...

  • Page 250

    Perform the following procedure to open and remove the blade server cover. 1. Read the safety topic and the “installation guidelines” on page 233. 2. Shut down the operating system, turn off the blade server, and remove the blade server from the bladecenter unit. See “removing the blade server from ...

  • Page 251

    Installing and closing the blade server cover install and close the cover of the blade server before you insert the blade server into the bladecenter unit. Do not attempt to override this important protection. Statement 21 caution: hazardous energy is present when the blade server is connected to th...

  • Page 252

    3. Pivot the cover to the closed position until the releases (as shown by 2 ) click into place in the cover. 4. Install the blade server into the bladecenter unit. See “installing the blade server in a bladecenter unit” on page 236. Removing the bezel assembly remove the bezel assembly. 1. Read the ...

  • Page 253

    1. Connect the control-panel cable ( 1 in figure 13) to the control-panel connector ( 2 ) on the system board. 2. Carefully slide the bezel assembly ( 4 ) onto the blade server until the two bezel-assembly releases ( 3 ) click into place in the bezel assembly. 3. Install and close the blade server c...

  • Page 254

    Perform the following procedure to remove the drive. 1. Back up the data from the drive to another storage device. 2. Read the safety topic and the “installation guidelines” on page 233. 3. Shut down the operating system, turn off the blade server, and remove the blade server from the bladecenter un...

  • Page 255

    All drive connectors are on the same bus. If the two drives are both sas hard disk drives, you can use them to implement and manage a redundant array of independent disks (raid) level-1 array. See “configuring a raid array” on page 268 for information about raid configuration. To install a drive, co...

  • Page 256

    8. Install the blade server into the bladecenter unit. See “installing the blade server in a bladecenter unit” on page 236. Removing a memory module you can remove a very low profile (vlp) dual-inline memory module (dimm). 1. Read the safety topic and the “installation guidelines” on page 233. 2. Sh...

  • Page 257

    Installing a memory module install dual inline memory modules (dimms) in the blade server. Table 37 shows allowable placement of dimm modules: table 37. Memory module combinations dimm count ps700 base blade planar (p1) dimm slots 1 2 3 4 5 6 7 8 2 x x 4 x x x x 6 x x x x x x 8 x x x x x x x x see “...

  • Page 258

    6. Remove the bezel. See “removing the bezel assembly” on page 240 7. Locate the dimm connectors on the system board. See the illustration in “system-board connectors” on page 8. Determine the connector into which you will install the dimm. 8. Touch the static-protective package that contains the pa...

  • Page 259

    Removing a ciov form-factor expansion card you can remove a ciov form-factor expansion card from the 1xe connector. 1. Read the safety topic and the “installation guidelines” on page 233. 2. Shut down the operating system, turn off the blade server, and remove the blade server from the bladecenter u...

  • Page 260

    To install a ciov form-factor expansion card, complete the following steps: 1. Read the safety topic and the “installation guidelines” on page 233. 2. Shut down the operating system, turn off the blade server, and remove the blade server from the bladecenter unit. See “removing the blade server from...

  • Page 261

    Removing a combination-form-factor expansion card complete this procedure to remove a combination-form-factor expansion card. 1. Read the safety topic and the “installation guidelines” on page 233. 2. Shut down the operating system, turn off the blade server, and remove the blade server from the bla...

  • Page 262

    1. Read the safety topic and the “installation guidelines” on page 233. 2. Shut down the operating system, turn off the blade server, and remove the blade server from the bladecenter unit. See “removing the blade server from a bladecenter unit” on page 235. 3. Open and remove the blade server cover....

  • Page 263

    3. Carefully lay the blade server on a flat, static-protective surface, with the cover side up. 4. Open and remove the blade server cover. See “removing the blade server cover” on page 237. 5. Locate the battery on the system board. See “system-board connectors” on page 8 for the location of the bat...

  • Page 264

    Caution: when replacing the lithium battery, use only ibm part number 33f8354 or an equivalent type battery recommended by the manufacturer. If your system has a module containing a lithium battery, replace it only with the same module type made by the same manufacturer. The battery contains lithium...

  • Page 265

    Perform the following procedure to remove the disk drive tray. 1. Read the safety topic and the “installation guidelines” on page 233. 2. Shut down the operating system, turn off the blade server, and remove the blade server from the bladecenter unit. See “removing the blade server from a bladecente...

  • Page 266

    To install the disk drive tray, complete the following steps: 1. Place the drive tray ( 1 in figure 25) into position on the system board and install the four screws to secure it. 2. Install the disk drive that was removed from the drive tray. See “installing a drive” on page 242 for instructions. 3...

  • Page 267

    Removing the tier 2 management card you can remove this tier 2 cru yourself or request ibm to remove it, at no additional charge, under the type of warranty service that is designated for the blade server. Remove the management card to replace the card or to reuse the card in a new system board and ...

  • Page 268

    Installing the tier 2 management card you can install this tier 2 cru yourself or request ibm to install it, at no additional charge, under the type of warranty service that is designated for the blade server. Use this procedure to install the management card into the currently installed system boar...

  • Page 269

    7. Install and close the blade server cover. See “installing and closing the blade server cover” on page 239. Statement 21 caution: hazardous energy is present when the blade server is connected to the power source. Always replace the blade server cover before installing the blade server. 8. Install...

  • Page 270

    Before you complete this procedure, install the management card, as described in installing the management card. Powervm is one of the capacity on demand advanced functions. Capacity on demand advanced functions are also referred to as virtualization engine systems technologies or virtualization eng...

  • Page 271

    Note: when you request the activation code, you must supply the information that is emphasized in the following example. Cod vet information system type: 7895 system serial number: 12-34567 anchor card ccin: 52ef anchor card serial number: 01-231s000 anchor card unique identifier: 30250812077c3228 r...

  • Page 272

    Chvet -o e -k where, is the activation code. For example, if the activation code is 4d8d6e7a81409365ca1f000028200041fd , enter the following command: chvet -o e -k 4d8d6e7a81409365ca1f000028200041fd b. In an ivm session, validate that the code entry is successful by using the lsvet -t hist command. ...

  • Page 273

    4. Does the blade server have fibre channel adapters? Yes “save vfchost map data” on page 218. Then, continue with the next step. No continue with the next step. 5. Shut down the operating system, turn off the blade server, and remove the blade server from the bladecenter unit. See “removing the bla...

  • Page 274

    14. Write the machine type, model number, and serial number of the blade server on the repair identification (rid) tag that comes with the replacement system-board and chassis assembly. This information is on the identification label that is behind the control-panel door on the front of the blade se...

  • Page 275

    Chapter 5. Configuring update the firmware and use the management module and the system management services (sms) to configure the ps700 blade server. Updating the firmware ibm periodically makes firmware updates available for you to install on the blade server, the management module, or expansion c...

  • Page 276

    5. Install the firmware update with one of the following methods: v install the firmware with the in-band diagnostics of your aix system, as described in using the aix diagnostics to install the server firmware update through aix. V install the firmware with the update_flash command on aix: cd /tmp/...

  • Page 277

    Using the sms utility use the system management services (sms) utility to perform a variety of configuration tasks on the ps700 blade server. Starting the sms utility start the sms utility to configure the blade server. 1. Turn on or restart the blade server, and establish an sol session with it. Se...

  • Page 278

    The ce login must have a role of run diagnostics and be a primary group of system. This enables the ce login to perform the following tasks: v run the diagnostics, including the service aids, certify, and format. V run all the operating-system commands that are run by system group users. V configure...

  • Page 279

    Blade server ethernet controller enumeration the enumeration of the ethernet controllers in a blade server is operating-system dependent. You can verify the ethernet controller designations that a blade server uses through the operating-system settings. The routing of an ethernet controller to a par...

  • Page 280

    Table 38. Mac addressing scheme for physical and logical host ethernet adapters (continued) node name in management module relation to the mac that is listed on the ps700 label example logical hea port mac +17 same as last mac address on the label 00:1a:64:44:0ec8 to 00:1a:64:44:0ed4 for more inform...

  • Page 281

    D. Click bladecenter ps700 to display the list of downloadable files for the blade server. Chapter 5. Configuring 269.

  • Page 282

    270 power systems: problem determination and service guide for the ibm power ps700 (8406-70y).

  • Page 283

    Appendix. Notices this information was developed for products and services offered in the u.S.A. The manufacturer may not offer the products, services, or features discussed in this document in other countries. Consult the manufacturer's representative for information on the products and services cu...

  • Page 284

    The manufacturer's prices shown are the manufacturer's suggested retail prices, are current and are subject to change without notice. Dealer prices may vary. This information is for planning purposes only. The information herein is subject to change before the products described become available. Th...

  • Page 285

    Electronic emission notices when attaching a monitor to the equipment, you must use the designated monitor cable and any interference suppression devices supplied with the monitor. Class a notices the following class a statements apply to the ibm servers that contain the power7 processor and its fea...

  • Page 286

    Warning: this is a class a product. In a domestic environment, this product may cause radio interference, in which case the user may be required to take adequate measures. Vcci statement - japan the following is a summary of the vcci japanese statement in the box above: this is a class a product bas...

  • Page 287

    Electromagnetic interference (emi) statement - taiwan the following is a summary of the emi taiwan statement above. Warning: this is a class a product. In a domestic environment this product may cause radio interference in which case the user will be required to take adequate measures. Ibm taiwan co...

  • Page 288

    Germany compliance statement deutschsprachiger eu hinweis: hinweis für geräte der klasse a eu-richtlinie zur elektromagnetischen verträglichkeit dieses produkt entspricht den schutzanforderungen der eu-richtlinie 2004/108/eg zur angleichung der rechtsvorschriften über die elektromagnetische verträgl...

  • Page 289

    Electromagnetic interference (emi) statement - russia class b notices the following class b statements apply to features designated as electromagnetic compatibility (emc) class b in the feature installation information. Federal communications commission (fcc) statement this equipment has been tested...

  • Page 290

    Industry canada compliance statement this class b digital apparatus complies with canadian ices-003. Avis de conformité à la réglementation d'industrie canada cet appareil numérique de la classe b est conforme à la norme nmb-003 du canada. European community compliance statement this product is in c...

  • Page 291

    Japanese electronics and information technology industries association (jeita) confirmed harmonics guideline with modifications (products greater than 20 a per phase) ibm taiwan contact information electromagnetic interference (emi) statement - korea germany compliance statement deutschsprachiger eu...

  • Page 292

    Dieses gerät ist berechtigt, in Übereinstimmung mit dem deutschen emvg das eg-konformitätszeichen - ce - zu führen. Verantwortlich für die einhaltung der emv vorschriften ist der hersteller: international business machines corp. New orchard road armonk, new york 10504 tel: 914-499-1900 der verantwor...

  • Page 294

    Printed in usa gi11-9831-00