IBM NeXtScale nx360 M4 Installation And Service Manual

Summary of NeXtScale nx360 M4

  • Page 1

    Ibm nextscale nx360 m4 type 5455 installation and service guide.

  • Page 3

    Ibm nextscale nx360 m4 type 5455 installation and service guide.

  • Page 4

    Note before using this information and the product it supports, read the general information in appendix d, “getting help and technical assistance,” on page 373, “notices” on page 377, the warranty information document, and the safety information and environmental notices and user guide documents on...

  • Page 5

    Contents safety . . . . . . . . . . . . . . . Vii guidelines for trained service technicians . . . . Viii inspecting for unsafe conditions . . . . . . Viii guidelines for servicing electrical equipment . . Ix safety statements . . . . . . . . . . . . . X chapter 1. The ibm nextscale nx360 m4 compute...

  • Page 6

    Chapter 5. Removing and replacing components . . . . . . . . . . . . 93 installation tools . . . . . . . . . . . . . 93 installing an optional device . . . . . . . . . 93 installation guidelines . . . . . . . . . . . 93 system reliability guidelines . . . . . . . . 95 handling static-sensitive devic...

  • Page 7

    European union emc directive conformance statement. . . . . . . . . . . . . . 381 germany class a statement . . . . . . . 382 japan vcci class a statement. . . . . . . 383 japan electronics and information technology industries association (jeita) statement . . . 383 korea communications commission ...

  • Page 8

    Vi ibm nextscale nx360 m4 type 5455: installation and service guide.

  • Page 9

    Safety before installing this product, read the safety information. Antes de instalar este produto, leia as informações de segurança. Læs sikkerhedsforskrifterne, før du installerer dette produkt. Lees voordat u dit product installeert eerst de veiligheidsvoorschriften. Ennen kuin asennat tämän tuot...

  • Page 10

    Les sikkerhetsinformasjonen (safety information) før du installerer dette produktet. Antes de instalar este produto, leia as informações sobre segurança. Antes de instalar este producto, lea la información de seguridad. Läs säkerhetsinformationen innan du installerar den här produkten. Bu ürünü kurm...

  • Page 11

    An unsafe condition, you must determine how serious the hazard is and whether you must correct the problem before you work on the product. Consider the following conditions and the safety hazards that they present: v electrical hazards, especially primary power. Primary voltage on the frame can caus...

  • Page 12

    V before you work on the equipment, disconnect the power cord. If you cannot disconnect the power cord, have the customer power-off the wall box that supplies power to the equipment and lock the wall box in the off position. V never assume that power has been disconnected from a circuit. Check it to...

  • Page 13

    Danger electrical current from power, telephone, and communication cables is hazardous. To avoid a shock hazard: v do not connect or disconnect any cables or perform installation, maintenance, or reconfiguration of this product during an electrical storm. V connect all power cords to a properly wire...

  • Page 14

    Statement 3 caution: when laser products (such as cd-roms, dvd drives, fiber optic devices, or transmitters) are installed, note the following: v do not remove the covers. Removing the covers of the laser product could result in exposure to hazardous laser radiation. There are no serviceable parts i...

  • Page 15

    Statement 5 caution: the power control button on the device and the power switch on the power supply do not turn off the electrical current supplied to the device. The device also might have more than one power cord. To remove all electrical current from the device, ensure that all power cords are d...

  • Page 16

    Statement 12 caution: the following label indicates a hot surface nearby. Statement 26 caution: do not place any object on top of rack-mounted devices. Statement 27 caution: hazardous moving parts are nearby. Rack safety information, statement 2 xiv ibm nextscale nx360 m4 type 5455: installation and...

  • Page 17

    Danger v always lower the leveling pads on the rack cabinet. V always install stabilizer brackets on the rack cabinet. V always install servers and optional devices starting from the bottom of the rack cabinet. V always install the heaviest devices in the bottom of the rack cabinet. Safety xv.

  • Page 18

    Xvi ibm nextscale nx360 m4 type 5455: installation and service guide.

  • Page 19

    Chapter 1. The ibm nextscale nx360 m4 compute node type 5455 the ibm nextscale nx360 m4 compute node type 5455 is a high-availability, scalable compute node that is optimized to support the next-generation microprocessor technology and is ideally suited for medium and large businesses. The ibm nexts...

  • Page 20

    Included in the compute node documentation. To obtain the most up-to-date documentation for this product, go to http://publib.Boulder.Ibm.Com/infocenter/ flexsys/information/index.Jsp. You can subscribe to information updates that are specific to your compute node at http://www.Ibm.Com/support/mynot...

  • Page 21

    Hardware and software requirements the hardware and software requirements of the ibm documentation cd. The ibm documentation cd requires the following minimum hardware and software: v microsoft windows or red hat linux v 100 mhz microprocessor v 32 mb of ram v adobe acrobat reader 3.0 (or later) or ...

  • Page 22

    Related documentation this installation and service guide contains general information about the server including how to set up and cable the server, how to install supported optional devices, how to configure the server, and information to help you solve problems yourself and information for servic...

  • Page 23

    The following notices and statements are used in this document: v note: these notices provide important tips, guidance, or advice. V important: these notices provide information or advice that might help you avoid inconvenient or problem situations. V attention: these notices indicate potential dama...

  • Page 24

    V system error leds v software raid supportability for raid level-0, raid level-1, or raid level-10 v hardware raid supportability for raid level-0, raid level-1, raid level-5, or raid level-10 v wake on lan (wol) drive expansion bays (depending on the model): supports up to eight 3.5-inch sata (if ...

  • Page 25

    Environment: server off 7 : v temperature: 5°c to 45°c (41°f to 113°f) v relative humidity: 8% to 85% v maximum dew point: 27°c (80.6°f) storage (non-operating): v temperature: 1°c to 60°c (33.8°f to 140.0°f) v maximum altitude: 3,050 m (10,000 ft) v relative humidity: 5% to 80% v maximum dew point:...

  • Page 26

    A. Conductive materials (conductive flooring, conductive footwear on all personnel who go into the datacenter; all mobile furnishings and equipment will be made of conductive or static dissipative materials). B. During maintenance on any hardware, a properly functioning wrist strap must be used by a...

  • Page 27

    Monitoring, and alerting function. If an environmental condition exceeds a threshold or if a system component fails, the imm lights leds to help you diagnose the problem, records the error in the imm event log, and alerts you to the problem. Optionally, the imm also provides a virtual presence capab...

  • Page 28

    Enclosure. This policy is enforced when the initial power is applied to the ibm nextscale n1200 enclosure or when a compute node is inserted into the ibm nextscale n1200 enclosure. The following settings for this policy are available: – basic power management – power module redundancy – power module...

  • Page 29

    Major components of the storage tray use this information to locate the major components on the storage tray. The storage tray is installed on the top of a compute node. Each storage tray supports up to seven 3.5-inch lff sata hard disk drives. The serveraid adapter can be connects from compute node...

  • Page 30

    Major components of the gpu tray use this information to locate the major components on the gpu tray. The gpu tray is installed on the top of a compute node. Each gpu tray supports up to two graphics processing unit (gpu) enclosure (full-height, full-length). The following illustration shows the maj...

  • Page 31

    Power, controls, and indicators use this information to view power features, turn on and turn off the compute node, and view the functions of the controls and indicators. Compute node controls, connectors, and leds use this information for details about the controls, connectors, and leds. The follow...

  • Page 32

    Power button/led when the compute node is connected to power through the ibm nextscale n1200 enclosure, press this button to turn on or turn off the compute node. This button is also the power led. This green led indicates the power status of the compute node: v flashing rapidly : the led flashes ra...

  • Page 33

    Ethernet link activity/status led when any of these leds is lit, they indicate that the server is transmitting to or receiving signals from the ethernet lan that is connected to the ethernet port that corresponds to that led. Management connector use this connector to connect the server to a network...

  • Page 34

    1. Wait until the power led on the compute node flashes slowly before you press the power button. While the imm2 in the compute node is initializing and synchronizing with the chassis management module, the power led flashes rapidly, and the power button on the compute node does not respond. This pr...

  • Page 35

    System-board external connectors the following illustration shows the external connectors on the system board. Power distribution board connector dimm 1 dimm 2 dimm 3 dimm 4 pci riser connector 2 dimm 8 dimm 7 microprocessor 1 microprocessor 2 operator information panel sata connector led signal con...

  • Page 36

    System-board switches and jumpers the following illustration shows the location and description of the switches and jumpers. Ethernet 1 connector ethernet 2 connector management connector kvm connector figure 9. External connectors on system board 18 ibm nextscale nx360 m4 type 5455: installation an...

  • Page 37

    Note: if there is a clear protective sticker on the top of the switch blocks, you must remove and discard it to access the switches. Note: 1. Before you change any switch settings or move any jumpers, turn off the server. Review the information in “safety” on page vii, “installation guidelines” on p...

  • Page 38

    Any error led can be lit after ac power has been removed from the system-board tray so that you can isolate a problem. After ac power has been removed from the system-board tray, power remains available to these leds for up to 90 seconds. To view the error leds, press and hold the light path button ...

  • Page 39

    Chapter 2. Configuration information and instructions this chapter provides information about updating the firmware and using the configuration utilities. Updating the firmware use this information to update the system firmware. Important: 1. Some cluster solutions require specific code levels or co...

  • Page 40

    When you replace a device in the server, you might have to update the firmware that is stored in memory on the device or restore the pre-existing firmware from a cd or dvd image. The following list indicates where the firmware is stored: v uefi firmware is stored in rom on the system board. V imm2 f...

  • Page 41

    – remotely viewing video with graphics resolutions up to 1600 x 1200 at 75 hz, regardless of the system state – remotely accessing the server, using the keyboard and mouse from a remote client – mapping the cd or dvd drive, diskette drive, and usb flash drive on a remote client, and mapping iso and ...

  • Page 42

    Table 1. Server configuration and applications for configuring and managing raid arrays (continued) server configuration raid array configuration (before operating system is installed) raid array management (after operating system is installed) serveraid-c100 hii megaraid storage manager (msm), mega...

  • Page 43

    V includes an online readme file with links to tips for your hardware and operating-system installation setup and configuration overview use this information for the serverguide setup and configuration. When you use the serverguide setup and installation cd, you do not need setup diskettes. You can ...

  • Page 44

    V view and change assignments for devices and i/o ports v set the date and time v set and change passwords v set the startup characteristics of the server and the order of startup devices v set and change settings for advanced hardware features v view, set, and change settings for power-management f...

  • Page 45

    Select this choice to view information about the uefi 1.10 and uefi 2.0 compliant adapters and drivers installed in the server. – processors select this choice to view or change the processor settings. – memory select this choice to view or change the memory settings. – devices and i/o ports select ...

  • Page 46

    - network configuration select this choice to view the system management network interface port, the imm mac address, the current imm ip address, and host name; define the static imm ip address, subnet mask, and gateway address, specify whether to use the static ip address or have dhcp assign the im...

  • Page 47

    Wake on lan functions. For example, you can define a startup sequence that checks for a disc in the cd-rw/dvd drive, then checks the hard disk drive, and then checks a network adapter. This choice is on the full setup utility menu only. V boot manager select this choice to view, add, delete, or chan...

  • Page 48

    Select this choice to save the changes that you have made in the settings. V restore settings select this choice to cancel the changes that you have made in the settings and restore the previous settings. V load default settings select this choice to cancel the changes that you have made in the sett...

  • Page 49

    V change the position of the power-on password switch (enable switch 3 of the system board switch block (sw4) to bypass the password check (see “system-board switches and jumpers” on page 18 for more information). Attention: before you change any switch settings or move any jumpers, turn off the ser...

  • Page 50

    The power-on password override switch does not affect the administrator password. Administrator password: if an administrator password is set, you must type the administrator password for access to the full setup utility menu. You can use any combination of 6 to 20 printable ascii characters for the...

  • Page 51

    Changing the power policy option to the default settings after loading uefi defaults the default settings for the power policy option are set by the imm2. To change the power policy option to the default settings, complete the following steps. 1. Turn on the server. Note: approximately 20 seconds af...

  • Page 52

    V environmental monitor with fan speed control for temperature, voltages, fan failure, power supply failure, and power backplane failure. V intelligent platform management interface (ipmi) specification v2.0 and intelligent platform management bus (ipmb) support. V invalid system configuration (conf...

  • Page 53

    V uploading a diskette image to the imm memory and mapping it to the server as a virtual drive the blue-screen capture feature captures the video display contents before the imm restarts the server when the imm detects an operating-system hang condition. A system administrator can use the blue-scree...

  • Page 54

    4. On the next screen, select integrated management module . 5. On the next screen, select network configuration . 6. Find the ip address and write it down. 7. Exit from the setup utility. Logging on to the web interface use this information to log on to the web interface. To log on to the imm web i...

  • Page 55

    Note: approximately 5 to 10 seconds after the server is connected to power, the power-control button becomes active. 2. When the prompt setup is displayed, press f1 . 3. From the setup utility main menu, select boot manager . 4. Select add boot option ; then, select generic boot option > embedded hy...

  • Page 56

    You can activate the features on demand (fod) software upgrade key for raid that is integrated in the integrated management module. For more information and instructions for activating the features on demand raid software key, see the ibm features on demand user’s guide . To download the document, g...

  • Page 57

    Note: changes are made periodically to the ibm website. The actual procedure might vary slightly from what is described in this document. Installing a newer version to locate and install a newer version of ibm systems director, complete the following steps: 1. Check for the latest version of ibm sys...

  • Page 58

    The asu is an online tool that supports several operating systems. Make sure that you download the version for your operating system. You can download the asu from the ibm web site. To download the asu and update the uuid, complete the following steps. Note: changes are made periodically to the ibm ...

  • Page 59

    Imm_password the imm account password (1 of 12 accounts). The default value is passw0rd (with a zero 0 not an o). Note: if you do not specify any of these parameters, asu will use the default values. When the default values are used and asu is unable to access the imm using the online authenticated ...

  • Page 60

    Imm_password the imm account password (1 of 12 accounts). The default value is passw0rd (with a zero 0 not an o). The following commands are examples of using the userid and password default values and not using the default values: example that does not use the userid and password default values: as...

  • Page 61

    – device.Cat v for linux based operating systems: – cdc_interface.Sh 4. After you install asu, type the following commands to set the dmi: asu set system_prod_data.Sysinfoprodname [access_method] asu set system_prod_data.Sysinfoserialnum [access_method] asu set system_prod_data.Sysencloseassettag [a...

  • Page 62

    Asu set system_prod_data.Sysinfoserialnum asu set system_prod_data.Sysencloseassettag v online kcs access (unauthenticated and user restricted): you do not need to specify a value for access_method when you use this access method. The kcs access method uses the ipmi/kcs interface. This method requir...

  • Page 63

    --user --password examples that do use the userid and password default values: asu set system_prod_data.Sysinfoprodname --host asu set system_prod_data.Sysinfoserialnum --host asu set system_prod_data.Sysencloseassettag --host v bootable media: you can also build a bootable media using the applicati...

  • Page 64

    46 ibm nextscale nx360 m4 type 5455: installation and service guide.

  • Page 65

    Chapter 3. Troubleshooting this chapter describes the diagnostic tools and troubleshooting information that are available to help you solve problems that might occur in the server. If you cannot diagnose and correct a problem by using the information in this chapter, see appendix d, “getting help an...

  • Page 66

    To download the latest version of dsa code and the dynamic system analysis installation and user's guide , go to http://www.Ibm.Com/support/entry/portal/ docdisplay?Lndocid=serv-dsa. 4. Check for and apply code updates. Fixes or workarounds for many problems might be available in updated uefi firmwa...

  • Page 67

    B. Make sure that the server, operating system, and software are installed and configured correctly. Many configuration problems are caused by loose power or signal cables or incorrectly seated adapters. You might be able to solve the problem by turning off the server, reconnecting cables, reseating...

  • Page 68

    Checkout procedure the checkout procedure is the sequence of tasks that you should follow to diagnose a problem in the server. About the checkout procedure before you perform the checkout procedure for diagnosing hardware problems, review the following information: v read the safety information that...

  • Page 69

    Performing the checkout procedure use this information to perform the checkout procedure. To perform the checkout procedure, complete the following steps: 1. Is the server part of a cluster? V no: go to step 2. V yes: shut down all failing servers that are related to the cluster. Go to step 2. 2. Co...

  • Page 70

    The integrated management module ii (imm2) combines service processor functions, video controller, and remote presence and blue-screen capture features in a single chip. The imm provides advanced service-processor control, monitoring, and alerting function. If an environmental condition exceeds a th...

  • Page 71

    – dsa preboot dsa preboot diagnostic program is stored in the integrated usb memory on the server. Dsa preboot collects and analyzes system information to aid in diagnosing server problems, as well as offering a rich set of diagnostic tests of the major components of the server. Dsa preboot collects...

  • Page 72

    The following illustration shows the locations of the power-supply leds on the ac power supply. The following table describes the problems that are indicated by various combinations of the power-supply leds on an ac power supply and suggested actions to correct the detected problems. Ac power-supply...

  • Page 73

    Ac power-supply leds description action notes ac dc error (!) on off off power-supply not fully seated, faulty system board, or the power supply has failed. 1. Reseat the power supply. 2. Follow actions in “power problems” on page 72. 3. Follow actions in “solving power problems” on page 75 until th...

  • Page 74

    Log from the setup utility (see “starting the setup utility” on page 26). For more information about post error codes, see appendix b, “uefi (post) error codes,” on page 309. V system-event log: this log contains post and system management interrupt (smi) events and all events that are generated by ...

  • Page 75

    If you have installed dynamic system analysis (dsa) portable, you can use it to view the system-event log (as the ipmi event log), or the imm event log (as the asm event log), the operating-system event logs, or the merged dsa log. You can also use dsa preboot to view these logs, although you must r...

  • Page 76

    Table 3. Methods for viewing event logs (continued) condition action the server is hung, and no communication can be made with the imm. V if dsa preboot is installed, restart the server and press f2 to start dsa preboot and view the event logs (see “running dsa preboot diagnostic programs” on page 6...

  • Page 77

    V kernel modules (available in dsa portable only) v light path diagnostics status v network interfaces and settings v performance data and details about processes that are running v raid controller configuration v service processor (integrated management module) status and configuration v system con...

  • Page 78

    – checkpoint panel – i2c bus – sas and sata drives if you are unable to restart the server or if you need comprehensive diagnostics, use dsa preboot. For more information and to download the utilities, go to http://www.Ibm.Com/ support/entry/portal/docdisplay?Lndocid=serv-dsa. Running dsa preboot di...

  • Page 79

    Aborted: the test could not proceed because of the server configuration. Additional information concerning test failures is available in the extended diagnostic results for each test. Viewing the test log results and transferring the dsa collection use this information to view the test log results a...

  • Page 80

    Troubleshooting by symptom use the troubleshooting tables to find solutions to problems that have identifiable symptoms. If you cannot find a solution to the problem in these tables, see dsa messages for information about testing the server and “running dsa preboot diagnostic programs” on page 60 fo...

  • Page 81

    Table 4. Hard disk drive symptoms and actions (continued) v follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V if an action step is preceded by “(trained technician only)”, that step must be performed only by a trained technician. V...

  • Page 82

    Intermittent problems use this information to solve intermittent problems. V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V if an action step is preceded by “(trained technician only),” that step must be performed only by a trai...

  • Page 83

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V if an action step is preceded by “(trained technician only),” that step must be performed only by a trained technician. V go to the ibm support website at to check for technical i...

  • Page 84

    Memory problems use this information to solve memory problems. V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V if an action step is preceded by “(trained technician only),” that step must be performed only by a trained technici...

  • Page 85

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V if an action step is preceded by “(trained technician only),” that step must be performed only by a trained technician. V go to the ibm support website at to check for technical i...

  • Page 86

    Monitor and video problems use this information to solve monitor and video problems. Some ibm monitors have their own self-tests. If you suspect a problem with your monitor, see the documentation that comes with the monitor for instructions for testing and adjusting the monitor. If you cannot diagno...

  • Page 87

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V if an action step is preceded by “(trained technician only),” that step must be performed only by a trained technician. V go to the ibm support website at to check for technical i...

  • Page 88

    Network connection problems use this information to solve network connection problems. V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V if an action step is preceded by “(trained technician only),” that step must be performed on...

  • Page 89

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V if an action step is preceded by “(trained technician only),” that step must be performed only by a trained technician. V go to the ibm support website at to check for technical i...

  • Page 90

    Power problems use this information to solve power problems. V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V if an action step is preceded by “(trained technician only),” that step must be performed only by a trained technician...

  • Page 91

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V if an action step is preceded by “(trained technician only),” that step must be performed only by a trained technician. V go to the ibm support website at to check for technical i...

  • Page 92

    V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V if an action step is preceded by “(trained technician only),” that step must be performed only by a trained technician. V go to the ibm support website at to check for technical i...

  • Page 93

    Software problems use this information to solve software problems. V follow the suggested actions in the order in which they are listed in the action column until the problem is solved. V if an action step is preceded by “(trained technician only),” that step must be performed only by a trained tech...

  • Page 94

    Power problems can be difficult to solve. For example, a short circuit can exist anywhere on any of the power distribution buses. Usually, a short circuit will cause the power subsystem to shut down because of an overcurrent condition. To diagnose a power problem, use the following general procedure...

  • Page 95

    Table 5. Components associated with power rail errors (continued) pwr rail error in the imm event log components pwr rail h error v hard disk drive power cable v hard disk drives v hard disk drive backplane or v pci adapter power cable v adapter installed in pci riser-card assembly 2 v pci riser-car...

  • Page 96

    – the ethernet transmit/receive activity led is lit when the ethernet controller sends or receives data over the ethernet network. If the ethernet transmit/receive activity is off, make sure that the hub and network are operating and that the correct device drivers are installed. V check the lan act...

  • Page 97

    Problem determination tips because of the variety of hardware and software combinations that can encounter, use the following information to assist you in problem determination. If possible, have this information available when requesting assistance from ibm. The model name and serial number are loc...

  • Page 98

    V imm firmware level v adapters and attachments, in the same locations v address jumpers, terminators, and cabling v software versions and levels v diagnostic program type and version level v configuration option settings v operating-system control-file setup see appendix d, “getting help and techni...

  • Page 99

    4. Locate the uefi boot backup jumper (jp2) on the system board. 5. Move the uefi boot backup jumper (jp2) from pins 1 and 2 to pins 2 and 3 to enable the uefi recovery mode. 6. Reinstall the server cover; then, reconnect all power cords. 7. Restart the server. The system begins the power-on self-te...

  • Page 100

    12. Reinstall the cover (see “installing the compute node cover” on page 111). 13. Reconnect the power cord and any cables that you removed. 14. Restart the server. The system begins the power-on self-test (post). If this does not recover the primary bank, continue with the following steps. 15. Remo...

  • Page 101

    Nx-boot failure configuration changes, such as added devices or adapter firmware updates, and firmware or application code problems can cause the server to fail post (the power-on self-test). If this occurs, the server responds in either of the following ways: v the server restarts automatically and...

  • Page 102

    84 ibm nextscale nx360 m4 type 5455: installation and service guide.

  • Page 103

    Chapter 4. Parts listing, ibm nextscale nx360 m4 compute node type 5455 the parts listing of ibm nextscale nx360 m4 compute node type 5455. The following replaceable components are available for the ibm nextscale nx360 m4 compute node type 5455 server, except as specified otherwise in “replaceable s...

  • Page 104

    The following table lists the part numbers for the server replaceable components. Table 6. Parts listing, type type 5455 index description cru part number (tier 1) cru part number (tier 2) 3 1.8-inch ssd cable assembly (software raid) 00am452 3 1.8-inch ssd cable assembly (hardware raid) 00am453 4 2...

  • Page 105

    Table 6. Parts listing, type type 5455 (continued) index description cru part number (tier 1) cru part number (tier 2) 7 microprocessor, intel xeon e5-2609 v2 2.5 ghz, 10 mb, 1333 mhz, 80 w (4 core) 00y2779 7 microprocessor, intel xeon e5-2620 v2 2.1 ghz, 15 mb, 1600 mhz, 80 w (6 core) 00y2780 7 mic...

  • Page 106

    Table 6. Parts listing, type type 5455 (continued) index description cru part number (tier 1) cru part number (tier 2) 11 hard disk drive, 3.5-inch 2 tb, 6 gbps 512e near-line sata 00fn124 11 hard disk drive, 3.5-inch 3 tb, 6 gbps 512e near-line sata 00fn139 11 hard disk drive, 3.5-inch 4 tb, 6 gbps...

  • Page 107

    Table 6. Parts listing, type type 5455 (continued) index description cru part number (tier 1) cru part number (tier 2) node planar tray 00ka917 configuration cable 00am460 memory key, blank usb for vmware esxi downloads 42d0545 t8 torx screwdriver (provided on the back of the chassis) 00fk488 therma...

  • Page 108

    If you need help with your order, call the toll-free number that is listed on the retail parts page, or contact your local ibm representative for assistance. Power cords for your safety, a power cord with a grounded attachment plug is provided to use with this product. To avoid electrical shock, alw...

  • Page 109

    Power cord part number used in these countries and regions 39m5151 abu dhabi, bahrain, botswana, brunei darussalam, channel islands, china (hong kong s.A.R.), cyprus, dominica, gambia, ghana, grenada, iraq, ireland, jordan, kenya, kuwait, liberia, malawi, malaysia, malta, myanmar (burma), nigeria, o...

  • Page 110

    92 ibm nextscale nx360 m4 type 5455: installation and service guide.

  • Page 111

    Chapter 5. Removing and replacing components use this information to remove and replace the server components. The types of replaceable components are: v structural parts: purchase and replacement of structural parts (components, such as chassis assembly, top cover, and bezel) is your responsibility...

  • Page 112

    V read the safety information in “safety” on page vii and “handling static-sensitive devices” on page 95. This information will help you work safely. V make sure that the devices that you are installing are supported. For a list of supported optional devices for the compute node, see http://www.Ibm....

  • Page 113

    And operating system support hot-swap capability, you can remove or install the component while the compute node is running. (orange can also indicate touch points on hot-swap components.) see the instructions for removing or installing a specific hot-swap component for any additional procedures tha...

  • Page 114

    V handle the device carefully, holding it by its edges or its frame. V do not touch solder joints, pins, or exposed circuitry. V do not leave the device where others can handle and damage it. V while the device is still in its static-protective package, touch it to an unpainted metal surface on the ...

  • Page 115

    Attention: v to maintain proper system cooling, do not operate the ibm nextscale n1200 enclosure without a compute node or node bay filler installed in each node bay. V when you remove the compute node, note the node bay number. Reinstalling a compute node into a different node bay from the one it w...

  • Page 116

    If you are installing a compute node model without an integrated ethernet controller, you must install a network interface adapter before you install the compute node in the chassis for management network communication. For a list of supported optional devices for the compute node, see http://www.Ib...

  • Page 117

    Table 9. Compute nodes supported (low-line ac input, with 900-watt power supply x6) microprocessor sku (w) # of microprocessor(s) non- redundant or n+1 with ovs 1 , n=5 n+1 redundant, n=5 n+n redundant, n=3 n+n redundant with ovs 1 , n=3 50 1 12 12 9 11 2 12 12 6 10 60 1 12 12 7 9 2 12 9 5 7 70 1 12...

  • Page 118

    Note: 1. Ovs (oversubscription) of the power system allows for more efficient use of the available system power. Table 11. Compute nodes + two 130-watt 2 gpus supported (high-line ac input, with 1300-watt power supply x6) microprocessor sku (w) # of microprocessor(s) non- redundant or n+1 with ovs 1...

  • Page 119

    Table 12. Compute nodes + two 225-watt 2 gpus supported (high-line ac input, with 1300-watt power supply x6) (continued) microprocessor sku (w) # of microprocessor(s) non- redundant or n+1 with ovs 1 , n=5 n+1 redundant, n=5 n+n redundant, n=3 n+n redundant with ovs 1 , n=3 60 1 6 6 5 6 2 6 6 4 + 1 ...

  • Page 120

    Table 13. Compute nodes + two 235-watt 2 gpus supported (high-line ac input, with 1300-watt power supply x6) microprocessor sku (w) # of microprocessor(s) non- redundant or n+1 with ovs 1 , n=5 n+1 redundant, n=5 n+n redundant, n=3 n+n redundant with ovs 1 , n=3 50 1 6 6 5 + 1 microprocessor node 6 ...

  • Page 121

    Table 14. Compute nodes + two 300-watt 2 gpus supported (high-line ac input, with 1300-watt power supply x6) microprocessor sku (w) # of microprocessor(s) non- redundant or n+1 with ovs 1 , n=5 n+1 redundant, n=5 n+n redundant, n=3 n+n redundant with ovs 1 , n=3 50 1 6 6 4 + 2 microprocessor node 5 ...

  • Page 122

    Table 15. 1300-watt power supply supportability quantity of 1300-watt power supplies fpc power bank non-redundant n+1 redundant n+n redundant 2 support non-support 3 4 5 6 support note: when setting power redundancy through fpc after nodes are powered on, it is possible that the current power bank i...

  • Page 123

    Process takes approximately 90 seconds. The power led flashes rapidly, and the power button on the compute node does not respond until this process is complete. 5. Turn on the compute node (see “turning on the compute node” on page 15 for instructions). 6. Make sure that the power led on the compute...

  • Page 124

    4. Pull the storage tray out of the compute node. If you are instructed to return the storage tray, follow all packaging instructions, and use any packaging materials for shipping that are supplied to you. Installing a storage tray into a compute node use this information to install a storage tray i...

  • Page 125

    Table 16. Hard disk drive configuration for storage tray legend: hdd=> hard disk drive internal storage node hdd quantity 7 6 5 4 3 2 1 0 drive bay 0 hdd hdd hdd hdd hdd hdd hdd filler drive bay 1 hdd hdd hdd hdd hdd hdd filler filler drive bay 2 hdd hdd hdd hdd hdd filler filler filler drive bay 3 ...

  • Page 126

    5. Hold the front of the compute node and slide the storage tray forward to the closed position, until it clicks into place. Removing a gpu tray from a compute node use this information to remove a gpu tray from a nextscale nx360 m4 compute node. Before you remove a gpu tray from a compute node, com...

  • Page 127

    4. Pull the gpu tray out of the compute node. If you are instructed to return the gpu tray, follow all packaging instructions, and use any packaging materials for shipping that are supplied to you. Installing a gpu tray into a compute node use this information to install a gpu tray in a nextscale nx...

  • Page 128

    4. Hold the front of the compute node and slide the gpu tray forward to the closed position, until it clicks into place. 5. Reinstall the pci riser-cage assembly in the gpu tray (see “replacing a pci riser-cage assembly in the gpu tray” on page 157). 6. Replace the cover (see “installing the compute...

  • Page 129

    Caution: the following label indicates a hot surface nearby. Statement 21 caution: hazardous energy is present when the compute node is connected to the power source. Always replace the compute node cover before installing the compute node. To remove the compute node cover, complete the following st...

  • Page 130

    If you are replacing (installing) a cover, be sure to have the system service label kit available for use during the procedure, (see chapter 4, “parts listing, ibm nextscale nx360 m4 compute node type 5455,” on page 85). Attention: you cannot insert the compute node into the ibm nextscale n1200 encl...

  • Page 131

    After you install the compute node cover, install the compute node into the chassis (see “installing a compute node in a chassis” on page 97 for instructions). Removing the air baffle use this information to remove the air baffle. To remove the air baffle, complete the following steps: 1. Read the s...

  • Page 132

    Replacing the air baffle use this information to install the air baffle. To install the air baffle, complete the following steps: 1. Read the safety information that begins on “safety” on page vii and “installation guidelines” on page 93. 2. Turn off the compute node and peripheral devices and disco...

  • Page 133

    8. Turn on the peripheral devices and the compute node. Removing a raid adapter battery holder use this information to remove a raid adapter battery holder. If a raid adapter battery is installed remotely near the fan cage and you need to replace it, complete the following steps: 1. Read the safety ...

  • Page 134

    4. Replacing the cover (see “installing the compute node cover” on page 111). 5. Slide the server into the rack. 6. Reconnect the power cords and all external cables, and turn on the server and peripheral devices. Removing the pci riser filler use this information to remove the pci riser filler. To ...

  • Page 135

    Replacing the pci riser filler use this information to install the pci riser filler. To install the pci riser filler, complete the following steps: 1. Read the safety information that begins on “safety” on page vii and “installation guidelines” on page 93. 2. Turn off the compute node and peripheral...

  • Page 136

    5. Remove the filler from the gpu tray and set it aside. Attention: for proper cooling and airflow, replace the filler before you turn on the compute node. Operating the compute node with the filler removed might damage gpu tray components. Replacing the filler on to the gpu tray use this informatio...

  • Page 137

    5. Reinstall the cover (see “installing the compute node cover” on page 111). 6. Slide the compute node into the rack. 7. Reconnect the power cords and any cables that you removed. 8. Turn on the peripheral devices and the compute node. Removing the front handle use this information to remove the fr...

  • Page 138

    1. Locate the screw that attaches the handle to the compute node. 2. Using a phillips screwdriver, remove the screw from the front handle and save the screw in a safe place. Use the same screw when you install a front handle. If you are instructed to return the front handle, follow all packaging ins...

  • Page 139

    1. Orient the front handle so that the blue release latch is toward the middle of the compute node. 2. Align the hole in the handle with the hole on the compute node where the handle is installed. 3. Using a phillips screwdriver, install the phillips #2 screw that secures the handle. Install the scr...

  • Page 140

    3.5-inch hard disk drive cage screws figure 37. Removing a hard disk drive cage (3.5-inch) 2.5-inch hard disk drive cage screws pin pin hole back of chassis t8 torx screwdriver figure 38. Removing a hard disk drive cage (2.5-inch) 122 ibm nextscale nx360 m4 type 5455: installation and service guide.

  • Page 141

    1. Remove the cover (see “removing the compute node cover” on page 110). 2. Remove the easy-swap hard disk drives and hard disk drive bay fillers (see “removing and installing drives” on page 145). 3. Remove the hard disk drive backplate (see “removing the hard disk drive backplate” on page 142). 4....

  • Page 142

    3.5-inch hard disk drive cage screws figure 40. Installing a hard disk drive cage (3.5-inch) 2.5-inch hard disk drive cage screws pin pin hole back of chassis t8 torx screwdriver figure 41. Installing a hard disk drive cage (2.5-inch) 124 ibm nextscale nx360 m4 type 5455: installation and service gu...

  • Page 143

    1. Remove the cover (see “removing the compute node cover” on page 110). 2. Position the cage in the bezel at an angle and rotate the cage into position on the system board. 3. Align the cage with the screw holes on the system board. 4. Using a phillips (for 3.5-inch hard disk drive cage) or t8 torx...

  • Page 144

    Before you remove the operator information panel, read “safety” on page vii and “installation guidelines” on page 93. To remove the operator information panel, complete the following steps. 1. Use a t8 torx (part number 00fk488, provided on the back of the chassis) screwdriver to remove the screw fr...

  • Page 145

    If you are instructed to return the operator information panel, follow all packaging instructions, and use any packaging materials for shipping that are supplied to you. Installing the operator information panel use this information to install the operator information panel. Before you install the o...

  • Page 146

    3. Install the connector of the operator information panel on the system board. 4. Use a t8 torx (part number 00fk488, provided on the back of the chassis) screwdriver to install the screw to the operator information panel. Removing the power paddle card from the gpu tray use this information to rem...

  • Page 147

    8. If you are instructed to return the power paddle card, follow all packaging instructions, and use any packaging materials for shipping that are supplied to you. Replacing the power paddle card on to the gpu tray use this information to install the power paddle card on to the gpu tray. To install ...

  • Page 148

    Removing the system battery use this information to remove the cmos battery. The following notes describe information that you must consider when replacing the battery. V ibm has designed this product with your safety in mind. The lithium battery must be handled correctly to avoid possible danger. I...

  • Page 149

    B. Use one finger to tilt the battery horizontally out of its socket, pushing it away from the socket. Attention: neither tilt nor push the battery by using excessive force. C. Use your thumb and index finger to lift the battery from the socket. Attention: do not lift the battery by using excessive ...

  • Page 150

    To install the replacement system battery, complete the following steps: 1. Follow any special handling and installation instructions that come with the replacement battery. 2. Read the safety information that begins on “safety” on page vii and “installation guidelines” on page 93. 3. Turn off the s...

  • Page 151

    5. Carefully open the retaining clips on each end of the dimm connector and remove the dimm. Attention: to avoid breaking the retaining clips or damaging the dimm connectors, open and close the clips gently. 6. If you are instructed to return the dimm, follow all packaging instructions, and use any ...

  • Page 152

    X8 = x8 organization x16 = x16 organization - v is the sdram and support component supply voltage (vdd) v blank = 1.5 v specified v l = 1.35 v specified, 1.5 v operable note: values for these voltages are “specified” which means the device characteristics such as timing are supported at this voltage...

  • Page 153

    Speed is set to max performance and lv-dimm power is set to enhance performance mode. The 1.35 v udimms, rdimms or lrdimms will function at 1.5 v. V the compute node supports a maximum of 8 dual-rank udimms. The compute node supports up to one udimms per channel. V the compute node supports a maximu...

  • Page 154

    Dimm installation sequence depending on the server model, the server may come with a minimum of one 4 gb dimm installed in slot 4. When you install additional dimms, install them in the order shown in the following table to optimize system performance. In general, all channels on the memory interfac...

  • Page 155

    Table 18. Normal mode dimm installation sequence number of installed microprocessor dimm connector population sequence one microprocessor installed 4, 3, 1, 2 two microprocessors installed 4, 5, 3, 6, 1, 8, 2, 7 memory mirrored channel memory mirrored channel mode replicates and stores data on two p...

  • Page 156

    6. Touch the static-protective package that contains the dimm to any unpainted metal surface on the outside of the server. Then, remove the dimm from the package. 7. Turn the dimm so that the alignment slot align correctly with the alignment tab. 8. Insert the dimm into the connector by aligning the...

  • Page 157

    3. Carefully lay the compute node on a flat, static-protective surface, orienting the compute node with the bezel pointing toward you. To remove the optional 3.5-inch hard disk drive hardware raid cage, complete the following steps. 1. Remove the cover (see “removing the compute node cover” on page ...

  • Page 158

    3. Remove the easy-swap hard disk drive #7 (see “removing a 3.5-inch hard disk drive” on page 145). 4. Using a phillips screwdriver, remove the four screws from the cage and rotate the cage from under the bezel; then, remove the cage from the compute node at an angle. If you are instructed to return...

  • Page 159

    1. Remove the cover (see “removing the compute node cover” on page 110). 2. Position the hardware raid cage in the bezel at an angle and rotate the cage into position on the system board. 3. Align the cage with the screw holes on the system board. 4. Using a phillips screwdriver, insert the 4 screws...

  • Page 160

    After you install the hard disk drive cage, complete the following steps: 1. Install the cover onto the compute node (see “installing the compute node cover” on page 111 for instructions). 2. Install the compute node into the chassis (see “installing a compute node in a chassis” on page 97 for instr...

  • Page 161

    1. Remove the cover (see “removing the compute node cover” on page 110). 2. Unlatch and slide out slightly the easy-swap hard disk drive and hard disk drive bay filler, (just enough to disengage the drive or filler). 3. Unlatch the release latch and lift out the hard disk drive backplate. If you are...

  • Page 162

    3. Carefully lay the compute node on a flat, static-protective surface, orienting the compute node with the bezel pointing toward you. To install the hard disk drive backplate, complete the following steps. 1. Remove the cover (see “removing the compute node cover” on page 110). 2. Align the backpla...

  • Page 163

    1. Install the cover onto the compute node (see “installing the compute node cover” on page 111 for instructions). 2. Install the compute node into the chassis (see “installing a compute node in a chassis” on page 97 for instructions). Removing and installing drives use this information to remove an...

  • Page 164

    If you are instructed to return the component or optional device, follow all packaging instructions, and use any packaging materials for shipping that are supplied to you. Installing a 3.5-inch hard disk drive use this information to install a 3.5-inch sas/sata hard disk drive. Before installing a 3...

  • Page 165

    8. Check the hard disk drive status indicator to make sure that the hard disk drive is operating correctly. After you replace a failed hard disk drive, the green activity led flashes as the disk spins up. The yellow led turns off after approximately 1 minute. If the new drive starts to rebuild, the ...

  • Page 166

    4. Remove the cover (see “removing the compute node cover” on page 110). 5. Pull the plunger of the 2.5-inch hard disk drive cage outward and rotate the cage upward. 6. Push this latch gently outward a little to let the screw un-hold by the latch hole. Then, remove the hard disk drive. 7. Pull the p...

  • Page 167

    If you are instructed to return the component or optional device, follow all packaging instructions, and use any packaging materials for shipping that are supplied to you. Installing a 2.5-inch hard disk drive use this information to install a 2.5-inch hard disk drive. The following notes describe t...

  • Page 168

    6. Remove the filler panel, if one is present. 7. Touch the static-protective package that contains the disk drive to any unpainted metal surface on the server; then, remove the disk drive from the package. 8. Align the drive with the bay of the hard disk drive cage; then, carefully slide the drive ...

  • Page 169

    A. After you install the hard disk drive, check the disk drive status leds to verify that the hard disk drive is operating correctly. If the yellow hard disk drive status led is lit continuously, that drive is faulty and must be replaced. If the green hard disk drive activity led is flashing, the dr...

  • Page 170

    7. Pull the plunger of the 1.8-inch hard disk drive cage outward and rotate the cage downward until the cage snaps into place. If you are instructed to return the component or optional device, follow all packaging instructions, and use any packaging materials for shipping that are supplied to you. I...

  • Page 171

    Attention: static electricity that is released to internal server components when the server is powered-on might cause the server to halt, which might result in the loss of data. To avoid this potential problem, always use an electrostatic-discharge wrist strap or other grounding system when you wor...

  • Page 172

    Note: a. After you install the hard disk drive, check the disk drive status leds to verify that the hard disk drive is operating correctly. If the yellow hard disk drive status led is lit continuously, that drive is faulty and must be replaced. If the green hard disk drive activity led is flashing, ...

  • Page 173

    5. If an adapter is installed in the pci riser-cage assembly, disconnect any cables that are connected to the adapter. 6. Remove the adapter, if one is present, from the pci riser-cage assembly (see “removing an adapter/gpu adapter” on page 159). 7. Set the adapter and the pci riser-cage assembly as...

  • Page 174

    9. Replace the cover (see “installing the compute node cover” on page 111). 10. Slide the server into the rack. 11. Reconnect the power cords and any cables that you removed. 12. Turn on the peripheral devices and the server. Removing a pci riser-cage assembly in the gpu tray note: pci riser-cage br...

  • Page 175

    7. If a gpu adapter is installed in the pci riser-cage assembly, disconnect any cables that are connected to the adapter. 8. Remove the gpu adapter, if one is present, from the pci riser-cage assembly (see “removing an adapter/gpu adapter” on page 159). 9. Set the gpu adapter and the pci riser-cage ...

  • Page 176

    2. Turn off the server and peripheral devices and disconnect all power cords. 3. Remove the cover (see “removing the compute node cover” on page 110). 4. Install the gpu adapter in the new pci riser-cage assembly (see “replacing an adapter/gpu adapter” on page 160). 5. Remove the pci filler panel, i...

  • Page 177

    Removing an adapter/gpu adapter use this information to remove an adapter/gpu adapter. To remove an adapter/gpu adapter, complete the following steps: 1. Read the safety information that begins on “safety” on page vii and “installation guidelines” on page 93. 2. Turn off the server and peripheral de...

  • Page 178

    If you are instructed to return the adapter/gpu adapter, follow all packaging instructions, and use any packaging materials for shipping that are supplied to you. Replacing an adapter/gpu adapter the following notes describe the types of adapters that the server supports and other information that y...

  • Page 179

    Pcie adapter pci riser-card assembly figure 88. Adapter installation gpu adapter front pci riser assembly expansion slot covers power connectors figure 89. Gpu adapter installation (with front pci riser assembly) gpu adapter rear pci riser assembly power connectors figure 90. Gpu adapter installatio...

  • Page 180

    Attention: when you install an adapter/gpu adapter, make sure that the adapter/gpu adapter is correctly seated in the riser-cage assembly and that the riser-cage assembly is securely seated in the riser-cage connector on the system board before you turn on the server. An incorrectly seated adapter m...

  • Page 181

    1. Remove the cover (see “removing the compute node cover” on page 110). 2. Locate the usb connector on the system board (see “system-board internal connectors” on page 16). 3. Pull the usb flash drive out of the connector. If you are instructed to return the usb flash drive, follow all packaging in...

  • Page 182

    2. If the compute node is installed in an ibm nextscale n1200 enclosure, remove it (see “removing a compute node from a chassis” on page 96 for instructions). 3. Carefully lay the compute node on a flat, static-protective surface, orienting the compute node with the bezel pointing toward you. This c...

  • Page 183

    1. Install the cover onto the compute node (see “installing the compute node cover” on page 111 for instructions). 2. Install the compute node into the chassis (see “installing a compute node in a chassis” on page 97 for instructions). Removing and replacing tier 2 crus you may install a tier 2 cru ...

  • Page 184

    4. Remove the air baffle (see “removing the air baffle” on page 113). 5. Locate the microprocessor to be removed (see “system-board internal connectors” on page 16). 6. Remove the heat sink. Attention: do not touch the thermal material on the bottom of the heat sink. Touching the thermal material wi...

  • Page 185

    C. Open the microprocessor retainer. Attention: do not touch the microprocessor contacts. Contaminants on the microprocessor contacts, such as oil from your skin, can cause connection failures between the contacts and the socket. 8. Remove the microprocessor from the socket. A. Select the empty inst...

  • Page 186

    D. Lift the microprocessor out of the socket. 9. Install the new microprocessor (see “replacing a microprocessor and heat sink”). Attention: if you are replacing a microprocessor, use the empty installation tool that comes with the new microprocessor to remove the microprocessor. 10. If you do not i...

  • Page 187

    V be extremely careful, the microprocessor socket contacts are very fragile. Do not touch the microprocessor socket contacts. Contaminants on the microprocessor contacts or microprocessor socket contacts, such as oil from your skin, can cause connection failures between the contacts and the socket. ...

  • Page 188

    V the microprocessor speeds are automatically set for this server; therefore, you do not have to set any microprocessor frequency-selection jumpers or switches. V if the thermal-grease protective cover (for example, a plastic cap or tape liner) is removed from the heat sink, do not touch the thermal...

  • Page 189

    A. Open the packaging that contains the new microprocessor installation tool assembly and carefully remove the installation tool assembly from the package. Note: do not touch the microprocessor contacts. Contaminants on the microprocessor contacts, such as oil from your skin, can cause connection fa...

  • Page 190

    Attention: v do not press the microprocessor into the socket. V make sure that the microprocessor is oriented and aligned correctly in the socket before you try to close the microprocessor retainer. V do not touch the thermal material on the bottom of the heat sink or on top of the microprocessor. T...

  • Page 191

    Attention: when you handle static-sensitive devices, take precautions to avoid damage from static electricity. For details about handling these devices, see “handling static-sensitive devices” on page 95. 9. Close the microprocessor socket release levers and retainer: a. Close the microprocessor ret...

  • Page 192

    A. Remove the plastic protective cover from the bottom of the heat sink. B. Position the heat sink over the microprocessor. The heat sink is keyed to assist with proper alignment. C. Align and place the heat sink on top of the microprocessor in the retention bracket, thermal material side down. D. P...

  • Page 193

    Thermal grease the thermal grease must be replaced whenever the heat sink has been removed from the top of the microprocessor and is going to be reused or when debris is found in the grease. When you are installing the heat sink on the same microprocessor that it was removed from, make sure that the...

  • Page 194

    6. Install the heat sink onto the microprocessor as described in 10 on page 173. Removing the compute node use this information to remove the compute node. Note: 1. This procedure should be performed only by trained service technicians. 2. Before you replace the system board, make sure that you back...

  • Page 195

    See “system-board layouts” on page 16 for more information about the locations of the connectors, jumpers, and leds on the system board. To remove the compute node, complete the following steps: 1. Remove the cover (see “removing the compute node cover” on page 110). 2. Remove all of the installed c...

  • Page 196

    Installing the compute node use this information to install the compute node. Note: this procedure should be performed only by trained service technicians. Before you install the compute node, complete the following steps: 1. Read “safety” on page vii and “installation guidelines” on page 93. 2. If ...

  • Page 197

    Install all of the components in the following list that you removed from the old compute node onto the new compute node: v dimms (see “installing a memory module” on page 133). V air baffles (see “replacing the air baffle” on page 114). V hard disk drives and hard disk drive fillers (see “installin...

  • Page 198

    4. Update the universal unique identifier (uuid) and the vital product data (vpd). Use the advanced settings utility to update the uuid and vpd in the uefi-based compute node (see “updating the universal unique identifier (uuid)” on page 39). 5. Update the compute node with the latest firmware or re...

  • Page 199

    Cabling hard disk drive with serveraid sas/sata controller the internal routing and connectors for the hard disk drive with serveraid sas/sata controller. The following illustrations show the internal routing and connectors for the 3.5-inch, 2.5-inch, and 1.8-inch hard disk drive models respectively...

  • Page 200

    Notes: 0 6 1 2 3 4 5 7 figure 115. 3.5-inch hard disk drive with serveraid sata controller cable connection (no hardware raid support for hard disk drive #7) 0 6 1 2 3 4 5 7 figure 116. 3.5-inch hard disk drive with serveraid sata controller cable connection (hardware raid is supported for all hard ...

  • Page 201

    1. For 3.5-inch hard disk drive model, the serveraid sata controller is available only with the storage tray installed. 2. You must remove hard disk drive #6 (if there is one installed) before connecting/disconnecting the configuration cable. See “major components of the storage tray” on page 11 for...

  • Page 202

    184 ibm nextscale nx360 m4 type 5455: installation and service guide.

  • Page 203

    Appendix a. Integrated management module ii (imm2) error messages this section details the integrated management module ii (imm2) error messages. When a hardware event is detected by the integrated management module ii (imm2) on the server, the integrated management module ii (imm2) logs that event ...

  • Page 204

    Cim information provides the prefix of the message id and the sequence number that is used by the cim message registry. Snmp trap id the snmp trap id that is found in the snmp alert management information base (mib). Automatically contact service if this field is set to yes , and you have enabled el...

  • Page 205

    User modifies the ethernet port duplex setting. Severity: info alert category: none serviceable: no cim information: prefix: imm and id: 0004 snmp trap id: automatically notify support: no user response: information only; no action is required. 40000005-00000000 ethernet mtu setting modified from [a...

  • Page 206

    Automatically notify support: no user response: information only; no action is required. 4000000b-00000000 ip address of default gateway modified from [arg1] to [arg2] by user [arg3]. Explanation: this message is for the use case where a user modifies the default gateway ip address of a management c...

  • Page 207

    User response: complete the following steps until the problem is solved: 1. Make sure that the correct login id and password are being used. 2. Have the system administrator reset the login id or password. 40000011-00000000 security: login id: [arg1] had [arg2] login failures from cli at [arg3]. Exp...

  • Page 208

    Snmp trap id: automatically notify support: no user response: information only; no action is required. 40000017-00000000 enet[[arg1]] ip- cfg:hstname=[arg2], ip@=[arg3] ,netmsk=[arg4], gw@=[arg5] . Explanation: this message is for the use case where a management controller ip address and configurati...

  • Page 209

    2. Make sure that the imm ethernet-over-usb interface is enabled. 3. Reinstall the rndis or cdc_ether device driver for the operating system. 4. Disable the watchdog. If there was an operating-system error, check the integrity of the installed operating system. 4000001d-00000000 watchdog [arg1] fail...

  • Page 210

    Alert category: none serviceable: no cim information: prefix: imm and id: 0033 snmp trap id: automatically notify support: no user response: information only; no action is required. 40000022-00000000 ssl data in the management controller [arg1] configuruation data is invalid. Clearing configuration ...

  • Page 211

    40000027-00000000 platform watchdog timer expired for [arg1]. Explanation: this message is for the use case when an implementation has detected a platform watchdog timer expired severity: error alert category: system - os timeout serviceable: no cim information: prefix: imm and id: 0039 snmp trap id...

  • Page 212

    4000002d-00000000 ddns setting changed to [arg1] by user [arg2]. Explanation: ddns setting changed by user severity: info alert category: none serviceable: no cim information: prefix: imm and id: 0045 snmp trap id: automatically notify support: no user response: information only; no action is requir...

  • Page 213

    40000034-00000000 ipv6 static ip configuration disabled by user [arg1]. Explanation: ipv6 static assignment method is disabled by user severity: info alert category: none serviceable: no cim information: prefix: imm and id: 0052 snmp trap id: automatically notify support: no user response: informati...

  • Page 214

    Automatically notify support: no user response: information only; no action is required. 4000003b-00000000 dhcpv6 failure, no ip address assigned. Explanation: s dhcp6 server fails to assign an ip address to a management controller. Severity: warning alert category: none serviceable: no cim informat...

  • Page 215

    User response: information only; no action is required. 40000041-00000000 cim/xml http port number changed from [arg1] to [arg2] by user [arg3]. Explanation: a user has modified the cim http port number severity: info alert category: none serviceable: no cim information: prefix: imm and id: 0065 snm...

  • Page 216

    40000047-00000000 led [arg1] state changed to [arg2] by [arg3]. Explanation: a user has modified the state of an led severity: info alert category: none serviceable: no cim information: prefix: imm and id: 0071 snmp trap id: automatically notify support: no user response: information only; no action...

  • Page 217

    Explanation: a user configured an ldap miscellaneous setting severity: info alert category: none serviceable: no cim information: prefix: imm and id: 0077 snmp trap id: automatically notify support: no user response: information only; no action is required. 4000004e-00000000 serial redirection set b...

  • Page 218

    User response: information only; no action is required. 40000054-00000000 server [arg1] [arg2] cleared by user [arg3]. Explanation: a user cleared a server power action. Severity: info alert category: none serviceable: no cim information: prefix: imm and id: 0084 snmp trap id: automatically notify s...

  • Page 219

    Alert category: none serviceable: no cim information: prefix: imm and id: 0090 snmp trap id: automatically notify support: no user response: information only; no action is required. 4000005b-00000000 secure web services (https) [arg1] by user [arg2]. Explanation: a user enables or disables secure we...

  • Page 220

    40000061-00000000 license key for [arg1] removed by user [arg2]. Explanation: a user removes a license key severity: info alert category: none serviceable: no cim information: prefix: imm and id: 0097 snmp trap id: automatically notify support: no user response: information only; no action is requir...

  • Page 221

    Snmp trap id: automatically notify support: no user response: information only; no action is required. 40000068-00000000 user [arg1] custom privileges set: [arg2]. Explanation: user account priveleges assigned severity: info alert category: none serviceable: no cim information: prefix: imm and id: 0...

  • Page 222

    Alert category: none serviceable: no cim information: prefix: imm and id: 0110 snmp trap id: automatically notify support: no user response: information only; no action is required. 4000006f-00000000 alert recipient number [arg1] updated: name=[arg2], deliverymethod=[arg3], address=[arg4], includelo...

  • Page 223

    40000075-00000000 the measured power value exceeded the power cap value. Explanation: power exceeded cap severity: warning alert category: warning - power serviceable: no cim information: prefix: imm and id: 0117 snmp trap id: 164 automatically notify support: no user response: information only; no ...

  • Page 224

    4000007c-00000000 dynamic power savings mode has been turned off by user [arg1]. Explanation: dynamic power savings mode turned off by user severity: info alert category: none serviceable: no cim information: prefix: imm and id: 0124 snmp trap id: automatically notify support: no user response: info...

  • Page 225

    40000083-00000000 the new minimum power cap value has returned below the power cap value. Explanation: minimum power cap exceeds power cap recovered severity: info alert category: warning - power serviceable: no cim information: prefix: imm and id: 0131 snmp trap id: 164 automatically notify support...

  • Page 226

    Alert category: system - other serviceable: no cim information: prefix: imm and id: 0137 snmp trap id: 22 automatically notify support: no user response: 1. Turn off the server and disconnect it from the power source. You must disconnect the server from ac power to reset the imm. 2. After 45 seconds...

  • Page 227

    Alert category: warning - temperature serviceable: yes cim information: prefix: plat and id: 0490 snmp trap id: 12 automatically notify support: no user response: 1. Make sure there is a node filler correctly installed for the empty node slot. 2. Make sure the air baffles are placed and correctly in...

  • Page 228

    80010701-1001ffff numeric sensor [numericsensorelementname] going high (upper non-critical) has asserted. (pci riser 1 temp) explanation: this message is for the use case when an implementation has detected an upper non-critical sensor going high has asserted. Severity: warning alert category: warni...

  • Page 229

    3. Make sure the fans are operating, and there are no obstructions to the airflow (both front and rear of the server). 4. Reduce the ambient temperature. The system must be operating within the specifications. (see features and specifications for more information) 80010701-1a01ffff numeric sensor [n...

  • Page 230

    1. Make sure there is a node filler correctly installed for the empty node slot. 2. Make sure the air baffles are placed and correctly installed; and make sure the node cover is installed and completely closed. 3. Make sure the fans are operating, and there are no obstructions to the airflow (both f...

  • Page 231

    Cim information: prefix: plat and id: 0494 snmp trap id: 0 automatically notify support: no user response: 1. Make sure there is a node filler correctly installed for the empty node slot. 2. Make sure the air baffles are placed and correctly installed; and make sure the node cover is installed and c...

  • Page 232

    Implementation has detected an upper critical sensor going high has asserted. Severity: error alert category: critical - temperature serviceable: yes cim information: prefix: plat and id: 0494 snmp trap id: 0 automatically notify support: no user response: 1. Make sure there is a node filler correct...

  • Page 233

    80010b01-0701ffff numeric sensor [numericsensorelementname] going high (upper non-recoverable) has asserted. (ambient temp) explanation: this message is for the use case when an implementation has detected an upper non-recoverable sensor going high has asserted. Severity: error alert category: criti...

  • Page 234

    3. Make sure the fans are operating, and there are no obstructions to the airflow (both front and rear of the server). 4. Reduce the ambient temperature. The system must be operating within the specifications. (see features and specifications for more information) 80010b01-1001ffff numeric sensor [n...

  • Page 235

    1. Make sure there is a node filler correctly installed for the empty node slot. 2. Make sure the air baffles are placed and correctly installed; and make sure the node cover is installed and completely closed. 3. Make sure the fans are operating, and there are no obstructions to the airflow (both f...

  • Page 236

    Automatically notify support: no user response: no action; information only. 80030012-0601ffff sensor [sensorelementname] has deasserted. (smm mode/smm monitor) explanation: this message is for the use case when an implementation has detected a sensor has deasserted. Severity: info alert category: s...

  • Page 237

    Alert category: warning - other serviceable: yes cim information: prefix: plat and id: 0520 snmp trap id: 60 automatically notify support: no user response: 1. Restart imm. If the error doesn't disappear, please proceed step 2. 2. Update to the latest level of imm/uefi code, please proceed step 3. 3...

  • Page 238

    6. Replace the pci adapter and make sure the pci adapter is functioning normally. 80070201-1102ffff sensor [sensorelementname] has transitioned to critical from a less severe state. (pci 2 temp) explanation: this message is for the use case when an implementation has detected a sensor transitioned t...

  • Page 239

    Alert category: critical - voltage serviceable: yes cim information: prefix: plat and id: 0522 snmp trap id: 1 automatically notify support: no user response: 1. Check the system-event log. 2. Check for an error led on the system board. 3. Replace any failing device. 4. Check for a server firmware u...

  • Page 240

    80070214-2201ffff sensor [sensorelementname] has transitioned to critical from a less severe state. (tpm lock) explanation: this message is for the use case when an implementation has detected a sensor transitioned to critical from less severe. Severity: error alert category: critical - other servic...

  • Page 241

    80070228-2e01ffff sensor [sensorelementname] has transitioned to critical from a less severe state. (ipmb io error) explanation: this message is for the use case when an implementation has detected a sensor transitioned to critical from less severe. Severity: error alert category: critical - other s...

  • Page 242

    2. Make sure the air baffles are placed and correctly installed; and make sure the node cover is installed and completely closed. 3. Make sure the fans are operating, and there are no obstructions to the airflow (both front and rear of the server). 4. Reduce the ambient temperature. The system must ...

  • Page 243

    6. Replace the pci adapter and make sure the pci adapter is functioning normally. 80070614-2201ffff sensor [sensorelementname] has transitioned to non-recoverable. (tpm phy pres set) explanation: this message is for the use case when an implementation has detected a sensor transitioned to non-recove...

  • Page 244

    Alert category: critical - memory serviceable: yes cim information: prefix: plat and id: 0810 snmp trap id: 41 automatically notify support: no user response: 1. Check the system-event log for dimm failure events (uncorrectable or pfa) and correct the failures. 2. Re-enable mirroring in the setup ut...

  • Page 245

    Uefi(post) error code for this event can be found in the logged imm message text. Please refer to the uefi(post) error code in the "uefi(post) error code" section of the information center for the appropriate user response. Firmware error : sys boot status : 806f000f-220102ff subsystem [memoryelemen...

  • Page 246

    Alert category: critical - other serviceable: yes cim information: prefix: plat and id: 0766 snmp trap id: 50 automatically notify support: no user response: this is a uefi detected event. The uefi(post) error for this event can be found in the logged imm message text. Please refer to the uefi(post)...

  • Page 247

    806f0021-2201ffff fault in slot [physicalconnectorsystemelementname] on system [computersystemelementname]. (no op rom space) explanation: this message is for the use case when an implementation has detected a fault in a slot. Severity: error alert category: critical - other serviceable: yes cim inf...

  • Page 248

    3. Update the server firmware (uefi and imm) and adapter firmware. Important: some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the...

  • Page 249

    Snmp trap id: automatically notify support: no user response: no action; information only. 806f0028-2101ffff sensor [sensorelementname] is unavailable or degraded on management system [computersystemelementname]. (tpm cmd failures) explanation: this message is for the use case when an implementation...

  • Page 250

    Snmp trap id: 41 automatically notify support: yes user response: 1. Check the ibm support website for an applicable retain tip or firmware update that applies to this memory error. 2. Swap the affected dimms (as indicated by the error leds on the system board or the event logs) to a different memor...

  • Page 251

    Disconnect and reconnect the server to the power source and restart the server. 8. (trained service technician only) replace the affected microprocessor. 806f010c-2004ffff uncorrectable error detected for [physicalmemoryelementname] on subsystem [memoryelementname]. (dimm 4) explanation: this messag...

  • Page 252

    2. Swap the affected dimms (as indicated by the error leds on the system board or the event logs) to a different memory channel or microprocessor. 3. If the problem follows the dimm, replace the failing dimm. 4. (trained technician only) if the problem occurs on the same dimm connector, check the di...

  • Page 253

    806f010c-2581ffff uncorrectable error detected for [physicalmemoryelementname] on subsystem [memoryelementname]. (all dimms) explanation: this message is for the use case when an implementation has detected a memory uncorrectable error. Severity: error alert category: critical - memory serviceable: ...

  • Page 254

    Snmp trap id: 5 automatically notify support: yes user response: 1. Run the hard disk drive diagnostic test on drive n. 2. Reseat the following components: a. Hard disk drive (wait 1 minute or more before reinstalling the drive) b. Cable from the system board to the backplane 3. Replace the followin...

  • Page 255

    Snmp trap id: 5 automatically notify support: yes user response: 1. Run the hard disk drive diagnostic test on drive n. 2. Reseat the following components: a. Hard disk drive (wait 1 minute or more before reinstalling the drive) b. Cable from the system board to the backplane 3. Replace the followin...

  • Page 256

    Snmp trap id: 5 automatically notify support: yes user response: 1. Run the hard disk drive diagnostic test on drive n. 2. Reseat the following components: a. Hard disk drive (wait 1 minute or more before reinstalling the drive) b. Cable from the system board to the backplane 3. Replace the followin...

  • Page 257

    806f0123-2101ffff reboot of system [computersystemelementname] initiated by [watchdogelementname]. (ipmi watchdog) explanation: this message is for the use case when an implementation has detected a reboot by a watchdog occurred. Severity: info alert category: system - other serviceable: no cim info...

  • Page 258

    1. Make sure that the latest levels of firmware and device drivers are installed for all adapters and standard devices, such as ethernet, scsi, and sas. Important: some cluster solutions require specific code levels or coordinated code updates. If the device is part of a cluster solution, verify tha...

  • Page 259

    806f020d-0402ffff failure predicted on drive [storagevolumeelementname] for array [computersystemelementname]. (computer hdd1) explanation: this message is for the use case when an implementation has detected an array failure is predicted. Severity: warning alert category: system - predicted failure...

  • Page 260

    806f020d-0406ffff failure predicted on drive [storagevolumeelementname] for array [computersystemelementname]. (1u storage hdd1) explanation: this message is for the use case when an implementation has detected an array failure is predicted. Severity: warning alert category: system - predicted failu...

  • Page 261

    806f020d-040affff failure predicted on drive [storagevolumeelementname] for array [computersystemelementname]. (1u storage hdd5) explanation: this message is for the use case when an implementation has detected an array failure is predicted. Severity: warning alert category: system - predicted failu...

  • Page 262

    Cim information: prefix: plat and id: 0136 snmp trap id: 41 automatically notify support: no user response: note: each time you install or remove a dimm, you must disconnect the server from the power source; then, wait 10 seconds before restarting the server. 1. Check the ibm support website for an ...

  • Page 263

    Connector. If the connector contains any foreign material or is damaged, replace the system board. 6. (trained service technician only) remove the affected microprocessor and check the microprocessor socket pins for any damaged pins. If a damage is found, replace the system board. 7. (trained servic...

  • Page 264

    Snmp trap id: 41 automatically notify support: no user response: note: each time you install or remove a dimm, you must disconnect the server from the power source; then, wait 10 seconds before restarting the server. 1. Check the ibm support website for an applicable retain tip or firmware update th...

  • Page 265

    6. (trained service technician only) remove the affected microprocessor and check the microprocessor socket pins for any damaged pins. If a damage is found, replace the system board. 7. (trained service technician only) if the problem is related to microprocessor socket pins, replace the system boar...

  • Page 266

    Alert category: system - other serviceable: no cim information: prefix: plat and id: 0131 snmp trap id: automatically notify support: no user response: 1. Make sure the dimm is installed correctly. 2. If the dimm was disabled because of a memory fault (memory uncorrectable error or memory logging li...

  • Page 267

    Alert category: system - other serviceable: no cim information: prefix: plat and id: 0131 snmp trap id: automatically notify support: no user response: 1. Make sure the dimm is installed correctly. 2. If the dimm was disabled because of a memory fault (memory uncorrectable error or memory logging li...

  • Page 268

    Serviceable: yes cim information: prefix: plat and id: 0062 snmp trap id: 40 automatically notify support: no user response: 1. Check the cpu led. See more information about the cpu led in light path diagnostics. 2. Check for a server firmware update. Important: some cluster solutions require specif...

  • Page 269

    5. (trained technician only) remove the affected microprocessor and check the microprocessor socket pins for any damaged pins. If a damage is found, replace the system board. 6. (trained technician only) replace the affected microprocessor. 806f050c-2002ffff memory logging limit reached for [physica...

  • Page 270

    6. (trained technician only) replace the affected microprocessor. 806f050c-2005ffff memory logging limit reached for [physicalmemoryelementname] on subsystem [memoryelementname]. (dimm 5) explanation: this message is for the use case when an implementation has detected that the memory logging limit ...

  • Page 271

    806f050c-2008ffff memory logging limit reached for [physicalmemoryelementname] on subsystem [memoryelementname]. (dimm 8) explanation: this message is for the use case when an implementation has detected that the memory logging limit has been reached. Severity: warning alert category: warning - memo...

  • Page 272

    Snmp trap id: 5 automatically notify support: no user response: 1. Make sure that the raid adapter firmware and hard disk drive firmware are at the latest level. 2. Make sure that the sas cable is connected correctly. 3. Replace the sas cable. 4. Check backplane cable connection. 5. Replace the raid...

  • Page 273

    3. Replace the sas cable. 4. Check backplane cable connection. 5. Replace the raid adapter. 6. Replace the hard disk drive that is indicated by a lit status led. 806f050d-0407ffff array [computersystemelementname] is in critical condition. (1u storage hdd2) explanation: this message is for the use c...

  • Page 274

    806f050d-040bffff array [computersystemelementname] is in critical condition. (1u storage hdd6) explanation: this message is for the use case when an implementation has detected that an array is critical. Severity: error alert category: critical - hard disk drive serviceable: yes cim information: pr...

  • Page 275

    Code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code. 4. Remove components one at a time, restarting the server each time, to see if the problem goes away. 5. If the problem remains, (trained ...

  • Page 276

    806f060d-0402ffff array [computersystemelementname] has failed. (computer hdd1) explanation: this message is for the use case when an implementation has detected that an array failed. Severity: error alert category: critical - hard disk drive serviceable: yes cim information: prefix: plat and id: 01...

  • Page 277

    Snmp trap id: 5 automatically notify support: yes user response: 1. Make sure that the raid adapter firmware and hard disk drive firmware are at the latest level. 2. Make sure that the sas cable is connected correctly. 3. Replace the sas cable. 4. Replace the raid adapter. 5. Replace the hard disk d...

  • Page 278

    806f060d-040bffff array [computersystemelementname] has failed. (1u storage hdd6) explanation: this message is for the use case when an implementation has detected that an array failed. Severity: error alert category: critical - hard disk drive serviceable: yes cim information: prefix: plat and id: ...

  • Page 279

    806f070c-2004ffff configuration error for [physicalmemoryelementname] on subsystem [memoryelementname]. (dimm 4) explanation: this message is for the use case when an implementation has detected a memory dimm configuration error has been corrected. Severity: error alert category: critical - memory s...

  • Page 280

    Cim information: prefix: plat and id: 0126 snmp trap id: 41 automatically notify support: no user response: make sure that dimms are installed in the correct sequence and have the same size, type, speed, and technology. One of the dimms : 806f070d-0401ffff rebuild in progress for array in system [co...

  • Page 281

    806f070d-0407ffff rebuild in progress for array in system [computersystemelementname]. (1u storage hdd2) explanation: this message is for the use case when an implementation has detected that an array rebuild is in progress. Severity: info alert category: system - other serviceable: no cim informati...

  • Page 282

    806f072b-2101ffff a successful software or firmware change was detected on system [computersystemelementname]. (imm promotion/imm recovery) explanation: this message is for the use case when an implementation has detected a successful software or firmware change. Severity: info alert category: syste...

  • Page 283

    Or coordinated code updates. If the device is part of a cluster solution, verify that the latest level of code is supported for the cluster solution before you update the code. 5. Make sure that the installed dimms are supported and configured correctly. 6. (trained technician only) replace the syst...

  • Page 284

    Cim information: prefix: plat and id: 0142 snmp trap id: 22 automatically notify support: no user response: 1. Reseat the dimm, and then restart the server. 2. Replace dimm n. (n = dimm number) 806f090c-2003ffff [physicalmemoryelementname] on subsystem [memoryelementname] throttled. (dimm 3) explana...

  • Page 285

    Implementation has detected memory has been throttled. Severity: warning alert category: system - other serviceable: yes cim information: prefix: plat and id: 0142 snmp trap id: 22 automatically notify support: no user response: 1. Reseat the dimm, and then restart the server. 2. Replace dimm n. (n ...

  • Page 286

    Cim information: prefix: plat and id: 0146 snmp trap id: 0 automatically notify support: no user response: 1. Make sure that the fans are operating, that there are no obstructions to the airflow, that the air baffles are in place and correctly installed, and that the server cover is installed and co...

  • Page 287

    Cim information: prefix: plat and id: 0146 snmp trap id: 0 automatically notify support: no user response: 1. Make sure that the fans are operating, that there are no obstructions to the airflow, that the air baffles are in place and correctly installed, and that the server cover is installed and co...

  • Page 288

    81010002-0701ffff numeric sensor [numericsensorelementname] going low (lower non-critical) has deasserted. (cmos battery) explanation: this message is for the use case when an implementation has detected a lower non-critical sensor going low has deasserted. Severity: info alert category: warning - v...

  • Page 289

    81010701-1001ffff numeric sensor [numericsensorelementname] going high (upper non-critical) has deasserted. (pci riser 1 temp) explanation: this message is for the use case when an implementation has detected an upper non-critical sensor going high has deasserted. Severity: info alert category: warn...

  • Page 290

    81010701-2d01ffff numeric sensor [numericsensorelementname] going high (upper non-critical) has deasserted. (pch temp) explanation: this message is for the use case when an implementation has detected an upper non-critical sensor going high has deasserted. Severity: info alert category: warning - te...

  • Page 291

    81010901-1002ffff numeric sensor [numericsensorelementname] going high (upper critical) has deasserted. (pci riser 2 temp) explanation: this message is for the use case when an implementation has detected an upper critical sensor going high has deasserted. Severity: info alert category: critical - t...

  • Page 292

    81010902-0701ffff numeric sensor [numericsensorelementname] going high (upper critical) has deasserted. (sysbrd 12v) explanation: this message is for the use case when an implementation has detected an upper critical sensor going high has deasserted. Severity: info alert category: critical - voltage...

  • Page 293

    81010b01-1002ffff numeric sensor [numericsensorelementname] going high (upper non-recoverable) has deasserted. (pci riser 2 temp) explanation: this message is for the use case when an implementation has detected an upper non-recoverable sensor going high has deasserted. Severity: info alert category...

  • Page 294

    81030006-2101ffff sensor [sensorelementname] has asserted. (sig verify fail) explanation: this message is for the use case when an implementation has detected a sensor has asserted. Severity: info alert category: system - other serviceable: no cim information: prefix: plat and id: 0508 snmp trap id:...

  • Page 295

    Automatically notify support: no user response: no action; information only. 81070201-0302ffff sensor [sensorelementname] has transitioned to a less severe state from critical. (cpu 2 overtemp) explanation: this message is for the use case when an implementation has detected a sensor transition to l...

  • Page 296

    81070202-1501ffff sensor [sensorelementname] has transitioned to a less severe state from critical. (pib fault) explanation: this message is for the use case when an implementation has detected a sensor transition to less severe from critical. Severity: info alert category: critical - voltage servic...

  • Page 297

    8107021b-0301ffff sensor [sensorelementname] has transitioned to a less severe state from critical. (cpu 1 qpilinkerr) explanation: this message is for the use case when an implementation has detected a sensor transition to less severe from critical. Severity: info alert category: critical - other s...

  • Page 298

    81070301-1102ffff sensor [sensorelementname] has deasserted the transition to non-recoverable from a less severe state. (pci 2 temp) explanation: this message is for the use case when an implementation has detected that the sensor transition to non-recoverable from less severe has deasserted. Severi...

  • Page 299

    816f0007-0301ffff [processorelementname] has recovered from ierr. (cpu 1) explanation: this message is for the use case when an implementation has detected a processor recovered - ierr condition. Severity: info alert category: critical - cpu serviceable: no cim information: prefix: plat and id: 0043...

  • Page 300

    Implementation has detected a fault condition in a slot has been removed. Severity: info alert category: critical - other serviceable: no cim information: prefix: plat and id: 0331 snmp trap id: 50 automatically notify support: no user response: no action; information only. One of pci error : 816f00...

  • Page 301

    Degraded/unavailable/failure. Severity: info alert category: warning - other serviceable: no cim information: prefix: plat and id: 0399 snmp trap id: 60 automatically notify support: no user response: no action; information only. 816f0107-0301ffff an over-temperature condition has been removed on [p...

  • Page 302

    Severity: info alert category: critical - memory serviceable: no cim information: prefix: plat and id: 0139 snmp trap id: 41 automatically notify support: no user response: no action; information only. 816f010c-2005ffff uncorrectable error recovery detected for [physicalmemoryelementname] on subsyst...

  • Page 303

    816f010d-0401ffff the drive [storagevolumeelementname] has been enabled. (computer hdd0) explanation: this message is for the use case when an implementation has detected a drive was enabled. Severity: info alert category: critical - hard disk drive serviceable: no cim information: prefix: plat and ...

  • Page 304

    User response: no action; information only. 816f010d-0408ffff the drive [storagevolumeelementname] has been enabled. (1u storage hdd3) explanation: this message is for the use case when an implementation has detected a drive was enabled. Severity: info alert category: critical - hard disk drive serv...

  • Page 305

    816f0113-0301ffff system [computersystemelementname] has recovered from a bus timeout. (cpu 1 peci) explanation: this message is for the use case when an implemenation has detected that a system has recovered from a bus timeout. Severity: info alert category: critical - other serviceable: no cim inf...

  • Page 306

    Alert category: system - other serviceable: no cim information: prefix: plat and id: 0390 snmp trap id: automatically notify support: no user response: no action; information only. 816f0207-0301ffff [processorelementname] has recovered from frb1/bist condition. (cpu 1) explanation: this message is f...

  • Page 307

    Snmp trap id: 27 automatically notify support: no user response: no action; information only. 816f020d-0404ffff failure no longer predicted on drive [storagevolumeelementname] for array [computersystemelementname]. (computer hdd3) explanation: this message is for the use case when an implementation ...

  • Page 308

    User response: no action; information only. 816f020d-040affff failure no longer predicted on drive [storagevolumeelementname] for array [computersystemelementname]. (1u storage hdd5) explanation: this message is for the use case when an implementation has detected an array failure is no longer predi...

  • Page 309

    816f030c-2004ffff scrub failure for [physicalmemoryelementname] on subsystem [memoryelementname]has recovered. (dimm 4) explanation: this message is for the use case when an implementation has detected a memory scrub failure recovery. Severity: info alert category: critical - memory serviceable: no ...

  • Page 310

    816f040c-2001ffff [physicalmemoryelementname] enabled on subsystem [memoryelementname]. (dimm 1) explanation: this message is for the use case when an implementation has detected that memory has been enabled. Severity: info alert category: system - other serviceable: no cim information: prefix: plat...

  • Page 311

    816f040c-2007ffff [physicalmemoryelementname] enabled on subsystem [memoryelementname]. (dimm 7) explanation: this message is for the use case when an implementation has detected that memory has been enabled. Severity: info alert category: system - other serviceable: no cim information: prefix: plat...

  • Page 312

    816f0507-2584ffff [processorelementname] has recovered from a configuration mismatch. (all cpus) explanation: this message is for the use case when an implementation has recovered from a processor configuration mismatch. Severity: info alert category: critical - cpu serviceable: no cim information: ...

  • Page 313

    816f050c-2006ffff memory logging limit removed for [physicalmemoryelementname] on subsystem [memoryelementname]. (dimm 6) explanation: this message is for the use case when an implementation has detected that the memory logging limit has been removed. Severity: info alert category: warning - memory ...

  • Page 314

    816f050d-0403ffff critical array [computersystemelementname] has deasserted. (computer hdd2) explanation: this message is for the use case when an implementation has detected that an critiacal array has deasserted. Severity: info alert category: critical - hard disk drive serviceable: no cim informa...

  • Page 315

    816f050d-0409ffff critical array [computersystemelementname] has deasserted. (1u storage hdd4) explanation: this message is for the use case when an implementation has detected that an critiacal array has deasserted. Severity: info alert category: critical - hard disk drive serviceable: no cim infor...

  • Page 316

    816f0607-2584ffff an sm bios uncorrectable cpu complex error for [processorelementname] has deasserted. (all cpus) explanation: this message is for the use case when an sm bios uncorrectable cpu complex error has deasserted. Severity: info alert category: critical - cpu serviceable: no cim informati...

  • Page 317

    816f060d-0406ffff array in system [computersystemelementname] has been restored. (1u storage hdd1) explanation: this message is for the use case when an implementation has detected that a failed array has been restored. Severity: info alert category: critical - hard disk drive serviceable: no cim in...

  • Page 318

    816f060d-040cffff array in system [computersystemelementname] has been restored. (1u storage hdd7) explanation: this message is for the use case when an implementation has detected that a failed array has been restored. Severity: info alert category: critical - hard disk drive serviceable: no cim in...

  • Page 319

    816f070c-2006ffff configuration error for [physicalmemoryelementname] on subsystem [memoryelementname]has deasserted. (dimm 6) explanation: this message is for the use case when an implementation has detected a memory dimm configuration error has deasserted. Severity: info alert category: critical -...

  • Page 320

    816f070d-0403ffff rebuild completed for array in system [computersystemelementname]. (computer hdd2) explanation: this message is for the use case when an implementation has detected that an array rebuild has completed. Severity: info alert category: system - other serviceable: no cim information: p...

  • Page 321

    816f070d-0409ffff rebuild completed for array in system [computersystemelementname]. (1u storage hdd4) explanation: this message is for the use case when an implementation has detected that an array rebuild has completed. Severity: info alert category: system - other serviceable: no cim information:...

  • Page 322

    Severity: info alert category: system - other serviceable: no cim information: prefix: plat and id: 0060 snmp trap id: automatically notify support: no user response: no action; information only. One of the cpus : 816f0813-2581ffff system [computersystemelementname]has recovered from an uncorrectabl...

  • Page 323

    Cim information: prefix: plat and id: 0143 snmp trap id: automatically notify support: no user response: no action; information only. 816f090c-2004ffff [physicalmemoryelementname] on subsystem [memoryelementname] is no longer throttled. (dimm 4) explanation: this message is for the use case when an ...

  • Page 324

    816f0a07-0302ffff the processor [processorelementname] is no longer operating in a degraded state. (cpu 2) explanation: this message is for the use case when an implementation has detected a processor is no longer running in the degraded state. Severity: info alert category: warning - cpu serviceabl...

  • Page 325

    User response: no action; information only. 816f0a0c-2006ffff an over-temperature condition has been removed on the [physicalmemoryelementname] on subsystem [memoryelementname]. (dimm 6) explanation: this message is for the use case when an implementation has detected an over temperature condition f...

  • Page 326

    3. (trained technician only) replace the system board (see removing the system board and installing the system board). (n = microprocessor number) 816f0a13-0302ffff 308 ibm nextscale nx360 m4 type 5455: installation and service guide.

  • Page 327

    Appendix b. Uefi (post) error codes this section details the uefi (post) error codes. Uefi (post) diagnostic error codes can be generated when the server starts up or while the server is running. Uefi (post) codes are logged in the integrated management module ii (imm2) event log in the server. For ...

  • Page 328

    4. Check ibm support site for an applicable service bulletin or firmware update that applies to this processor error. 5. (trained service technician only) replace mismatching processor. Inspect processor socket and replace the system board first if socket is damaged. I.18005 [i.18005] a discrepancy ...

  • Page 329

    1. Verify that the processor is a valid option that is listed as a server proven device for this system. If not, remove the processor and install a server proven one. 2. Verify that matching processors are installed in the correct processor sockets according to the service information for this produ...

  • Page 330

    1. Verify that matching processors are installed in the correct processor sockets according to the service information for this product. 2. Check ibm support site for an applicable service bulletin or firmware update that applies to this processor error. 3. (trained service technician only) replace ...

  • Page 331

    3. If error persists, or boot is unsuccessful, (trained service technician only) replace the system board. I.3818003 [i.3818003] the crtm flash driver could not lock the secure flash region. Explanation: crtm could not lock secure flash region severity: info user response: complete the following ste...

  • Page 332

    S.2018001 [s.2018001] an uncorrected pcie error has occurred at bus % device % function %. The vendor id for the device is % and the device id is %. Explanation: pcie uncorrected error detected severity: error user response: complete the following steps: 1. If this node and/or any attached cables we...

  • Page 333

    2. If user did not intentionally trigger the reboots, check logs for probable cause. 3. Undo recent system changes (settings or devices added). If not recent system changes, remove all options then remove cmos battery for 30 seconds to clear cmos contents. Verify that the system boots. Then, re-inst...

  • Page 334

    Found, correct and retry with the same dimm. (note: event log may contain a recent 00580a4 event denoting detected change in dimm population that could be related to this problem.) 2. If no problem is observed on the dimm connectors or the problem persists, replace the dimm identified by lightpath a...

  • Page 335

    S.68005 [s.68005] an error has been detected by the the iio core logic on bus %. The global fatal error status register contains %. The global non-fatal error status register contains %. Please check error logs for the presence of additional downstream device error data. Explanation: critical ioh-pc...

  • Page 336

    Severity: warning user response: complete the following steps: 1. Go to system settings > settings > driver health status list and find a driver/controller reporting configuration required status. 2. Search for the driver menu from system settings and change settings appropriately. 3. Save settings ...

  • Page 337

    2. Reset the imm from the fpc. 3. Use fpc to remove aux power from the node. This will reboot the entire node. 4. Check ibm support site for an applicable service bulletin or firmware update that applies to this error. 5. Reflash imm firmware. 6. Remove and re-install cmos battery for 30 seconds to ...

  • Page 338

    1. If the node has recently been installed, moved, serviced, or upgraded, verify that the dimm is properly seated and visually verify that there is no foreign material in any dimm connector on that memory channel. If either of these conditions is found, correct and retry with the same dimm. (note: e...

  • Page 339

    Appendix c. Dsa diagnostic test results after running the dsa diagnostic tests, use this information to resolve any issues that were found. Dsa broadcom network test results the following messages can result when you run the broadcom network test. 405-000-000 brcm:testcontrolregisters test passed ex...

  • Page 340

    405-801-000 brcm:testmiiregisters test aborted explanation: the mii register test was canceled. Severity: warning serviceable: no recoverable: no automatically notify support: no 405-802-000 brcm:testeeprom test aborted explanation: the eeprom test was canceled. Severity: warning serviceable: no rec...

  • Page 341

    405-902-000 brcm:testeeprom test failed explanation: a failure was detected while testing non-volatile ram. Severity: error serviceable: yes recoverable: no automatically notify support: no user response: complete the following steps: 1. Check component firmware level and upgrade if necessary. The i...

  • Page 342

    In the dsa diagnostic event log within the firmware/vpd section for this component. 2. Rerun the test. 3. If failure remains, refer to "troubleshooting by symptom" in the system "installation and service guide" for the next corrective action. Dsa brocade test results the following messages can resul...

  • Page 343

    Explanation: the test was canceled. Severity: warning serviceable: no recoverable: no automatically notify support: no 218-804-000 brocade:externalethloopbacktest aborted explanation: the test was canceled. Severity: warning serviceable: no recoverable: no automatically notify support: no 218-805-00...

  • Page 344

    218-904-000 brocade:externalethloopbacktest failed explanation: a failure was detected during the loopback test. Severity: error serviceable: yes recoverable: no automatically notify support: no user response: complete the following steps: 1. Check or replace sfp/cable. 2. Rerun the test. 3. Verify ...

  • Page 345

    6. If failure remains, refer to "troubleshooting by symptom" in the system "installation and service guide" for the next corrective action. Dsa cpu stress test results the following messages can result when you run the cpu stress test. 089-000-000 cpu stress test passed explanation: cpu stress test ...

  • Page 346

    2. Make sure that the dsa diagnostic code is at the latest level. 3. Run the test again. 4. Check system firmware level and upgrade if necessary. The installed firmware level can be found in the dsa diagnostic event log within the firmware/vpd section for this component. 5. Run the test again. 6. If...

  • Page 347

    Severity: error serviceable: yes recoverable: no automatically notify support: no user response: complete the following steps: 1. Check component firmware level and upgrade if necessary. The installed firmware level can be found in the dsa diagnostic event log within the firmware/vpd section for thi...

  • Page 348

    5. If the problem remains, contact your technical-service representative. Dsa hard drive test results the following messages can result when you run the hard drive test. 217-000-000 hdd test passed explanation: hdd stress test passed. Severity: event serviceable: no recoverable: no automatically not...

  • Page 349

    406-800-000 ianet:registers test aborted explanation: registers test was canceled. Severity: warning serviceable: no recoverable: no automatically notify support: no 406-801-000 ianet:eeprom test aborted explanation: eeprom test was canceled. Severity: warning serviceable: no recoverable: no automat...

  • Page 350

    Recoverable: no automatically notify support: no user response: complete the following steps: 1. Check component firmware level and upgrade if necessary. The installed firmware level can be found in the dsa diagnostic event log within the firmware/vpd section for this component. 2. Rerun the test. 3...

  • Page 351

    Serviceable: no recoverable: no automatically notify support: no 408-800-000 mlnx:mlnx_diagnostictestethernetport test aborted explanation: port test was canceled. Severity: warning serviceable: no recoverable: no automatically notify support: no 408-801-000 mlnx:mlnx_diagnostictestibport test abort...

  • Page 352

    Serviceable: no recoverable: no automatically notify support: no 201-811-000 standalone memory test aborted explanation: unable to locate smbios key "_sm_". Severity: warning serviceable: no recoverable: no automatically notify support: no user response: complete the following steps: 1. Perform the ...

  • Page 353

    201-812-001 standalone memory test aborted explanation: memory test is not supported for this system. Severity: warning serviceable: no recoverable: no automatically notify support: no user response: complete the following steps: 1. Perform the actions mentioned one at a time and try the test after ...

  • Page 354

    Recoverable: no automatically notify support: no user response: complete the following steps: 1. Perform the actions mentioned one at a time and try the test after each action. 2. If the problem remains, contact your technical-service representative. 3. Turn off the system and disconnect it from pow...

  • Page 355

    2. If the problem remains, contact your technical-service representative. 3. Turn off the system and disconnect it from power. Wait for 45 seconds. Reseat dimm(s). Reconnect it to power. 4. Make sure that dsa and bios/uefi are at the latest level. 201-815-000 standalone memory test aborted explanati...

  • Page 356

    201-816-001 standalone memory test aborted explanation: program error with full memory menu option selection. Severity: warning serviceable: no recoverable: no automatically notify support: no user response: complete the following steps: 1. Perform the actions mentioned one at a time and try the tes...

  • Page 357

    Automatically notify support: no user response: complete the following steps: 1. Perform the actions mentioned one at a time and try the test after each action. 2. If the problem remains, contact your technical-service representative. 3. Turn off the system and disconnect it from power. Wait for 45 ...

  • Page 358

    2. If the problem remains, contact your technical-service representative. 3. Turn off the system and disconnect it from power. Wait for 45 seconds. Reseat dimm(s). Reconnect it to power. 4. Make sure that dsa and bios/uefi are at the latest level. 201-820-000 standalone memory test aborted explanati...

  • Page 359

    201-821-001 standalone memory test aborted explanation: variable range mtrr registers are larger than fixed range mtrr registers. Severity: warning serviceable: no recoverable: no automatically notify support: no user response: complete the following steps: 1. Perform the actions mentioned one at a ...

  • Page 360

    Automatically notify support: no user response: complete the following steps: 1. Perform the actions mentioned one at a time and try the test after each action. 2. If the problem remains, contact your technical-service representative. 3. Turn off the system and disconnect it from power. Wait for 45 ...

  • Page 361

    1. Perform the actions mentioned one at a time and try the test after each action. 2. If the problem remains, contact your technical-service representative. 3. Turn off the system and disconnect it from power. Wait for 45 seconds. Reseat dimm(s). Reconnect it to power. 4. Make sure that dsa and bios...

  • Page 362

    2. If the problem remains, contact your technical-service representative. 3. Turn off the system and disconnect it from power. Wait for 45 seconds. Reseat dimm(s). Reconnect it to power. 4. Make sure that dsa and bios/uefi are at the latest level. 201-827-001 standalone memory test aborted explanati...

  • Page 363

    201-844-002 standalone memory test aborted explanation: chipset error: problem in masking msr machine check control mask registers. Severity: warning serviceable: no recoverable: no automatically notify support: no user response: complete the following steps: 1. Perform the actions mentioned one at ...

  • Page 364

    Recoverable: no automatically notify support: no user response: complete the following steps: 1. Perform the actions mentioned one at a time and try the test after each action. 2. If the problem remains, contact your technical-service representative. 3. Turn off the system and disconnect it from pow...

  • Page 365

    3. Turn off the system and disconnect it from power. Wait for 45 seconds. Reseat dimm(s). Reconnect it to power. 4. Make sure that dsa and bios/uefi are at the latest level. 201-860-001 standalone memory test aborted explanation: no oem0 type 1 found. Severity: warning serviceable: no recoverable: n...

  • Page 366

    Recoverable: no automatically notify support: no user response: complete the following steps: 1. Perform the actions mentioned one at a time and try the test after each action. 2. If the problem remains, contact your technical-service representative. 3. Turn off the system and disconnect it from pow...

  • Page 367

    3. Turn off the system and disconnect it from power. Wait for 45 seconds. Reseat dimm(s). Reconnect it to power. 4. Make sure that dsa and bios/uefi are at the latest level. 201-863-000 standalone memory test aborted explanation: no ibmerror key in oem1 structure. Severity: warning serviceable: no r...

  • Page 368

    Recoverable: no automatically notify support: no user response: complete the following steps: 1. Perform the actions mentioned one at a time and try the test after each action. 2. If the problem remains, contact your technical-service representative. 3. Turn off the system and disconnect it from pow...

  • Page 369

    3. Turn off the system and disconnect it from power. Wait for 45 seconds. Reseat dimm(s). Reconnect it to power. 4. Make sure that dsa and bios/uefi are at the latest level. 201-865-003 standalone memory test aborted explanation: no xsecsrat key in oem0 structure. Severity: warning serviceable: no r...

  • Page 370

    201-867-000 standalone memory test aborted explanation: efi/sal: buffer not allocated. Severity: warning serviceable: no recoverable: no automatically notify support: no user response: complete the following steps: 1. Perform the actions mentioned one at a time and try the test after each action. 2....

  • Page 371

    User response: complete the following steps: 1. Perform the actions mentioned one at a time and try the test after each action. 2. If the problem remains, contact your technical-service representative. 3. Turn off the system and disconnect it from power. Wait for 45 seconds. Reseat dimm(s). Reconnec...

  • Page 372

    2. If the problem remains, contact your technical-service representative. 3. Turn off the system and disconnect it from power. Wait for 45 seconds. Reseat dimm(s). Reconnect it to power. 4. Make sure that dsa and bios/uefi are at the latest level. 201-869-003 standalone memory test aborted explanati...

  • Page 373

    201-871-000 standalone memory test aborted explanation: data mis-compare encountered. Severity: warning serviceable: no recoverable: no automatically notify support: no user response: complete the following steps: 1. Perform the actions mentioned one at a time and try the test after each action. 2. ...

  • Page 374

    User response: complete the following steps: 1. Perform the actions mentioned one at a time and try the test after each action. 2. If the problem remains, contact your technical-service representative. 3. Turn off the system and disconnect it from power. Wait for 45 seconds. Reseat dimm(s). Reconnec...

  • Page 375

    2. If the problem remains, contact your technical-service representative. 3. Turn off the system and disconnect it from power. Wait for 45 seconds. Reseat dimm(s). Reconnect it to power. 4. Make sure that dsa and bios/uefi are at the latest level. 201-878-003 standalone memory test aborted explanati...

  • Page 376

    3. Turn off the system and disconnect it from power. Wait for 45 seconds. Reseat dimm(s). Reconnect it to power. 4. Make sure that dsa and bios/uefi are at the latest level. 201-886-000 standalone memory test aborted explanation: memory upper limit is less than 16 mbytes. Severity: warning serviceab...

  • Page 377

    201-899-002 standalone memory test aborted explanation: memory diagnostics test aborted by user. Severity: warning serviceable: no recoverable: no automatically notify support: no 201-899-003 standalone memory test aborted explanation: memory diagnostics test aborted by user. Severity: warning servi...

  • Page 378

    1. Perform the actions mentioned one at a time and try the test after each action. 2. If the problem remains, contact your technical-service representative. 3. Turn off the system and disconnect it from power. Wait for 45 seconds. Reseat dimm(s). Reconnect it to power. 4. Make sure that dsa and bios...

  • Page 379

    Serviceable: yes recoverable: no automatically notify support: no user response: complete the following steps: 1. Ensure that all memory is enabled by checking the "available system memory" in the "resource utilization" section of the dsa diagnostic event log. 2. If necessary, access the configurati...

  • Page 380

    Automatically notify support: no 409-805-000 nvidia::diagnosticserviceprovider::matrix test aborted explanation: nvidia gpu matrix test was canceled. Severity: warning serviceable: no recoverable: no automatically notify support: no 409-806-000 nvidia::diagnosticserviceprovider::binomial test aborte...

  • Page 381

    4. Rerun the diagnostics, using the same gpu, on system that is known to be working. A variety of system issues can cause diagnostic failure. 5. If the problem remains, contact your ibm technical-support representative. 409-906-000 nvidia::diagnosticserviceprovider::binomial test failed explanation:...

  • Page 382

    215-804-000 optical drive test aborted explanation: optical drive test aborted. The media tray is open. Severity: warning serviceable: yes recoverable: no automatically notify support: no user response: complete the following steps: 1. Close the media tray and wait for 15 seconds for the media to be...

  • Page 383

    166-801-001 imm i2c test aborted explanation: imm returned incorrect response length. Severity: warning serviceable: yes recoverable: no automatically notify support: no user response: perform the actions mentioned one at a time and try the test after each action: 1. Turn off the system and disconne...

  • Page 384

    166-808-001 imm i2c test aborted explanation: reservation canceled or invalid reservation id. Severity: warning serviceable: yes recoverable: no automatically notify support: no user response: perform the actions mentioned one at a time and try the test after each action: 1. Turn off the system and ...

  • Page 385

    2. Make sure that dsa and bmc/imm are at the latest level. 166-815-001 imm i2c test aborted explanation: invalid data field in request. Severity: warning serviceable: yes recoverable: no automatically notify support: no user response: perform the actions mentioned one at a time and try the test afte...

  • Page 386

    Recoverable: no automatically notify support: no user response: perform the actions mentioned one at a time and try the test after each action: 1. Turn off the system and disconnect it from power. Wait for 45 seconds. Reconnect it to power. 2. Make sure that dsa and bmc/imm are at the latest level. ...

  • Page 387

    A time and try the test after each action: 1. Turn off the system and disconnect it from power. Wait for 45 seconds. Reconnect it to power. 2. Make sure that dsa and bmc/imm are at the latest level. 3. Run the test again. 4. If failure remains, refer to "troubleshooting by symptom" in the system "in...

  • Page 388

    5. If the failure remains, refer to "troubleshooting by symptom" in the system "installation and service guide" for the next corrective action. 264-903-000 tape test failed explanation: tape test failed. Media is not detected. Severity: error serviceable: yes recoverable: no automatically notify sup...

  • Page 389

    Automatically notify support: no user response: complete the following steps: 1. Clean the tape drive using the appropriate cleaning media and install new media. 264-908-000 tape test failed explanation: an error was found in getting tape capacity. Severity: error serviceable: yes recoverable: no au...

  • Page 390

    372 ibm nextscale nx360 m4 type 5455: installation and service guide.

  • Page 391

    Appendix d. Getting help and technical assistance if you need help, service, or technical assistance or just want more information about ibm products, you will find a wide variety of sources available from ibm to assist you. Use this information to obtain additional information about ibm and ibm pro...

  • Page 392

    You can solve many problems without outside assistance by following the troubleshooting procedures that ibm provides in the online help or in the documentation that is provided with your ibm product. The documentation that comes with ibm systems also describes the diagnostic tests that you can perfo...

  • Page 393

    To create a personalized support web page, go to http://www.Ibm.Com/support/ mynotifications. From this personalized page, you can subscribe to weekly email notifications about new technical documents, search for information and downloads, and access various administrative services. Software service...

  • Page 394

    376 ibm nextscale nx360 m4 type 5455: installation and service guide.

  • Page 395

    Notices this information was developed for products and services offered in the u.S.A. Ibm may not offer the products, services, or features discussed in this document in other countries. Consult your local ibm representative for information on the products and services currently available in your a...

  • Page 396

    Adobe and postscript are either registered trademarks or trademarks of adobe systems incorporated in the united states and/or other countries. Cell broadband engine is a trademark of sony computer entertainment, inc., in the united states, other countries, or both and is used under license therefrom...

  • Page 397

    Ibm makes no representation or warranties regarding non-ibm products and services that are serverproven ® , including but not limited to the implied warranties of merchantability and fitness for a particular purpose. These products are offered and warranted solely by third parties. Ibm makes no repr...

  • Page 398

    Table 21. Limits for particulates and gases (continued) contaminant limits 1 ashrae 52.2-2008 - method of testing general ventilation air-cleaning devices for removal efficiency by particle size . Atlanta: american society of heating, refrigerating and air-conditioning engineers, inc. 2 the deliques...

  • Page 399

    Communications. Operation of this equipment in a residential area is likely to cause harmful interference, in which case the user will be required to correct the interference at his own expense. Properly shielded and grounded cables and connectors must be used in order to meet fcc emission limits. I...

  • Page 400

    Germany class a statement deutschsprachiger eu hinweis: hinweis für geräte der klasse a eu-richtlinie zur elektromagnetischen verträglichkeit dieses produkt entspricht den schutzanforderungen der eu-richtlinie 2004/108/eg zur angleichung der rechtsvorschriften über die elektromagnetische verträglich...

  • Page 401

    Japan vcci class a statement this is a class a product based on the standard of the voluntary control council for interference (vcci). If this equipment is used in a domestic environment, radio interference may occur, in which case the user may be required to take corrective actions. Japan electroni...

  • Page 402

    Taiwan class a compliance statement 384 ibm nextscale nx360 m4 type 5455: installation and service guide.

  • Page 403

    German ordinance for work gloss statement the product is not suitable for use with visual display work place devices according to clause 2 of the german ordinance for work with visual display units. Das produkt ist nicht für den einsatz an bildschirmarbeitsplätzen im sinne § 2 der bildschirmarbeitsv...

  • Page 404

    386 ibm nextscale nx360 m4 type 5455: installation and service guide.

  • Page 405

    Index a abr, automatic boot recovery 82 ac power-supply leds 53 ac power-supply leds 53 accessible documentation 380 activity led 13 adapter/gpu adapter removing 159 replacing 160 administrator password 32 air baffle removing 113 replacing 114 asm event log 57 assertion event, system-event log 55 as...

  • Page 406

    Error symptoms (continued) mouse 64 network connection 70 optional devices 70 power 72 serial port 73 serverguide 74 software 75 usb port 75 usb-device 64 video 68, 75 errors format, dsa code 60 ethernet controller 77 ethernet controller 8 ethernet controller configuration 22 european union emc dire...

  • Page 407

    Leds (continued) activity 13 check log 13 locator 13 on the system board 20 power 13 power-supply 53 system error 13 legacy operating system requirement 25 load-sharing power throttling 8 locator led 13 logging 36 m major components gpu tray 12 storage tray 11 system board 10 memory specifications 5...

  • Page 408

    Replacing (continued) pci riser filler 117 pci riser-cage assembly 155, 157 power paddle card on to the gpu tray 129 raid adapter battery holder 115 structural parts 110 tier 1 crus 125 tier 1 crus, replacement 125 tier 2 crus 165 requirements hardware 3 software 3 returning component 96 device 96 r...

  • Page 410

    Part number: 00kc216 printed in usa (1p) p/n: 00kc216.