Is Your HP Blade System Healthy?
20%
Only 20% of downtime results from hardware failure
such as a broken processor or a failed disc.
The remaining 80% of unplanned downtime is
caused by problems managing the environment or
oversights. This means that the majority of downtime
is ultimately preventable.
Common causes of downtime in Blade environments
are:
- Out of date software, drivers, BIOS, patches and
fi rmware
- Software compatibility issues
- Suboptimal configurations & confi guration issues
- Service management & procedural issues
HPs Blade System Health Check has been designed
specifi cally for blade environments to help to
address some of the common issues that cause the
80% of problems that are preventable, giving you
greater reliability.
More and more companies are purchasing Blades as a means of increasing asset utilisation and leveraging the cost advantages that owning fewer assets brings.
Significant downtime in blade environments is caused by the added complexity that the blades & virtualisation bring. Moving to a Bladed environment brings with it a new set of challenges for IT departments as the technology contained within the chassis is highly interdependent. Changing one parameter within a Blade enclosure can have unforeseen consequences in another. There are in excess of 40 fi rmware, software and microprograms in a Blade enclosure which all need to be compatible with each other. Changing one often requires analysis of the impact on all the others. Therefore, things like upgrades and replacing parts requires thorough planning and testing following a disciplined process.
Typical examples of blade systems failure that are preventable include:
Media company – implemented a firmware upgrade without considering the interdependencies. This led to a mismatch of fi rmware levels, causing 70 blade servers to become intermittently unavailable and affecting call centre operations at peak time.
Financial company – a blade server failed and was replaced with a spare by the customer. The customer did not have a firmware/driver update policy and the replacement blade was at a higher revision of firmware and was therefore incompatible with the existing infrastructure. This led to a severely degraded service whilst updates were retrospectively implemented to allow the blade infrastructure to work properly.
Firmware and Patches
Are you sure that your Blade System is running all current versions of firmware? Do you find it difficult to determine the most appropriate revisions for your specific environment?
HP estimates that approximately 30% of all unplanned downtime in Blade Systems environments is caused either directly or indirectly by out of date patches and firmware. Therefore things like the apparently straightforward matter of swapping one Blade for another can lead to reduced performance or an outage. Replacements parts often have the latest fi rmware installed which may be incompatible with the firmware within the existing enclosure. What seems like a routine replacement or upgrade can have a major impact on system performance, application availability and the smooth running of your business systems. Rectifying outages caused by incompatibility can also be very time consuming, compounding the problem.