These are people who have either engineered airline IT networks or actually worked on British Airways' systems in the past.

What I've heard is a lot of confusion and scepticism at the idea that a local power surge could have wreaked such havoc.
There is also confusion as to why back-up systems didn't do their job.
Only
the people in the room know exactly what happened, so these views are
based on the information made public, and bucketfuls of IT experience,
including at BA.
One put it like this: "BA has two data centres near Heathrow, about a kilometre apart, so how could a power surge affect both?"
Then there are all the fail-safes in place.
The two data centres mirror each other I'm told, so when one collapses the other should take over.
All
the big installations have back-up power. If the mains fails, a UPS
(uninterruptable power supply) kicks in. It's basically a big battery
that keeps things ticking over until the power comes back on, or a
diesel generator is fired up.
This UPS is meant to take the hit from any "surge", so the servers don't have to.
All the big servers and large routers, I'm told, also have dual power supplies fed from different sources.
I'm also told that, certainly a while ago, they used to have regular
outages to confirm all the back-up bits were working. And daily
inspections of the computer room. There is no reason to think these were
stopped.
It's not even clear who was monitoring the system at the crucial time. Was it a contractor? How much experience did they have?
The
point is this: certainly up until a while ago, British Airways' IT
systems had a variety of safety nets in place to protect them from big
dumps of uncontrolled power, and to get things back on their feet
quickly if there was any problem. I'm assuming those safety nets are
still there, so why did they fail? And did human error play a part in
all this?

British Airways chief executive Alex Cruz told me
recently that the company has launched an exhaustive investigation into
what went wrong, although no-one can say when it will report back, and
whether the findings will ever be made public.
If BA wants to
repair its reputation, its owner IAG needs to convince the public that
making hundreds of IT staff redundant last year did not leave them
woefully short of experts who could have fixed the meltdown sooner. And
that it won't happen again - at least not on this epic scale.
Mr Cruz was adamant, by the way, that the outsourcing did not contribute in any way to this mess.
BBC
No comments:
Post a Comment