Most services hit by Microsoft Azure outage back online

Microsoft said power was restored to the affected infrastructure in its data centre after temperatures returned to normal operating limits. PHOTO: REUTERS

SINGAPORE - Most Web services that were hit by the outage of Microsoft Azure’s cloud services on Wednesday are back online after power was restored to affected sections of its infrastructure.

Microsoft said on Thursday that its Azure cloud services have broadly recovered after a cooling unit failure at a South-east Asia data centre caused widespread outages on multiple Web services the previous day. It did not specify which data centre was affected, but the computing giant has a data centre in Singapore.

On its website, Microsoft said power was restored to the affected infrastructure in its data centre after temperatures returned to normal operating limits.

The statement added that Azure services gradually recovered as its underlying compute and storage scale units came back online.

A power surge in the South-east Asian region on Wednesday caused some cooling units to go offline, resulting in increased temperatures in that data centre, said Microsoft. The company “proactively powered down a number of compute and storage units to avoid damage to hardware and reduce cooling system load”.

A check by The Straits Times on Thursday afternoon found that the websites of Esplanade and Nanyang Technological University, which were down on Wednesday, were now accessible.

When contacted, a Central Provident Fund Board spokesman said its engineers had deployed a workaround and that most visitors could access its digital services since noon on Wednesday.

According to an update on EZ-Link’s Facebook page at 2.40pm on Thursday, all services related to SimplyGo ez-link concession cards and EZ-Link Wallet had resumed.

“While we have broadly recovered, a small subset of services is still working on post-recovery checks, and we are closely monitoring the data centre metrics for storage and compute resources to ensure they continue to show as healthy,” the Microsoft statement read. Microsoft added that it would communicate directly with Azure users who are still affected by the outage via the Azure Portal Service Health Alerts.

Such outages on cloud services are not preventable despite the best efforts, one expert said.

Mr Ian Lim, field chief security officer at cyber-security firm Palo Alto Networks, said: “Cloud platforms and the data centres that host them are built with reliability and redundancy by design. Despite our best efforts, cloud outages are still unavoidable.”

Mr Lim said businesses that use such cloud services should not rely on just one company for their operations to reduce the risk of service outages in the event of a data centre failure. “Using a multi-cloud approach will allow for businesses to have some backup systems so that they can guard against a similar service disruption like on Wednesday,” he said.

“But doing so will mean that they take on the added cost of hiring or training staff so they are well versed in the operations of the various cloud service providers.”

Join ST's Telegram channel and get the latest breaking news delivered to you.