The University of Texas at Austin

ITS Services Status

RESOLVED: UT-Virtual ESXi Server Crash

Service Restored

Networking and Systems have concluded their review of the UT-V network service outage on Thursday, January 30th, 2014.

Details of the outage, including the timeline, impact, root cause analysis, and lessons learned can be found at: https://wikis.utexas.edu/x/jA95Aw or https://wikis.utexas.edu/display/networking/UT-V+Networking+Failure+Analysis+-+2014-01-30

We apologize for the service interruption and are taking steps to review the system to avoid or reduce the impact of similar problems in the future.


Root Cause Analysis report is underway from ITS Networking.


Steps have been taken to mitigate the issue until the root cause can be addressed. VM owners have been notified of the second crash via the technical contact email addresses on record for the affected VMs.


At this time, there is no indication of service outages associated with this UT-V Host failure.
Affected VMs are identified.  Owners of effected VMs are being notified.


Another UT-V ESXi host crashed at approximately 9:28pm.
Likely due to the same issue that was diagnosed this afternoon. VMs hosted by the server have been automatically restarted on other hosts. Affected VM owners will be contacted but they should check their services for potential impact.


The root cause of the server crash has been diagnosed.


Affected VM owners are being identified and contacted.


The server which crashed was in the Commodity2 cluster.
57 VMs were affected


At noon today (Wed. Jan 30) one of our UT-Virtual ESXi servers crashed.  Virtual Machines on the affected server went down with the server, the High Availability component restarted them on other servers in the cluster.  Other ESXi servers are unaffected.  We are compiling a list of affected VMs and opening a ticket with VMware to determine the cause of the crash.

Last updated on Mar 13, 2014 11:26 PM