The University of Texas at Austin

ITS Services Status


RESOLVED: Central Storage Failure
Service Restored

The DDN array is stabilized.


Vendor support continues Root Cause Analysis.
Replacements parts have arrived and vendor support and ITS-Storage teams are determining optimal maintenance window for part replacement


Thursday June 27 2013 Morning Update:
 
Root Cause Analysis for the storage outage continues.
Vendor is coordinating replacement parts and installation for storage device components.
Next Update: Monday July 1, 2013


Root Cause Analysis for the storage outage is in process.

Next update Thursday morning.


1:15pm update
• UT-V administrators have completed analysis of impacted VMs.
• VM Owners have been contacted by UT-V admins and are asked to verify that their VMs are operating correctly
• Root Cause Analysis for the storage outage is in process

Next Update: 3:30pm


11:35am update
• Restoration for impacted services is complete
• UT-V Administrators continue to analyze the impacted VMs to verify functionality
• VM Owners have been contacted by UT-V admins and are asked to verify that their VMs are operating correctly
• Root Cause Analysis for the storage outage is in process

Updates to will now transition to 1 hour intervals.


11:05am Update
            •The Central Storage device that failed is now running at full capacity. 
            •Log review is underway with vendor support to determine a Root Cause for the failure event.
            •Health Check of impacted VMs and services continues, with the following updates:
                    Austin Disk: All-clear
                    Wikis: All-clear
                    Blogs: All-clear
                    Splunk: All-clear
                    DocRepo: All-clear
                    WorldSpace: Down.  Mitigation underway.
                    UT Lists: All-clear.
                    Webspace:  Health Check complete. In process of transparently restarting service.


10:20am Update
            The Central Storage device that failed is running in a degraded state. 
            Vendor support is on-site, working with ITS-Systems-Storage staff in the Unversity Data Center.
            Health Check of impacted VMs and services continues, with the following updates:
                    Austin Disk:  Up and Running.  Health Check continues.
                    MySQL: Up and Running.  Health Check continues.
                    UTEBS: Up and Running.  Health Check continues.
                    UTLists: Down.  Mitigation underway.
                    Webspace: Up and Running.  Health Check continues.


9:45 am Update
          Healthcheck for the impacted VMs continues. VM Owners for the impacted VMs have been contacted and asked to verify the health of their UT-V VMs.
          Known impacted major services include:
                 
                Austin Disk
                Authentication for O365.  Both Fat Client (Outlook) and Outlook Web Access.  (Note that AEMS was not impacted)
                Wikis
                Blogs
                MySQL
                Doc Repo
                ID Photos
                Whips
                WebSpace
        The service outages from this morning all appear to be related to the central storage outage, and NOT a network issue.


Austin Disk Services was affected by the storage issue. Shares appear responsive now but diagnostics are ongoing.


9:26am Update:  Multiple VMs running on the UT-V service are impacted by the Central Storage outage.  UT-V is reconnected to the storage and triage of VM health is underway.
.


A DDN storage controller crashed at 8:50 AM.  Multiple services are affected.

Systems staff are investigating.


Last updated on Jul 11, 2013 5:03 PM

We Can Help

Get help from an expert:

* ITS Help and Service Desk

* Call us at 512-475-9400

* Submit a help request online

We also have a walk-in service in the first floor lobby of the Flawn Academic Center (FAC). Stop by and let us help you!