ITS Services Status
UPDATE 8/20/2013 at 4:24 p.m.
ITS Systems has worked with the vendor to identify the root cause of the failure and will be documenting the incident in a failure report to be shared with campus via it-updates and Tech Staff Forums per the ITS Incident Communication Policy.
Systems staff believe they have identified the application causing excessive processes to be created in the ITS Central RAC Oracle database environment resulting in service unavailability.
At this time all database schemas in that environment should be available.
ITS Systems staff will continue to monitor the situation.
Oracle support has been contacted and a case has been opened at the highest level of urgency to assist in resolving this problem. The instability on one node has caused other nodes instability as well. FIS has been affected. Updates to follow at 1:00 or sooner depending on activities and resolution.
ITS Systems continues to work on resolving the problem.
36 databases are available, responding (never impacted)
14 databases are unavailable (on node being overloaded).
Systems staff were alerted via automated monitoring that a node within the ITS Central RAC service became unavailable at 10:34. The Oracle environment has a connection limit set to stop denial of service attacks. This limit has not been exceeded in the past, however Fall activities may have caused the limit to be exceeded, resulting in database unavailability. The Oracle DBAs are increasing and monitoring the connection limit in the Central RAC pool. This will require a reload of the Oracle databases on all 3 nodes in the cluster, which may result in additional outages.
An update will be provided at 11:30 or earlier depending on problem resolution.
At 10:34 one of the clustered nodes in the ITS Oracle Central RAC became unavailable. Systems staff are investigating. An update will be issued at 10:45.
Last updated on Aug 20, 2013 4:26 PM