Last updated at 2017-10-11 05:35:18

Service Outage Report - 20171011

Summary

There was multiple mixed issues for this outage when I checked the mudfish servers. Mostly DB server and authentication server causes this issue because recently many new mudfish users are using our service and no. of users are incresing. Because DB server was too much overloaded yesterday it begans to make delays while handling the request. It leads another TIME-WAIT issues of Authentication server. :-(

Outage Time

  • 2017-10-10 9:30 PM ~ 2017-10-11 1:00 AM (KST)
    • It lasts around three and half hours.
  • 2017-10-11 12:00 PM ~ 2017-10-11 13:00 PM (KST)
    • We performed PM (Preventive Maintainance) time to prevent this issue again.

Done

  • Our DB server is scaled up to handle more requests simultaneously.
  • Authentication is patched to reuse TIME-WAIT sockets more quickly.

Service Outage Report - 20171011 (last edited 2017-10-11 05:35:18 by loxch)