Tuesday, April 14, 2009

A major outage of the ooma service

About the April 13th Service Outage

dennis.peng | April 13, 2009 @ 8:30pm

A major outage of the ooma service began today (April 13th) at approximately 11AM (pacific time). This outage affected all ooma customers and was triggered by Internet connectivity issues at our upstream service provider. While we do not have all the details around the problems that affected the Internet connection at this time, we do know that these problems were not isolated to ooma, but did effectively cut off the ooma service from the outside world.

Between 2PM and 3PM, Internet connectivity was slowly being restored to our service. However, the flood of ooma Hubs coming back online created an immense amount of load on our provisioning systems. We rushed to add capacity to the system, but the nature of the network outage had interfered with the system’s ability to recover by itself. Beginning at 2:30PM, we began throttling Hubs from connecting to our servers. This allowed the system to settle down and work through the backlog of requests. As the systems stabilized, we slowly began opening up blocks of users to connect to the ooma service.

As of 5PM, most service has been fully restored. Some Hubs may not recover automatically in an expedient manner - if your ooma service is still down, please reboot your ooma Hub. Unplug power from the back of the device, wait one minute and then plug the power back in. The ooma Hub may need to download a software update, but it should go into service shortly thereafter.

Rest assured that we are taking this outage very seriously. Discussions have already started on how to make the service resilient to a similar event in the future. ooma currently has one data center located in west coast. We have planned to light up a second data center in the midwest or east coast this year, and this outage has served as a stark reminder for us to get moving on that. This has also served as a good opportunity for us to re-evaluate our contingency and business continuity plans.

We know that phone service is critical function that everyone relies on. We apologize for the outage this afternoon and thank you for your patience as we work through some growing pains.

---

this is taken from Betanews:

Participants in Ooma's ongoing Twitter discussion yesterday blamed the service outage on a bigger service failure at Internap, a data center, co-location, and content delivery services provider based in Atlanta. They said such an outage affected multiple VoIP services and other major businesses, including Google's Gmail and RIM's BlackBerry service. Indeed, BlackBerry users did report a service backlog, during roughly the same timeframe yesterday, though no mail ended up being lost. Internap has not made any public statements regarding a service outage, and Betanews has contacted Internap for clarification.

No comments: