Outages

You are currently browsing the archive for the Outages category.

Subscribe to the Outages category feed

Websites (other than wiki and gallery) and mail were out a couple times this afternoon. The parent Xen host crashed twice and was rebooted by RimuHosting. Sigh. Hopefully it’ll stay up this time.

Update 12-25: I must have jinxed it. Host continued to have issues and was out overnight. (Sleigh riding with Santa?) Seems to be back up now after a Xen upgrade.

Another four months, another outage. Comcast was down for a couple hours, presumably due to snow. Gallery and wiki, which are hosted in the basement on carrot and tomato, were inaccessible.

Snowing Morning

There was also a gallery and wiki outage about two weeks ago due to an unexpected IP address change.

As usual, mail, blogs, and other websites were unaffected. They’re hosted on kiwi, which is located in a data center in Dallas.

I’ll be moving the servers rhubarb, carrot, and tomato back down to the basement today (basement renovation is nearly done!). Consequently the Gallery and Wiki will be down for a few hours.

Mail, blogs, and other websites will be unaffected. They’re hosted on kiwi, which is located in a data center in Dallas.

Update 7:35 p.m. PDT: Back online!

While we were out of town a power loss took down the gallery and wiki. They’re on UPSes but apparently not big enough ones!

A disk failure crashed carrot around 6:00 PDT. Everything is back up as of 16:30 PDT and the RAID array is rebuilding in the background.

I’ll be moving the servers rhubarb, carrot, and tomato today (basement renovation!). Consequently the Gallery and Wiki will be down for a few hours.

Mail, blogs, and other websites will be unaffected. They’re hosted on kiwi, which is located in a data center in Dallas.

Update 1:30 PDT: Move complete!

RimuHosting is moving our main server, kiwi, to a different cabinet on Monday, January 21st @ 4 PM PST. Services will down for approximately half an hour.

The Gallery and Wiki are unaffected. Inbound email will be accepted but not delivered until the outage is over.

Update: Maintenance was delayed by 24 hours.

The main server, kiwi, now has a little more than twice the memory it used to. This should improve website responsiveness and put an end to the sporadic outages of the email spam filter. The upgrade involved a few reboots over the weekend.

Gallery and Wiki were down for ~ 10 hours. The usual drill: Comcast changed my IP address and everything hosted in the basement became inaccessible. I didn’t notice right away because I had forgotten to update the the health checks to specifically test the Gallery and Wiki now that they’re hosted separately from everything else. Then again, Esmae keeps us so busy I probably wouldn’t have noticed the alert emails pouring in…

 



On July 25th rhubarb, the firewall machine in the basement, had a spectacular hard disk crash. It sounded like a cat fight and startled us out of bed! 8-O

Rhubarb kept routing traffic, so I was lazy and didn’t get around to rebuilding on a new disk until today. Carrot, which hosts the Gallery and the Wiki, was inaccessible during rhubarb’s rebuild. Kiwi, which hosts everything else and lives in a data center in Texas, was unaffected.

The gap in the traffic graph below reflects the time during which rhubarb had no working hard disk and couldn’t record log files. If you had been attempting to hack in to my network, that would have been a good time to do so surreptitiously!

red-month.png

All services are now back up and running. I revived carrot (that’s the server in the basement). Everything that couldn’t be migrated to kiwi (that’s the new hosted server) is once again running on carrot. This includes the gallery and all Tomcat webapps (Wiki, Calendar).
I had liked the idea of no longer maintaining a server in the house any more, so I’m still looking into ways of migrating those.

While reviving carrot, I learned that there was no disk corruption. Carrot had long ago become unbootable and I just never noticed! In September 2006 the boot menu was incorrectly written. I don’t have logs to confirm it, but I assume this was the first time carrot had rebooted since then.

On Friday morning the server in my basement overheated and became completely unresponsive. It failed to properly reboot afterwards, perhaps due to slight corruption of the files needed for booting or a latent misconfiguration. I opted to accelerate the move to the new hosted server rather than try to bring the old server back to life. Mail and static web content was migrated by Saturday March 3rd. Blogs and some smaller features followed the next day. I’ve been ironing out the kinks and bringing remaining pieces back online throughout the week. Still down are:

  • Gallery
  • Web mail
  • Wiki

Web mail will be restored before long. The others will need more time. :-( If you see anything else amiss, please assume I’m unaware of the issue and let me know.

2007-Feb-08 Outage

All services down from 2:45 to 6:15 PST (plus up to fifteen minutes for DNS propagation). Comcast changed my IP address (again!). Faster recovery this time because I set up some monitoring and shortened the DNS timings. But I would have rather stayed in bed. Still looking into options for solving the problem permanently.

2007-Feb-06 Outage

All services down from 1:45 to 17:06 PST (plus up to two hours for DNS propagation). Comcast changed my IP address (again). It looks like they offer static IPs now; I’ll look into getting one.

http/https (web) services down from 7:28 to 9:07 PDT. Same as last week.

Update 2006-Jun-07: As a work-around, Tomcat is now being bounced on an automatic basis. This is keeping Apache stable at the price of a few minutes of unresponsiveness each night for the Java webapps (Calendar, Wiki, etc.). This was implemented in early May and seems to be working.

2006-Apr-28 Outage

http/https (web) services down from 12:24 to 20:53 PDT. Somehow Apache’s getting completely tied up waiting for Tomcat. Added timeouts and periodic connection recycling. This happened once earlier this year and I forgot to log it.