Black Wednesday

Blog calendar

RSS feed from Michal Frackowiak's blog

subscribe to the RSS feed

— or —

get my blog posts via email

michal frackowiakmichal frackowiak
leigerleiger
RefutnikRefutnik
SquarkSquark
anjelanjel
TRT- Vipul SharmaTRT- Vipul Sharma
bcammobcammo
Arotaritei VladArotaritei Vlad
shark797039shark797039
Matt GentileMatt Gentile
clearekicleareki
Watch: site | category | page

Blog tags


View my profile on LinekdIn

Photos

{"module":"wiki\/image\/FlickrGalleryModule","params":{"userName":"michal_frackowiak","size":"square","perPage":"6","limitPages":"1"}}

Blogroll

Piotr Gabryjeluk's blog (Gabrys on Wikidot)
Lukasz Tarka's blog (Squark on Wikidot)

Recent posts by my friends

{"module":"feed\/FeedModule","params":{"src":"http:\/\/squark.wikidot.com\/feed\/pages\/category\/blog\/;http:\/\/piotr.gabryjeluk.pl\/feed\/pages\/category\/dev\/t\/Piotr+Gabryjeluk+dev+blog","limit":"10","module_body":"[[div style=\"font-size: 87%; margin: -0.7em 0;\"]]\n%%linked_title%%\n[[\/div]]"}}

Me in other networks:

facebook, last.fm, del.icio.us, Flickr, Flaker

« Back to the Blog

05 Mar 2009 09:21

There are weeks that nothing exciting (or fatal) happens. But there are days that a lot happens that make you think if this is really a coincidence. Yesterday was one of such days, and it was not even Friday 13th. So here is what happened:

  • I stayed at home, Lukasz was at the office. All of sudden network went down. Later we learned that half of our city (that get internet from TP S.A.) had problems.
  • Later Piotr called that our office server is down and cannot boot up. It was one of the disks in RAID 10 array that failed and for some reason GRUB could not boot. It booted later after Piotr did some magic, now we just need to replace one drive asap.
  • At 15.30 local time I got an alert email that Wikidot.com is down. Immediately i tried to log-in to the server - nothing. Ping - yes. Alive. But all services went down.
  • After a few minutes we knew we must act. Piotr started re-assigning IP addresses of the web server to a backup server. Failed. Looks like the router could not handle this in real-time this time.
  • Main server restart - nothing helps. We had a similar issue some time ago, we started the rescue mode (server boots from a rescue linux image, this is greatly automated by SoftLayer). Server is up. A year ago what prevented the system from booting was a forced fsck on one of the drives and this required a key pressed or so (as told by the SoftLayer support team). So we started disk checks. And this took almost an hour! S#*t!
  • Meanwhile my friend called me as his car broke just 20 meters from our parking lot and he could not move it, so I went to help him.
  • Server got up, everything was back to normal. Situation under control.

I am not afraid of fatal Fridays any more. I fear of Wednesdays.


rating: 1, tags:

rating: +1+x

del.icio.usdiggSimpyRedditYahooMyWebFurl

Add a New Comment
or Sign in as Wikidot user
(will not be published)
- +
asdad