09 Mar 2009 10:38
So as you have probably already read on my previous blog post, we are thinking about re-designing the Wikidot infrastructure to make it more failure-proof, elastic and more efficient. One of the concepts have is to use one of the "cloud solutions", e.g. Amazon AWS (i.e. EC2 + S3).
So here are a few facts:
- In max 6 months from now we need to reorganize our server infrastructure anyway— simply because demand for Wikidot is growing.
- We want to create a more self-configuring (self-healing) infrastructure, removing as many single-points-of-failure as possible.
- We want to be able to scale just by throwing additional servers into the cluster, possibly within minutes, and be able to throw them away when we do not need them.
- Managing hardware is a pain and we would rather move our effort into higher-level management.
- We want better separation of various services we are running (like daily tasks, log analysis etc.). Some of them deserve separate servers.
Some of the solutions that more or less comply with the above requirements:
- Simply add more hardware — it would work. it works for most applications. But is costly and not elastic. Difficult to add resources on-the-fly. With SoftLayer server provisioning is really good, we can have a new server within 2-3 hours.
- Virtualize our own hardware — SoftLayer has an offering here, we would need to look deeper into this. But is it really what we need? Our hardware would still be a point of failure.
- Use virtualized instances — we could get virtual instances from a 3rd party provider (is SL going to offer virtual servers?). The problem is however that we need some good performance from our boxes too, and thus we would need a good degree of control over it.
- Use a "pure cloud solution" like Google App Engine. GAE is out-of-question because we would need to rewrite our code, and I am not sure it can run something as complex as Wikidot, with a lot of background services etc.
- Use Amazon EC2 — guess what, it looks like an optimal solution.
There is more info about EC2 here. Basically you can rent instances (virtual servers) on a per-hour basis ($0.10 - $0.80 per hour), there is a nice API to manage your instances, storage, IP addresses etc. EC2 deserves a separate article obviously. I can only say that:
- I am using EC2 + S3 + SQS from one of my other project and it rocks in terms of performance and scalability. A properly-designed application can handle millions of visits per day without much magic.
- Pricing is nice, but does not necessarily mean we could save any $$ moving to AWS. You pay only for what you are using, no up-front fees or plans. Good for small startups that grow over time.
- AWS meets most of our above requirements.
One of the nice things about EC2 is that you can get a new server within 2-5 minutes, use it as long as you wish, and terminate it. Everything is automated. There are dozens of Linux images available, tons of documentation and support from the community and Amazon itself (this one is paid extra).
Since we already had experience with AWS, we decided to run a couple of tests last Friday, just to get a glimpse of the situation. So what we did was:
- We set up a simple cluster configuration (1 front-end web server + 1 database server)
- We installed Wikidot on it using a fresh database dump.
- We tried to simulate read-only traffic by taking access logs from Wikidot.com and throwing the requests to the test server in parallel.
The only thing we are still concerned about is database performance in the virtualized (Xen) environment, over network-attached drives. Although in our tests the database was doing really fine, we need to do more read+modify tests.
Amazon AWS is getting a lot of attention recently, and the recent changes and future enchancements look really promising.
After performing the tests we terminated the servers — Amazon charged us about $3 ;-)
BTW: This opens one more interesting case for us. SaaS with Wikidot — would you like to get your very own Wikidot installation within minutes, hosted on a virtual server instance? Yes, we know how to do this, and we might automate the whole process someday.
rating: 1, tags: aws cloud testtag wikidot