Redis Performance on EC2 (aka weekend project coming)

Blog calendar

RSS feed from Michal Frackowiak's blog

subscribe to the RSS feed

— or —

get my blog posts via email

michal frackowiakmichal frackowiak
SquarkSquark
anjelanjel
Killer8845Killer8845
shark797039shark797039
Watch: site | category | page

Blog tags


View my profile on LinekdIn

My Twitter

1268137834|%e %b, %H:%M|agohover
bugs.wikidot.com - fixing bugs today. As many as we can. Report your own if you find one!

1268092513|%e %b, %H:%M|agohover
A Brief History of Pretty Much Everything - pretty brilliant www.youtube.com/watch?v=gNYZH9kuaYM

1268091732|%e %b, %H:%M|agohover
I have a feeling that Wikidot.com is now blazingly fast after server upgrade: blog.wikidot.com/blog:files-service-migrated

1268091379|%e %b, %H:%M|agohover
Inspiring story of Pandora - finally real revenue after 10 years: www.nytimes.com/2010/03/08/technology/08pandora.html

1267964484|%e %b, %H:%M|agohover
@antirez Switch to PostgreSQL? It has much more resonable locking mechanisms.

1267794675|%e %b, %H:%M|agohover
Power is back ;-) . Fun is over :-(

Photos

Blogroll

Piotr Gabryjeluk's blog (Gabrys on Wikidot)
Lukasz Tarka's blog (Squark on Wikidot)

Recent posts by my friends


Me in other networks:

facebook, last.fm, del.icio.us, Flickr, Flaker

« Back to the Blog

1240568763|%e %b, %H:%M (%O ago)

Weekend is coming and I have a very small pet-project for it. I would still keep the idea non-public, but it involves processing hundreds of entries per second, analyzing data from multiple sources. It would have a dead-simple web interface.

The nature of the project requires really fast data backend, capable of storing and retrieving a few thousand items per second. The dataset would be approximately 5GB, average item size: 0.5KB.

When it came to tools selection, after short considerations I have chosen Sinatra for web interface, and Redis as a memory-only (with disk dumps) key-value datastore. It should be capable of handling 100 000 requests per second and deal well with large datasets, so fits perfectly. It also differs from Memcached or MemcachedDB because it has great higher-level structures like Lists and Sets, basic sorting and selection commands.

Recently there is a lot of hype about (distributed) key-value storage, for more info I recommend a nice article by Richard Jones Anti-RDBMS: A list of distributed key-value stores. HighScalability blog also has a lot of references and articles.

Redis looks like the way to go. The only problem with it is that the whole dataset (database) must fit in the RAM, otherwise performance might degrade terribly (because of swapping). Performance itself is not an issue, and you would need several concurrent clients to actually face this as a limit.

Anyway, initially I wanted to deploy the project at Amazon EC2 - because of hyped scalability, price etc. But here comes a surprise — the performance simply sucks. I guess this is because the instances share common hardware and you might have actual memory bandwidth limited.

Here are my results of running

./redis-benchmark -n 100000

Amazon Small instance ($0.10/h)

====== PING ======
  100042 requests completed in 11.95 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

8369.61 requests per second

====== SET ======
  100023 requests completed in 12.13 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

8247.28 requests per second

====== GET ======
  100004 requests completed in 14.26 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

7010.94 requests per second

====== INCR ======
  100000 requests completed in 14.40 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

6945.89 requests per second

====== LPUSH ======
  100000 requests completed in 12.24 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

8171.27 requests per second

====== LPOP ======
  100000 requests completed in 14.22 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

7033.83 requests per second

The small instance is a no-go if you want to use it for Redis. Keep in mind it is AMD-based and in general the High CPU instances (with Intel Xeons) outperform their AMD brothers dramatically.

Amazon High CPU Medium ($0.20/h)

====== PING ======
  100007 requests completed in 6.52 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

15333.79 requests per second

====== SET ======
  100006 requests completed in 2.22 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

44986.95 requests per second

====== GET ======
  100009 requests completed in 2.21 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

45252.94 requests per second

====== INCR ======
  100000 requests completed in 2.35 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

42625.75 requests per second

====== LPUSH ======
  100009 requests completed in 2.24 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

44686.78 requests per second

====== LPOP ======
  100011 requests completed in 2.28 seconds
  50 parallel clients
  3 bytes payload
  keep alive: 1

43787.66 requests per second

This is much better, but still sucks. For a similar price you could get a dedicated box at SoftLayer, our current provider, with more than a double performance AND good upgrade options.

Surprisingly, more expensive EC2 instances could not deliver any much higher performance, being in every respect less performant than any decent dedicated box. You could find more benchmarks at the Redis website. Our office quad-core server was also able to get about 100 000 inserts per second.

I know the power of Amazon is not exactly the "inexpensive hardware", but rather flexibility, range of added services, probably easier administration… but there are kind of services you really do not want to put in virtualized environment. Talking to "bare metal" is extremely important when running Redis, and probably any memory-intensive software.

Also, since Redis datasets must fit in the memory, it would be nice to be able to get cheap boxes (slow drives are ok) with lots of ram. Still, it is worth considering if using Amazon EC2 is the best option.

Still, I am considering running the project on EC2 in the initial period, but you really need to be careful about the choice.

How it refers to Wikidot?

When I was testing EC2 instances with PostgreSQL installed, populated with a copy of Wikidot.com database, I was getting only 50% of the performance of the dedicated server for queries that for sure all used only cached data, even on the fastest instances. So it looks like moving our database server to EC2 would significantly decrease our performance. At this moment it is not acceptable. This post on Amazon forums would suggests memory bandwidth problems in EC2 instances.

Previously I have been presenting a possible migration to Amazon EC2 services. After a while it looks like our whole database / webserver infrastructure would need to be reconsidered to benefit from EC2 architecture. In the end we will need to partition our datasets (sharding) and probably modify storage for uploaded files, but honestly I would rather move this moment in time as far as I can, and as long as we still have plenty of options within our current setup.


BTW: A weekend (short) project is a kind of project that should take only a few days to complete, or at least to build a reasonably working and functional prototype. It should be fun and educational, give a chance to explore new solutions and technologies. Perfectly I would welcome more people on-board.


rating: 0, tags: amazon ec2 redis sinatra

rating: 0+x

del.icio.usdiggSimpyRedditYahooMyWebFurl

What is with SUN's (new) VDC? (Virtual data center)
Helmuti_pdorfHelmuti_pdorf 1240988683|%e %b %Y, %H:%M %Z|agohover

I do not know if you have red this..
I read to today in Sun's "News" of their Virtual Data Center - ( with "computer clouding" on Solaris, Linux or windows). under the title "Give room ( make place ) - Amazon…"

Have you seen this?

Here are some infos:
http://www.infoworld.com/t/platforms/sun-challenges-amazon-cloud-dominance-053

http://www.informationweek.com/news/software/hosted/showArticle.jhtml?articleID=215802006&pgno=1&queryText=&isPrev

I for my own have nothing done in java or sun systems till today..

Regards
Helmut


Service is my success. My webtips:www.blender.org (Open source), Wikidot-Handbook.

Wollen Sie Wikidot helfen im deutschen » Wikidot Handbuch ?

Reply  |  Options
Unfold What is with SUN's (new) VDC? (Virtual data center) by Helmuti_pdorfHelmuti_pdorf, 1240988683|%e %b %Y, %H:%M %Z|agohover
Cloud Computing
Helmuti_pdorfHelmuti_pdorf 1242307221|%e %b %Y, %H:%M %Z|agohover

Interesting (new) Documen foundt:

[http://www.sun.com/offers/docs/cloud_computing_primer.pdf]


Service is my success. My webtips:www.blender.org (Open source), Wikidot-Handbook.

Wollen Sie Wikidot helfen im deutschen » Wikidot Handbuch ?

Reply  |  Options
Unfold Cloud Computing by Helmuti_pdorfHelmuti_pdorf, 1242307221|%e %b %Y, %H:%M %Z|agohover
Add a New Comment
Page tags: amazon ec2 redis sinatra
asdad