My pseudo-vacation seemed like a good time to run some performance/stress tests on EC2. As I expected, this gave me a chance to fix some problems that I’d never seen on my own machines at Red Hat, or in smaller-scale testing, so it was definitely worthwhile just for that. The real goal, though, was to establish some performance baselines and measure overhead for some of the CloudFS functionality, so that I can make sure we’re not paying to dearly as that functionality becomes more complex. I set up three m1.large servers and one m1.large client, all running the official Fedora 14 x86_64 images for EC2. Actually I tried running Amazon’s own Linux distro, and even figured out how to make the GlusterFS packaging recognize it as a copy of RHEL (mostly /etc/system-release instead of redhat-release), but then the performance was dreadful. This is similar to what happened when I tried running similar tests on Rackspace a few weeks back. I could debug either case, but it doesn’t seem like it should be a high priority when I already have something that seems to work pretty well. To continue the methodological description…
- My own build of GlusterFS+CloudFS, all with -O2
- Simple three-way DHT setup (no replication) with write-behind and io-cache disabled
- iozone -c -e -r 1m -s 1280m -i 0 -i 1 -l 30
I used 30 threads to get a decent distribution across the three servers, and moderate parallelism on the client. That times the file size means that the data is about 20% more than the total memory of all four systems; this doesn’t entirely eliminate cache effects, but does bring them down to real-world levels. Here are the results for a “vanilla” config, for vanilla plus only the “cloud” (tenant-isolation) translator, and for vanilla plus only the “crypt” (at-rest encryption) translator.
| (MB/s) | Initial Write | Re-write | Initial Read | Re-read |
|---|---|---|---|---|
| Baseline | 47.6 | 84.8 | 82.2 | 82.2 |
| +cloud | 55.2 | 85.9 | 82.1 | 82.1 |
| +crypt | 37.1 | 37.8 | 30.9 | 31.0 |
The baseline figure is actually pretty respectable, though the delta between first and subsequent writes was a bit surprising. There was definitely some evidence of contention/starvation, as one server or another would sit at zero blocks read/written for several seconds at a time. As it turns out, the distribution between the three servers was also surprisingly uneven, so the third server would always finish significantly before the other two. That brings the results down a bit, but it’s more realistic than if I’d forced a more even distribution.
The biggest surprise was that the cloud-translator results were actually slightly better than the baseline. Mostly I think that’s just the variability of performance that EC2 is known for, but at least it shows that the overhead for adding the translator is below the noise. That’s good news; we’ll see if the situation remains the same when I add UID/GID mapping. The general lousiness of the crypto results was not a surprise at all. The starvation behavior was even more pronounced, with servers sometimes sitting idle for most of a minute before getting more work to do (remember that this is pure client-side encryption). The single glusterfs process on the client was pegged during writes, and nearly so during reads. That’s actually good news too, since it implies that there should be a pretty easy-to-achieve performance boost from multi-threading the encryption/decryption stage either using the io-threads translator or otherwise. The best news of all, in my opinion, is that all of these tests managed to complete without failures or data errors. I did have one memory leak in the crypt translator, which I fixed, but things looked very good after that.
That should be enough performance testing for this year. ;) When things pick up after the weekend, I’ll probably have more results like these to report.