Monthly Archives: December 2012

Quick post with some updates

So Neringa and I will head to the US in January (and likely a short vacation while there).

I will be doing a few classes while out there.

New York City http://nyccqrs.eventbrite.com/ (sorry for the late notice, thats why prices are lower than normal)

Likely a short workshop the following Thursday and Friday in Seattle after the P&P event. This one likely will be a community workshop.

Austin Texas Jan 21-23 http://austincqrs.eventbrite.com/

Also coming out this week will be two huge (as in importance of content) pieces of documentation on the Event Store. Links will be on this blog and  http://geteventstore.com/blog/ 

Silly Benchmark

So people often ask me crazy questions (I think its a test to see if you thought about the scenario or not). One of those questions was “How does the ES work when I put 4mb events into it). I figured I would run that test today. Actually not bad really (internally we have a max size setup though!)

I will need to rerun the test on a gb network in fact. It saturated 100mb between my laptop and desktop during testing (99% usage).

C:\Users\Greg\Downloads\xampp\apache\bin>ab -n 200 -c 10 -k http://192.168.3.3:2
113/streams/huge-1/event/1
This is ApacheBench, Version 2.3 <$Revision: 1373084 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 192.168.3.3 (be patient)
Completed 100 requests
Completed 200 requests
Finished 200 requests
Server Software: Mono-HTTPAPI/1.0
Server Hostname: 192.168.3.3
Server Port: 2113

Document Path: /streams/huge-1/event/1
Document Length: 4194431 bytes

Concurrency Level: 10
Time taken for tests: 76.898 seconds
Complete requests: 200
Failed requests: 1
(Connect: 0, Receive: 0, Length: 1, Exceptions: 0)
Write errors: 0
Non-2xx responses: 1
Keep-Alive requests: 199
Total transferred: 834767314 bytes
HTML transferred: 834691769 bytes
Requests per second: 2.60 [#/sec] (mean)
Time per request: 3844.920 [ms] (mean)
Time per request: 384.492 [ms] (mean, across all concurrent requests)
Transfer rate: 10601.03 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.5 0 4
Processing: 1464 3804 834.7 3674 8549
Waiting: 93 391 963.5 112 5455
Total: 1464 3804 834.7 3674 8549

Percentage of the requests served within a certain time (ms)
50% 3674
66% 3732
75% 3782
80% 3818
90% 3944
95% 4885
98% 8072
99% 8403
100% 8549 (longest request)

 

Just remember. Even though these questions may be crazy, we still look into them 🙂

Next Week Projections Talk

Next week I will be doing a talk on the using the Event Store as a read model. I have spent so much time the last few years talking about the benefits of Event Sourcing as a write model but there are some quite unique scenarios where it makes a ton of sense as a read model as well.

It will also be interesting for me as many people have misunderstood the concept of the “projections library” in the event store. In particular I often get asked the question “how do I query all of my states to get the ones with the last name of “xxx””. While this *can* be done in the projections library you probably should not be doing it there. It sounds more like you want a document db/kv store. This is something we have not been good in explaining to people as of yet (largely because we have been so focused on the primary use case of the write side)

Projections are about stream operations, repartitioning, and the ability to do temporal queries. Many very complex systems do a very few simple things. The canonical example of this is a system for handling temperature sensors or price streams from the stock market. All the system does is say some moving averages or candlesticking of prices but they take 9 months to build. Why because of all the other stuff associated with the problem (sure its easy until you have 20k+ updates/second and you need to be available). One of the main usecases for projections is these kinds of problems where you can drop in 5 lines of javascript and the other gunk is built in already for you.

 

Some benchmarking today

EventStore running AtomPub feeds over HTTP (feed hits, no internal caching).

Still some room for improvement but not bad overall.

greg@ouroboros:~$ ab -n 50000 -c 15 http://192.168.3.3:2113/streams/package-0
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking 192.168.3.3 (be patient)
Completed 5000 requests
Completed 10000 requests
Completed 15000 requests
Completed 20000 requests
Completed 25000 requests
Completed 30000 requests
Completed 35000 requests
Completed 40000 requests
Completed 45000 requests
Completed 50000 requests
Finished 50000 requests
Server Software: Mono-HTTPAPI/1.0
Server Hostname: 192.168.3.3
Server Port: 2113

Document Path: /streams/package-0
Document Length: 2360 bytes

Concurrency Level: 15
Time taken for tests: 23.436 seconds
Complete requests: 50000
Failed requests: 0
Write errors: 0
Total transferred: 134900000 bytes
HTML transferred: 118000000 bytes
Requests per second: 2133.50 [#/sec] (mean)
Time per request: 7.031 [ms] (mean)
Time per request: 0.469 [ms] (mean, across all concurrent requests)
Transfer rate: 5621.28 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 1
Processing: 1 7 7.0 6 132
Waiting: 1 7 7.0 6 132
Total: 1 7 7.0 6 132

Percentage of the requests served within a certain time (ms)
50% 6
66% 7
75% 7
80% 8
90% 9
95% 10
98% 20
99% 24
100% 132 (longest request)

Large Amounts of Memory with GC

Often times we have to use large amounts of memory with the Event Store (up to 70 gb in our current running environments). I have gotten the question many times of how do we do this and not get killed by the garbage collector.

When the garbage collector runs it will stop all of your threads, iterate through your heaps and figure out what to kill and what not to based upon what is rooted via other references or current stacks http://msdn.microsoft.com/en-us/magazine/bb985010.aspx is a pretty good description of how it works. There have also recently been changes to the collector to make it much better http://blogs.msdn.com/b/dotnet/archive/2012/07/20/the-net-framework-4-5-includes-new-garbage-collector-enhancements-for-client-and-server-apps.aspx discusses.

There are however two ways that you can help lower the impact of the garbage collector on you. The first is to use the large object heap http://msdn.microsoft.com/en-us/magazine/cc534993.aspx. The Large Object Heap is for large objects and does not get compacted. This can dramatically lower your use of memory. Allocate big chunks and reuse them (or just allocating big chunks for things that are actually big like say an index of a file). Always try to use value types here and don’t hold references if you can (will take a lot of work off the gc, 5m longs is much better to deal with than 5m objects). There is a drawback here however in that very often you will end up with a fragmented heap if you aren’t careful.

The other mechanism to avoid the managed memory problems is to stop using managed memory and use unmanaged memory. The main thing that needs to get loaded into memory for something like the event store is the data on disk. The OS can cache this as well for you but you can force it to happen by putting it into unmanaged memory. Unmanaged memory is not used enough from managed code. Think about all the places where you need MB of memory. The ES uses unmanaged memory. Very often if you see it using 16gb of memory it has a managed heap size of 200mb + unmanaged for all the rest. This helps to keep GC rates down.

We have not seen a huge perf difference between the LOH and putting things into unmanaged memory for things that live a long time. For short lived things beware the LOH (not compacted) though I have not tested yet with new GC to see if its changed.