Legal Requirements

Recently I did a keynote presentation at DDDx in London. You can watch it here if you want. https://skillsmatter.com/skillscasts/8082-stop-over-engineering

Throughout the talk I am putting forward the idea that many scenarios should just not be handled in software but should instead be pushed up to a business process level. This can often remove much of the complication in software. In other words let the humans do the work manually on the truely complex things and handle the 99% of relatively simple cases to start. Obviously there is more to it than this but feel free to watch the talk.

A few people today were watching the talk at their lunch break and having discussions. They came back to me with an interesting and quite good question.

What about legal requirements?

We can still do similar things even when dealing with legal requirements. To illustrate my point I will take the most byzantine complex legal scenario that I know. The US tax code.

„ The Code has grown so long that it has become challenging even to figure out how
long it is. A search of the Code conducted in the course of preparing this report turned
up 3.7 million words.
A 2001 study published by the Joint Committee on Taxation
put the number of words in the Code at that time at 1,395,000.7
A 2005 report by a tax
research organization put the number of words at 2.1 million, and notably, found that
the number of words in the Code has more than tripled since 1975.8

https://www.irs.gov/pub/tas/08_tas_arc_msp_1.pdf

If we were building software to say make filings or to audit filings we can still do so without supporting every possible legal requirement. As I said in the talk there are a few things to look at here.

The first in terms of analysis is instead of talking about everything that can happen in the tax code we should discuss “how do I know that we are still in a happy case”. A very simple analysis will show you that a huge portion of returns use only a few rules. Earned Income Credit, mortgage deduction, dependents, and maybe declaring a trivial amount of interest on bank accounts. From looking this number is > 60% of all returns received. I don’t have exact data on % of returns varying rules are used on but the vast majority of rules are only used on a tiny % of returns (<1%).

We also need to remember that our computer system is part of larger business process. We may automate away 99% of the work on these returns leaving 1% to be dealt with in other ways (humans). To be fair in terms of tax returns you probably want to automate away 99.5%-99.9%. The last .5-.1% of the actual returns is where the real complexity it. Don't do it, just leave it to the humans. We should be focusing on "How do I know this is something I know how to do vs something I should give to a human" not focusing on "what are all the edge conditions and complexity?"

The tax code is also a really good example of why you would not want to automate away that last .5-.1%. It changes ALOT. There is roughly one change per day made to the US tax code. Imagine trying to keep your software up to date! The basic rules that most use change rarely.

Rob,Ayende. .NET

In regard to:

While I find most of Ayende’s posts to be well reasoned, I feel we may end up agreeing to disagree here. I understand completely your argument that good software can be written on the .NET platform. I understand as well that you can build things like RavenDB on the .NET platform including CoreCLR. Myself and our team have also built such software.

Your conclusions however are absolutely wrong. We have been supporting a linux operations story for coming up on four years. We actually prefer to run in linux as opposed to windows (though we are written in C#). This is due to many reasons but the largest is the quality of operations people and the maturity of devops tools on the linux side.

We have experienced where your logic falls apart.

RavenDB nor EventStore should be written in C#.

A different environment such as C/C++/Go/Rust would be far superior to haven been written in C#. Cross compiling C# is a pain in the ass to say the least whether we talk about supporting linux via mono or .netcore, both are very immature solutions compared to the alternatives. The amount of time saved by writing C# code is lost on the debugging of issues and having unusual viewpoints of them. The amount of time saved by writing in C# is completely lost from an operational perspective of having something that is not-so-standard. We have not yet talked about the amount of time spent in dealing with things like packaging for 5+ package managers and making things idiomatic for varying distributions.

Maybe you think you will avoid this and include instructions for how to install ravendb for *x* distribution. They want packages. They want things they understand. Most linux admins will flat out refuse to install coreclr (rightly as it is immature) or mono (rightly as its a shitshow)

The costs associated with this are prohibitive. At the end of the day the kinds of code you discuss could likely be implemented by mediocre C/Go developers but you need to hire the very top of the line C# developers. Finding a C developer that understands the concept of a B+tree is far easier than a C# developer. Finding a C developer who has some clue about how memory works and is able to optimize is normal.

When you get into looking at TCO the amount of time spent dicking around optimizing GC and understanding/supporting your underlying abstractions having written in C# becomes negligibly less expensive than writing in a systems programming language in the first place.

Once you start looking at support operations (trust me we have experience on this), having written in a systems programming language will be far cheaper.

TLDR: The fact that you write systems code in C# (or Java!) does not refute Rob. You are in fact writing systems code in C# in spite of being in C#. This is the biggest mistake we have associated to event store, you should recognize the same, we keep things running but recognize that we would have been better off with other decisions. Given our investment we can not easily change but given a chance to start from scratch would make different ones.

Rob,Ayende,.NET

In regard to: https://ayende.com/blog/174433/re-why-you-cant-be-a-good-net-developer which is in regard to http://codeofrob.com/entries/why-you-cant-be-a-good-.net-developer.html

Ayende,

While I find most of your posts to be well reasoned, I feel we may end up agreeing to disagree here. I understand completely your argument that good software can be written on the .NET platform. I understand as well that you can build things like RavenDB on the .NET platform including CoreCLR. Myself and our team have also built such software.

Your conclusions however are absolutely wrong. We have been supporting a linux operations story for coming up on four years. We actually prefer to run in linux as opposed to windows (though we are written in C#). This is due to many reasons but the largest is the quality of operations people and the maturity of devops tools on the linux side.

We have experienced where your logic falls apart.

RavenDB nor EventStore should be written in C#.

A different environment such as C/C++/Go/Rust would be far superior to haven been written in C#. Cross compiling C# is a pain in the ass to say the least whether we talk about supporting linux via mono or .netcore, both are very immature solutions compared to the alternatives. The amount of time saved by writing C# code is lost on the debugging of issues and having unusual viewpoints of them. The amount of time saved by writing in C# is completely lost from an operational perspective of having something that is not-so-standard. We have not yet talked about the amount of time spent in dealing with things like packaging for 5+ package managers and making things idiomatic for varying distributions.

Maybe you think you will avoid this and include instructions for how to install ravendb for *x* distribution. They want packages. They want things they understand. Most linux admins will flat out refuse to install coreclr (rightly as it is immature) or mono (rightly as its a shitshow)

The costs associated with this are prohibitive. At the end of the day the kinds of code you discuss could likely be implemented by mediocre C/Go developers but you need to hire the very top of the line C# developers. Finding a C developer that understands the concept of a B+tree is far easier than a C# developer. Finding a C developer who has some clue about how memory works and is able to optimize is normal.

When you get into looking at TCO the amount of time spent dicking around optimizing GC and understanding/supporting your underlying abstractions having written in C# becomes negligibly less expensive than writing in a systems programming language in the first place.

Once you start looking at support operations (trust me we have experience on this), having written in a systems programming language will be far cheaper.

TLDR: The fact that you write systems code in C# does not refute Rob. You are in fact writing systems code in C# in spite of being in C#. This is the biggest mistake we have associated to event store, you should recognize the same.

 

CLion review

I have been using CLion on and off for about two months now. This is just a quick review of it.

The good.

I really like it as an editor. It is quick and intuitive to get into.
If you have background with intellij/R# the learning curve is quite low.
CMake support
Does a great job picking up subtle little issues in code
Auto add includes🙂

The bad.

With sublime as example there are so many more plugins available not so much with clion
Getting existing projects working is a PITA
It feels really weird to install the jvm so I can edit my C code

I want to expand on the last point a little. Like many tools if you start off from scratch using it everything works wonderfully. If you have though an old project and try to move it to Clion its not so simple. As an example in private eye everything is done in makefiles including a bunch of cross platform type stuff. Getting this working in CLion is non-trivial so I tended to just use it as an editor editing particular files (luckily there are only a few). This may be I am just not an advanced enough user.

Summary:

Its a nice tool. I haven’t fully moved over from vim but in a new larger project I am working on I am starting with CLion and sometimes editing in vim. Ask me again in 3-6 months my opinion after I’ve become a more advanced user.

Your Tools Control You

Been a while since I have blogged. Will be writing a few over the next few weeks.

Hopefully over time you as a developer change your opinions. One thing I have drastically changed my opinion over time on is tooling. As many know I used to work heavily on a tool called mighty moose (still works and is OSS). One of the things I built into mighty moose was a pretty cool graphing feature I thought was the bees knees at the time that I built it. You can see it and alot of other cool features Svein and I built into it here:

http://continuoustests.com

One thing that was interesting with the graphs was people would bring them up and file support complaints that the nodes were too little when they brought it up on their code because there were 500 boxes in the graph (one got an OutOfMemory on it, in debugging it was a graph with over 50k nodes in it). This was never a problem with the graph it was a problem with the code.

I still believe such tools can have value and help make better software however I don’t really use such tools any more. In fact I have gone from being very pro-tooling to being very anti-tooling. Tools control your thinking process. I can probably look at the output of your process and figure out what tools you were using!

A classic example of this many have worked with is a typical .NET 1.1 win/webforms app written in Visual Studio. VS has a great (I mean seriously wonderful) step debugger built into it. The problem is that people tend to use it. A typical workflow would be change some code, hit f5, start stepping through. One of the other cool features, truely innovative at the time, was the ability to easily change code on the fly.

Speaking of ASP.NET have you ever looked at what came out over http from it? If you ever have wondered what a dementor’s kiss feels like I reckon its similar.

dementor-s-kiss-o

The problem in this example is when you are then given the code that was developed in this way. You suddenly find out that is the only workflow that will actually work with the code. You will find other artifacts as well such as nested loops and longer functions as the tools work better that way! Its quite annoying to think step into vs step over.

This is a classic case of tools controlling your output.

Other examples of this can be seen in tools like intellij or resharper. A typical smell of such tooling is that interfaces are backwards in a domain model. Instead of the domain model defining the contract it wants and an implementer adapting to that interface they “extract interface” from the implementer. This is quite backwards but a typical smell.

Another example can be seen in code of people using containers, ask yourself do you use almost exclusively constructor injection? Have you constructor injected a dependency that you only used in one method? Does the granularity actually match there?

Given most of these things are small. But these small changes start adding up.

Back to mighty moose. I found it changed the way I was writing code. One thing I got lazy with was the structure of my code in terms of files because I had graphs to navigate. I also got lazy about tests because the minimizer would not run them most of the time. I even got a bit dangerous trusting my risk metrics.

Today I have gone back to a plain text editor.
I write unit tests but don’t TDD (TDD has its effects)
I like REPLs
I try to log like I am debugging a problem (should be normally debuggable without a debugger, let’s just say when running involves burning eproms you think more about it first)

What’s interesting is I find I am almost back to where I was 15 years ago writing embedded code on 68030/68040s with a VT terminal plugged in for log messages. I see more and more the push back on big tooling and its a good thing (of course the “pushbacks” keep adding more and more functionality and becoming IDEs themselves! kind of reminds me of the 3 year old 50kloc “micro orms”)

p.s. I’m still looking for a nice text editor setup for F# with support for type providers etc if someone knows one.

The Test that Cried Wolf

There was an interesting post on the BDD list today which is a pretty common question:

TLDR I want to automate receiving a SMS in my test to verify my SMS send with <vendor> worked what is the best way to do this?

An answer came back that you can use twilio and recive the message through their API
This is in general a terrible idea and you should avoid it.
The argument quickly came back that its easy and relatively cheap to automate why not?

STOP

People have a mistaken view that something being cheap and simple to autmate make that thing a good idea to automate. The reason its so terrible to automate the sending of a text message has nothing to do with the cost of the initial automation (though its not as simple as people think, I have done it!). The reason its so terrible is that it will become the Test-That-Cried-Wolf.

Let’s start with the service you will use to receive text messages (in this case twilio)

http://status.twilio.com/services/incoming-sms

1 day, 23 hours ago     This service is operating normally at this time.
2 days ago      We are investigating a higher than normal error rate in TwiML and StatusCallback webhooks
1 week, 6 days ago      This service is operating normally, and was not impacted by the POST request issue.
1 week, 6 days ago      We are investigating an issue with POST requests to /Messages and /SMS/Messages.
2 weeks, 1 day ago      Twilio inbound and outbound messaging experienced an outage from 1.30 to 1.34pm PDT. The service is operating normally at this time.
2 weeks, 1 day ago      Our messaging service is currently impacted. We are investigating and will provide further updates as soon as possible.
2 weeks, 1 day ago      All queued messages have been delivered. All inbound messages are being delivered normally.
2 weeks, 1 day ago      All inbound messages are being delivered normally. Our engineers are still working on delivering queued messages. We expect this to be resolved before 6pm PDT
2 weeks, 1 day ago      A percentage of incoming long code messages, that were received between 3.02pm and 3.45pm are queued for delivery. Our engineers are actively investigating the situation.
2 weeks, 2 days ago     A number of Twilio services experienced degraded network connectivity from 8:47am PT to 8:50am PT.  All services are now operating normally.
2 weeks, 2 days ago     This service is operating normally at this time.
2 weeks, 2 days ago     We are getting reports of elevated errors. Our Engineering Team is aware and are working to resolve.
2 weeks, 5 days ago     This service is operating normally at this time.
2 weeks, 5 days ago     We are investigating a problem where webhooks in response to incoming SMS or MMS messages may be delayed or may be made multiple times.

What happens when your service that you only use for receiving SMS in your test is having a problem? Test Fails.
What happens when your service sending the SMS is having a problem? Test Fails.
There are at minimum two other providers here. Test Fails.
Anyone who has owned a phone knows that SMS are not always delivered immediately. How long do you wait? Test Fails.
Anyone who has owned a phone knows that SMS is not guarenteed delivery. Test Fails.

Start adding these up and if you run your tests on a regular basis you can easily expect 1-2 failures/week. On most teams I deal with a failed test gets looked at immediately to figure out why its failing. In all of these cases it will have nothing to do with anything in your code and is a temporal issue (quite likely not impacting production). How many times will you research this problem before you say “well it does that all the time”.

The cost of such tests is not in their initial implementation but in their false positives . When >90% of the test failures have nothing to do with your system the failures will GET IGNORED. What’s the point of having a test when you ignore the failures? These are the tests-that-cry-wolf and should be avoided. There is a place for such tests, they are on the operations side where any crying-wolf is a possible production issue and WILL be investigated.

Another Security Model

I had an interesting question when sitting with a client today. The Event Store supports internally role based security through ACLs they would prefer to use a claim based system with it. An interesting idea, is there a reasonably easy way to do this? Well yes but it requires a bit of coding around (would be a nice thing to have in a library somewhere *wink wink*)

The general idea with claims based security is that something else will do the authentication and the application will act only on a series of claims that are given. In this particular example they want to control access to streams based upon claims about the user and to do it in a reasonably generic way.

As an example for a user you may receive the following claims.

{
    organization : 37,
    department : 50,
    team : 3,
    user : 12
}

What they want is to be able to use these in conjunction with streams to determine whether or not a given user should have access to the stream (and to be reasonably dynamic with it.)

Obviously we will not be able to easily do this with the internal security (well you could but it would be very ugly) but it can be built relatively easily on top. It is quite common to for instance run Event Store only on localhost and to only expose a proxy publicly (this kind of thing can be done in the proxy, while not an ideal solution it can get us pretty close to what we want.)

If we just want to work with say a single claim “can read event streams” we could simply do this in the proxy directly and check the claim before routing the request. Chances are however you want to do quite a bit more with this and make it more dynamic however which is where the conversation went in particular what about per stream and per stream-type setup dynamically? Well we could start using the stream metadata for this.

For a resource (stream metadata)

{
    organization : 37,
    department : 50,
    team : 13,
    user : 121
}

Now we could try taking the intersection of this with the claims provided on the user.

The intersection would result in

{
    organization : 37,
    department : 50,
}

We might include something along the lines of

{
    approve = "organization,department",
    somethingelse = "organization,department,team"
    delete = "user"
}

Where the code would then compare the intersection to the verb you were trying to do (must have all). This is a reasonably generic way of handling things but we can one step further and add in a bit more.

/streams/account-defaultsetting

{
    approve = "organization,department",
    somethingelse = "organization,department,team"
    delete = "user"
}

This is now defined in a default which will be merged with the stream metadata (much like how ACLs work). If a value is provided in the stream metadata it will override the default (based on type of stream). This allows us to easily setup defaults for streams as well. The logic in the proxy is roughly as follows:

Read {type}-defaultsetting (likely cached)
Read streammetadata
Merge streammetadata + {type-defaultsetting} to effective metadata
Calculate intersection of effective metadata with user info
Check intersection vs required permission for operation

This provides a reasonably generic solution that is quite useful in many circumstances. The one issue with it is that if someone roots your box they can access the data directly without permissions as they can bypass the proxy and talk directly on localhost (to be fair your probably have bigger problems at this point). It is however a reasonable solution for many situations.

Ouro’s Birthday

We are now over 3 years working on Event Store! Ouro’s second birthday will be happening in London Sept 17. In general Ouro’s birthday is always a lot of fun!

This year we will have some short talks by varying contributors on the OSS side (what they have done, what they are working on etc) and I will do a talk on major functionality completed/on the way including showing off some new goodies.

There will be plenty of people from the community there and I am sure lots of good discussions. After there will be a bit of a celebration with free food/beer and of course some ouro swag. Come on out and hang out for the evening! RSVP required: http://geteventstore.com/two-years-on/#tickets

Sublime is Sublime Closing

Well its an early morning. I can blame the travel from London for that. I managed to struggle through to the end of the second period watching the canadiens game last night. I was a bit worried entering the third but was quite happy to see they won when I woke up🙂

In this post I just want to sum up the other posts from the sublime series as well as add a few tidbits. In the post series we have learned how to setup sublime for .net development. We have covered how to setup project/solution support. How to get intellisense and some basic refactoring. Even how to get automated builds and tests running (all in linux).

We have also looked at a lot of other things that are built on top of sublime that are fairly useful if you are doing other types of development such as javascript or html5. Many of these tools far outclass the Visual Studio equivalents and are usable with many other environments (such as using a ruby backend).

I have personally given up on using Visual Studio as a whole. I will however keep a vm with it on it for some very specific tasks that it does well (such as line by line debugging). These are not things I use in my daily worksflow but are nice to have when you absolutely need them.

Some other changes have come about in the use of sublime as my primary editor. A big one is that when I am writing one off code (which I do alot) I do not bother creating project or solution files any more. I instead just create C# files then either run the command line directly to the compiler or create a small make file. It sounds odd but its actually much simpler than creating project/solution files overall.

There will also be much going on in this space coming up. As of now the sublime plugin supports maybe 20% of what omnisharp is capable of. There will be quite a bit of further support coming in. As an example I was looking the other day at supporting run tests in context from inside of sublime (in test->run test, on fixture->run fixture). There is also much coming in for refactoring support and my guess is that you will see even more coming in on this due to nrefactory moving to roslyn. I think within a year you will find most of this tooling built in.

Another thing that I added to sublime though there isn’t really an official plugin for it yet is sublime repl + script cs. I find it quite common to grab a function and work it out in the repl first and then move it back into the code. A perfect example of this happened to me while in London. I was trying to combine to Uris and was getting some odd behaviour. Three minutes in the repl showed exactly what the issue was.

Moving to sublime will change the way that you work a bit but is definitely worth trying. Remember that a primary benefit of working in this way is that everything that you are doing is a composition of pieces that will also apply to any other code you happen to be working on (whether its C/Ruby/Erlang/even F#).

Banking Example Again

I was reading through this yesterday on my way out of London. Go on take a minute and read it.

http://hackingdistributed.com/2014/04/06/another-one-bites-the-dust-flexcoin/

I do find it funny that the bitcoin exchanges were taken down by such things but the article is pretty ridiculous in how it presents its problem/solution. Banks don’t actually work as described in this post. There is not a column “balance” in an account table as presented unless the developers just had no clue what they were doing.

mybalance = database.read("account-number") newbalance = mybalance - amount database.write("account-number", newbalance) dispense_cash(amount) // or send bitcoins to customer - See more at: http://hackingdistributed.com/2014/04/06/another-one-bites-the-dust-flexcoin/#sthash.esawajq9.dpuf

This is absurd your balance while perhaps being denormalized on your account is really the result of an equation (summation of the value of your transactions). All of these problems discussed would just go away if the system had been designed to record a journal properly (and as the journal is append only most other issues would go away).

I have always hated that the typical example of distributed transactions is transfering between two accounts. Banks don’t work this way!