Monthly Archives: October 2014

The Test that Cried Wolf

There was an interesting post on the BDD list today which is a pretty common question:

TLDR I want to automate receiving a SMS in my test to verify my SMS send with <vendor> worked what is the best way to do this?

An answer came back that you can use twilio and recive the message through their API
This is in general a terrible idea and you should avoid it.
The argument quickly came back that its easy and relatively cheap to automate why not?

STOP

People have a mistaken view that something being cheap and simple to autmate make that thing a good idea to automate. The reason its so terrible to automate the sending of a text message has nothing to do with the cost of the initial automation (though its not as simple as people think, I have done it!). The reason its so terrible is that it will become the Test-That-Cried-Wolf.

Let’s start with the service you will use to receive text messages (in this case twilio)

http://status.twilio.com/services/incoming-sms

1 day, 23 hours ago     This service is operating normally at this time.
2 days ago      We are investigating a higher than normal error rate in TwiML and StatusCallback webhooks
1 week, 6 days ago      This service is operating normally, and was not impacted by the POST request issue.
1 week, 6 days ago      We are investigating an issue with POST requests to /Messages and /SMS/Messages.
2 weeks, 1 day ago      Twilio inbound and outbound messaging experienced an outage from 1.30 to 1.34pm PDT. The service is operating normally at this time.
2 weeks, 1 day ago      Our messaging service is currently impacted. We are investigating and will provide further updates as soon as possible.
2 weeks, 1 day ago      All queued messages have been delivered. All inbound messages are being delivered normally.
2 weeks, 1 day ago      All inbound messages are being delivered normally. Our engineers are still working on delivering queued messages. We expect this to be resolved before 6pm PDT
2 weeks, 1 day ago      A percentage of incoming long code messages, that were received between 3.02pm and 3.45pm are queued for delivery. Our engineers are actively investigating the situation.
2 weeks, 2 days ago     A number of Twilio services experienced degraded network connectivity from 8:47am PT to 8:50am PT.  All services are now operating normally.
2 weeks, 2 days ago     This service is operating normally at this time.
2 weeks, 2 days ago     We are getting reports of elevated errors. Our Engineering Team is aware and are working to resolve.
2 weeks, 5 days ago     This service is operating normally at this time.
2 weeks, 5 days ago     We are investigating a problem where webhooks in response to incoming SMS or MMS messages may be delayed or may be made multiple times.

What happens when your service that you only use for receiving SMS in your test is having a problem? Test Fails.
What happens when your service sending the SMS is having a problem? Test Fails.
There are at minimum two other providers here. Test Fails.
Anyone who has owned a phone knows that SMS are not always delivered immediately. How long do you wait? Test Fails.
Anyone who has owned a phone knows that SMS is not guarenteed delivery. Test Fails.

Start adding these up and if you run your tests on a regular basis you can easily expect 1-2 failures/week. On most teams I deal with a failed test gets looked at immediately to figure out why its failing. In all of these cases it will have nothing to do with anything in your code and is a temporal issue (quite likely not impacting production). How many times will you research this problem before you say “well it does that all the time”.

The cost of such tests is not in their initial implementation but in their false positives . When >90% of the test failures have nothing to do with your system the failures will GET IGNORED. What’s the point of having a test when you ignore the failures? These are the tests-that-cry-wolf and should be avoided. There is a place for such tests, they are on the operations side where any crying-wolf is a possible production issue and WILL be investigated.