The Climate Change Simulation, which was using the processing power of thousands of peoples PCs to predict climate change, has had to re-start the whole simulation after noticing an error in its software. Makes me wonder about the accuracy of simulation programmes, and the problems of testing them.
The organisers explain: “For those of you who are interested, the problem was a single entry in a file header, which meant that the model started reading from the wrong point in the file. Because the data and the dates in the file were OK, the problem was far from obvious.”
The experiment ran an identical model, but with different parameters, on each contributing computer. As of today, it claims that 11,878,377.156 model years have been run on 41,551 trickling machines
The organisers say:
“..we probably couldn’t have picked up the problem more than a week earlier. It is obvious now that too many models were warming up too fast over the 20th century, but we needed a reasonable number to have got through the 1970s to be able to see this wasn’t just due to chance…. [then> ….it was clear that all the front-runners were crashing on the same date in 2013…”
Ive thought a lot about testing recently. I run a website for a FTSE 100 company, for use in emergencies. So this has to have high availability. Inspired by articles like this, I wrote an end to end testing robot in Perl, which automatically logs on to the site every hour and tests that its basic functions are working. Thats a start, but not a very comprehensive one.
Ruby on Rails includes some quite interesting testing facilities in its framework, and CodeIgniter (which Im using for another project) has just introduced a unit testing class – which Im currently puzzling over. Not the code, which is excellent, but how best to use it in my project.
As I understand it, there are two main types of testing: unit testing looks at each piece of code, bottom up, and checks that each individual function is doing what it is supposed to do. End to end testing looks at the site as a whole, and checks that it is reacting as it is expected to.
Which makes me wonder how you test a simulation… After all, with my site, I am testing for some pretty simple things. Is the site there? Can my robot log in to it? When it does, will it see what I know should be there, and on the correct page?
But with a complex simulation like the Climate Prediction one, how do you do this? Since you dont know the answers in advance (thats why youre doing the simulation) you can only spot an error if it is so obvious that it can be counter-intuitive – in other words it is within the bounds of human intuition. (That is, if you have some idea what the answer ought to be.) Without this, how can you do an end to end test?
One approach would be to do a run with dummy data, where you know what the outcome should be: but even that has its limitations. As the Climate Prediction site days, “Because the data and the dates in the file were OK, the problem was far from obvious.” How far do you have to go with testing?
Leaving aside entirely questions about the accuracy of the model or the assumptions you have made, how do you ensure against the sort of annoying software errors which (once noticed) are obvious and elementary? (And which everyone who has ever written a programme knows are frequent and inevitable…)
I have no answers to these questions, though they are important and I shall continue to ponder them. Thanks to Mark of Classilog for drawing this to my attention, in the unlikely surroundings of a Humphrey Lyttleton concert in a deconsecrated church.