Coincidence is a fascinating concept: I just finished lecture 11 "testing and debugging" of MIT's interesting CS course yesterday and had an amazing example in practice today.The coincidence was a bug that was identified in an app I wrote. It did not make any sense at first: some data was written that had nothing to do with the expected data. I could not guess where it had come from. I started to look at the function responsible for writing that piece of output. As the mentioned course states: "reading" your code is very important, but you can easily get into the habit of just take things for granted.

Trying to understand the function I quickly saw the specific output got written to a temporary file first. You see, as code grows and apps age, you often forget the inner working of each and every function.

The thing to remember when debugging is that you need to be SYSTEMATIC. In the course a great example is shown of the binary search approach (the "Silly" program): start in the middle to rule out half of the code. Of the half left, start again in the middle, etc.

I started to look for a pattern to see when this bug might had slipped in. Having complete statistics (thankful now), and starting somewhere in the middle of last weeks uses, I was quickly able to pinpoint the exact time and date when the issue started.

The important question, always

Then I asked: "what changed around that date?". I started to think and talk to people involved: there was a permission change of the temporary file that got written to! A quick test script confirmed I could not overwrite the temp file anymore! Bingo!

The course and the real-life example thought me some things:

  1. read along till the end
    The directory the temp file was in had enough permissions to write to, so that alone was not the issue. Until I went further down in the code I saw the file was not getting cleaned up. This fact I discovered later. So, although the binary search example as described above, is good to do anyways, I might had saved a bit of time. It is obvious but reading rewards!
  2. reproduce the issue
    When you got a basic understanding of the issue, try to reproduce the problem as soon as possible in the process and with the minimum number of inputs (saves time). As the course says: once reproduced, you are more than halfway there. I often use a test.php to play around calling functions to isolate their working and really understand them.
  3. search algorithm
    The binary approach into action: see above: you can quickly narrow down where the issue might be.
  4. print, print, print
    As explained in the CS course ... print variables in your code. Actually do as much logging as you can. Bugs will show up, even best programmers are "bugged" (and if you would foresee those mistakes you wouldn't allow them to crawl in!) Although outputting extra stuff to console, tables/ files, might cost time to set up the first time, it will save time later and might decrease the damage done. I actually think a good approach is to design a test suite and toggle a global variable to switch it on/off. Be ready when you receive the call ;)

Defensive programming

As John Guttag says in the course: "defensive programming is all about facilitating validation and debugging!" And with that you prevent future head aches :)

At last

... be patient, analyzing a bug can take hours, sometimes even days if you (wisely) walk out from time to time. However I found the process much satisfying, intriguing even. I mean applying a systematic approach to narrow down an issue step by step, I think it is in the programmer's nature to just love that :)

 

Share the word or join..

Did you like this post? Tell your friends via the social media buttons on this site...

You can also join my Facebook group or get a weekly Mailchimp newsletter to keep up with my new blog posts. You can subscribe in the sidebar of this site ...

Ah.. and I am also at twitter: @bbelderbos


Bob Belderbos

Software Developer, Pythonista, Data Geek, Student of Life. About me