“We did an exceptionally awesome job on the design through our Agile and TDD practices which our company had recently adapted to. It evolved exceptionally well through our iterative and incremental processes. With a lot of negative testing we ensured our software was robust. A few weeks after the production deployment, the customer started reporting a good number of errors with a variety of exceptions and stack traces one after another and the maintenance phase became a nightmare – support engineers spending a lot of time in debugging the issues and the development team delivering the patches with quick or partial fixes and we almost got into a never-ending report-and-fix cycle.”
Stories like this are not uncommon in the IT services industry. Yes, poor exception handling will prove very costly to you. Good enough to force your customer not to extend the contract with you. You might have done a cool job with your TDD and negative testing to ensure the robustness of your software. But exceptions are what they are – exceptions - and they need a careful attention to be handled properly in your software.
Here are some pitfalls (the "Why" part of it) that some of us usually perform about Exception Handling.
--We tend to be reactive than proactive in that we don’t handle the exceptions properly unless some serious error is reported. That too we limit the bug fix only to the particular error reported.
--‘Satisfy the Compiler’ approach – We think we are good as long as we satisfy the compiler.
--‘These exceptions are hard to occur’ approach – We think certain exceptions are hard to occur by assuming all ideal production environments.
--Misconception that Exception Handling is a Developer tool used for debugging purposes for tracing the root cause - Yes, it is, but it is not limited to.
--‘Swallowing Exceptions’ - We do nothing more than printing a Stack Trace or an Error Message. Exception Handling is not just Exception Logging.
--'I deal with my code only & I only deal with my code' approach - Not understanding the BIG picture where a module/method fits in the application flow, we handle an exception where it need not be OR we don't handle an exception where it needs to be.
(I am standing up here to admit - "Yes, I did those mistakes too".)
Not that it has to do with the developers’ lack of technical knowledge on the subject. As developers, we would, in general, know the technical facts of Exception Handling but we, sometimes, don't ask these very important questions to their implementation:
--What do these Exceptions mean to the Software or the Customer’s business or the Workflow of the Application? [Beyond the development cycle]
--What has to be done if these Exceptions occur in the production?
----Do they need to be recovered?
------If so, how can they be recovered? And what's next? Is it "Fail-Safe" or "Control Change"?
----Do they NOT to be recovered in the software? If so, what’s the action expected from the user?
------Can they be safely ignored and printed to the Application Log? If so, which Application Log?
------If the user has to take appropriate action, what is the meaningful information to be printed to the Log? And again, which Log in this case (Different Logs have different meanings to the business users)?
--Is enough attention being paid to using Asserts ("Fail-Safe" & "Prevention than Cure")?
-- Is enough attention given to Testing the Exception Handling? (Also, not all the Exceptions can be produced or reproduced in a Testing or Staging environment. But the application design shouldn’t ignore this.)
So, let us expect and respect those exceptions:
ü We need to implement Exception Handling hand-in-hand with the Application Design. Just like our Application design, Exception Handling design needs to evolve too and in fact it is an integrated aspect rather than an add-on aspect of the Application Design.
ü We need to analyze the Exception flow with an assumption that they have happened, just like we do it for general Application Design. In a layman terminology, we can call it a “pessimistic design” to serve its purpose right.
Well and good, if your team is already composed of design gurus and seasoned architects who are already doing this job to the perfection. But if it is not the case, it is definitely something that your team has to do some exercise on – and there will be so many things that will fall in place.
Having said that, building software with 100% robustness with respect to Exceptions, is something that is, in reality, very difficult to realize. Maybe it would happen over a long period of maitenance cycle. What we must try is to respect the Exceptions and improve the robustness of the software and get as much close as possible to the perfection.
Does your team understand what do those Exceptions mean after the development cycle?Are your team members in sync with the Exception Handling mechanism? Is there a consistency in handling Exceptions?
Does your Test Planning have a place for testing Exceptions?
Have you collaborated with the business side (customer) on handling these unusual conditions? Is your implementation in sync with their expectations?
Note: Though I have tried to take a language-neutral and technology-neutral stand here to highlight the importance of Exception Handling, I have been a little biased towards the User level application development. And if we talk about the System level programming, it is again a total different subject to handle those traps, page faults, invalid memory references etc. The closer we get to those 0s and 1s in our programming, the more are the demands for the robustness - goes with the fact “It doesn’t matter how strong your walls are when your foundation is weak”. (I have not done much of system programming as a matter of fact.)
"A pessimistic approach comes of greater importance in designing software that is highly robust"
(Attribution: Images on this post are downloaded from FreeDigitalPhotos.Net)