In several occasions I observed that there are SQL and non-SQL developers who don’t know how to troubleshoot a programming problem in general, respectively a SQL-related issue in particular, and I’m not referring here to the complex problems that typically require the expertise of a specialist, but simple day to day situations: troubleshooting an error thrown by the database engine, an error in the logic, a performance issue, unavailability of resources, etc. I’m not necessarily talking here about the people posting questions on forums, professional networks or blogs, even if in many situations they could have found an answer to the problem by doing a little research, but seeing developers actually at work. It’s true that there are also many cases in which the software throws an error message that you don’t know from where to start or that the error is pointed as appearing at other line than at the line where actually the error occurs, leading the developer to a false thread.
Before going into detail let’s take a short look at troubleshooting and what it means! Paraphrasing Wikipedia’s general definition for troubleshooting, troubleshooting in IT is a type of problem solving applied to software and infrastructure related issues. Software issues refer not only to the various types of errors thrown by software applications, but also to functional, rendering or configuration errors, performance issues, data quality issues, etc. Infrastructure related issues could refer to the IT infrastructure – network, information systems, processes, methods or methodologies used. In this post I will refer only to the software issues even if the technique(s) for troubleshooting this kind of issues could be applied also to infrastructure issues.
Polya’s Approach to Problem Solving
In his book 'How To Solve It', G. Polya, a well known Hungarian mathematician, advances a 4 step natural approach in solving a problem: 1. understanding the problem, 2. devising a plan, 3. carrying out the plan, and 4. looking back [1]. G. Polya’s approach could be used for all types of problems, including IT problems, and even if I find this approach too high level for solving this type of problems, it’s actually a cornerstone on which more detailed approaches could be used. Let’s look shortly at each of Polya’s four steps!
1. Understanding the problem
Understanding the problem resumes in identifying what is known, the data, the actual facts, and what is not known, what causes the issue and how it will be solved. Somebody was saying that a problem well understood is half solved, and there are quite good chances to arrive to the wrong solution if the problem is not well understood. If in Mathematics the problem is defined beforehand together with the whole range of constraints, in IT for example, when troubleshooting the problem needs to be defined, in the context of this post the problem revolving around a technical or business issue appearing in the form of an error message, un unexpected/wrong application behavior, wrong process, etc. Thus the actual facts could resume to the error message, current vs. expected behavior, tools used, high/low level design, business logic, affected objects, specific constraints, etc.
2. Devising a plan
In this step is attempted to find the connection between the data and the unknown, looking at the problem from different angles in order to obtain an idea of the solution, to make a plan [1]. We have a plan when we know which steps we have to follow in order to identify the issue (solve the problem), they don’t have to be too detailed, but addressable, not necessarily complete but as a base that could be evolved with time, for example when new information/results are found. It could be multiple directions to look into, for example based on possible list of causes, constraints the various features comes with, different features for implementing the same thing, etc.
Naturally the first question a developer should ask: have I seen this issue before in actual or slightly modified form? Could be the problem broken down to smaller (known) problems? Could be derived anything useful from the data, have been considered all essential notions involved in the problem [1]? Essential notions, that’s always a thing to look into, mainly because I would say that many issues derive from feature constraints or from misuse of features. There could be used tools brainstorming, check lists, root-cause analysis, conceptual mapping, in fact any tool which could help us to track the essential notions and the relations between them.
3. Carrying out the plan
Once the plan sketched, we could go on and approach each of the branches of the plan, performing the successive steps in one branch until we find an end-point (a point in which we can’t go further). There could be branches going nowhere, multiple solutions, or no apparent solution to the problem. Everything is possible… More likely while advancing in carrying out the plan, we could discover other intermediary steps, other branches (alternatives of arriving to the same result or to approach different constraints).
4. Looking back
According to Polya, this step resumes to examining the solution [1], reviewing the argumentation used, solution’s construction, on whether the solution is optimal, on whether it could be reused to solve other types of problems or whether it could be improved/refactored. Actually this is a step many developers completely ignore, they found a solution, it’s working so their work is done! No, even when pressed by time should be considered also these aspects of problem solving, and from my point of view this step includes also steps like documenting the issue, and in special cases communicating the solution found to the circle of professionals (e.g. in terms of best practices or lessons learned, why not a blog post, etc.). Topics like optimality and refactoring and are quite complex and deserve a post of their own, therefore I will resume myself to mention only the fact that they typically consider the solution from the point of view of performance, complexity, (re)usability and design, the developer having to trade between them and other similar (quality) dimensions.
Beyond Polya’s Approach
A natural question: do we really have to follow this approach?! Come on, there will be cases when you’ll have the solution without actually attempting to define the problem (explicitly) or devise a plan (explicitly), or only by listing the scope and the constraints! Unconsciously we are actually following the first three steps, but forget or complete ignore the fourth, and I feel that applying Polya’s approach brings some “conscious thought” in this process that could help us make the most of it.
In many cases the solution will be there in documentation, giving developers some explicit or implicit hints about the areas in which to search, for example in case of an error related to a query a first input is the error message returned by the database engine. Fortunately RDBMS vendors like Microsoft and Oracle provide also a longer description for each error, allowing thus to understand what the error message is about. This is the happiest case, there are also many software tools that after they run half of hour, they return a fuzzy error message (e.g. ‘an error occurred!’), and nothing more.
References:
[1] G. Polya (1973) How To Solve It: A New Aspect of Mathematical Method, 2nd Ed.Stanford University. ISBN: 0-691-08097-6.