04 February 2017

Killing Sessions - Killing ‘em Softly and other Snake Stories

Introduction

    There are many posts on the web advising succinctly how to resolve a blocking situation by terminating a session via kill command, though few of them warn about its use and several important aspects that need to be considered. The command is powerful and, using an old adagio, “with power comes great responsibility”, responsibility not felt when reading between the lines. The easiness with people treat the topic can be seen in questions like “is it possibly to automate terminating sessions?” or in explicit recommendations of terminating the sessions when dealing with blockings.

   A session is created when a client connects to a RDBMS (Relational Database Management System) like SQL Server, being nothing but an internal logical representation of the connection. It is used further on to perform work against the database(s) via (batches of) SQL statements. Along its lifetime, a session is uniquely identified by an SPID (Server Process ID) and addresses one SQL statement at a time. Therefore, when a problem with a session occurs, it can be traced back to a query, where the actual troubleshooting needs to be performed.

   Even if each session has a defined scope and memory space, and cannot interact with other sessions, sessions can block each other when attempting to use the same data resources. Thus, a blocking occurs when one session holds a lock on a specific resource and a second session attempts to acquire a conflicting lock type on the same resource. In other words, the first session blocks the second session from acquiring a resource. It’s like a drive-in to a fast-food in which autos must line up into a queue to place an order. The second auto can’t place an order until the first don’t have the order – is blocked from placing an order. The third auto must wait for the second, and so on. Similarly, sessions wait in line for a resource, fact that leads to a blocking chain, with a head (the head/lead blocking) and a tail (the sessions that follow). It’s a FIFO (first in, first out) queue and using a little imagination one can compare it metaphorically with a snake. Even if imperfect, the metaphor is appropriate for highlighting some important aspects that can be summed up as follows:

  • Snakes have their roles in the ecosystem
  • Not all snakes are dangerous
  • Grab the snake by its head
  • Killing ‘em Softly
  • Search for a snake’s nest
  • Snakes can kill you in sleep
  • Snake taming

   Warning: snakes as well blockings need to be handled by a specialist, so don’t do it by yourself unless you know what are you doing!

Snakes have their roles in the ecosystem

    Snakes as middle-order predators have an important role in natural ecosystems, as they feed on prey species, whose numbers would increase exponentially if not kept under control. Fortunately, natural ecosystems have such mechanism that tend to auto-regulate themselves. Artificially built ecosystems need as well such auto-regulation mechanisms. As a series of dynamical mechanisms and components that work together toward a purpose, SQL Server is an (artificial) ecosystem that tends to auto-regulate itself. When its environment is adequately sized to handle the volume of information or data it must process then the system will behave smoothly. As soon it starts processing more data than it can handle, it starts misbehaving to the degree that one of its resources gets exhausted.

   Just because a blocking occurs doesn’t mean that is a bad thing and needs to be terminated. Temporary blockings occur all the time, as unavoidable characteristic of any RDBMS with lock-based concurrency like SQL Server. They are however easier to observe in systems with heavy workload and concurrent access. The more users in the system touch the same data, the higher the chances for a block to occur. A good design database and application architecture typically minimize blockings’ occurrence and duration, making them almost unobservable. At the opposite extreme poor database design combined with poor application design can make from blockings a DBA’s nightmare. Persistent blockings can be a sign of poor database or application design or a sign that one of the environment’s limits was reached. It’s a sign that something must be done. Restarting the SQL server, terminating sessions or adding more resources have only a temporary effect. The opportunity lies typically in addressing poor database and application design issues, though this can be costlier with time.

Not all snakes are dangerous

    A snake’s size is the easiest characteristic on identifying whether a snake is dangerous or not. Big snakes inspire fear for any mortal. Similarly, “big” blockings (blockings consuming an important percentage of the available resources) are dangerous and they have the potential of bringing the whole server down, eating its memory resources slowly until its life comes to a stop. It can be a slow as well a fast death.

   Independently of their size, poisonous snakes are a danger for any living creature. By studying snakes’ characteristics like pupils’ shape and skin color patterns the folk devised simple general rules (with local applicability) for identifying whether snakes are poisonous or not. Thus, snakes with diamond-shaped pupils or having color patterns in which red touches yellow are likely/believed to be poisonous. By observing the behavior of blockings and learning about SQL Server’s internals one can with time understand the impact of each blocking on server’s performance.

Grab the snake by its head

    Restraining a snake’s head assures that the snake is not able to bite, though it can be dangerous, as the snake might believe is dealing with a predator that is trying to hurt it, and reach accordingly. On the other side troubleshooting blockings must start with the head, the blocking session, as it’s the one which created the blocking problem in the first place.

    In SQL Server sp_who and its alternative sp_who2 provide a list of all sessions, with their status, SPID and a reference with the SPID of the session blocking it. It displays thus all the blocking pairs. When one deals with a few blockings one can easily see whether the sessions form a blocking chain. Especially in environments under heavy load one can deal with a handful of blockings that make it difficult to identify all the formed blocking chains. Identifying blocking chains is necessary because by identifying and terminating directly the head blocking will often make the whole blocking chain disappear. The other sessions in the chain will perform thus their work undisturbed.

    Going and terminating each blocking session in pairs as displayed in sp_who is not recommended as one terminates more sessions than needed, fact that could have unexpected repercussions. As a rule, one should restore system’s health by making minimal damage.

    In many cases terminating the head session will make the blocking chain disperse, however there are cases in which the head session is replaced by other session (e.g. when the sessions involve the same or similar queries). One will need to repeat the needed steps until all blocking chain dissolve.

Killing ‘em Softly 

   Killing a snake, no matter how blamable the act, it is sometimes necessary. Therefore, it should be used as ultimate approach, when there is no other alternative and when needed to save one’s or others’ life. Similarly killing a session should be done only in extremis, when necessary. For example, when server’s performance has deprecated considerably affecting other users, or when the session is hanging indefinitely.


    Kill command is powerful, having the power of a hammer. The problem is that when you have a hammer, every session looks like a nail. Despite all the aplomb one has when using a tool like a hammer, one needs to be careful in dealing with blockings. A blocking not addressed correspondingly can kick back, and in special cases the bite can be deadly, for system as well for one’s job. Killing the beast is the easiest approach. Kill one beast and another one will take its territory. It’s one of the laws of nature applicable also to database environments. The difference is that if one doesn’t addresses the primary cause that lead to a blocking, the same type of snake more likely will appear repeatedly.


    Unfortunately, the kill command is no bulletproof for terminating a session, it may only severe the snake. As the documentation warns, there can be cases in which the method won’t have any effect on the blocking, the blocking continuing to room around. So, might be a good idea to check whether the session disappeared and keep an eye on it until it totally disappeared. Especially when dealing with a blocking chain it can happen that the head session is replaced by another session, which probably was waiting for the same resources as the previous head session. It may happen that one deals with two or more blocking chains independent from each other. Such cases appear seldom but are possible.


     Killing the head session with a blocking without gathering some data provides less opportunities for learning, for understanding what’s happening in your system, of identifying what caused the blocking to occur. Therefore, before jumping to kill a session, collect the data you need for further troubleshooting.

Search for a snake’s nest 

   With the warning that unless one deals with small snakes, might not be advisable in searching for a snake’s nest, the idea behind this heuristic is that with a snake’s occurrence more likely there is also a nest not far away, where several other snakes might hide. Similarly, a query that causes permanent blockings might be the indication for code that generates a range of misbehaving queries. It can be same code or different pieces of code. One can attempt to improve the performance of a query that leads to blockings by adding more resources on the server or by optimizing SQL Server’s internals, though one can’t compensate for poor programming. When possible, one needs to tackle the problem at the source, otherwise performance improvements are only temporary.

Snakes can kill you in sleep 

   When wondering into the wild as well when having snakes as pets one must take all measures to assure that nobody’s health is endangered. Same principle should apply to databases as well, and the first line of defense resides in actively monitoring the blockings and addressing them timely as they occur. Being too confident that nothing happens and no taking the necessary precautions can prove to be a bad strategy when a problem occurs. In some situations, the damage might be acceptable in comparison with the effort and costs needed to build the monitoring infrastructure, though for critical systems it can come with important costs.

Snakes’ Taming 

   Having snakes as pets doesn’t seem like a good idea, and there are so many reasons why one shouldn’t do it (see PETA’s reasons)! On the other side, there are also people with uncommon hobbies, that not only limit themselves at having a snake pet, but try to tame them, to have them behave like pets. There are people who breed snakes to harness their venom for various purposes, occupation that requires handling snakes closely. There are also people who brought their relation with snakes at level of art, since ancient Egypt snake charming being a tradition in countries from Southeast Asia, Middle East, and North Africa. Even if not all snakes are tameable, snake’s taming and charming is possible. In the process the tamer must deprogram or control snakes’ behavior, following a specific methodology in a safe environment.

    No matter how much one tries to avoid persistent blockings, one can learn from troubleshooting blockings, about their sources, behavior as well about own limitations. One complex blocking can be a good example with which one can test his knowledge about SQL Server internals as well about applications’ architecture. Each blocking provides a scenario in which one can learn something.

    When fighting with a blocking, it’s wise to do it within a safe environment, typically a test or development environment. Fighting with it in a production environment can cause unnecessary stress and damage. So, if you don’t have a safe environment in which to carry the fight, then build one and try to keep the same essential characteristics as in production environment!

   There will be also situations in which one must fight with a blocking in the production environment. Then, be careful in not damaging the data as well the environment, and take all the needed precautions!


Conclusion

    The comparison between snakes and blockings might not be perfect, though hopefully it will imprint in reader’s mind the dangers of handling blockings inappropriately and increase the awareness in what concerns related topics.

No comments: