30 January 2011

What’s New #3: SQL Server Resources

    During the past weeks I found several interesting learning resources on SQL Server 2008 topics, so here they are:

    For those interest to upgrade their knowledge to SQL Server 2008 R2 Edition check the Microsoft SQL Server 2008 R2 Update for Developers Training Course videos on Channel9. They cover topics like Location Awareness, CLR Integration, StreamInsight, Reporting Services, Application and Multi-Server Management, improvements on BLOB, T-SQL and tools for Excel 2010 or SharePoint 2010.

    The ER diagrams of SQL Server’s DMVs or DMFs are available for download as PDF at SQL Server 2008 Systems Views Map, and the same for SQL Server 2005, respectively SQL Server 2008 R2. Even if not all objects and possible relations are shown, they could still save you from lot of effort as the diagram depicts the relations between the most important entities and the cardinality existing between them, plus the additional constraints participating in the joins. The entities represented: Objects, Types and Indexes, Linked Servers, CLR, Database Mirroring, Service Broker, Resource Governor, Transactions, Databases and Storage, Traces and Events, Execution Environment, Server information.

    There are several free SQL Server books on SQLServerCentral.com coming from Redgate: A. Kuznetsov’s Defensive Database Programming, B. McGehee’s SQL Server Maintenance Plans, How to become an Exceptional DBA, Mastering SQL Server Profiler and Brad’s Sure Guide to SQL Server 2008, J. Magnabosco’s Protecting SQL Server Data, G. Fritchey’s SQL Server Execution Plans and SQL Server Crib Sheet Compendium (coauthor with A. Prasanna), R. Landrum’s SQL Server Tacklebox, plus several volumes of ‘Best of SQL Server Central’ authors in Vol. 1-4, Vol. 5, Vol. 6 and Vol. 7, and several other books to come.

     It seems there are a few new books also on Scribd website: R. Colledge’s SQL Server Administration in Action, R. Vieira’s Professional Microsoft SQL Server 2008 Programming, M. Lee and M. Mansfield’s SQL Server 2008 Administration Instant Reference, plus many other books that can be retrieved by searching SQL Server 2008 on the respective site.


    There are several papers coming from Microsoft and several professionals. It worth to check them:
-   K. Lassen’s paper on Using Management Data Warehouse for Performance Monitoring, in which he provides several best practices for performance management of SQL Server 2008. Except the introduction in Management Data Warehouse features, he provides also an extension of the built-in functionality for indexes, plus several functions and queries.
-  D. Kiely's paper on SQL Server 2008 R2 Security Overview for Database Administrators covers some of the most important security features in SQL Server 2008. There is a similar paper targeting SQL Server 2005.
-  B. Beauchemin’s paper on SQL Server 2005 Security Best Practices - Operational and Administrative Tasks describes the best practices for setting up and maintaining security in SQL Server 2005.
-  SQL Server 2008 Compliance Guide a paper written to help organizations and individuals understand how to use the features of the Microsoft® SQL Server® 2008 database software to address their compliance need.
-  SQL Server 2008 Upgrade Technical Reference Guide essential phases, steps and best practices to upgrade existing instances of SQL Server 2000 and 2005 to SQL Server 2008.
-  Technet Wiki Checklists on Database Engine Security, Database Engine Security, Database Engine Connections and  Data Access. (probably more to come)
-   Also Microsoft maintains a list of White Papers on SQL Server 2008, respectively SQL Server 2005.

07 January 2011

Pulling the Strings of SQL Server–Part IV: Spaces, Trimming, Length and Comparisons

    In the previous post on concatenation, I was talking about the importance of spaces and other delimiters in making concatenations’ output more “readable”. Excepting their importance in natural language, the spaces have some further implication in the way strings are stored and processed. As remarked in the introductory post from this topic, there are two types of spaces that stand out in the crowds of spaces, namely the trailing spaces, the spaces found at the right extremity of a string,  respectively the leading spaces, the spaces found at the left extremity of a string. Are few the cases when the two trailing space are of any use, therefore databases like SQL Server usually ignore them. The philosophy about leading space is slightly different because there are cases in which they are used in order to align the text to the right, however there are tools which are cutting off the leading spaces. When no such tools are available or any of the two types of spaces are not cut off, then we’ll have do to it ourselves, and here we come to the first topic of this post, trimming.

Trimming

   Trimming is the operation of removing the empty spaces found at the endings of a string. Unlike other programming languages which use only one function for this purpose (e.g. Trim function in VB or Oracle), SQL Server makes use of two functions used for this purpose, LTrim used to trim the spaces found at the left ending of the string, respectively RTrim, used to trim the spaces found at the right ending of the string.
-- trimming a string
SELECT  LTrim(' this is a string ') Length1 -- left trimming
, RTrim(' this is a string ') Length2 --right trimming
, LTrim(RTrim(' this is a string ')) Length2 --left & right trimming
    As can be seen it’s not so easy to identify the differences, maybe the next function will help to see that there is actually a difference.

Note:
    If it looks like the two trimming functions are not working with strings having leading or trailing spaces, then maybe you are not dealing with an empty character but rather with other characters like CR, LF, CRLF or other similar characters, rendered sometimes like an empty character.

Length

   Before approaching other operations with strings, it’s maybe useful (actually necessary as we will see) to get a glimpse of the way we can determine the length of a string value, in other words how many characters it has, this being possible by using the Len function:
-- length of a string
SELECT Len('this is a string') Length1 -- simple string
, Len('this is a string ') Length2 --ending in space
, Len(' this is a string') Length3 --starting with a space
, Len(' this is a string ') Length4 --starting & ending with a space
, Len(LTrim(' this is a string ')) Length5 –length & left trimming
,Len(RTrim(' this is a string ')) Length5 –length & right trimming
,Len(LTrim(RTrim(' this is a string '))) Length5 –length, left & right trimming
    In order to understand the above results, one observation is necessary: if a strings ends in with one or more empty characters, the Len function ignores them, though this doesn’t happen with the leading empty characters, they needing to be removed explicitly if needed.

Comparisons

    The comparison operation points the differences or similarities existing between two data types, involving at minimum two expressions that reduce at runtime to a data type and the comparison operator. This means that each member of comparison could include any valid combinations of functions as long they are reduced to compatible data types. In what concerns the comparison of strings, things are relatively simple, the comparison being allowed  independently on whether they have fix or varying length. Relatively simple because if we’d have to go into details, then we’d need to talk about character sets (also called character encoding or character maps) and other string goodies the ANSI SQL standard(s) are coming with, including a set of rules that dictate the behavior of comparisons. So, let’s keep things as simple as possible. As per above attempt of definition, a comparison implies typically an equality, respectively difference, based on equal (“=”), respectively not equal (“<>” or “!=”). Here are some simple examples:
-- sample comparisons
SELECT CASE WHEN 'abc' != 'abc ' THEN 1 ELSE 0 END Example1
,
CASE WHEN ' abc' != 'abc' THEN 1 ELSE 0 END Example2

,
CASE WHEN ' ' != '' THEN 1 ELSE 0 END Example3

-- error comparison
, CASE WHEN 'abc' != NULL THEN 1 ELSE 0 END Example4
,
CASE WHEN 'abc' = NULL THEN 1 ELSE 0 END Example5

-- adequate NULL comparison
, CASE WHEN 'abc' IS NOT NULL THEN 1 ELSE 0 END Example6
,
CASE WHEN 'abc' IS NULL THEN 1 ELSE 0 END Example7

comparisons - set 1
   The first three examples are demonstrating again the behavior of leading, respectively trailing spaces. The next two examples, even if they seem quite logical in terms of natural language semantics, they are wrong from the point of view of SQL semantics, and this because the comparison of values in which one of them is NULL equates to a NULL, thus resulting the above behavior in which both expressions from the 4th and 5th example equate to false. The next two examples show how the NULLs should be handled in comparisons with the help of IS operator, respectively it’s negation – IS NOT. 

   Like in the case of numeric values, the comparison between two strings could be expressed by using the “less than” (“<”) and “greater than” (“?”) operators, alone or in combination with the equality operator (“<=”, “>=”) or the negation operator (“!>”, “<!”) (see comparison operators in MDSN). Typically an SQL Server database is case insensitive, so there  will be no difference between the following strings: “ABC”, “abc”, “Abc”, etc. Here are some examples:
-- sample comparisons (case sensitive)
SELECT CASE WHEN 'abc' < 'ABC' THEN 1 ELSE 0 END Example1
,
CASE WHEN 'abc' > 'abc' THEN 1 ELSE 0 END Example2

,
CASE WHEN 'abc' >= 'abc ' THEN 1 ELSE 0 END Example3

,
CASE WHEN 'abc' <> 'ABC' THEN 1 ELSE 0 END Example4

,
CASE WHEN 'abc' > '' THEN 1 ELSE 0 END Example5

,
CASE WHEN ' ' > '' THEN 1 ELSE 0 END Example6

comparisons - case insensitiveness
    The case sensitivity could be changed at attribute, table or database level. As we don’t deal with a table and the don’t want to complicate too much the queries, let’s consider changing the sensitivity at database level. So if you are using a non-production database, try the following script in order to enable the case sensitivity:
--enabling case sensitivity for a database
ALTER DATABASE <your database>
COLLATE Latin1_General_CS_AS
, and the following script to disable it:
--disabling case sensitivity for a database
ALTER DATABASE <your database>
COLLATE Latin1_General_CI_AS
    In order to test the behavior of case sensitivity, enable first the sensitivity and then rerun the previous set of example (involving case sensitivity).
comparisons - case sensitiveness
    After that you could disable again the case sensitivity by running the last script. Please note that if your database has other collation, you’ll have to change the scripts accordingly in order to point to your database’s collation.

03 January 2011

Pulling the Strings of SQL Server–Part V: Character Indexes

    A string is in fact a concatenation or chain of characters, each character in the string having a position referred as character’s index. In programming it’s useful to identify the first occurrence of a character or string in forward or backwards order, with a few simple tricks being possible to identify all the occurrences of the characters or whole strings in a string. Such functions are appearing in some of the well known programming languages under names like InStr, IndexOf for forward processing, respectively, InStrRev, LastIndexOf. SQL Server has only one function for this purpose, namely CharIndex, used to retrieve the position where a character is found starting from a given position. If no occurrence was found, the function returns 0, while if no start position if given, the default is 0. There are several other remarks that could be found in MSDN, and it’s actually always a good idea to consult the documentation from time to time, not only when meeting a new function. But let’s get back to the CharIndex function with a few examples based on searching a character within a given string:
-- searching a character into a string
DECLARE @string varchar(50)
SET @string = 'this is a string'

SELECT
CharIndex(' ', @string) Example1

, CharIndex(NULL, @string) Example2
, CharIndex(NULL, @string, NULL) Example3
,
CharIndex(NULL, NULL, NULL) Example4

CharIndex Example Set 1
   If in the first example the function is used to retrieve the first occurrence of a space in the given string, in the last three examples is highlighted the behavior of the function in case one of the parameters is NULL, in such cases the function returning a NULL – this means that when other behavior is expected, it falls in your attributions to handle the NULL values accordingly.

   Here’s a second set of examples based on the search of a string within another string. As can be seen there isn’t a difference at all in how the function is applied:
-- searching a string into a string
DECLARE @string varchar(50)
DECLARE @search varchar(50)

SET @string = 'this_is_another_string'

SET @search = 'is'

SELECT @string Example1 -- simple string

, CharIndex(@search, @string) Example2 -- occurring position

,
CharIndex(@search, @string, 5) Example3 -- second occurring position

CharIndex Example Set 2
   From the three examples, the third deserves some special attention because it attempts to retrieve the second occurrence of the string by using a hardcoded starting value. In fact the search could start a position further from the first occurrence, something like in the below example:
-- searching a string into a string in foward/reverse order
DECLARE @string varchar(50)
DECLARE @search varchar(50)

SET @string = 'this_was_a_string'

SET @search = '_'

SELECT @string Example1 -- simple string

, CharIndex(@search, @string) Example2 -- 1st occuring position (forward)

, CharIndex(@search, @string, CharIndex(@search, @string)+1) Example3 -- 2nd occuring position (forward)

, CharIndex(@search, @string, CharIndex(@search, @string, CharIndex(@search, @string)+1)+1) Example4 -- 3rd occuring position (forward)

CharIndex Example Set 3

    The logic could be repeated over and over again in order to retrieve the n-th position, list all the positions of occurrence or the total number of occurrences.

    Even if SQL Server doesn’t provide a function for retrieving the first occurrence in backward order, in other words starting from the end to the start, there is actually a simple trick that might do it. As SQL Server provides a Reverse function, which reverses the value of the string provided as parameter.
-- 1st occuring position backwards
DECLARE @string varchar(50)
DECLARE
@search varchar(50)

SET
@string = 'this_was_a_string'

SET
@search = '_'

SELECT
Len(@string) - CharIndex(@search, Reverse(@string))+1 Example


  Unfortunately the logic needed to retrieve the second and following occurrences starting from a given position it’s not so easy readable. Therefore, for the sake of “simplicity”, here’s the logic for reverse processing incorporated in CharIndexRev function:
-- reverse CharIndex
CREATE FUNCTION dbo.CharIndexRev(
@search varchar(50)
,
@string varchar(500)
,
@start int = 0)

RETURNS
int
AS
BEGIN
RETURN Len(@string) - CharIndex(@search, Reverse(@string), CASE WHEN IsNull(@start, 0) != 0 THEN Len(@string) - @start+1 ELSE 0 END)+1

END
And here’s the function at work:
-- searching a string into a string in reverse order
DECLARE @string varchar(50)
DECLARE
@search varchar(50)

SET
@string = 'this_was_a_string'

SET
@search = '_'

SELECT
dbo.CharIndexRev(NULL, @string, 0) Example1

,
dbo.CharIndexRev(@search, NULL, 0) Example2

,
dbo.CharIndexRev(NULL, NULL, 0) Example3

,
dbo.CharIndexRev(@search, @string, 0) Example4

,
dbo.CharIndexRev(@search, @string, dbo.CharIndexRev(@search, @string, 0)-1) Example5

,
dbo.CharIndexRev(@search, @string, dbo.CharIndexRev(@search, @string, dbo.CharIndexRev(@search, @string, 0)-1)-1) Example6

CharIndex Example Set 4
    As can be seen, the output of the last three examples matches the reversed output of the previous set (11, 9, 5) vs. (5, 9, 11), while the first three examples exemplify the behavior of the function when one of the parameters is Null.

01 January 2011

Book Review: Head First SQL

   I love the Head First O’Reilly book series which excepting the book on SQL, have acquired over its short existence several other titles like the ones on Data Analysis, Statistics and Excel, topics strongly correlated with the thematic of the current blog, but also related tangential topics like Project Management, Design Patterns, Object-Oriented Analysis and Design, Software Development, Web Design, Programming, PHP & MySQL, HTML with CSS & XHTML, C#, Java, AJAX, JavaScript, Pyhton, Servlets and JSP, EJB or Rails. (The book series contains more titles which seems to be available also at Head First Labs together with other supporting information.) What I like about this book series is that approaches the themes from a basic level, using a rich set of “brain-friendly” visual representations, conversational and personalized style, challenging food for thought, plus some advices on how to use book’s format, facts that allow the reader to make most of the learning process. In fact the books are approachable for any newbies in IT branch, so I recommend the whole set of books to beginners as well to experienced programmers, considering the fact that everybody has something to learn, at least when considering the presentational format of the book.

   Coming back to the current title, Head First SQL, it covers almost everything somebody has to know about database design, DML and DDL statements, database objects and security in a nutshell. It focuses only on the SQL Standard (I wish I know which one as there are several versions/revisions, see SQL 1999), so it could be used for all type of RDBMS that implements the standard, even if the book is built around MySQL. A consequence of this aspect is that any vendor specific flavors won’t be found in here, so the book should be used in combination with the documentation or the books that target the vendor-specific functionality. There are also several topics not covered but mentioned roughly: reserved words and special characters, temporary tables, data conversion, some numeric functions and operators, indexes, etc.

     I have to remark that some of the definitions are somehow ambiguous, and I feel that in some cases additional specifications/clarifications are necessary. Some examples in this direction could be considered the very first important definitions, the ones of the database seen as “a container that holds tables and SQL structures related to those tables”, the one of column seen as piece of data stored by a table, or of a row seen as a single set of columns that describe attributes of a single thing. The first definition is quite acceptable if it’s ignored the existence of non-relational databases (it’s also true that the book targets RDBMS even if not evident), however the other two definitions are too vague, at least from my point of view. It’s also true that more rigor and detail would involve the introduction of other concepts (e.g. entity, matrix, etc.), requiring more space and maybe complicate the topics. The reader could intuit from explanatory text and examples what’s all about, however expecting the reader to bridge such gaps should be avoided especially when we consider the intrinsic nature of this book in particular, and IT books in general. Anyway, this shouldn’t stop you discover and enjoy the book!

Notes:

1. As I learned on the hard way, not all book formats and writing styles are matching everybody’s taste and expectations. The simplistic and rich visually format of the book could annoy as well, so might be a good idea to browse first the book on Scribd before buying it.

2. The source code is available here.