Thursday, September 12, 2013

Alice's Adventures in SQL-NULL-Land

Programming in a relational database environment is usually a very predictable affair.   Everyone understands the concept of rows and columns in tables.  SQL has only four operations to learn (five if you count MERGE).  And thanks to products like Excel, business users know the concept, too, and we all have a common language of tables, rows, and columns.

But, for programmers, there is one thing that remains tricky, and it's a little thing if it was a thing at all, and if we're not careful, we're soon tumbling down Alice's rabbit hole.  That one little thing is the NULL.  And the NULL is at its trickiest when we use it in logic.

Here's a sample Oracle PL/SQL anonymous block.  Before you cut and paste this snippet of code into SQLPLUS or another SQL tool of your choice, answer this question: what will it print?

set serveroutput on 
declare

   the_letter varchar2(1) := null; 

begin

   if the_letter not in ( 'A', 'B', 'C')   then 
        dbms_output.put_line('if true');
   else 
       dbms_output.put_line('else false');
   end if; 

end;
This bit of code cuts to the point right away. Execute it, and the output is
else false
There's no mistake here, "else false" is the code-correct answer. Sure, the value in the_letter may not be an A or a B or a C, and perhaps you expect it to print "if true".  This result may not be what you want or expect, but this code always prints "else false". And in SQL-NUL-Land, that is the correct answer.  Here's another code snippet:
set serveroutput on 
declare

   alice boolean;

begin 

    alice := 'RABBIT' = NULL;
  
   if ( alice ) then 
      dbms_output.put_line('The rabbit is null');
   end if;

   if NOT (alice) then 
      dbms_output.put_line('The rabbit is not null');
   end if;

end;
Execute that, and what do you see? It's like listening to John Cage's 4'33": Nothing. And in that nothing is everything.

We are taught that binary variables have two states: TRUE or FALSE.  And that's true, or not false, except when handling NULLs.  Our binary variable ALICE actually has three states: TRUE, FALSE, or NULL.  As programmers, we need to properly code for the three possible states of a binary variable.

There are a several ways of handling this, here are few suggestions:
  • Use the VALUE, COALESCE, or NVL function  to assign a default value to any variable that might be null.
  • Leave an extra ELSE on all IF/THEN/ELSE constructs.
  • Explicitly test for NULL values. Notice that we explicitly test for NULL using the keyword "is" and not the equal "=" sign.
  • Properly initialize all variables when we declare them.
Here's a code snippet showing all three suggestions.
declare
   alice boolean;
begin
   alice := 'RABBIT' = NULL;
 
   if NOT( coalesce(alice,false) ) then
      dbms_output.put_line('The rabbit is not null');
   end if;
       
   if (alice) then 
      dbms_output.put_line('The rabbit is not null');
   elsif NOT(alice) then
      dbms_output.put_line('The rabbit is not null');
   else -- Catch the third state
      dbms_output.put_line('Alice is undefined, neither true nor false');
   end if;

   if alice is null then -- Explicitly test for null
      dbms_output.put_line('Alice is undefined, neither true nor false');
   end if;
      

end;
Yes, the examples are a bit contrived, and of course, you wouldn't really code something like
alice := 'RABBIT' = NULL;
But null values can occur in odd ways. Perhaps a function returning a value behaves badly and returns a null? Or a SQL select doesn't satisfy the where clause? Consider this snippet, based on a real problem:
declare

   the_name varchar2(32);

   is_bedrock_citizen boolean;

begin

   -- Function returning who is using this routine
   the_user := get_citizen_name();  

   -- See if they live in Bedrock
   is_bedrock_citizen := the_user in ('FRED','WILMA','BARNEY','BETTY'); 

   if NOT is_bedrock_citizen then
       raise_application_error(-20001,
                               'You are not a Bedrock citizen.');
   else
       dbms_output.put_line('You are a Bedrock citizen.');
       -- More code below...
    
   end if;

end;
When the get_citizen_name() function works properly and returns someone's name, the program works as its designers intended. The name is compared to the list of Bedrock citizens and assigns a TRUE or FALSE value to the is_bedrock_citizen variable. Foks who live outside of Bedrock get the raise_application_error message and the program stops;  Wilma, Fred, Barney, and Betty get the "You are a citizen" message and the program continues.

But what happens when get_citizen_name() misbehaves and returns a NULL value? Now we have a logical comparison using a NULL value, and the result is neither TRUE nor FALSE, the logical value is NULL. The logical test of NOT is_bedrock_citizen fails and the else block executes. So, whether this user was a citizen or not, if get_citizen_name() misbehaves, then everyone is a citizen!

Good luck, and don't go down this rabbit hole!  Initialize your variables, and be prepared to handle NULLs.

No comments:

Post a Comment