Lost Learnings: 2013

Saturday, December 14, 2013

Regular Expressions: Analyzing PL/SQL Source

We're going to examine a few themes that we've touched on in earlier posts. In Get Regular (Expressions) we took a first look at regular expressions, and in Recursive SQL we used the Oracle DBA_DEPENDENCIES view for some sample data. We'll take another look at a couple of views in the Oracle 11g database, and we'll use regular expressions to dig a little deeper into these views.

We have an old package that's been running for over 15 years. It's gotten rather long in the tooth, and we need to replace it. First, we need to find all the other objects that use this package. Oracle's DBA_DEPENDENCIES view will show every object that directly references a package.

describe dba_dependencies
Name                 Null     Type          
-------------------- -------- ------------- 
OWNER                NOT NULL VARCHAR2(30)  
NAME                 NOT NULL VARCHAR2(30)  
TYPE                          VARCHAR2(18)  
REFERENCED_OWNER              VARCHAR2(30)  
REFERENCED_NAME               VARCHAR2(64)  
REFERENCED_TYPE               VARCHAR2(18)  
REFERENCED_LINK_NAME          VARCHAR2(128) 
DEPENDENCY_TYPE               VARCHAR2(4)

We plug our schema name and package name in as the referenced_owner and referenced_name, run the query, and the database displays the information we're looking for.

select owner, name, type  
from dba_dependencies
where referenced_owner = 'MY_SCHEMA'
  and referenced_name = 'MY_PACKAGE';

And we see that three other objects use our package:

OWNER       NAME      TYPE             
-------     ------    ------------    
SCHEMA1     PCK1      PACKAGE BODY           
SCHEMA4     PROC1     PROCEDURE        
SCHEMA5     FUNC1     FUNCTION

But we need to dig a little deeper. Our package has many procedures and functions defined within it. Some of the package functions are obsolete, and some of the package functions need to be rewritten in a new package. Not only do we want to know what other objects are dependent on our package, we want to know which of the package's functions and procedures are called. The DBA_SOURCE view has all the PL/SQL code compiled into the database.

describe dba_source
Name  Null Type           
----- ---- -------------- 
OWNER      VARCHAR2(30)   
NAME       VARCHAR2(30)   
TYPE       VARCHAR2(12)   
LINE       NUMBER         
TEXT       VARCHAR2(4000)

We will use two regular expression functions to examine the source. We will use REGEXP_LIKE to identify the source text that has one of the functions. REGEXP_LIKE returns true if the pattern is found in TEXT and false otherwise. The lower-case "i" is the match parameter indicating that we want a case-insensitive search. PL/SQL source is not case sensitive, so the case-insensitive search allows us to find text that is upper, lower, or camel-case.

We will use REGEXP_SUBSTR to identify the function or procedure. We'll start searching the first position of the TEXT, and we'll return the first occurrence we find. Again, we will use a case-insensitive search. Using these two functions, we can perform the same sort of analysis that we we could do by grepping through a code tree on disk. In this example, we're looking to see if specific functions or procedures are used. Here's our query to perform this analysis against the code compiled into the database:

select distinct owner, name, type, 
       regexp_substr(text,
                    'my_package\.(my_func1|my_func2|my_proc1|my_proc2)',1,1,'i') calls
from dba_source
where regexp_like(text,
                  'my_package\.(my_func1|my_func2|my_proc1|my_proc2)','i') 
 ;

The results of the query are every package, procedure, or function that uses the package, and the query identifies what functions and procedures the calling object uses:

OWNER       NAME      TYPE             CALLS
-------     ------    ------------     -------------------
SCHEMA1     PCK1      PACKAGE BODY     my_package.my_func1
SCHEMA1     PCK1      PACKAGE BODY     MY_PACKAGE.MY_FUNC1
SCHEMA4     PROC1     PROCEDURE        my_package.my_proc2
SCHEMA5     FUNC1     FUNCTION         MY_PACKAGE.MY_FUNC2

This query is a useful summary of what's calling my_package and which procedures in my_package are used. We can get more detail from dba_source if we need it. For example, dba_source includes the line number, so the query

select distinct owner, name, type, line, 
       regexp_substr(text,
                    'my_package\.(my_func1|my_func2|my_proc1|my_proc2)',1,1,'i') calls
from dba_source
where regexp_like(text,
                  'my_package\.(my_func1|my_func2|my_proc1|my_proc2)','i') 
 ;

would return

OWNER       NAME      TYPE             LINE      CALLS
-------     ------    ------------     ----      -------------------
SCHEMA1     PCK1      PACKAGE BODY     330       my_package.my_func1
SCHEMA1     PCK1      PACKAGE BODY     621       my_package.my_func1
SCHEMA1     PCK1      PACKAGE BODY     721       MY_PACKAGE.MY_FUNC1
SCHEMA4     PROC1     PROCEDURE        45        my_package.my_proc2
SCHEMA5     FUNC1     FUNCTION         87        MY_PACKAGE.MY_FUNC2

This query shows the line number where the calling object uses our package. This is useful if we need some context to see why the calling object uses our package and what the calling package is doing with the results.

The above examples show us where specific procedures and functions are used in other objects. This is useful if we want a narrow search, but often we want to see every reference to a function or procedure or even a publicly-defined variable in our package. The regular expression for this search is even simpler: PL/SQL names consistent of letters, numbers, and a few special characters, so when the regular expression can't match any more characters in the expression [a-z0-9_$#]+, then we have the whole name. Here's the query:

select distinct owner, name, type, 
       regexp_substr(text,
                    'my_package\.[a-z0-9_$#]+',1,1,'i') calls
from dba_source
where regexp_like(text,
                  'my_package\.[a-z0-9_$#]+','i') 
 ;

DBA_DEPENDENCIES and DBA_SOURCE are good repositories to help us understand what objects are using our code and how the objects are using our code. And, the regular expression functions make searching though these tables much easier. Anytime we can get the database to do the work for us, "that's a good thing".

Wednesday, December 4, 2013

A Simple Web Client

Cloud computing and software as a service providers have lots of capacity and attractive rates, so many organizations are taking their applications to the cloud. After an application becomes cloud-based, many users need to pull some of the data to feed to other systems. In this post, we'll look at a simple PL/SQL web client that connects to a web service and downloads data.

For our data, we'll use the Data Download Program at the Federal Reserve Bank. There's a lot of statistical data on the Federal Reserve site, and we will use that data as the source for our simple web client. The data files are small and free to download, so the Bank gives us a good source of data to use.

From the Data Download Program we click on the Foreign Exchange Rates (G.5 / H.10) link, then we get a page that lets us choose which package we would like. The H10 radio button is selected by default, so clicking the Download button takes us to the download page. We could download a CSV into our browser and save the result as an OpenOffice or Excel spreadsheet. But for production purposes, following these links and downloading data through a web browser is a tedious chore. Notice the Direct Download for Automated Systems link? We'll copy that link and use it for our demonstration.

create or replace procedure simple_web_client as
    
  timeout number := 60;  -- timeout in seconds

  /* Below is the URL from the Direct Download for Automated Systems.  Just copy the link
     and paste the text.
  */
  v_url varchar2(1024) :=
  'http://www.federalreserve.gov/datadownload/Output.aspx?rel=H10&series=122e3bcb627e8e53f1bf72a1a09cfb81&lastObs=10&from=&to=&filetype=csv&label=include&layout=seriescolumn&type=package';
   
  v_line_count number := 0;
  v_line_in varchar2(1024);
  v_header varchar2(1024);

  /* UTL_HTTP.REQ is set by the BEGIN_REQUEST function.  REQ includes 
     the URL, method, and http_version.
  */
  req   UTL_HTTP.REQ;

  /* UTL_HTTP.RESP is set by the BEGIN_RESPONSE function.  RESP includes   
     the status, a brief message, and the http version.
  */
  resp  UTL_HTTP.RESP;
    
begin
    
  utl_http.set_transfer_timeout(timeout);         -- set the timeout before opening the request
  req := utl_http.begin_request(v_url);           -- Pass our URL, get a req object back
  
  utl_http.set_header(req, 'User-Agent', 'Mozilla/4.0');

  resp := utl_http.get_response(req);             -- Pass our req object, get a resp object back
  utl_http.read_line(resp, v_header, true);       -- Read line of resp into v_header

  dbms_output.put_line('CSV Header = ' || v_header);
  v_line_count := v_line_count + 1;

  loop

    utl_http.read_line(resp, v_line_in, true);    -- csv data
    v_line_count := v_line_count + 1;
    dbms_output.put_line('CSV Data = ' || v_line_in);

  end loop;

exception
  when utl_http.end_of_body then                  -- end of the data
      dbms_output.put_line('Lines read, including header = ' || v_line_count );
      util_http.end_response(resp);               -- close the response object

  when others then
    dbms_output.put_line('Unhandled exception, lines = ' || v_line_count  );
    dbms_output.put_line(sqlerrm);
    dbms_output.put_line(dbms_utility.format_error_backtrace);
 
    -- Do not leave anything open.  Close everything and ignore errors.
    begin
       utl_http.end_response(resp);
       utl_http.end_request(req);
    exception
       when others then null;
    end;
    
end simple_web_client;

Now we execute our client using SQLPLUS or SQL/Developer. If we connect properly, our output will look something like this:

anonymous block completed
CSV Header = "Series Description","Nominal Broad Dollar Index ","Nominal Major Currencies Dollar Index ","Nominal Other Important Trading Partners Dollar Index "
CSV Data = "Unit:","Index:_1973_Mar_100","Index:_1973_Mar_100","Index:_1997_Jan_100"
CSV Data = "Multiplier:","1","1","1"
CSV Data = "Currency:","NA","NA","NA"
CSV Data = "Unique Identifier: ","H10/H10/JRXWTFB_N.B","H10/H10/JRXWTFN_N.B","H10/H10/JRXWTFO_N.B"
CSV Data = "Time Period","JRXWTFB_N.B","JRXWTFN_N.B","JRXWTFO_N.B"
CSV Data = 2013-10-14,ND,ND,ND
CSV Data = 2013-10-15,101.1463,75.7635,128.2686
CSV Data = 2013-10-16,101.0694,75.7839,128.0691
CSV Data = 2013-10-17,100.4167,74.9717,127.6647
CSV Data = 2013-10-18,100.3842,74.8882,127.7015
CSV Data = 2013-10-21,100.6396,75.0142,128.1114
CSV Data = 2013-10-22,100.3595,74.7093,127.8816
CSV Data = 2013-10-23,100.4550,74.8150,127.9575
CSV Data = 2013-10-24,100.5485,74.8527,128.1189
CSV Data = 2013-10-25,100.5134,74.9733,127.8807
Lines read, including header = 16

The Oracle utl_http packages makes reading data from a webserver as simple as reading data from a file. With a simple web client, we can read any data that's available from a web server. Our web service example used CSV-formatted data, but we could download XML data as easily. And for a real hands-free operation, schedule your client using DBMS_SCHEDULE or DBMS_JOB.

There's just one caveat when using the utl_http functions: depending on your DBAs and your installation, you may not have access to use the network functions. If you get an error like

ORA-24247: network access denied by access control list (ACL)

then it's time to visit the DBAs and request that they add your schema or your user account to the network ACL.

Monday, November 25, 2013

Favicons

Many years ago, I posted pictures of our bulldog Petunia and her daughter Cactus on the web. I hosted the site on my Linux PC running Apache, PHP, and MySQL. One of the things I did for this site was create a favicon using a head shot of Petunia. It was pretty cute.

Fast-forward ten years, and it's time to create a favicon again. In the meantime, I've forgotten where to put the favicon in the HTML code; in fact, I've forgotten what it is called. A quick search on the Wayback Machine wasn't much help; the venerable internet archive didn't capture a lot of the database-generated content of my old website.

After a bit googling and poking through code on other people's web pages, I'm back in business. So, to start: it's called a favicon, short for "favorite icon". It's the little image file that appears to the left of the page's title in a browser tab, or to left of the page's title in the browser's bookmarks. Look for the "LL" to the left "Lost Learnings" in the title to see this blog's favicon.

Next, where do we put it? The html header has the favicon location. The image should be called favicon.ico, and depending where your images are located, the code will look something like this, assuming our images are loading in the the images directory:

<head>
  <!-- Other head tags -->
  <link href="images/favicon.ico" type="image/ico" rel="shortcut icon"  />
</head>

Where do we get a favicon? If we're working on a corporate website, there's usually a corporate favicon that we use. Browsing the HTML header of our corporate pages with Firebug or Chrome's developer tools will quickly reveal the location of the corporate favicon.

If we're working on our personal website, then we can use an image editing program like GIMP to create a favicon. GIMP is a very powerful image editor with lots of features. If you use GIMP to create a favicon, there are a couple of things to keep in mind. First, the favicon.ico file should be small and square. I have found that 32x32 pixels works well; it's small, but has enough detail to avoid ragged edges. Second, before we save our image, we need to flatten it so that the image does not have multiple layers. Finally, export the file as favicon.ico. This is the required name, and GIMP will convert the file during the export process.

I would like to add a favicon to my pages on Oracle's APEX demo site, so let's see how we would do this in APEX. First, we need to upload the favicon.ico image file:

From Shared Components, click on Images link under Files
Next, click on the Create button
Select No Application Associated to make the image available to all applications within the workspace, or select a specific application ID.
Use the Browse button to find the favicon.ico on your computer
Click the Upload button

Now that the favicon is loaded into the Apex Shared Components, we're ready to use it: Navigate to Edit Page, and click the Edit icon in the Page region of Page Rendering. In the HTML Header section, add the first line if you associated the favicon with an application id in the upload step. Add the second line if you did not associate the favicon with an application:

<link href="#APP_IMAGES#favicon.ico" rel="shortcut icon" type="image/ico" /> 

<link href="#WORKSPACE_IMAGES#favicon.ico" rel="shortcut icon" type="image/ico" />

Finally, let's add the favicon to our blog on blogspot.com. From the blog's main page, click on the Layout link. In the upper left corner of the layout, find the region labeled Favicon. Click the Edit link, upload the favicon.ico file using the Browse button, then click the Save button.

Now we're ready to view our pages with their favicons. If the favicon does not appear, clear the browser's cache or use shift-refresh to see the favicon.

Sunday, November 17, 2013

Commitment: Commits in APEX

Commitment is such a loaded word these days. Maybe we humans can't quite get it right, but certainly we can count on our computers to get it right. After all, they're binary machines, 0 or 1, false or true: absolute logic. Or can we?

While debugging a PL/SQL block in APEX, I noticed that the code was marking the input data as processed, even if the procedure terminated with a raise_application_error. I was focused on debugging the error causing the raise_application_error, so I didn't care that the input was marked as processed. After fixing the problem, then it occurred to me -- wait, a minute -- why was the input marked as processed when the transaction terminated in error?

When we do transaction processing, we expect the transaction to be atomic. Either all parts of the transaction goes in, or none of the transaction goes in. Imagine going to the ATM machine, moving $50 from savings to checking, and the move fails in the middle of the financial transaction.   That would leave the savings debited $50, nothing credited to checking, and $50 floating in the bit-ether. If we do transaction processing correctly, then either the money is successfully moved, or the money is left in savings.Nothing is left half-done.

Back to Apex. I clearly did not have a correctly-working transaction.    There was nothing in my procedure that would do a commit. But, somewhere, somehow, something was committing the database changes, even though I didn't expect it to.

There's a lot going on here. Partly it's semantics and syntax. My APEX code was not doing anything I hadn't done in COBOL/DB2 with host variables, and the COBOL/DB2 transactions always worked correctly. Partly, it's an understanding, or misunderstanding of what's happening in Apex.   Let's start with syntax and semantics first.

Imagine we have COBOL/DB2 program and some PL/SQL in an Apex block. Both code snippets will do the same thing. Here's the COBOL:

   EXEC SQL
      INSERT INTO MYTABLE(MY_COLUMN) VALUES(1)
   END-EXEC.

   EXEC SQL
      SELECT DUMMY
      INTO :WS-DUMMY
      FROM SYSIBM.DUMMY1
   END-EXEC;

   EXEC SQL
      ROLLBACK
   END-EXEC.

Now lets look at the PL/SQL code in APEX. Each of these anonymous blocks runs in its own APEX process in the same page after the page is submitted.

begin

   insert into mytable (my_column) values(1);

end;

begin 

   select dummy
   into :p1_dummy
   from dual;

end;

begin

   rollback;

end;

Well, that's pretty simple. Except for wrapping the PL/SQL inside an anonymous block and wrapping the COBOL up for the DB2 pre-processor, these code snippets should do the same thing. But they don't. Run each piece of code several times. The COBOL example will never add a row to MYTABLE, while the ORACLE APEX page adds a row to MYTABLE. Notice I said "adds a row", but more on that later.

This isn't an Oracle/DB2 issue. Login to an Oracle SQLPLUS session, and try the PL/SQL code snippets in SQLPLUS. The SQLPLUS session yields the same results as the COBOL test.

Syntax and semantics: in an ideal world, there's a tight correlation between syntax and semantics. We would like things that look the same to behave the same. Our variables, :WS-DUMMY and :p1_dummy, look like two ordinary "host variables". An astute reader of Apex documentation will call :p1_dummy a "session variable", and that's our first clue. Session variables may look like host variables (syntax is the same), but session variables do not behave like host variables (semantics differ). Mostly they do behave the same, with an important exception: changing a session variable causes APEX to take a commit.

Back to "adds a row": only one row is added. If we run our code snippet in Apex a few times, and examine the table, we find there is one row. Why aren't there a few rows? After all, didn't we select data into :p1_dummy each time? That's the second point: changing a session variable causes APEX to take a commit. If we move the same value into a session variable, if the value does not change, then APEX does not take a commit. This means that our code will behave differently depending on the data!

For gremlins, this is a good thing. For developers, this is not so good. Here are a few tips to avoiding obscure errors caused by buggy transaction processing:

During development, turn on DEBUG. Examine the debug logs, looking for text like Session State: Saved Item "P1_DUMMY" New Value="". Anytime we see this, APEX has taken a commit.
Use local variables declared in the anonymous block rather than session variables. Use session variables to store values that will be displayed on the web page, or to save values that you need between page submits. Otherwise, use local variables.
If you pass session variables as IN OUT arguments to an external procedure, changing the parameter values in the procedure is the same as changing the session variables.
APEX only performs commits after a process is run. If you find an error, rollback the transaction and set an error before your process exits. This is true for both anonymous blocks in APEX and external procedures called by APEX.
If you need a commit, explicitly code the commit. Do not assume that APEX will take a commit just because the procedure set a session variable. If the value of the session variable didn't change, APEX won't take a commit.
Be aware of how APEX processes tabular forms (more on that below).

Searching the APEX document is rather discouraging. Searching for keywords like COMMIT and TRANSACTION do not turn up any applicable information. Searching the Internet was more fruitful. Daniel McGhan blogged this topic in August, 2012, and he notes seven situations where APEX takes a commit.

On load, after a page finishes rendering

On submit, before branching to another page

On submit, if one or more validations fail, before re-rendering the page

After a PL/SQL process that contains one or more bind variables has completed

After a computation

When APEX_UTIL.SET_SESSION_STATE is called

When APEX_MAIL.PUSH_QUEUE is called

The Oracle forums are another good source of information. According to an Oracle employee responding to a question in the Oracle forums:

Commits are issued when your code does so explicitly, when a page show or accept request reaches the end, or when anything within your code causes session state to be saved. Session state -altering actions include PL/SQL assignments to (or "select into") bind variables which are page- or application-item references, calls to session-state updating APIs, and page or application computations.

Our pages often include some simple items(text boxes, select lists, date pickers, etc.) plus a tabular form for repeating elements. If we're going to process the page as one transaction properly, we need to know if there are any implicit commits in the APEX Page Processing post-submit processes. If we create a tabular region, we find that Apex adds two post-submit processes: ApplyMRU and ApplyMRD.
Ideally, we should be able add our post-submit process and have all of the post-submit process run as one transaction. Either everything is committed, or none of it committed. We do not want half a transaction.

In the absence of good documentation to guide us, let's do some testing. We'll test in APEX 4.2, the results may differ in other versions. We can test for implicit commits by building a one-page applicaton with a tabular region on the DEMO_CUSTOMER table. For our first test, let's try some inserts and updates. Our three post-submit processes will be:

ApplyMRU
Rollback
ApplyMRD

Click the Submit button, wait for the page to redisplay, and the envelope, please: Any updates or added rows are committed. The rollback has no effect; any updates or added rows are committed. This is not the behavior we expect. We expect a rollback to undo any inserts or updates from ApplyMRU.

For our second test, let's try some deletes. We'll move our rollback step after the ApplyMRD. Our three post-submit processes will look like this:

ApplyMRU
ApplyMRD
Rollback

Check a few delete boxes, click the Delete button, confirm that we want to delete the rows, wait for the page to refresh, and the envelope, please: APEX displays a message saying the rows are deleted, but in fact, the deleted rows are still in the table. The rollback worked as we expected.

There are two important points: First, ApplyMRU takes an implicit commit. Second, ApplyMRU and ApplyMRD do not behave the same way! ApplyMRU and ApplyMRD are not executed at the same time, their process is driven by either the Submit button or Delete button, so we don't need to worry about running them together if the tabular form is the only updatable elements on the page.

But we do need to be careful if we run other processes after the submit button is clicked. If ApplyMRU runs first and commits changes, and if a second process runs and fails, then we have an half-completed transaction. Clearly, that's a bad thing. So, to the list of suggestions above, let's add two more:

Order post-submit process so that ApplyMRU and ApplyMRD run after any other process. If the early processes raise an error, the ApplyMRU and ApplyMRD will not run, and there will not be a half-completed transaction.
In some cases, it will make more sense to split a page into two (or more pages), and save all the database processing until the last page. APEX can save session variables on the first pages, and the session variables are available on the last page when all the database updates occur.

APEX is powerful development environment. However, if we want our transactions to process properly, if we don't want to spend time tracking down obscure bugs, then we need to be aware of when APEX takes implicit commits.

Tuesday, November 12, 2013

Quirks Mode

"The good thing about standards is that there are so many to choose from." I had a good laugh when I read that years ago. Andrew Tanenbaum, in his book Computer Networks, is quite correct. It was true of computer networks twenty years ago (remember SNA, DECNet, Arpanet, and BITNET, just to name a few), and it's true of web standards and web browsers today.

After laying out a page with <div> tags and testing the page with Internet Explorer, Chrome, Safari, and Firefox, one tester reported the page was not displaying properly. A bit of investigation revealed that the tester was using Internet Explorer 9. As of this writing, IE9 is only one release behind the current IE10, so we were puzzled why IE9 did not render the page as expected. Further investigation revealed that IE9 behaves differently when viewing intra-net pages and inter-net pages. For pages served within our corporate firewall, IE 9 runs in "quirks mode", and behaves like IE 5! "The good thing about standards..." For pages served from the external network, IE9 behaves like a modern browser. Google "IE quirks mode", and you will find lots of discussion regarding IE in quirks mode and standards mode.

We are using APEX to deliver the problem page, so we'll examine one solution to handling this problem in APEX. Our page just displays information with some links, and we are not collecting any information nor interacting with the end-user. This makes things much simpler. Our solution will consist of three parts:

we will detect the browser and save the browser in an Apex item
we will use two APEX regions, one for newer browsers and a second for older browsers
Each region will have a Condition, and we will display the region (or not) depending on the browser

I discussed browser detection in an earlier post, so I grabbed some code from the demo page. After adding a hidden item in the first region named "P1_BROWSER", add this PL/SQL anonymous block to a "Before Header" process on the Apex page:

declare

   v_browser_string varchar2(512) := null;

begin

   v_browser_string := upper(owa_util.get_cgi_env('HTTP_USER_AGENT') ) ;

   if instr(v_browser_string,'MOZILLA/4.0') > 0       -- older browser
         or instr(v_browser_string,'MSIE 9') > 0 then -- IE9
      :p1_browser := 'OLD';
   else
      :p1_browser := 'NEW';
   end if;
    

end;

Unfortunately, when we have an IE9 browser, we can't tell what document mode the browser is in. We could examine the client's IP address, too, but now we're introducing network variables into our code. And the end-user can switch document modes, too. So, we'll just treat IE9 as an old browser.

Our modern browser region was laid out in three columns with <div> tags. For older browsers, we can lay out the page with <table> tags:

<table width=100%>
  <tbody>
   <tr>
      <td width=20%>
         <!-- Column 1 content -->
      </td>
      <td width=40%>
         <!-- Column 2 content -->
      </td>
      <td width=40%>
         <!-- Column 2 content -->
      </td>
   </tr>
   </tbody>
</table>

Next, we need to add a Condition to the modern browser region and the old browser region. The condition should be "PL/SQL Function Body Returning a Boolean", and the code should look like the first line for the modern browser region or the second line for the old browser region.

return :p1_browser = 'NEW';

return :p1_browser <> 'NEW';

Now when a browser requests our page, we detect the browser and display the region that is appropriate for that browser. Modern browsers get the <div> layout, older browers get the <table> layout, and our end-users see the same layout.

Finally, a rant and a plug. When I read Tanenbaum's Computer Networks, I was really impressed by how well written the book is. Computer Networks was a pleasure to read; I can not say that about many technical books. Why shouldn't a technical book read as well as a good novel? Sure, we're all geeks, and we love to learn; but I could read Tanenbaum's book just for the fun of reading it.

Tuesday, November 5, 2013

Retirement

Cloud computing is the rage these days. As we migrate old systems to the cloud, we need to retire the old system that we left behind. This is usually a several step process: we start with the easiest step to implement and undo, and we finish with the most difficult step to undo.

Assuming that the retired application has its own schema, the easiest first step is to simply revoke all the privileges on the underlying database schema. We leave the data and the programs in place, and we simply deny all users other than the owner access to the database objects. If we're retiring a large system, there will be hundreds of grants to revoke. We would like an easy way to find all the grants and revoke them. Also, we would like an easy way to undo these changes!

I work in an Oracle shop, so we'll use the Oracle system catalog as an example. Using two two views in the Oracle catalog, we can find all the information we need to revoke every grant on the retired schema's objects. The USER_TAB_PRIVS view shows all the objects where the schema is object owner, grantor, or grantee. We want to revoke every grant where the schema is the grantor and owner of the object. From USER_TAB_PRIVS we get the object name and the privilege. The USER_OBJECTS view shows us all the objects that belong to the schema. From USER_OBJECTS we get the object type, and we use the object type to skip PACKAGE BODYs and TYPE BODYs, because the grants belong to the PACKAGE or TYPE, not the PACKAGE BODY nor TYPE BODY.

Here's our script. We open a cursor from the join of the two views, we concatenate a string to revoke the privilege, and then we execute the string. The SPOOL command writes the results in an output file. SQL/Developer does not support the spool command, so login to the retired schema and run this script using sqlplus. If you're a bit nervous about running the script under the wrong user or schema, change the USER function to the quoted name of the schema instead.

set serveroutput on
spool 'Revoked_Grants.txt' 
begin

    for c in (

        select 'REVOKE ' || a.privilege ||
                  ' ON ' || a.table_name ||
                  ' FROM ' || a.grantee as revoke_command
        from user_tab_privs a
        inner join user_objects b
          on a.table_name = b.object_name
        where b.object_type not in ('PACKAGE BODY', 'TYPE BODY')
          and a.owner  =  user
          and a.grantor = user) loop
    
        dbms_output.put_line(c.revoke_command);

        begin
           execute immediate c.revoke_command;
        exception
           when others then dbms_output.put_line(' *** ' || sqlerrm);
        end;
   
    end loop;
   
end;
/
spool off

When we run this script, the spool command writes a file listing every grant that we revoked. In addition to logging the results for audit purposes, we can use this file to reverse the changes. Simply edit the file, change the REVOKEs to GRANTs, change the FROMs to TOs, save the file, and process it through sqlplus. Now we have an easy way to revoke all the privileges and an easy way to restore them if necessary.

Tuesday, October 29, 2013

How to make license plates

Database sequences So, you're going to the big house. You're going up the river. You're doing time. You're going to be making license plates, and you want to impress the warden?¹

Back in the late 1970's, the license plate on my car read "CAR 999". I always liked that my car's plate read C-A-R. Creating the numbers from 001 though 999 is pretty easy, but creating the letters from AAA through ZZZ is a little trickier. In this post, we'll discuss how to create unique identification numbers, and we'll virtually re-create the late1970's State of Maryland license plates with three letters followed by three numbers.

The license plate problem is one example of generating unique identification numbers. We go to school, we get an id. We go to work, we get an id. We have an id for the computer, we have an id on our driver's license, we have a Social Security number, and our insurance policies have id numbers. Sometimes the ids are just numbers, but more often the id is a mix of letters and numbers. Using letters instead of or in addition to numbers greatly expands the number of ids. On a car's license plate, the letters and numbers take the same amount of space. But, the letters from AAA to ZZZ give us 17,576 possible combinations, while the numbers from 000 to 999 give us only 1000 combinations. Using only numbers, we quickly run out of identification numbers.

When we create ids, we can not reuse a given sequence of letters and numbers. We need to know what we used previously. There are a few ways of tackling this problem. In our license plate example, we could pre-generate all the possible plate numbers, store them in a table, and mark them "in-use" as we use them. Or, we could store the last plate in a table, generate the next plate based on the last plate, and update the table with the latest plate. But, the easiest solution is to use database sequences. A database sequence is an object defined in the database that automatically increments each time we consume a number. We don't have to pre-generate plate numbers, we don't have to update tables or keep track of anything. The database does the work for us. And that's a good thing.

We'll start by creating two sequences, one for the letters and one for the numbers. More about why the letters start with 676 and finish with 17575 later:

create sequence numbers_seq 
   start with 1 maxvalue 999 cycle nocache;

create sequence letters_seq 
   start with 676 maxvalue 17575 nocache cycle;

We can use the number_seq directly, but we must convert the letter_seq to three characters. If we think about the letters of the alphabet as a base 26 counting system, then we need to convert base 10 digits to base 26 digits. We learned to do this in Computer Science 101, when we converted base 10 numbers to base 16 numbers and back again. The solution is the same, only the base is different. This function accepts a base 10 number and returns a three-character base 26 representation:

create or replace function seq2letters(p_seq in number)
    return varchar2 is
    
  v_letters varchar2(3) default null;
  v_n integer := p_seq;
  v_r integer;
  digits varchar2(26) := 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
  base integer  := length(digits);
begin
   while ( v_n > 0) loop
      v_r := mod(v_n,base);
      v_letters := substr(digits,v_r + 1, 1) || v_letters;
      v_n := floor( v_n / base );
      end loop;
   return v_letters;
end seq2letters;

For debugging purposes, we'll create the a function to do the opposite conversion, too. When we decide on the start and maxvalue values for the letter_seq, this is useful for finding out what the base 10 numbers are for BAA and ZZZ.

create or replace function letters2num(  p_letters in varchar2)
    return number
is
  v_length integer := length(p_letters);
  v_num number := 0;
  v_i integer;
  digits varchar2(26) := 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
  base integer  := length(digits);
  v_digit varchar2(1);  
  v_digit_value integer;
  v_power integer := 0;
begin
  for v_i in reverse 1 .. v_length loop
    v_digit := substr(p_letters,v_i,1);
    v_digit_value := instr(digits, v_digit);
    v_num := v_num + (v_digit_value - 1) * base ** v_power;
    v_power := v_power + 1;
  end loop;
  return v_num;
end letters2num;

Let's test our two new functions. We want to find the base 10 equivalents of BAA and ZZZ using letters2num. Then, we'll test seq2letters to make sure we get our starting letters back.

select letters2num('BAA') start, letters2num('ZZZ') finish
from dual;

START      FINISH
-----      ------
  676       15757

select seq2letters(676) first_plate, seq2letters(15757) last_plate
from dual;

FIRST_PLATE     LAST_PLATE
-----------     ----------
BAA             ZZZ

Perfect! We have the tools to translate numbers between base 10 and base 26.

Next, we'll create two functions that return the next letters and next numbers. Our function will get the next number seq and return a 3-digit number. The letters function is a little more complicated. Whenever the plate number is 1, we need to get the next letters. If the plate number is not 1, then we reuse the current letters. In either case, the letter function will use the base 10 converter function to convert the sequence to three letters. Here's the code

create or replace function get_numbers return number is
    
begin
    
    return numbers_seq.nextval;

end get_numbers;

create or replace function get_letters ( p_num number ) return varchar is
    
    v_letters_10 number;
    v_letters_26 varchar2(3);
begin
    
    if p_num = 1 then  
       
        v_letters_10 := letters_seq.nextval;
       
    else

       select last_number - 1
       into v_letters_10
       from user_sequences
       where sequence_name = 'LETTERS_SEQ';


    end if;
        
    v_letters_26 := seq2letters(v_letters_10);
        
    return v_letters_26;
       
end get_letters;

Now we have all the code we need to generate a license plate. First we call the numbers function, then we call the letters function. In this demo, hosted on Oracle's APEX demonstration website, we'll create a 1970's Maryland license plate. And what does a vintage late 1970s Maryland plate look like? Check eBay: it's more than an auction site, it's a museum as well! The old plates were pretty simple: large red letters and numbers with a red border around the whole plate. Try out the demonstration, every click of the Make a Plate button makes another license plate. Try running the demonstration in two or more windows, you'll never make the same plate twice. Well, at least not until you get to plate ZZZ 999 and the sequence roll over.

1. American euphemisms for going to prison.

Tuesday, October 22, 2013

Recursive SQL

Our shop recently upgraded to Oracle 11g Release 2. Having used DB2 for many years, two new 11gR2 features caught my eye. First, Oracle 11gR2 supports aliasing the names of common table expressions. Second, Oracle supports SQL-99 syntax for recursion using common table expressions.

Let's start with some data first. In the Oracle editor of your choice, run the following script. The script creates some test data for us in the system catalog, and we'll use this data to solve a practical problem.

create table t1 ( v1 varchar2(1));

create view v1 as select * from t1;

create view v2 as select * from v1;

create view v3 as select * from v2;

create view v4 as select * from v1;

The script creates five new objects in the system catalog. The DBA_DEPENDENDCIES view shows the dependency of one object on another. In our example, view V1 depends on table T1. View V2 depends on view V1. View V3 depends on view V2. View V4 depends on view V1. Every object, whether it is a table, view, package, or object of any type, has a row in DBA_DEPENDENCIES if the object depends on another object, or if the object has other objects depending on it.

The USER_DEPENDENCIES view is a subset of DBA_DEPENDENCIES; DBA_DEPENDENCIES may be not be available to everyone, so these examples will use USER_DEPENDENCIES. The two columns of interest to us are the NAME and REFERENCED_NAME columns. A row in this view means that object NAME depends on object REFERENCED_NAME. In the following query, we see every dependency that we described in the previous paragraph.

select name, referenced_name
from user_dependencies
where referenced_name in ('V1','T1','V2','V3') 
order by 1;

NAME    REFERENCED_NAME
-----   ---------------
V1      T1
V2      V1
V3      V2
V4      V1

The USER_DEPENDENCIES view shows us all the direct dependents of an object. The USER_DEPENDENCIES view shows that view V4 depends on view V1. USER_DEPENDENCIES does not show that view V4 indirectly depends on table T1. If you don't believe that, drop table T1 and then run a query against view V4. Not so good!

A shop with many tables, views, packages, procedures, functions, and triggers, will have an extensive hierarchy of dependencies. So, while this view shows the direct dependents, it does not show all the dependents. Does view V4 depend on table T1? Before we drop table T1, we need to know the answer to that question!

This problem is easily solved with recursive SQL queries. We'll take a look at two ways of handling this. First, we'll examine how to answer this question using recursion with common table expressions, à la DB2 or Oracle 11gR2. Here's the query:

with recurs (name, rname, lvl ) as (
   select name, referenced_name, 1
   from user_dependencies
   where referenced_name = 'T1'

   union all

   select a.name, a.referenced_name, r.lvl + 1
   from user_dependencies a,
        recurs  r
   where r.name = a.referenced_name
)
select name, rname, lvl
from recurs; 

NAME    RNAME   LVL
----    -----   ---
V1      T1      1
V2      V1      2
V4      V1      2
V3      V2      3

The first new feature in 11gR2 is aliasing the common table expression columns. We have declared RECURS as a common table expression with two columns, NAME and RNAME. Oracle 11gR1 supported common table expressions, but 11gR1 did not support aliasing the column names.
The next new feature is the recursive query. We have two sub-queries unioned together. The first query is our starting point: REFERENCED_NAME = 'T1'. The second query is the recursion. We select rows from a join of the USER_DEPENDENCIES table and the RECURS common table expression. The rows are linked by joining the REFERENCED_NAME of USER_DEPENDENCIES to the NAME of RECURS. In other words, we join the child row from USER_DEPENDENCIES to the parent row from RECURS. The LVL column shows the depth of the hierarchy.

DB2 LUW has supported common table expressions and recursion for at least 10 years. Oracle's pre-11g2 databases supported recursive queries in a limited way using the START WITH ... CONNECT BY PRIOR syntax. Run the following query, and we get the same results, although not in the same order. LEVEL is an Oracle pre-defined column showing the depth of the hierarchy. In the previous query, we found this by keeping track of the level (LVL) ourselves.

select name, referenced_name, level
from user_dependencies
start with referenced_name = 'T1'
connect by prior name = referenced_name;

NAME    REFERENCED_NAME  LEVEL
----    ---------------  -----
V1      T1               1
V2      V1               2
V3      V2               3
V4      V1               2

Like Oracle's outer join operator "(+)" from the last post, START WITH ... CONNECT BY PRIOR is very common in Oracle shops, so it's important to understand it. Also, it's very concise, always a benefit to those among us who have ten thumbs on the keyboard.

Recursive queries are another good tool to have in our toolbox. Trying to sort out the object dependencies is just one use. Another common problem is sorting out the system privileges. Many applications define their own authorization schemes, granting access to roles or groups, then including those roles/groups in other roles/groups. Recursive queries can answer the question "what users have authorization to which tables", and these queries can show the underlying hierarchy.

But before we go, for our example query to be really useful, we need to account for the rest of the columns in DBA_DEPENDENCIES. In addition to the object's name, DBA_DEPENDENCIES includes the object's owner and the object's type. If we don't account for any object's type and owner, we quickly end up with a recursive loop, especially if our objects include package specs and package bodies. Here's the improved query:

with recurs (name, owner, type, rname, rowner, rtype,  lvl ) as (
   select name, owner, type, referenced_name, referenced_owner, referenced_type, 1
   from dba_dependencies
   where referenced_name = 'T1'

   union all

   select a.name, a.owner, a.type, a.referenced_name, a.referenced_owner, a.referenced_type, r.lvl + 1
   from dba_dependencies a,
        recurs  r
   where r.name = a.referenced_name
     and r.owner = a.referenced_owner
     and r.type = a.referenced_type 
) cycle name, owner, type set cycle_detected to 1 default 0
   
select * 
from recurs;

Now when we join a parent row to a child row, we're joining on all the appropriate columns, and a recursive loop is less likely. Another improvement in this query is the use of the CYCLE keyword. CYCLE instructs the database to detect when our query has previously visited the row defined by name, owner, and type. When the database detects this condition -- a recursive loop -- the database sets the CYCLE_DETECTED column to 1 and does not descend further. Now we have a really useful query.

More Reading

For more information, the vendors' publications have some good examples:

DB2 Info Center
Oracle 11gR2 Don't be put off by the term "subquery factoring". It's just recursion.

Also, see my blog entry discussing recursion to implement aggregation in SQL Server.

Monday, October 14, 2013

We gather together to join these two tables...

Joining tables using SQL is a pretty straightforward operation. Usually we join tables where the values in one column equal the values in a second column, and the join returns only those rows where the columns values in both tables match. Let's start with some data:

EMPLOYEES TABLE
---------------

ID      NAME               DEPT_ID
--      ----------------   -------
1       Fred Flintstone    3
2       Barney Rubble      3
3       Wilma Flintstone   -
4       Betty Rubble       -
5       Mister Slate       1

DEPARTMENT TABLE
----------------

ID      NAME
--      ------------
2       Sales
1       Front Office
3       Gravel Pit

Here's a simple query with the results. The query answers the question "display all the employees with their departments".

select d.name dept_name,
       e.name employee
from department d,
     employees e
where d.id = e.dept_id;

DEPT_NAME       EMPLOYEE
------------    ----------------
Gravel Pit      Fred Flintstone
Gravel Pit      Barney Rubble
Front Office    Mister Slate

Notice that not very row in our source tables appears in the select results. Wilma and Betty are not employees, and the Sales department has no employees. Depending on our application, this might be what we want, but in some cases, we want to see the rows where we didn't get matches. For example, if the question is "display all departments and their employees", then we would expect to see the Sales department without any employees.

For many programmers, the natural response is to write a program in a high-level language using two loops. The outer loop opens a cursor on the department table, fetches data, and for each row fetched starts an inner loop. The inner loop opens a cursor on the employee table using the department id as a predicate.. In SQL parlance, we have constructed an OUTER JOIN.

Proponents of the "solve the problem with SQL" philosophy are surely gnashing their teeth at this solution. There are several ways of using SQL to write this query. We'll take a look at the most common way of handling it now, and we'll step into the way-back machine to take a look at two older ways of solving this problem. I don't advocate using the older methods, but if you work in a shop that adopted DB2 or Oracle early, you will see them and need to understand them.

The easiest and best method is to use SQL's OUTER JOIN keywords. In this example, we'll use a LEFT OUTER JOIN. A left outer join will return the rows from the table on the left side of the join, even if those rows don't satisfy the join condition. This query shows all departments and their employees:

select d.name dept_name,
       e.name employee
from department d 
left outer join employees e
  on d.id = e.dept_id;

DEPT_NAME       EMPLOYEE
-------------   ---------------
Gravel Pit      Fred Flintstone
Gravel Pit      Barney Rubble
Front Office    Mister Slate
Sales           -

Now the report shows the Sales department, even though the Sales department has no employees.

The early SQL databases that I worked with, SQL/DS and early versions of DB2, did not support the OUTER JOIN syntax. To construct a LEFT OUTER JOIN, we had to construct the inner join and UNION those results with missing rows from the left hand table. In the second part of the query, notice the correlated sub-query between the deparment and employees tables. This query yields the same results as the last query:

select d.name dept_name,
       e.name employee
from department d,
     employees e
where d.id = e.dept_id 

union all 

select d.name dept_name,
       null
from department d
where not exists ( select 1
                   from employees e
                   where d.id = e.dept_id 
                );

DEPT_NAME       EMPLOYEE
-------------   ---------------
Gravel Pit      Fred Flintstone
Gravel Pit      Barney Rubble
Front Office    Mister Slate
Sales           -

That's a lot of code, and not everyone is comfortable with the correlated sub-query, but it is preferable to coding loops and constructing joins in a high-level language.

Oracle used the "(+)", the outer join operator, to construct an outer join without needing to union the missing rows. Oracle has deprecated this syntax and recommends that programmers use the OUTER JOIN keywords construction instead. The deprecated syntax seems a bit clumsy until you get accustomed to it, and if you're working in a shop that has a long history with Oracle, you will see a lot of it.

Here's the query using Oracle's "(+)" outer join operator. Notice that the "(+)" is applied to the columns of the employee table if we want all the rows from the department table. The syntax and semantics don't correlate well; it is certainly one reason why Oracle deprecated this construction.

select d.name dept_name,
       e.name employee
from department d,
     employees e
where d.id = e.dept_id(+)
 
DEPT_NAME       EMPLOYEE
-------------   ---------------
Gravel Pit      Fred Flintstone
Gravel Pit      Barney Rubble
Front Office    Mister Slate
Sales           -

We do get the correct results, though, and programmers must have seen this as a big advantage over union-ing two queries together as in our example above.

The left outer join is just the start, we can also construct right outer joins and full outer joins using any of these methods. The preferred way is to SQL's RIGHT OUTER JOIN or FULL OUTER JOIN keywords for any new query. But, if you find yourself maintaining old queries, it's a good idea to understand the other methods of constructing outer queries, too.

Sunday, October 6, 2013

Rolling your own locks: Everything old is new again

No matter what language we program, no matter what operating system we use, or no matter what database we use, we frequently face the same problems. Yesterday's CICS/COBOL programmers and today's web developers both face a common problem: how to handle database locking. Fortunately, there is a solution common to both environments. We will use an old CICS strategy to solve an interactive web site problem.

The pseudo-conversational nature of CICS has bedeviled many a COBOL programmer. Processing transactions in a batch program is fairly simple:

Declare a cursor, including a for update of column_name1, column_name2 clause
Open the cursor
Fetch rows
Update where current of cursor
Close cursor
Commit the updates

The for update of clause instructs the database to take locks; these locks are released at the commit step or when the program terminates. The locking mechanism and database transactions insure that we process data updates while maintaining data integrity.   A loop like this can process tens of thousands of records in just a few minutes.

But in the online world of CICS, the program terminates every time a screen is displayed. Any locks taken are released when the program terminates. So, how do you write an online CICS program that maintains data integrity?

Web programmers have the same problem when programming interactive web pages. In the CICS world, we called it pseudo-conversational. In the HTML world, we call it stateless. It's really the same problem: in both environments, after displaying the page or the screen, we do not have a way to hold a lock. And in fact, we do not want to hold a lock. Remember the last time you went online and made a purchase? It took several minutes to complete that purchase. Imagine holding a lock on a table row or a table page for several minutes?   Such a system would quickly grind to a halt waiting for resources to unlock.

So, we adopt a strategy called optimistic locking.   We'll assume that the data we displayed in an online screen, whether it's a web page or a CICS screen, does not go stale. Rather than hold a lock after displaying a screen and while waiting for a user to continue, we simply test to see if the data went stale when we're ready to update our data. The logic looks something like this:

Select data from database
Display data on the screen; program terminates and waits for user input
User updates the data on the screen
Verify the original data has not changed, and apply the user's updates to the database

Let's start with some data. The Bedrock Bank ACCOUNT table has four columns: id (a surrogate key), name, checking, savings.

ID    NAME            CHECKING   SAVINGS
--    --------------  --------   -------
1     Flintstone           100       100
2     Rubble               100       100

Suppose Fred and Wilma go to the Bedrock Bank ATM to withdraw 70 rocks from savings. They each go to the ATM, display their account information, and attempt to make the withdrawal. According to each display, there should be enough rocks in the savings account to withdrawal 70 rocks. First, Wilma withdrawals 70 rocks, and all is good. Then Fred attempts to withdrawal 70 rocks. Should the Flintstone account go 40 rocks in the red? Of course not! The problem is Fred made his withdrawal based on stale information. His display says there are 100 rocks in savings, but that's old information now. There are only 30 rocks in savings. The ATM needs to implement controls to insure data integrity.

In the Oracle Apex Oracle Apex world, automatic row processing takes care of this behind the scenes. We can write web-based programs as if we didn't need to worry about locking. But in some cases our interactions with the database will be complicated enough that we need to implement the locking. Fortunately, it's not too difficult; we just need a way to know if the data went stale before we update it.

For our first solution, we'll take a page from CICS methodology. We'll add a last update timestamp to the ACCOUNT table. Notice that we're adding a complete timestamp, not just a date field. A date field is not fine-grained enough for our purpose.

ID    NAME            CHECKING   SAVINGS    LAST_UPDATE_TS
--    --------------  --------    ------    -----------------------
1     Flintstone           100       100    2013-09-15.12.02.57.000321
2     Rubble               100       100    2013-08-18.15.21.01.000123

We make two assumptions: (1) every insert or update always updates the LAST_UPDATE_TS column. We can do this automatically with a trigger, or we can require every application to do this. (2) Every transaction reads the LAST_UPDATE_TS on a select, and every update/delete transaction includes the LAST_UPDATE_TS as a predicate in the where clause.

Now, imagine Wilma and Fred at the ATM machine, each about to withdrawal 70 rocks:

Fred slides his card, the ATM displays 100 rocks and saves the LAST_UPDATE_TS.
Wilma slides her card, the ATM displays 100 rocks and saves the LAST_UPDATE_TS.

Wilma withdraws 70 rocks, the ATM executes the following:

update account
set savings = savings - 70,
    last_update_ts = current timestamp
where id = 1
  and last_update_ts = 2013-09-15.12.02.57.000321

Fred attempts to withdraw 70 rocks, the ATM executes the following:

update account
set savings = savings - 70,
    last_update_ts = current timestamp
where id = 1
  and last_update_ts = 2013-09-15.12.02.57.000321

Fred's attempt fails. Wilma's transaction changed both the SAVINGS column and the LAST_UPDATE_TS column. The ATM displays a message telling Fred his account balance has changed and re-displays the Flintstone account balance, now 30 rocks.

Not every shop timestamps rows of data, so in the Oracle Apex world, there is another way to detect stale data. The wwv_flow_item.md5 function accepts up to 50 arguments and returns a MD5 checksum. We can substitute the MD5 checksum for the LAST_UPDATE_TS in the above example. We don't save the MD5 checksum in the table; we just compute it on the fly. When Fred and Wilma slide their cards, the ATM executes the following SQL:

select name,
       checking, 
       saving,
       wwv_flow_item.md5(name, checking, saving)
into  :v_name,
      :v_checking
      :v_saving
      :v_hidden_md5
from account
where id = 1

And the update would look like:

update account
set saving = saving - 70
where id = 1
  and wwv_flow_item.md5(name, checking, saving) = :v_hidden_md5

Notice that we compute the MD5 checksum from the current data in both SQL statements. When Wilma completes her transaction before Fred, the MD5 checksum changes, and Fred's transactions fails, just as it did before.

Now we have tow tools to handle online web transactions in the stateless world web programming. We can use either row timestamps or MD5 checksums to identify stale data. Just remember to rollback the whole transaction if any part of the transaction fails.

More Reading

If you're rolling your own HTML pages outside of Apex and you need to compute checksums, the Oracle dbms_obfuscation_toolkit package includes an md5 function. The md5 function only accepts one argument, so just concatenate the values together. PL/SQL happily concatenates numbers to strings to numbers, so we don't need to worry about dissimilar types:

select name,
       checking, 
       saving,
       dbms_obfuscation_toolkit.md5(input_string => name || checking || saving)
into  :v_name,
      :v_checking
      :v_saving
      :v_hidden_md5
from account
where id = 1

Sunday, September 29, 2013

How did I do that?

One of my goals in this blog is documenting things I do frequently enough that they're easy to do, yet not so frequently that I can do it without a quick reference check.

So, here's the start of a growing list.

My last post discussed the REXX programming language. This time I'll discuss REXX's exception handling as a segue to two Oracle PL/SQL functions.

REXX's exception handling includes the signal on statement to trap specific exceptions, and the SIGL variable stores the program's line number where the exception occured. Here's an example:

/* Demonstrate Rexx error handling*/
signal on syntax

say ' 50 / 0  = ' 50 / 0

exit 0

/* Error handler */
syntax:
  Say 'Error on line' SIGL':' sourceline(SIGL) 
  exit 1

Execute this code, and we get the line number where the error occurred and the sourceline function returns the program source, too:

tom@Pavilion-dv8000:~$ rexx error.rx
Error on line 4: say ' 50 / 0  = ' 50 / 0

This is really useful, and not all runtime environments are so friendly when you do something foolish. The COBOL runtime library would print the address where the machine code failed. Not very friendly, but at least you could use the offset to search thru the program listing to find where the error occured.

Some runtime environments do even less. Oracle's PL/SQL language supports exception handling, but the pre-11g database did not include a way to determine the line number of the exception. If you found yourself maintaining code that was hundreds to thousands of lines long, tracking down the exception could be quite challenging. Thankfully, Oracle 11g includes a new function, dbms_utility.format_error_backtrace,that returns the line number where the exception occurred. Here's a quick example using the divide by zero problem:

declare
    x integer := 10;
    y integer := 0;
    z integer := null;
begin
    z := x / y;
   
    dbms_output.put_line('Program never gets here.');
exception 
    when others then 
    dbms_output.put_line(sqlerrm);
    dbms_output.put_line(dbms_utility.format_error_backtrace());
    
end;

Execute this code in SQLPLUS or or as an APEX script, and you get the following message:

ORA-01476: divisor is equal to zero
ORA-06512: at line 6

Statement processed. 0.01 seconds

The example is trivial, but knowing exactly where things went wrong is really useful when there are lots of places for code to go wrong.

Another useful Oracle function is dump. Have you ever wondered what was stored internally for an international character? Or have you ever wondered why two strings that looked the same, weren't the same? The dump function displays the internal coding of a character or a string. If you need to figure out why 'string' is not equal to 'string ', dump can help. Here's an example; notice the second parameter. The '16' indicates that we want the data displayed as hex numbers. The '1016' indicates that we want hex numbers and character set.

select dump('string',16), dump('string ',1016) 
from dual;

Execute this example, and you'll see the following:

DUMP('STRING',16)               DUMP('STRING',1016)
Typ=96 Len=6: 73,74,72,69,6e,67 Typ=96 Len=7 CharacterSet=AL32UTF8: 73,74,72,69,6e,67,20

Now it's easy to see that the second 'string' has an extra character space. Also, we can see that this database (hosted at apex.oracle.com) is using UTF8 character encoding.

Friday, September 20, 2013

REXX, the REstructured eXtended eXectuor

In many ways, Unix and Linux provide a very friendly development environment for programmers. There's a large suite of system utilities that accomplish all sorts of things from sorting to cutting to joining. And all these utilities can be glued together with pipes, either ad-hoc on the command line or as part of shell script. While I've done a fair amount of shell programming, I'm not a big fan of either ksh or bash as scripting languages. The shell scripting languages are quite archaic, and I'd really prefer something more modern.

Many years ago, I learned to program in the EXEC-2 language, a scripting language used by IBM's VM/CMS operating system. EXEC-2 and the Unix shell languages share many of the same virtues and vices, and if you learned one you could easily understand the other. I liked the quick edit-test development cycle, the ability to directly interface with the operating system, the ability to interface directly with the system editor, the ability to create new system commands, and the clumsy but powerful control structures. You could do a lot of clever things in EXEC-2 or the shell; but it always felt a bit clumsy.

And then IBM introduced the rest of the world to REXX, included as the System Product Interpreter in VM/SP Release 3. Wow! Everything I loved about EXEC-2 or shell in a modern language that resembled PL/I or the data step of the SAS system.

REXX was the brainchild of IBMer Mike Colishaw. Colishaw's aims for REXX were to create a language that was easy to learn and easy to use, a language that was powerful enough for serious program development, a language that favored the REXX programmer over the REXX implementer. And he succeeded. REXX is an easy to learn, intrepreted language with nearly all the control structures found in PL/I. And, unlike the old EXEC-2 or ksh/bash scripting languages, there's a really strong correlation between syntax and semantics.

So, let's talk about REXX. We'll install Rexx on a Linux Ubuntu system, and we'll demonstrate a little Rexx code, and we'll conclude with some pointers to more information.

Starting off, I thought I'd check the Ubuntu Software Center. I couldn't find a version of REXX in the Ubuntu Software Center. But, googling "rexx site:packages.ubuntu.com" turned up at link at http://packages.ubuntu.com/lucid/regina-rexx. Clicking the link gives us:

Perfect! We have a build for Ubuntu 12.04. We could download and install it, but let's let apt-get do the work for us. After su'ing to root, just run apt-get install regina-rexx:

Answer Y to continue, and in a few minutes, the Regina Rexx interpreter is installed on your system. And, voila, you're ready to run the traditional Hello World test:

Yup, the traditional Hello, World works. Even better, Rexx takes 22/7 and computes pi. Neither EXEC-2 nor the shell will do that so easily.

What does a short Rexx program look like? Our first example demonstrates an easy trick to make a Rexx program execute like any bash or ksh shell program in your $PATH. The program also demonstrates the Rexx parse instruction.

#!/usr/bin/rexx
/* This is a Rexx comment. The #! is used solely to make our 
   program execute like any ksh or bash script.  
*/
parse arg first_name '('common_name')' last_name . 

say 'First name: ' first_name
say 'Common name:' common_name
say 'Last name:  ' last_name

exit 0

Shell programmers will recognize the very unRexx-like first line. For ksh or bash programs, the first line is a comment (#) followed by a bang (!) that points to the interpreter. This is usually /bin/bash or /bin/ksh. Luckily, we can put the pathname to our Rexx interpreter in the #! path, and it works the same. If you're not sure where the Rexx interpreter is installed, use the whereis command. Just remember to chmod +x the Rexx script, and then execute it. Here's a sample:

Two other noteworthy points in this program: the parse instruction and the exit command.

The parse instruction will save you more time than almost any other feature of REXX. Parse is extremely powerful, and it splits strings into variables better than almost any other programming language I can think of.

The exit instruction returns success codes to the operating system. In Unix or Linux, you can check these code with the $? variable in your shell. If you execute a program from a REXX program, you can check the success code with the predefined rc variable.

One more short program. From Robert Sedgewick's book Algorithms, we'll translate the greatest common denominator program from Pascal to REXX. Here's the REXX code:

/* Example from R. Sedgewick's _Algorithms_ */
/* Find the greatest common divsor using    */
/* Euclid's algorithm.                      */
trace r
arg x y

 say gcd(x,y)

exit 0

/* Internal function */
gcd:procedure 
arg u, v

if v = 0 then 
   return u
else
   return gcd(v, u//v)

and here's the output, run on a Windows PC:

c:\> example.rexx 12 8
     5 *-* arg x y
       >>>   "12"
       >>>   "8"
     7 *-* say "The greatest common denominator of" x "and" y "is" gcd(x,y)
       >V>   "12"
       >V>   "8"
    12 *-* gcd:
       >V>   "12"
       >V>   "8"
       *-*  procedure
    13 *-*  arg u, v
       >>>    "12"
       >>>    "8"
    15 *-*  if v = 0 then
       >V>    "8"
    19 *-*  return gcd(v, u//v)
    12 *-*  gcd:
       >V>    "8"
       >V>    "12"
       >V>    "8"
       *-*   procedure
    13 *-*   arg u, v
       >>>     "8"
       >>>     "4"
    15 *-*   if v = 0 then
       >V>     "4"
    19 *-*   return gcd(v, u//v)
    12 *-*   gcd:
       >V>     "4"
       >V>     "8"
       >V>     "4"
       *-*    procedure
    13 *-*    arg u, v
       >>>      "4"
       >>>      "0"
    15 *-*    if v = 0 then
       >V>      "0"
    16 *-*    return u
       >V>      "4"
The greatest common denominator of 12 and 8 is 4
     9 *-* exit 0

From the top, a few new instructions. First, the trace instruction: trace r instructs REXX to display the variable assignments as it executes the code. The arg instruction assigns the command line arguments to variables. The say instruction writes to the console. The procedure instruction starts a new procedure or function. GCD is a function because it has a return statement. Notice that gcd is recursive, and that REXX traces all the recursion for us.

So far, we've talked about Classic Rexx, the ANSI standard REXX language as originally conceived by Cowlishaw. There are two other versions that are worth exploring. NetREXX is an alternative to Java, and runs on any system with a JVM. ooREXX is the open source version of IBM's object-oriented REXX interpreter. REXX is implemented on many different systems, too. In addition to Linux, REXX was the system interpreter for OS/2 and VM/CMS, and REXX is also available for Windows, Amiga, and the other IBM mainframe operating systems.

Frederick Brooks, in No Silver Bullet —Essence and Accident in Software Engineering , argues that there is no silver bullet that will magically solve all the problems inherit in software development. I think he's right: in the search for the silver bullet, we've spent the past 40 years reinventing the wheel. So, I won't claim that REXX is the silver bullet, but Mike Cowlishaw has certainly created the shiniest bullet we're likely to see for a very long time.

Resources

The Rexx Language Association promotes the use of Rexx, sponsors an annual symposium, and the website includes a forum. You'll find information about NetRexx and OORexx here, too.

Mark Hessling's home page, www.rexx.org, includes all the Rexx projects that he has developed or is maintaining, including Regina Rexx. In addition to Regina Rexx that I installed, you'll also find goodies like Rexx libraries to connect to databases, develop network programs, and write PDF files. If you were a fan of the VM/CMS Xedit editor, THE (The Hessling Editor) is Mark Hessling's implementation of the classic CMS editor.

The classic book on Rexx, The Rexx Language: A Practical Approach to Programming, by Rexx creator Mike Cowlishaw, is a short and very readable introduction to the Rexx language.

The IBM Infocenters for z/VM and z/OS include a Rexx Reference and a Rexx User's Guide. Just go to the Infocenter of your choice, and search for "rexx reference" or "rexx users guide".

The Design of the REXX Language by M. F. Cowlishaw. This academic article appeared in The IBM Systems Journal Vol. 23 No. 4, 1984.

The Man Behind Rexx: z/Journal Visits IBM Fellow Mike Cowlishaw in this 2004 interview.

Lost Learnings