Sunday, January 12, 2014

Oracle Pipelined Functions

I've had a few opportunities to use and write PL/SQL pipelined functions.   From a developer's point of view, a pipelined function is just a PL/SQL function with a new way of returning data.  From the function user's point of view, a pipelined function looks like a view on steroids. 

Let's start with the user's point of view first.   Why use a pipelined function?
  • The function can implement complicated business logic that might not be feasible in a view.
  • The  function can extend security. Create the function with definer's rights and grant execute to specific invokers to give the invokers access to data they might not have.
  • The function can implement additional security, restricting the data returned to the user.
  • The function can accept arguments, looking like a parameterized view.
  • The function can create rows of data and log errors.
Applications commonly create data with date information.  Suppose we have the following INCIDENT_LOG  table. Every incident is logged in this table with the date and description:
create table incident_log (
    incident_date date not null,
    incident_description varchar(4000)
    );
From this table, we can easily ask questions like "How many incidents did we have on January 2?" Or "How many incidents have we had?"  But, we can not answer the question "what days did we not have an incident?" We can not answer the question "Including days with no incidents, how many incidents do we average per day?" There is no data in our table if we did not have an incident, so we can not answer the last two questions.

From a developer's point view, to answer the last two questions we need to join our incident table to a table with all possible dates of interest. We could create a new table and populate it with all the dates to cover the range of dates in the INCIDENT_LOG table. For organizations with years or decades of data, that's a big table requiring ongoing maintenance.

Why not just create the dates we need when we need them?  We can do exactly that using a pipelined function. Our function will accept two arguments, a start date and an end date. Our function will return a table of rows with two columns, the date of interest and the name of the day, starting with the start date and finishing with the end date. 

Our pipelined function returns rows from a table, so we need to  create a TYPE of the table to return.  First, we create a row in the table as an OBJECT, then we create the TABLE of row objects.  Here's the code:
create or replace
    type row_of_date as object (
        v_date date,
        v_day  varchar2(12)
    );
/
create or replace
    type table_of_date as table of row_of_date;
/
Next, we create a function named WORK_DATES that returns the working dates and days between two dates.   Let's start with the declarations. Our function accepts two input dates and returns a TABLE_OF_DATE. We created the TABLE_OF_DATE in the code above. Notice, too, that our return is PIPELINED. The function creates one local variable, V_ROW_OF_DATE. This variable is an object, and so we must instantiate it using the default constructor. We do not need to create the constructor, Oracle defines the default constructor for us.
create or replace function work_dates (p_start_dt in VARCHAR2,
                                       p_end_dt in VARCHAR2)
return table_of_date pipelined as
        
    v_row_of_date row_of_date := row_of_date(null, null);

begin
    
    v_row_of_date.v_date := to_date(p_start_dt,'yyyy-mm-dd');
    loop
        
        v_row_of_date.v_day := to_char(v_row_of_date.v_date,'FmDay');
        if v_row_of_date.v_day not in ('Sunday','Saturday') then
            pipe row(v_row_of_date);  -- Pipelined functions return data here
        end if;

        v_row_of_date.v_date := v_row_of_date.v_date + interval '1' day;

        if v_row_of_date.v_date > to_date(p_end_dt,'yyyy-mm-dd') then
            exit;
        end if;

    end loop;

end work_dates;
/
Now, let's examine the executable code.  Our function starts by converting the start date to a date and stores it in v_row_of_date.v_date.  Next, the function determines the day of the week.  Our WORK_DATES function returns working days and skips Sundays and Saturdays.   If our organization's calendar is online, we could add logic here to query the calendar via a TCP/IP connection and skip holidays, too.

The important step is next:  piping the data back to the caller.  We don't use a RETURN statement, instead we use a PIPE ROW statement.

Finally, our function adds 1 day to the date and exits when the range of dates has been piped to the caller.

Using our new function is easy. We just CAST it as a table and query it as a table:
select *
from table(work_dates('2001-01-01','2001-01-09))
 
V_DATE          V_DAY
----------      ----------
01/01/2001      Monday
01/02/2001      Tuesday
01/03/2001      Wednesday
01/04/2001      Thursday
01/05/2001      Friday
01/08/2001      Monday
01/09/2001      Tuesday
Now, back to our queries. What working days in 2012 do not have incidents associated with them?
select v_date, v_day
from table(work_days('2012-01-01','2012-12-31')) a
where not exists ( select 1
                   from incident_log b
                   where a.v_date = b.incident_date
                 );
That was easy.  Next, what is the average number of incidents per working day in 2012? How many incidents were there in 2012?
with by_day as ( 
   select v_date, sum(case
                        when b.incident_date is not null then 1
                        else 0
                      end) daily_count
   from table(work_days('2012-01-01','2012-12-31')) a
   left outer join incident_log b
     on a.v_date = b.incident_date
   group by v_date
)
select sum(daily_count) total_incidents,
       count(*) n_work_days,
       avg(daily_count) incidents_per_day
from by_day;
The common table expression by_day sums up the daily count of incidents; the outer query sums the annual count and average daily count.

Pipelined functions make easy work of writing queries to answer questions where we're missing data.  With the ability to add complicated business logic, extend security, and write logs, pipeline functions are another useful tool in our toolbox.


No comments:

Post a Comment