Our query works well, but sometimes our data may have values that repeat within an id. Because our query controls the recursion with the values that we're concatenating, we might not get the result back that we wanted.
Let's start with last week's sample table, and add a row to it:
insert into sample values(3,'G');
Run the last week's final query, and our results look this:
Id | Concat_value |
---|---|
1 | A,B |
2 | C,D |
3 | E,F,G |
3 | E,F,G |
4 | H |
This is probably not the result we hoped for. So, first question: what did we hope to see? If the correct result is four rows, with the value of "E,F,G" for Id = 3 , then we just need to add the distinct function to the final query.
But maybe the correct answer is four rows, and the concat_value for Id = 3 is "E,F,G,G". In that case, we need to do a little work to our query. As originally written, the query uses the VALUE column to control the recursion. If the values repeat, then we need to introduce a surrogate value to the query. Let's examine the query and let the code comments do the talking:
-- Start with a new common table expression(CTE) that creates a unique surrogate -- value for each value in the VALUE column. The OLAP row_number() function will -- do that easily. with add_surogate (id, value, surogate_value) as ( select id, value, row_number() over (partition by id order by value) from sample ) -- Add the surrogate_value to the rq CTE ,rq ( id, value, surogate_value, concat_value, depth) as ( select a.id, a.value, a.surogate_value, cast( a.value as varchar(256)), 1 from add_surogate a where a.surogate_value = ( select min(surogate_value) -- Start the recursion using the surrogate value from add_surogate b where a.id = b.id ) union all select c.id, c.value, c.surogate_value, -- No changes, concatenate values as before cast( rq.concat_value + ',' + c.value as varchar(256)), rq.depth + 1 from add_surogate c inner join rq on c.id = rq.id -- And control the progress of the recursion using the -- surrogate value and c.surogate_value > rq.surogate_value and rq.depth < 10 ) ,all_rows as ( select id, value, concat_value, depth, max(depth) over (partition by id) max_depth from rq ) select id, concat_value from all_rows where depth = max_depth
The changes from last week's query to this week's query are pretty simple:
- Add a new common table expression to create a surrogate value
- Add the surrogate value to the recursive common table expression
- Use the surrogate value to control the recursion
- Continue to build the result string as before
ID | Concat_value |
---|---|
1 | A,B |
2 | C,D |
3 | E,F,G,G |
4 | H |
More Reading:
The vendors' online documentation have good examples:
- SQL Server 2008 Recursive Queries Using Common Table Expressions
- DB2 Info Center
- Oracle 11gR2 Don't be put off by the term "subquery factoring". It's just recursion.