New SQL Puzzle
June 13th, 2012 — ddelmoliSorry for the drought — to keep everyone’s mind fresh, how about a little puzzle?
Use SQL to create a symmetric Pascal matrix, with the output being (i,j,v). So that a Pascal matrix of size 4 would end up as:
1,1,1
2,1,1
3,1,1
4,1,1
1,2,1
2,2,2
3,2,3
4,2,4
1,3,1
2,3,3
3,3,6
4,3,10
1,4,1
2,4,4
3,4,10
4,4,20
Try to stay with ANSI SQL 2008 if possible — bailing out to other languages and functions is discouraged
Efficiency is an interesting idea here, as it’s pretty easy to do this by brute force calculations of factorials for every entry — but where’s the fun in that?
Extra credit for solutions which can take in the matrix size as some sort of parameter.
June 14th, 2012 at 7:00 am
I put together a solution using WITH blocks and Oracle specific syntax, so a solution is possible using just SQL (but I am not sure if that Oracle specific syntax qualifies as ANSI SQL 2008).
This is my output:
I J V
– — —-
1 1 1
2 1 1
3 1 1
4 1 1
1 2 1
2 2 2
3 2 3
4 2 4
1 3 1
2 3 3
3 3 6
4 3 10
1 4 1
2 4 4
3 4 10
4 4 20
Any interest in seeing the Oracle specific code?
June 14th, 2012 at 7:40 am
Charles, sure! WITH blocks aren’t Oracle-specific (I was able to run my solution in Postgres and Oracle), so I’d love to see what you came up “with”.
June 14th, 2012 at 8:21 am
I have a degree in mathematics, but that degree is very, very rusty. Based on the provided link, we are trying to produce an array like the following, where the value in a cell in the array is the sum of the value immediately above and immediately to the left:
1 1 1 1 1
1 2 3 4 5
1 3 6 10 15
1 4 10 20 35
1 5 15 35 70
The first rows have the following values:
1^0 + 2^0
1^0 + 2^0 + 3^0
1^0 + 2^0 + 3^0 + 4^0
1^0 + 2^0 + 3^0 + 4^0 + 5^0
The second row has the following values:
1^0 + 2^0 + 1^0
1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 = 1 * 3^0 + 2 * 2^0 + 3 * 1^0
1^0 + 2^0 + 3^0 + 4^0 + 1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 = 1 * 4^0 + 2 * 3^0 + 3* 2^0 + 4 * 1^0
1^0 + 2^0 + 3^0 + 4^0 + 5^0 + 1^0 + 2^0 + 3^0 + 4^0 + 1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 = 1 * 5^0 + 2 * 4^0 + 3 * 3^0 + 4 * 2^0 + 5 * 1^0
The third row:
1^0 + 2^0 + 1^0 + 1^0
1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 + 1^0 + 2^0 + 1^0 + 1^0 = 1 * 3^0 + 3 * 2^0 + 6 * 1^0
1^0 + 2^0 + 3^0 + 4^0 + 1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 + 1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 + 1^0 + 2^0 + 1^0 + 1^0 = 1 * 4^0 + 3 * 3^0 + 6 * 2^0 + 10 * 1^0
1^0 + 2^0 + 3^0 + 4^0 + 5^0 + 1^0 + 2^0 + 3^0 + 4^0 + 1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 + 1^0 + 2^0 + 3^0 + 4^0 + 1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 + 1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 + 1^0 + 2^0 + 1^0 + 1^0
Now we can just… cheat and look at the formula from the article that you provided:
(i + j – 2)! / ((i – 1)! * (j – 1)!)
! is short for factorial
4! = 4 * 3 * 2 * 1
3! = 3 * 2 * 1
2! = 2 * 1
1! = 1
Calculating the factorial is a bit of a challenge. The SUM analytic function can be used to generate running sums of a column, but there is no equivalent for running products of a column.
If we calculate the natural log of a column value, generate a running sum of those natural log values, and then generate the anti-log, we can essentially generate a running product (factorial) for a column:
WITH
FACTORIAL AS
(SELECT /*+ MATERIALIZE */
N,
EXP(SUM(LN(N)) OVER (ORDER BY N)) FACTORIAL
FROM
(SELECT
ROWNUM N,
LN(ROWNUM) RNL
FROM
DUAL
CONNECT BY
LEVEL<=20))
SELECT
N,
FACTORIAL F
FROM
FACTORIAL;
N F
– ———-
1 1
2 2
3 6
4 24
5 120
6 720
7 5040
8 40320
9 362880
10 3628800
11 39916800
12 479001600
13 6227020800
14 8.7178E+10
15 1.3077E+12
16 2.0923E+13
17 3.5569E+14
18 6.4024E+15
19 1.2165E+17
20 2.4329E+18
One problem is that with the formula (i + j – 2)!, we could very well need to calculate 0!, which is defined as 1. To accomplish this, we need to add the following inside the WITH block:
UNION ALL
SELECT
0 N,
1 FACTORIAL
FROM
DUAL
Now it is just a matter of defining how large of an array is needed, performing a Cartesian join to achieve the correct number of rows (I * J), and placing the factorial values where necessary:
WITH
FACTORIAL AS
(SELECT /*+ MATERIALIZE */
N,
EXP(SUM(LN(N)) OVER (ORDER BY N)) FACTORIAL
FROM
(SELECT
ROWNUM N,
LN(ROWNUM) RNL
FROM
DUAL
CONNECT BY
LEVEL<=20)
UNION ALL
SELECT
0 N,
1 FACTORIAL
FROM
DUAL),
I AS
(SELECT /*+ MATERIALIZE */
ROWNUM I
FROM
DUAL
CONNECT BY
LEVEL<=4),
J AS
(SELECT /*+ MATERIALIZE */
ROWNUM J
FROM
DUAL
CONNECT BY
LEVEL<=4)
SELECT
I.I,
J.J,
F1.FACTORIAL / (F2.FACTORIAL * F3.FACTORIAL) V
FROM
FACTORIAL F1,
FACTORIAL F2,
FACTORIAL F3,
I,
J
WHERE
F1.N=(I.I + J.J – 2)
AND F2.N=(I.I – 1)
AND F3.N=(J.J – 1)
ORDER BY
J.J,
I.I;
June 14th, 2012 at 2:36 pm
[...] Dominic’s challenge probably would not be much of a challenge if we could just type in formulas like the above into a SQL statement. Give his challenge a try to see if you are able to derive a unique solution to the problem. I probably spent a couple of minutes (maybe 60 seconds with the help of copy and paste) creating the above example using Microsoft Excel, but spent a couple of hours trying to produce a solution that worked using SQL. [...]
June 14th, 2012 at 6:41 pm
A PostgreSQL approach – Mixing analytic functions and recursive subquery factoring. http://rafudb.blogspot.fi/2012/06/pascal-matrix.html
June 14th, 2012 at 8:02 pm
Nice one Timo — glad to see another approach!
June 15th, 2012 at 3:34 am
select i
, j
, ( select exp( sum( ln( level ) ) ) from dual connect by level <= i + j – 2 )
/ ( select exp( sum( ln( level ) ) ) from dual connect by level <= i – 1 )
/ ( select exp( sum( ln( level ) ) ) from dual connect by level <= j – 1 ) v
from ( select mod( level – 1, n ) + 1 i
, ceil( level / n ) j
from ( select 8 n from dual )
connect by level <= n * n
)
June 15th, 2012 at 7:59 am
Anton, good one — I particularly like the single reference to the size of the array. And so far I’ve been impressed with the use of the exp(sum(ln())) construct to calculate the factorials.
June 15th, 2012 at 8:30 am
I think the statement below does the trick:
with r as (select level nr from dual connect by level <= 9 ) , z as (select r.nr row_nr , c.nr col_nr from r r , r c ) , p as (select * from z model dimension by (row_nr, col_nr) measures (cast(null as number) new_val) rules ( new_val[any, any] order by row_nr, col_nr = greatest(case when new_val[cv()-1, cv()] is null then 0 else new_val[cv()-1, cv()] end + case when new_val[cv(),cv()-1] is null then 0 else new_val[cv(),cv()-1] end ,1 ) ) ) select * from p pivot (sum(new_val) for col_nr in (1,2,3,4,5,6,7,8,9)) order by row_nr –, col_nr ;
The pivot is just for formatting the output. If you comment it out, the fixed number for the matrix size can be made dynamic.
June 15th, 2012 at 8:55 am
I admit this is probably pure Oracle (I have no idea which other databases might support the MODEL clause), but it was fun to make so I might as well share it
SQL> variable matrixsize number
SQL>
SQL> begin
2 :matrixsize := 4;
3 end;
4 /
PL/SQL procedure successfully completed.
SQL>
SQL> select
2 i, j, v
3 from dual
4 model
5 dimension by (1 i, 1 j)
6 measures (1 v)
7 rules automatic order (
8 v[for i from 2 to :matrixsize increment 1, 1] = 1,
9 v[1, for j from 2 to :matrixsize increment 1] = 1,
10 v[for i from 2 to :matrixsize increment 1,
11 for j from 2 to :matrixsize increment 1
12 ] = v[cv(i)-1,cv(j)] + v[cv(i),cv(j)-1]
13 )
14 order by
15 i, j;
I J V
———- ———- ———-
1 1 1
1 2 1
1 3 1
1 4 1
2 1 1
2 2 2
2 3 3
2 4 4
3 1 1
3 2 3
3 3 6
3 4 10
4 1 1
4 2 4
4 3 10
4 4 20
16 rows selected.
June 15th, 2012 at 10:06 am
Here is another possible solution that uses SYS_CONNECT_BY_PATH to create formulas and XMLQUERY to calculate the value of those formulas to generate the factorial values.
First, the formulas are built:
SELECT
ROWNUM N,
FF
FROM
(SELECT
’1′||SYS_CONNECT_BY_PATH(ROWNUM,’ * ‘) FF
FROM
DUAL
CONNECT BY
LEVEL<=8
ORDER BY 1);
N FF
- ———————————
1 1 * 1
2 1 * 1 * 2
3 1 * 1 * 2 * 3
4 1 * 1 * 2 * 3 * 4
5 1 * 1 * 2 * 3 * 4 * 5
6 1 * 1 * 2 * 3 * 4 * 5 * 6
7 1 * 1 * 2 * 3 * 4 * 5 * 6 * 7
8 1 * 1 * 2 * 3 * 4 * 5 * 6 * 7 * 8
As shown above, I am only planning to calculate through 8!, but that will later be changed to 20!.
Next, the value of the formulas are calculated:
SELECT
ROWNUM N,
XMLQUERY((FF) RETURNING CONTENT).GETNUMBERVAL() FACTORIAL
FROM
(SELECT
’1′||SYS_CONNECT_BY_PATH(ROWNUM,’ * ‘) FF
FROM
DUAL
CONNECT BY
LEVEL<=8
ORDER BY 1);
N FACTORIAL
- ———-
1 1
2 2
3 6
4 24
5 120
6 720
7 5040
8 40320
The above SQL statement does not calculate 0!, so that problem will be handled later by a DECODE statement.
The following will be used to derive the i and j values for an 8 x 8 matrix:
SELECT
MOD(ROWNUM-1,8)+1 I,
TRUNC((ROWNUM-1)/8)+1 J
FROM
DUAL
CONNECT BY
LEVEL <= 8 * 8;
The final query uses two WITH block subqueries, and scalar subqueries to pull the correct factorial value for the calculation:
WITH
FACTORIAL AS
(SELECT /*+ MATERIALIZE */
ROWNUM N,
XMLQUERY((FF) RETURNING CONTENT).GETNUMBERVAL() FACTORIAL
FROM
(SELECT
’1′||SYS_CONNECT_BY_PATH(ROWNUM,’ * ‘) FF
FROM
DUAL
CONNECT BY
LEVEL<=20
ORDER BY 1)),
NUMBERS AS
(SELECT
MOD(ROWNUM-1,8)+1 I,
TRUNC((ROWNUM-1)/8)+1 J
FROM
DUAL
CONNECT BY
LEVEL <= 8 * 8)
SELECT
N.I,
N.J,
DECODE((N.I + N.J – 2),0,1,
(SELECT
F.FACTORIAL
FROM
FACTORIAL F
WHERE
F.N=(N.I + N.J – 2))) /
DECODE((N.I – 1),0,1,
(SELECT
F.FACTORIAL
FROM
FACTORIAL F
WHERE
F.N=(N.I – 1))) /
DECODE((N.J – 1),0,1,
(SELECT
F.FACTORIAL
FROM
FACTORIAL F
WHERE
F.N=(N.J – 1))) V
FROM
NUMBERS N;
I J V
– — —–
1 1 1
2 1 1
3 1 1
4 1 1
5 1 1
6 1 1
7 1 1
8 1 1
1 2 1
2 2 2
3 2 3
4 2 4
5 2 5
6 2 6
7 2 7
8 2 8
1 3 1
2 3 3
3 3 6
4 3 10
5 3 15
6 3 21
7 3 28
8 3 36
1 4 1
2 4 4
3 4 10
4 4 20
5 4 35
6 4 56
7 4 84
8 4 120
1 5 1
2 5 5
3 5 15
4 5 35
5 5 70
6 5 126
7 5 210
8 5 330
1 6 1
2 6 6
3 6 21
4 6 56
5 6 126
6 6 252
7 6 462
8 6 792
1 7 1
2 7 7
3 7 28
4 7 84
5 7 210
6 7 462
7 7 924
8 7 1716
1 8 1
2 8 8
3 8 36
4 8 120
5 8 330
6 8 792
7 8 1716
8 8 3432
June 16th, 2012 at 6:46 pm
with pascal( i, j, v, n, pv )
as ( select 1, 1, 1, 8, mdsys.sdo_numtab( 1 )
from dual
union all
select case when i = n then 1 else i + 1 end
, case when i = n then j + 1 else j end
, case when i = n or j = 1 then 1 else v + ( select column_value from table( pv ) where rownum < 2 ) end
, n
, case when cardinality( pv ) = n then pv multiset except ( select mdsys.sdo_numtab( column_value ) from table( pv ) where rownum < 2 ) else pv end multiset union mdsys.sdo_numtab( case when i = n or j = 1 then 1 else v + ( select column_value from table( pv ) where rownum < 2 ) end )
from pascal
where i < n or j < n
)
select i, j, v
from pascal
June 16th, 2012 at 7:35 pm
[...] Comments anton on New SQL PuzzleCharles Hooper on New SQL PuzzleKim Berg Hansen on New SQL [...]
June 20th, 2012 at 6:56 am
[...] been impressed with the solutions to my little problem of generating a symmetric Pascal matrix using SQL. Charles Hooper in particular has provided [...]