## New SQL Puzzle

June 13th, 2012 — ddelmoliSorry for the drought — to keep everyone’s mind fresh, how about a little puzzle?

Use SQL to create a symmetric Pascal matrix, with the output being (i,j,v). So that a Pascal matrix of size 4 would end up as:

1,1,1

2,1,1

3,1,1

4,1,1

1,2,1

2,2,2

3,2,3

4,2,4

1,3,1

2,3,3

3,3,6

4,3,10

1,4,1

2,4,4

3,4,10

4,4,20

Try to stay with ANSI SQL 2008 if possible — bailing out to other languages and functions is discouraged

Efficiency is an interesting idea here, as it’s pretty easy to do this by brute force calculations of factorials for every entry — but where’s the fun in that?

Extra credit for solutions which can take in the matrix size as some sort of parameter.

June 14th, 2012 at 7:00 am

I put together a solution using WITH blocks and Oracle specific syntax, so a solution is possible using just SQL (but I am not sure if that Oracle specific syntax qualifies as ANSI SQL 2008).

This is my output:

I J V

– — —-

1 1 1

2 1 1

3 1 1

4 1 1

1 2 1

2 2 2

3 2 3

4 2 4

1 3 1

2 3 3

3 3 6

4 3 10

1 4 1

2 4 4

3 4 10

4 4 20

Any interest in seeing the Oracle specific code?

June 14th, 2012 at 7:40 am

Charles, sure! WITH blocks aren’t Oracle-specific (I was able to run my solution in Postgres and Oracle), so I’d love to see what you came up “with”.

June 14th, 2012 at 8:21 am

I have a degree in mathematics, but that degree is very, very rusty. Based on the provided link, we are trying to produce an array like the following, where the value in a cell in the array is the sum of the value immediately above and immediately to the left:

1 1 1 1 1

1 2 3 4 5

1 3 6 10 15

1 4 10 20 35

1 5 15 35 70

The first rows have the following values:

1^0 + 2^0

1^0 + 2^0 + 3^0

1^0 + 2^0 + 3^0 + 4^0

1^0 + 2^0 + 3^0 + 4^0 + 5^0

The second row has the following values:

1^0 + 2^0 + 1^0

1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 = 1 * 3^0 + 2 * 2^0 + 3 * 1^0

1^0 + 2^0 + 3^0 + 4^0 + 1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 = 1 * 4^0 + 2 * 3^0 + 3* 2^0 + 4 * 1^0

1^0 + 2^0 + 3^0 + 4^0 + 5^0 + 1^0 + 2^0 + 3^0 + 4^0 + 1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 = 1 * 5^0 + 2 * 4^0 + 3 * 3^0 + 4 * 2^0 + 5 * 1^0

The third row:

1^0 + 2^0 + 1^0 + 1^0

1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 + 1^0 + 2^0 + 1^0 + 1^0 = 1 * 3^0 + 3 * 2^0 + 6 * 1^0

1^0 + 2^0 + 3^0 + 4^0 + 1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 + 1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 + 1^0 + 2^0 + 1^0 + 1^0 = 1 * 4^0 + 3 * 3^0 + 6 * 2^0 + 10 * 1^0

1^0 + 2^0 + 3^0 + 4^0 + 5^0 + 1^0 + 2^0 + 3^0 + 4^0 + 1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 + 1^0 + 2^0 + 3^0 + 4^0 + 1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 + 1^0 + 2^0 + 3^0 + 1^0 + 2^0 + 1^0 + 1^0 + 2^0 + 1^0 + 1^0

Now we can just… cheat and look at the formula from the article that you provided:

(i + j – 2)! / ((i – 1)! * (j – 1)!)

! is short for factorial

4! = 4 * 3 * 2 * 1

3! = 3 * 2 * 1

2! = 2 * 1

1! = 1

Calculating the factorial is a bit of a challenge. The SUM analytic function can be used to generate running sums of a column, but there is no equivalent for running products of a column.

If we calculate the natural log of a column value, generate a running sum of those natural log values, and then generate the anti-log, we can essentially generate a running product (factorial) for a column:

WITH

FACTORIAL AS

(SELECT /*+ MATERIALIZE */

N,

EXP(SUM(LN(N)) OVER (ORDER BY N)) FACTORIAL

FROM

(SELECT

ROWNUM N,

LN(ROWNUM) RNL

FROM

DUAL

CONNECT BY

LEVEL<=20))

SELECT

N,

FACTORIAL F

FROM

FACTORIAL;

N F

– ———-

1 1

2 2

3 6

4 24

5 120

6 720

7 5040

8 40320

9 362880

10 3628800

11 39916800

12 479001600

13 6227020800

14 8.7178E+10

15 1.3077E+12

16 2.0923E+13

17 3.5569E+14

18 6.4024E+15

19 1.2165E+17

20 2.4329E+18

One problem is that with the formula (i + j – 2)!, we could very well need to calculate 0!, which is defined as 1. To accomplish this, we need to add the following inside the WITH block:

UNION ALL

SELECT

0 N,

1 FACTORIAL

FROM

DUAL

Now it is just a matter of defining how large of an array is needed, performing a Cartesian join to achieve the correct number of rows (I * J), and placing the factorial values where necessary:

WITH

FACTORIAL AS

(SELECT /*+ MATERIALIZE */

N,

EXP(SUM(LN(N)) OVER (ORDER BY N)) FACTORIAL

FROM

(SELECT

ROWNUM N,

LN(ROWNUM) RNL

FROM

DUAL

CONNECT BY

LEVEL<=20)

UNION ALL

SELECT

0 N,

1 FACTORIAL

FROM

DUAL),

I AS

(SELECT /*+ MATERIALIZE */

ROWNUM I

FROM

DUAL

CONNECT BY

LEVEL<=4),

J AS

(SELECT /*+ MATERIALIZE */

ROWNUM J

FROM

DUAL

CONNECT BY

LEVEL<=4)

SELECT

I.I,

J.J,

F1.FACTORIAL / (F2.FACTORIAL * F3.FACTORIAL) V

FROM

FACTORIAL F1,

FACTORIAL F2,

FACTORIAL F3,

I,

J

WHERE

F1.N=(I.I + J.J – 2)

AND F2.N=(I.I – 1)

AND F3.N=(J.J – 1)

ORDER BY

J.J,

I.I;

June 14th, 2012 at 2:36 pm

[...] Dominic’s challenge probably would not be much of a challenge if we could just type in formulas like the above into a SQL statement. Give his challenge a try to see if you are able to derive a unique solution to the problem. I probably spent a couple of minutes (maybe 60 seconds with the help of copy and paste) creating the above example using Microsoft Excel, but spent a couple of hours trying to produce a solution that worked using SQL. [...]

June 14th, 2012 at 6:41 pm

A PostgreSQL approach – Mixing analytic functions and recursive subquery factoring. http://rafudb.blogspot.fi/2012/06/pascal-matrix.html

June 14th, 2012 at 8:02 pm

Nice one Timo — glad to see another approach!

June 15th, 2012 at 3:34 am

select i

, j

, ( select exp( sum( ln( level ) ) ) from dual connect by level <= i + j – 2 )

/ ( select exp( sum( ln( level ) ) ) from dual connect by level <= i – 1 )

/ ( select exp( sum( ln( level ) ) ) from dual connect by level <= j – 1 ) v

from ( select mod( level – 1, n ) + 1 i

, ceil( level / n ) j

from ( select 8 n from dual )

connect by level <= n * n

)

June 15th, 2012 at 7:59 am

Anton, good one — I particularly like the single reference to the size of the array. And so far I’ve been impressed with the use of the exp(sum(ln())) construct to calculate the factorials.

June 15th, 2012 at 8:30 am

I think the statement below does the trick:

with r as (select level nr from dual connect by level <= 9 ) , z as (select r.nr row_nr , c.nr col_nr from r r , r c ) , p as (select * from z model dimension by (row_nr, col_nr) measures (cast(null as number) new_val) rules ( new_val[any, any] order by row_nr, col_nr = greatest(case when new_val[cv()-1, cv()] is null then 0 else new_val[cv()-1, cv()] end + case when new_val[cv(),cv()-1] is null then 0 else new_val[cv(),cv()-1] end ,1 ) ) ) select * from p pivot (sum(new_val) for col_nr in (1,2,3,4,5,6,7,8,9)) order by row_nr –, col_nr ;

The pivot is just for formatting the output. If you comment it out, the fixed number for the matrix size can be made dynamic.

June 15th, 2012 at 8:55 am

I admit this is probably pure Oracle (I have no idea which other databases might support the MODEL clause), but it was fun to make so I might as well share it

SQL> variable matrixsize number

SQL>

SQL> begin

2 :matrixsize := 4;

3 end;

4 /

PL/SQL procedure successfully completed.

SQL>

SQL> select

2 i, j, v

3 from dual

4 model

5 dimension by (1 i, 1 j)

6 measures (1 v)

7 rules automatic order (

8 v[for i from 2 to :matrixsize increment 1, 1] = 1,

9 v[1, for j from 2 to :matrixsize increment 1] = 1,

10 v[for i from 2 to :matrixsize increment 1,

11 for j from 2 to :matrixsize increment 1

12 ] = v[cv(i)-1,cv(j)] + v[cv(i),cv(j)-1]

13 )

14 order by

15 i, j;

I J V

———- ———- ———-

1 1 1

1 2 1

1 3 1

1 4 1

2 1 1

2 2 2

2 3 3

2 4 4

3 1 1

3 2 3

3 3 6

3 4 10

4 1 1

4 2 4

4 3 10

4 4 20

16 rows selected.

June 15th, 2012 at 10:06 am

Here is another possible solution that uses SYS_CONNECT_BY_PATH to create formulas and XMLQUERY to calculate the value of those formulas to generate the factorial values.

First, the formulas are built:

SELECT

ROWNUM N,

FF

FROM

(SELECT

’1′||SYS_CONNECT_BY_PATH(ROWNUM,’ * ‘) FF

FROM

DUAL

CONNECT BY

LEVEL<=8

ORDER BY 1);

N FF

- ———————————

1 1 * 1

2 1 * 1 * 2

3 1 * 1 * 2 * 3

4 1 * 1 * 2 * 3 * 4

5 1 * 1 * 2 * 3 * 4 * 5

6 1 * 1 * 2 * 3 * 4 * 5 * 6

7 1 * 1 * 2 * 3 * 4 * 5 * 6 * 7

8 1 * 1 * 2 * 3 * 4 * 5 * 6 * 7 * 8

As shown above, I am only planning to calculate through 8!, but that will later be changed to 20!.

Next, the value of the formulas are calculated:

SELECT

ROWNUM N,

XMLQUERY((FF) RETURNING CONTENT).GETNUMBERVAL() FACTORIAL

FROM

(SELECT

’1′||SYS_CONNECT_BY_PATH(ROWNUM,’ * ‘) FF

FROM

DUAL

CONNECT BY

LEVEL<=8

ORDER BY 1);

N FACTORIAL

- ———-

1 1

2 2

3 6

4 24

5 120

6 720

7 5040

8 40320

The above SQL statement does not calculate 0!, so that problem will be handled later by a DECODE statement.

The following will be used to derive the i and j values for an 8 x 8 matrix:

SELECT

MOD(ROWNUM-1,8)+1 I,

TRUNC((ROWNUM-1)/8)+1 J

FROM

DUAL

CONNECT BY

LEVEL <= 8 * 8;

The final query uses two WITH block subqueries, and scalar subqueries to pull the correct factorial value for the calculation:

WITH

FACTORIAL AS

(SELECT /*+ MATERIALIZE */

ROWNUM N,

XMLQUERY((FF) RETURNING CONTENT).GETNUMBERVAL() FACTORIAL

FROM

(SELECT

’1′||SYS_CONNECT_BY_PATH(ROWNUM,’ * ‘) FF

FROM

DUAL

CONNECT BY

LEVEL<=20

ORDER BY 1)),

NUMBERS AS

(SELECT

MOD(ROWNUM-1,8)+1 I,

TRUNC((ROWNUM-1)/8)+1 J

FROM

DUAL

CONNECT BY

LEVEL <= 8 * 8)

SELECT

N.I,

N.J,

DECODE((N.I + N.J – 2),0,1,

(SELECT

F.FACTORIAL

FROM

FACTORIAL F

WHERE

F.N=(N.I + N.J – 2))) /

DECODE((N.I – 1),0,1,

(SELECT

F.FACTORIAL

FROM

FACTORIAL F

WHERE

F.N=(N.I – 1))) /

DECODE((N.J – 1),0,1,

(SELECT

F.FACTORIAL

FROM

FACTORIAL F

WHERE

F.N=(N.J – 1))) V

FROM

NUMBERS N;

I J V

– — —–

1 1 1

2 1 1

3 1 1

4 1 1

5 1 1

6 1 1

7 1 1

8 1 1

1 2 1

2 2 2

3 2 3

4 2 4

5 2 5

6 2 6

7 2 7

8 2 8

1 3 1

2 3 3

3 3 6

4 3 10

5 3 15

6 3 21

7 3 28

8 3 36

1 4 1

2 4 4

3 4 10

4 4 20

5 4 35

6 4 56

7 4 84

8 4 120

1 5 1

2 5 5

3 5 15

4 5 35

5 5 70

6 5 126

7 5 210

8 5 330

1 6 1

2 6 6

3 6 21

4 6 56

5 6 126

6 6 252

7 6 462

8 6 792

1 7 1

2 7 7

3 7 28

4 7 84

5 7 210

6 7 462

7 7 924

8 7 1716

1 8 1

2 8 8

3 8 36

4 8 120

5 8 330

6 8 792

7 8 1716

8 8 3432

June 16th, 2012 at 6:46 pm

with pascal( i, j, v, n, pv )

as ( select 1, 1, 1, 8, mdsys.sdo_numtab( 1 )

from dual

union all

select case when i = n then 1 else i + 1 end

, case when i = n then j + 1 else j end

, case when i = n or j = 1 then 1 else v + ( select column_value from table( pv ) where rownum < 2 ) end

, n

, case when cardinality( pv ) = n then pv multiset except ( select mdsys.sdo_numtab( column_value ) from table( pv ) where rownum < 2 ) else pv end multiset union mdsys.sdo_numtab( case when i = n or j = 1 then 1 else v + ( select column_value from table( pv ) where rownum < 2 ) end )

from pascal

where i < n or j < n

)

select i, j, v

from pascal

June 16th, 2012 at 7:35 pm

[...] Comments anton on New SQL PuzzleCharles Hooper on New SQL PuzzleKim Berg Hansen on New SQL [...]

June 20th, 2012 at 6:56 am

[...] been impressed with the solutions to my little problem of generating a symmetric Pascal matrix using SQL. Charles Hooper in particular has provided [...]

August 3rd, 2013 at 11:42 am

Impressive SQL puzzle, I think this will be what I will do over the weekend.