NEOOUG Ohio, ODTUG Kscope12 and something unique from Redgate

If you’ve known me for a long time, and let’s face it, if you’re visiting this blog, you probably do, you know that I’m particularly proud of the work I was able to do at Network Solutions as the Director of Database Engineering and Development.

I had the privilege of working with some of the best software development colleagues I’ve ever encountered, including Pete Fox (Vice President of Engineering), Brian Taylor (Release Management), Eric Faulkner (Product/Portfolio Owner), Mike Cocozza (Application Tier) and Mona Nahavandi (Presentation Tier), Sujata Nakhre (Billing) and Michelle Lee (CRM), as well as a host of other professionals in QA and Operations to bring the Network Solutions website and IT applications to life.

I didn’t know it at the time (2002-2007) because it wasn’t well known yet, but we practiced what I now know to be proto-Agile and proto-Helsinki development of a complex system that was timeboxed and data-centric. We regularly released features in three-month cycles based on release themes made up of capabilities and features which were managed and prioritized by product owners.

Recently I had the privilege of presenting at the Northeast Ohio Oracle User’s Group training day conference in Cleveland, Ohio and during the sessions I actually started to hear the stirrings of the Oracle community becoming more aware of Agile and Helsinki as ways to improve our ability to deliver quality Oracle-based software systems in a more rapid manner.

BTW, I couldn’t have been more impressed with the NEOOUG training day conference — I was highly surprised given that I had assumed attendees would be limited to North East Ohio, and that the conference itself would reflect a minimal ability to host a larger event.

I was wrong.

Held at Cleveland State University, I estimated the crowd at about 100-150 participants in state-of-the-art venues including university classrooms and ballroom spaces. Excellent wifi coverage was the icing on the cake.

I thought the quality of the presentations was pretty good as well — and I’ll definitely be returning in the future, as well as encouraging folks from the Midwest and Great Lakes regions to attend. The folks at NEOOUG told me they’re aiming to be as good as the RMOUG Training Days conference and I’d say they’re clearly on the right track.

As an aside, I think it’s really a good idea for Oracle Technology conferences to be more closely identified with academia through the use of university / college facilities as opposed to the Applications-focused conferences which are more of a vendor event — I think it more clearly aligns the content with the audience and venue.

I gave 2 presentations at the conference: Implementing MapReduce with SQL and PL/SQL; and An Examination of Schema Architecture Patterns for Edition-Based Redefinition.

I’m particularly proud of the MapReduce presentation — it’s evolved nicely since I gave it at RMOUG and I’ll be reprising it at ODTUG KScope12 in June as well, but I was really pleasantly surprised at how well the EBR presentation went.

It’s probably because any discussion of EBR naturally aligns itself with how to use it to support a development process, and so I got to discuss some of my methods for doing database development:

1. Agile-style timeboxed deliveries
2. Helsinki-style database APIs through stored procedures
3. Describing the database as a “data server” accessed via an API (lately I prefer a RESTful API exposed via the APEX listener), instead of a “database server” with tables mapped into client-side ORMs
4. Versioning of database objects — historically by appending version numbers to programmatic objects (packages and procedures), now by using Edition-Based Redefinition
5. Extracting definitions of database objects to the file-system for source-code control check-in
6. DDL auditing everywhere — Development, QA, Production — to track all database structural changes

#4, #5 and #6 have been areas of keen interest to me for over 7 years now — and I’m happy to say that tools and vendors have really been responsive to my requests for improvements. I can’t say enough good things about Kris Rice and Bryn Llewellyn from Oracle, who listen to my requests for features in SQL Developer and how I might use EBR — I’ve seen changes that make SQL Developer get better and better at #4 and #5 in particular – now you can generate individual scripts per object and use the new Cart feature for deployment.

Today I want to talk about another vendor looking to provide capability in #4 and #5 in a unique way: through a live opportunity to engage the Oracle developer community in a rapid feedback loop on features at the upcoming ODTUG KScope12 conference in San Antonio, TX on June 24-28, 2012.

Redgate Software has done a bunch of work in the SQL Server space for years and more recently has upped their game in the Oracle tools world with their Schema Compare, Data Compare and Schema Doc products, as well as hosting the AllThingsOracle.com website. And at KScope they’re looking to really interact with the development community on building Source Code Control for Oracle in a live, agile fashion.

Instead of a simple product demonstration booth, Redgate will be using their vendor space as a mini-development shop, soliciting features from conference attendees and using an agile development process to actualize those features into a prototype product for Oracle Source Code Control.

Needless to say, I’m excited about it — I have definite ideas on what works and what doesn’t work for Oracle source code control and I can’t wait to try and get those things implemented while seeing what other developers and DBAs also want.

If database development is at all important to you, this sounds like a great opportunity to get your ideas out there.

Death of the Enterprise Architect?

Last week I read an interesting article about how cloud computing is changing the role of the enterprise architect and it got me thinking about the bad rap many architects are getting in the brave new agile, cloud, big data world.

From what I’ve been reading, there’s been a bit of a straw man argument going on — enterprise architects are often described as uber-control freaks who attempt to dictate software architectures in a repressive way to implementation teams. Mention the term “reference architecture” and you’ll often raise the hackles of the new developer-led world.

To be sure, there are many enterprise architects who match that description. And they’re the ones who give architects a bad name, just like undisciplined developers can give agile a bad rap too.

I tend to agree with the article in many respects — the idea that software architectures can be fully contained and described in a prescriptive way that potentially limits or increases the cost of software that adds value to the business is something that really isn’t good. To me, it almost always comes back to the value question — does the architect add net positive business value?

One of the best roles I’ve seen for enterprise architects is in attention to technical debt — measuring the growth of it, assisting agile teams in ways to address emerging technical debt, and generally ensuring that the business isn’t accumulating potentially crippling amounts of it. To me, the companies which have these kinds of architects never have to “stop all development” in order to do vendor-software upgrades, re-platforming, or huge re-write / re-factoring efforts.

Again, I don’t think this is about the enterprise architect defining an architecture and enforcing it — it’s a lot more about being an architectural scout: staying out in front of the agile development teams to bring back ideas and help them choose concepts which keep the system flexible in the face of change. Not in defining the ultimate generic architecture — we’ve been down that path before with SOA, “message-bus”, integration servers and broker architectures.

I hope you’re lucky enough to work with such an enterprise architect.

Recursive Subqueries for FizzBuzz

So in yesterday’s post I mentioned that I wasn’t happy with my solution. Among several reasons were the limit on the first 100 numbers, and the lack of using what I think to be a more elegant solution with recursive sub-queries.

Also, I received a comment about problems with the MarkDown plugin — which was also affecting my code posting, so I disabled MarkDown and can now handle code posting much, much better.

I couldn’t let it rest, so I dug into the recursive sub-queries with a vengeance and wrote up something more elegant.

with
fb0 (input) as (
select      'buzz,fizz,fizzbuzz,fizz,buzz,fizz,fizz' input
from        dual
),
fb2 (n,pos,n_start,n_str) as 
(
select      /* Anchor query of root nodes for recursion */
            fb1.column_value n, 
            1 pos,
            fb1.column_value n_start,
            to_char(fb1.column_value) n_str
from        fb0,
            table(sys.odcinumberlist(3,5,6,9,10,12,15)) fb1
where       /* Limit root nodes to ones that match the first word of input */
            case regexp_substr(fb0.input,'\w+')
              when 'fizz' then mod(fb1.column_value,3)
              when 'buzz' then mod(fb1.column_value,5)
              when 'fizzbuzz' them mod(fb1.column_value,15)
            end = 0
union all
select      /* Build "child" nodes by walking fizzbuzz matching to input string */
            case 
              when mod(fb2.n,15) in (5,9) then n+1
              when mod(fb2.n,15) in (3,10) then n+2
              when mod(fb2.n,15) in (0,6,12) then n+3
            end,
            fb2.pos + 1,
            fb2.n_start,
            fb2.n_str || ',' ||
            case 
              when mod(fb2.n,15) in (5,9) then n+1
              when mod(fb2.n,15) in (3,10) then n+2
              when mod(fb2.n,15) in (0,6,12) then n+3
            end
from        fb2,
            fb0
where       case 
              when mod(fb2.n,15) in (0,5,6,10) then 'fizz'
              when mod(fb2.n,15) in (3,9) then 'buzz'
              when mod(fb2.n,15) in (12) then 'fizzbuzz'
            end =
            regexp_substr(fb0.input,'\w+',1,fb2.pos + 1)
),
fb3 (input, n_str, n_length, n_start, min_length, min_start) as 
(
select      /* Measure lengths of resulting matches, discard ones that are not long enough */
            fb0.input,
            fb2.n_str,
            fb2.n - fb2.n_start n_length,
            fb2.n_start,
            min(fb2.n - fb2.n_start) over () min_length,
            min(fb2.n_start) over (partition by fb2.n - fb2.n_start) min_start
from        fb2,
            fb0
where       fb2.pos = nvl(length(regexp_replace(fb0.input,'\w')),0)+1
),
fb4 (input, n_str) as 
(
select      /* Retain the matches which are shortest and start at the smallest value */
            fb3.input,
            fb3.n_str
from        fb3
where       fb3.n_length = fb3.min_length
and         fb3.n_start = fb3.min_start
)
select      /* Display output */
            input,
            n_str
from        fb4;

What’s all the buzz about?

During the course of recent twitter / RSS reading, I came across an interesting post related to the old FizzBuzz problem. The post intrigued me for 2 reasons: it talks about solving a basic problem in Scala, and it emphasizes the use of functional programming. And just for grins and chuckles it calls out OOP based on an old paper from 1984 by John Hughes on Why Functional Programming Matters.

I’m a big fan of functional programming, even if my language of choice, SQL, is a poor example of it. Although, I tend to think functional when I write SQL, I know that the language is limited in that respect.

However, when I think about problems that need to be solved in an Oracle database system, I always try to exhaust SQL-based solutions before looking to PL/SQL (I’ve always enjoyed Tom Kyte’s mantra on the subject). It’s one of the reasons that systems I have influence over have very little imperative PL/SQL constructs.

I also strive to write modular, relatively self-explanatory SQL to combat the reputation of SQL as a “difficult” language — all languages are difficult when you are confronted with them :-)

Anyway, the post under discussion describes an inverse FizzBuzz problem with a description of working through various solutions using the function constructs of Scala, and many people have had a go at solving it.

One of my favorites is in F#, Microsoft’s functional language.

Anyway, here’s my attempt at it — I’m not entirely satisfied with it yet, but it was fun to write:

with 
fb0 as (
select       'fizz,buzz' input
from         dual
),
fb1 as (
select       level n,
             case
               when mod(level,3) = 0 and mod(level,5) <> 0 then 'fizz'
               when mod(level,5) = 0 and mod(level,3) <> 0 then 'buzz'
               when mod(level,3) = 0 and mod(level,5) = 0  then 'fizzbuzz'
             end fb,
             lead(level,1) over (order by level) next_n
from         dual
where        mod(level,3) = 0 or mod(level,5) = 0
connect by   level <= 100
),
fb2 as (
select       connect_by_root n n_root,
             level pos,
             fb1.n,
             fb1.fb
from         fb0,
             fb1
where        level <= nvl(length(regexp_replace(fb0.input,'\w'))+1,1)
start with   fb1.fb = regexp_substr(fb0.input,'\w+')
connect by   prior fb1.next_n = fb1.n
),
fb3 as (
select       listagg(fb2.fb,',') within group (order by fb2.pos) fb_str,
             listagg(fb2.n,',') within group (order by fb2.pos) n_str,
             max(fb2.n) - min(fb2.n) n_length,
             min(fb2.n) n_min
from         fb0,
             fb2
group by     fb2.n_root
having       listagg(fb2.fb,',') within group (order by fb2.pos) = fb0.input
),
fb4 as (
select       fb3.n_length,
             fb3.n_min,
             fb3.fb_str,
             fb3.n_str,
             min(fb3.n_length) over () min_length,
             min(fb3.n_min) over (partition by fb3.n_length) min_start
from         fb3
),
fb5 as (
select       fb4.fb_str,
             fb4.n_str
from         fb4
where        fb4.n_length = fb4.min_length
and          fb4.n_min = fb4.min_start
)
select       *
from         fb5;

I'm not entirely happy with it -- I'd like to combine fb2 with fb3, and fb4 with fb5. I'd also like to rewrite it with recursive subqueries instead of the connect by's and listagg's, but it's not too bad.

One of the things I'm starting to like with the WITH construct is how easy it is to debug the overall statement -- I can just replace the final SELECT with whichever subquery I want to test in order to see what results it's "passing" to the next subquery. Go ahead and try it to see!

The proof is out there

I’d like to thank Jonathan Lewis for taking up my challenge and writing up a proof that there is indeed a formula that solves Cary Millsap‘s string problem for not just circles, but for any regular polygon.

Like Jonathan, I found the problem intriguing, and “wasted” a few hours on a Saturday afternoon discovering the formula.

This blog entry isn’t about the formula or proof — it’s rather about the process I used to discover it.

You see, I didn’t know that such a formula actually existed before I set out — but I was convinced that there might be. After all, it would be neat if there was one, wouldn’t it?

This is a common way that I approach problems, especially in the Oracle programming and tuning world — usually I assume the existence of a solution and then I go about trying to find it. One of the things I’m constantly surprised at is how many people “give up” on these kinds of problems, start guessing, guess wrong, and then decide that such a solution either doesn’t exist, or that they’re not smart enough to find it. This post is an attempt to debunk the idea that finding such solutions requires any special smarts — that today there are so many ways to learn and discover that anyone can do it. It just takes perseverance and a bit of dedication.

I started out on my quest to find the formula for regular polygons by searching Wikipedia for “Regular Polygon”

http://en.wikipedia.org/wiki/Regular_polygon

About half-way down the page, I found the following image:

The little letter a looked exactly like what I was looking for — something that resembled the radius of a circle. Before reading this article I had no idea was it was called — from this article I learned it was called the apothem.

From there, I clicked on the link to the definition of the apothem, trying to see if there was a way to calculate its length somehow and relate it to the perimeter of the polygon.

http://en.wikipedia.org/wiki/Apothem

In the article, a way to calculate the apothem based on the length of the sides and the number of them is presented:

a = s / (2 tan (pi / n))

with the comment that the formula can be used when only the perimeter (p) and the number of sides (n) is known because s = p / n.

From here, the rest becomes relatively simple, as I substituted s = p / n for s in the formula for the apothem.

a = (p / n) / (2 tan (pi / n))

Given a new apothem of length a’ (where a’ = a + h); we can work through the formulas to determine the difference between our new perimeter p’ and the original perimeter p.

We start by isolating the perimeter:

a = p / (2n tan (pi/n))

2an tan (pi/n) = p

or

p = 2an tan(pi/n)

Looking at our new perimeter (p’), using the new apothem (a’):

p’ = 2a’n tan(pi/n)

And we want p’ – p (the difference between the perimeters)

p’ – p = 2a’n tan(pi/n) – 2an tan(pi/n)

Factor out the 2n tan (pi/n) which is common to both terms on the right:

p’ – p = (2n tan(pi/n)) (a’ – a)

And since we know that a’ simply equals a + h (the “height” we’re raising the string):

p’ – p = (2n tan(pi/n))(a + h – a)

(This was my aha! moment :-) , sorry, bad joke)

p’ – p = (2n tan(pi/n))(h)

or

p’ – p = 2nh tan(pi/n)

Voila, a solution that only requires the number of sides and the “height”.

I was satisfied that this solution matched Cary’s just by simply trying out the formula with 1,000-sided polygon and seeing how it matched — Jonathan went beyond and showed how the approximation of the tan function using a Taylor series (actually a MacLaurin series) gets you to 2hpi.

Why all this focus on a simple problem?

Mainly because it’s been interesting to see how people approach solving it and — in my mind, more interesting — how they attempt to generalize it in a way that makes it really applicable.

When I’m confronted with a particularly challenging Oracle problem, I generally assume there’s a pretty easy way out — I can’t be the first person to ever have encountered such a problem, and the database developers have probably thought of it too, so I usually start with the “Big 3″ manuals:

The Concepts Guide, the SQL Reference Guide, and the PL/SQL Packages and Types Reference.

  • Want to tokenize a string? Check out CTX_DOC.TOKENS
  • Want to know the “distance” between 2 strings? Check out UTL_MATCH — complete with Levenshtein and Jaro-Winkler differences
  • Want to do linear algebra? Check out UTL_NLA
  • Want to “smash” an XML document into something you can select against like a table? Check out XMLTABLE
  • Want to implement a k-means clustering algorithm? Check out DBMS_DATA_MINING

Don’t re-invent the wheel, and don’t assume that the problem is unsolvable — while we might not always stand on the shoulders of giants, we should start by assuming our problem isn’t unique, and that the proof or truth is out there…