The proof is out there

I’d like to thank Jonathan Lewis for taking up my challenge and writing up a proof that there is indeed a formula that solves Cary Millsap‘s string problem for not just circles, but for any regular polygon.

Like Jonathan, I found the problem intriguing, and “wasted” a few hours on a Saturday afternoon discovering the formula.

This blog entry isn’t about the formula or proof — it’s rather about the process I used to discover it.

You see, I didn’t know that such a formula actually existed before I set out — but I was convinced that there might be. After all, it would be neat if there was one, wouldn’t it?

This is a common way that I approach problems, especially in the Oracle programming and tuning world — usually I assume the existence of a solution and then I go about trying to find it. One of the things I’m constantly surprised at is how many people “give up” on these kinds of problems, start guessing, guess wrong, and then decide that such a solution either doesn’t exist, or that they’re not smart enough to find it. This post is an attempt to debunk the idea that finding such solutions requires any special smarts — that today there are so many ways to learn and discover that anyone can do it. It just takes perseverance and a bit of dedication.

I started out on my quest to find the formula for regular polygons by searching Wikipedia for “Regular Polygon”

http://en.wikipedia.org/wiki/Regular_polygon

About half-way down the page, I found the following image:

The little letter a looked exactly like what I was looking for — something that resembled the radius of a circle. Before reading this article I had no idea was it was called — from this article I learned it was called the apothem.

From there, I clicked on the link to the definition of the apothem, trying to see if there was a way to calculate its length somehow and relate it to the perimeter of the polygon.

http://en.wikipedia.org/wiki/Apothem

In the article, a way to calculate the apothem based on the length of the sides and the number of them is presented:

a = s / (2 tan (pi / n))

with the comment that the formula can be used when only the perimeter (p) and the number of sides (n) is known because s = p / n.

From here, the rest becomes relatively simple, as I substituted s = p / n for s in the formula for the apothem.

a = (p / n) / (2 tan (pi / n))

Given a new apothem of length a’ (where a’ = a + h); we can work through the formulas to determine the difference between our new perimeter p’ and the original perimeter p.

We start by isolating the perimeter:

a = p / (2n tan (pi/n))

2an tan (pi/n) = p

or

p = 2an tan(pi/n)

Looking at our new perimeter (p’), using the new apothem (a’):

p’ = 2a’n tan(pi/n)

And we want p’ – p (the difference between the perimeters)

p’ – p = 2a’n tan(pi/n) – 2an tan(pi/n)

Factor out the 2n tan (pi/n) which is common to both terms on the right:

p’ – p = (2n tan(pi/n)) (a’ – a)

And since we know that a’ simply equals a + h (the “height” we’re raising the string):

p’ – p = (2n tan(pi/n))(a + h – a)

(This was my aha! moment :-), sorry, bad joke)

p’ – p = (2n tan(pi/n))(h)

or

p’ – p = 2nh tan(pi/n)

Voila, a solution that only requires the number of sides and the “height”.

I was satisfied that this solution matched Cary’s just by simply trying out the formula with 1,000-sided polygon and seeing how it matched — Jonathan went beyond and showed how the approximation of the tan function using a Taylor series (actually a MacLaurin series) gets you to 2hpi.

Why all this focus on a simple problem?

Mainly because it’s been interesting to see how people approach solving it and — in my mind, more interesting — how they attempt to generalize it in a way that makes it really applicable.

When I’m confronted with a particularly challenging Oracle problem, I generally assume there’s a pretty easy way out — I can’t be the first person to ever have encountered such a problem, and the database developers have probably thought of it too, so I usually start with the “Big 3″ manuals:

The Concepts Guide, the SQL Reference Guide, and the PL/SQL Packages and Types Reference.

  • Want to tokenize a string? Check out CTX_DOC.TOKENS
  • Want to know the “distance” between 2 strings? Check out UTL_MATCH — complete with Levenshtein and Jaro-Winkler differences
  • Want to do linear algebra? Check out UTL_NLA
  • Want to “smash” an XML document into something you can select against like a table? Check out XMLTABLE
  • Want to implement a k-means clustering algorithm? Check out DBMS_DATA_MINING

Don’t re-invent the wheel, and don’t assume that the problem is unsolvable — while we might not always stand on the shoulders of giants, we should start by assuming our problem isn’t unique, and that the proof or truth is out there…

RMOUG 12 — Rocky Mountain Training Days

Well, I’ve just returned from a fabulous trip to the Rocky Mountain Oracle User’s Group (RMOUG) training days conference in Denver, Colorado.

It’s been a long time since I’ve attended this conference, but in the fall when I started to plan out which conferences I would begin re-attending in 2012 I gave this one a solid look. Often it’s hard to choose among all of the good conferences, but I thought I’d limit myself to the top 4-5 conferences based on my interests, and this one made the cut.

The main reason I wanted to attend is that I’m targeting conferences that try to bring many groups together instead of being a giant echo chamber for a homogeneous group of people. Some regional Oracle conferences start to seem like DBA-clubs at which the same topics are re-hashed again and again (the basic ones talk about space management and backups, while the advanced ones talk about block dumps and index leaf node management).

When I looked at the agendas for past RMOUG conferences, I was pleasantly surprised to see a wider range of topics that included DBA topics AND developer topics — and it’s this intersection that has me so keenly interested lately. Much of what I’m seeing in the data world is increasing conflict between DBAs and developers that is resulting in poor architectural and process decisions all-around. I’m hoping that conferences like RMOUG can be a venue for facilitating discussions between the camps.

The conference itself was run like a clock — but in an interesting way. It’s the most laid-back well-organized event I’ve ever been to. You know how some conferences are scheduled to the hilt with barely enough time to sprint from session to session and event to event? RMOUG isn’t like that — there’s a quiet competence about how its run so that you don’t miss a thing, and you don’t feel rushed or stressed about it. If you know the people involved, you probably understand why — Tim Gorman, Kellyn Pot’vin, John and Peggy King all exemplify that quite competence and energy that makes sure stuff gets done without a big deal being made of it.

Anyway, I went to some great sessions by Kris Rice (on the Oracle Database Cloud), Graham Wood (on under-utilized Database features), George Thrower (on Ruby!), and Jean-Pierre Dijcks (on the Oracle Big Data Appliance). I also caught up with lots of my OakTable friends.

In Kris’ session I learned a lot about the differences between the Oracle Database Cloud and the Amazon Relational Database Service (RDS) — basically Amazon RDS presents you with an instance and the ability to connect just like any other instance through a machine and port, while the Oracle Database Cloud is more like a schema with more limited access and extreme security (the Oracle Cloud lacks the Amazon security controls around IP address access, so it settles for more constraints around access). Also, file and object deployment is heavily virus scanned on the Oracle Database Cloud, leading to multi-step deployments — Kris showed us using the Deployment Cart functionality of SQL Developer. Also, a lot of emphasis on using RESTful interfaces to the Cloud (with a quick side note on the ability of the APEX listener to expose database capabilities as RESTful interfaces WITHOUT requiring a full APEX installation!).

In Jean-Pierre’s presentation I learned a lot about the connectors from the Big Data Appliance to an Oracle database, including specialized Hadoop finalization jobs which take reduce output and format it specifically for insertion into an Oracle database, along with an optimized hdfs client that will enable using hdfs content as external tables. JP described tests that were able to move 15TB/hr from hdfs on a BDA into an Exadata machine over Infiniband.

My own presentation on writing MapReduce in SQL and PL/SQL was rather sparsely attended, but that’s ok — I understand that presentations on the edge between DBAs and Developers aren’t always the most popular. I think it went well and my audience got some good stuff out of it. It was my first run of this presentation and I’ve taken away some good ideas on how to improve it.

In any event, from Cary Millsap’s keynote (in which I loved the difference between memorizing facts and understanding concepts as a way to “master” material) to Kerry Osborne’s presentation on DIY Exadata, I had an excellent time and will be sure to try and attend next year.

Between the location (Denver and the mountains), venue (Colorado Convention Center), and the excellent content I’d highly recommend putting this one on your list.

I wish I could go inside and see the RMOUG sessions!

To be precise, we’re mistaken

“Bother!  We were mistaken!”
“To be precise: we’re a mistake.”
- Thomson and Thompson, Cigars of the Pharaoh

This post belongs to that category of dusty old bits of knowledge that lay deep in your head — stuff you never really thought would be useful, but you learned it anyway and it’s there, waiting for its turn in the sun.

So a friend of mine comes up to me and says, “What do you know about Excel?”.  I say, “a little”.  And she asked me to come look at a problem she’s having.  She’s got a spreadsheet that she’s using to build a data entry form on (I know), and she’s got a cell where someone can put in an Account Number. 

“Watch”, she says.  And she types in 123456789123456789 and hits enter.

Excel immediately changes the cell contents to 123456789123456000, replacing the final 3 digits (789) with 3 zeros.

“Why is it doing that?’, she asks.

I make one change to the cell, and ask her to try it again.  This time is works without replacing the final 3 digits.

I give you a 4 question quiz:

  1. What did I do?
  2. Why did I do it?
  3. What about Excel was causing it to do what it did?
  4. Why would this knowledge be useful to me at all when dealing with databases?
Posted in Skills. 2 Comments »

Finding new features

Links to 2 things here.  I recently had an opportunity to throw an idea into a debate/quiz going on at Howard Rogers’ site.  I always find Howard’s quizzes to be provocative in that the debate is more interesting than either the question or the answer.  My interpretation of Howard’s question was: “Is there a single metric you can get from an Oracle database that will tell you the general well-being of the server hosting the database?” — no O/S tools allowed, SQL only, with a preference for a point-in-time number not requiring sampling.

Another way to say this is “a quick-and-dirty” answer :-)

Clearly, responsible adults are debating whether or not it’s a good idea to purport to hold up a single, non-sampled metric as the only thing you’ll ever need.  It’s not a good idea.  But if someone only allows you one command, what would you run?

Since Howard asked for the status of the server (not just the database), we need something that looks at the O/S from within Oracle.  We know that Oracle can do that — where do you think it gets sysdate from anyway? :-)

So, how do you find out if Oracle exposes monitoring statistics from the O/S?  I just read the reference manual, with an eye on the V$ views.  Gee, V$OSSTAT looks interesting, doesn’t it?  Oh well — how about:

select value from v$osstat where stat_name = ‘LOAD’;

Simple, and a reasonable thing to look at — as long as you know how many CPUs are on the box (and cpu_count is probably good enough for that).

Most people hadn’t heard of the v$osstat view… It’s “new”.  Which brings me to an excellent podcast from Tom Kyte – in it he talks about “new features”.  He also talks about how people look for the “best” new feature.  Which always depends on their specific needs.  :-)

How do I find new features?  I re-read the Concepts Guide and I always, always re-read the Reference guide (specifically the V$ views).  Here’s the thing, any cool new feature generally has a way to be monitored.  And if it doesn’t then I don’t want to use it :-)  So I can usually reverse back to a new feature by understanding the ways the Oracle engineers have exposed ways to monitor it.

(There are many exceptions to this — especially in the ways that Oracle extends SQL and PL/SQL — for those I look at the SQL Guide with special attention to the functions section, and also the PL/SQL Guide with special attention to the table of contents, and finally the Supplied Packages Guide — lotsa goodies in there).

Yesterday’s Papers

Who wants yesterday’s papers?  Nobody in the world. — Mick Jagger, 1967.

One thing I’ve heard from a lot of DBAs is that their daily work consists continually of solving the same sets of problems over and over again.  Some people are satisfied with this, and others even spend time teaching others how to solve the same problem over and over again too.  I’ve even seen managers encourage this behavior — “let’s share the knowledge on how to do that”.

A big concern I have with this is that problem solving skills can atrophy if they’re not used, which starts to lead to laziness and guessing when confronted with new problems.  It also appears to lead to “the search for an expert” — someone for whom the new problem is “yesterday’s news”.  I think some of this has led to the humorous BAAG (Battle Against Any Guess) party started by Alex Gorbachev.

To me, real creative value consists of eliminating not only the current problem, but also the recurrence of the problem.  Devising a way of having the problem solved automatically, or better yet, preventing it from occurring again shows real creative talent.  Moving beyond that to imagining future problems, solutions and preventative processes is even better.

Sometimes I wish there was a way to teach, quantify or encourage imagination and creativity.  I know, I’m sure there are thousands of experts in “mind-mapping” and other “creative techniques”.  For me, imagination comes when I’m able to clear my mind and not be confronted with continuous minutiae.  I’m thinking about designing an “Imagination Helmet” — something that allows you to block out distractions.  Heck, I bet I can sell them for $20 a pop on HSN.  Maybe I’ll even build an enhanced version with chromatherapy and soothing new age music :-)

Imagination.  Try it when confronted with your next problem.

When thinking about this post, the Rolling Stones song from above kept running through my mind — my father used to sing it around the house when I was a kid.  He’s a veritable expert on music from the 60′s, having won contests in ’62 and ’63 that the local radio station used to run on predicting top 10 lists.  He was amused when I asked him the name of the song that he used to sing and told him why I wanted to know. :-)

One limitation I’ve found about the Internet today is the paucity of pop-culture history — it’s like the world didn’t exist prior to 2000….

Posted in Skills. 1 Comment »