Superscript

…back to Change Management.

In my earlier post, I talked about the strong change management ethic at my current company.  It was interesting to see how my predecessor here adapted database schema evolution to that ethic.

I need to add here that I find “ancient” software development organizational ideas like “chief” or “surgeon” programming teams to be fascinating.  In particular because each model describes “super” programmers who end up being responsible for 80-90% of the code anyway due to shear talent and dominant personality.  Almost everywhere I’ve gone has had a local expert who seemed to qualify for that kind of title.  The relevance of ideas from Mills and Brooks is commented on here.

The biggest thing I’ve taken away from those ideas of “super” programmers is that, as a manager, I need to help every member of my staff become as close to “super” as possible.  That means removing as many non-productive, non-programming, non-learning tasks as possible.

Another way to put this is to find out ways to reduce the effort and work required to do non-programming tasks.

This is where the DDL generation scripts came into the picture.

When I arrived, every programmer hand-coded the scripts which were checked into the source code control system.  Everyone had to remember to include the source code control tags, and script names weren’t obvious.  Also, it wasn’t always clear that every changed object and/or procedure was being captured with every release.

To address those issues, I:

  1. Required that all code checked into the source code control system had to be generated via the DDL generation scripts.  We modified the scripts to include the source code control tags automatically.  We also did things like include primary key indexes, grants and initial stats analysis in table creation scripts.  We also injected source code control tags into the stored procedure and package code so that they could be seen when browsing the database.
  2. Modified the scripts to include the object type, schema name and object name in the generated file name.  So, a script that created the SCOTT.EMP table ended up being called mk-tbl-scott-emp.sql  (We used mk for created, rm for drop and ch for alter).
  3. Turned on DDL auditing for all objects in the development database.  This allowed us to double-check and make sure we captured and verified which objects were changed leading up to a release.

A note on DDL auditing — I like to try and have this turned on in every database I work with.  Development, Test, QA, Production, you name it.  I think it’s always useful to have a log of structure changes to the database — and why code up a special trigger when the database will track it automatically?  The only objection I’ve heard to this is space — and the space required is so ridiculously small.  Sometimes people complain about performance — and I say that if you have a system that relies on 1000′s of DDLs per day, you’ve got a system that needs work.  Who does DDL as part of an OLTP transaction?

Our biggest remaining challenge with this approach was how to create the “master” installation “super” scripts.  I’ll cover that in a later post…

(One other concern is the shear number of files this approach can generate.  If Oracle did this for the SYS schema, you’d have over 4,000 individual files.  On the other hand, this works well when you only need to re-create a single object without searching through 4-5 cat scripts to find the object definition.  Other systems use object definition files which are then read by specialized programs to generate installation scripts — that seems like a lot of work for an internal corporate application).

Change Management

I know, I’m staring at a box with the word Pandora written all over it.

Over the years, I’ve worked on many ways to let database schema evolution and software change management co-exist peacefully.

During the early years, I was happy if the init.ora file was under SCCS control.  I re-read the original OFA document and couldn’t find any reference to pfile standards, but I recall that we thought it would be neat if changes to the init.ora file were tracked and kept under version control with history. 

James Morle has an excellent section in his book, Scaling Oracle8i, which discusses the init.ora file and mentions SCCS here.

This is one of the reasons I don’t use binary spfiles nowadays….

Anyway, back then we didn’t worry so much about schema evolution and change management.  Who would’ve thought that these systems would last long enough to have such a requirement? :-)

I was introduced more formally to database source code control by John Beresniewicz, whom I met during my time as CTO at Savant Corporation.  John was the primary developer of an application (Savant Q) which had a repository and several robust PL/SQL packages.  It was one of the first retail applications I encountered which made heavy use of PL/SQL.  As such, John had everything under source code control using MS SourceSafe.  Since the company was really small, I ended up chipping in on the installation code.

As with many retail applications, the focus of the installation code was on “clean” installations instead of upgrades.  So, our source code control reflected that with heavy control on the creation commands, and specialized files for delta upgrades.  Packaging usually meant making an image of the entire source tree for burning to CDs.

Source code control for packaged software differs somewhat significantly from that for custom internal corporate applications.  Heck, source code control at many corporations can be a hit-and-miss affair for all kinds of code, let alone database schema evolution.

My current company has a very strong source code control ethic, which has led me to work on ways to manage schema evolution in line with that sense of responsibility.

The goals differ somewhat from the ones we used on packaged software.  “Releases” occur very frequently, often every 2 months.  The emphasis is on “upgrades”, not installations from scratch.  (Every once in a while, I get asked about whether or not we could create the corporate databases using only the scripts from source code control.  While I’d like to support that, the real need for such a capability is low.  The idea that I’d take 5-7 year old scripts to create an empty database and re-import 2-3 TB of data seems silly to me when it would be much, much more practical to practice doing backups and restorations).

I intended this topic to be the subject of my paper at Hotsos 2007, but I ended up not being able to attend the conference, and with that the paper never got written. :-)

2 recent blog posts *here and here) by Dominic Brooks got me to thinking about this topic again, so I’ll be posting on it over the next few days.  Perhaps with your feedback, I’ll eventually coalesce these posts into a decent paper on the subject.

Whither catproc?

I had an interesting conversation yesterday with one of my folks who was attempting to install APEX into an Oracle 9i database (you know who you are). :-)

One thing I bet Oracle is starting to have to deal with is the fact that a lot of existing databases began life as a 7.3 or 8.1 system and have been upgraded and patched up into current versions — with the end result that they’re probably a missing a few things.  We were among many who encountered the issue of upgrading a database that “was originally” created as 32-bit into a 64-bit 10.2.0.3 version.  I gather that the outcry over that was pretty loud.

Anyway, my friend’s installation of APEX into 9i ran into a problem that DBMS_XMLDOM was missing and came to me for help after scouring Metalink and the Internet to little or no avail.

So I got to show off my spelunking skills by digging through the scripts in rdbms/admin.

A long time ago, we all used to create databases from scratch using scripts, so we got used to running catalog.sql and then catproc.sql (to install the PL/SQL option), and then other scripts like catrep.sql (to install the replication option).  I used to tell new Oracle consultants that after attending classes and reading manuals, the best way to learn about the database was to read through sql.bsq and catalog.sql.

I started off by grep-ing for DBMS_XMLDOM in all of the .sql files in rdbms/admin, just to make sure we had some file somewhere which would re-create it if we needed it.  Found it in dbmsxmld.sql (a pretty obvious name for it).  I then grep-ed for dbmsxmld.sql to see what called it and came up with catxdbst.sql.  The file names made sense to me, as the dbms file was the base package create and the cat file was the caller — usually catalog.sql and catproc.sql called other cat files to install their stuff.

I kept going just to see where I’d end up and grep-ed for catxdbst.sql and found it called from catqm.sql.  Still no catalog.sql or catproc.sql, but I’m nothing if not persistent, so I kept going and found this in catproc.sql:

Rem XDB Schema creation
Rem Comment this out until catqm.sql is finalized.
Rem @@catqm.sql

I guess it never got finalized :-)

At this point it was pretty clear that we had the necessary code, it was probably just a matter of properly installing it.  The “cowboy” in me wanted to fire up sqlplus as sysdba and just run catqm.sql — but since other people actually use this database, I suggested we open a TAR (ooops!  I mean an SR), to have support walk us through the steps.

Later, out of curiousity, I wandered through the Oracle 9i doc set to see how to manually install XDB  / XMLDB.  A search on DBMS_XMLDOM listed the XML Developer’s Guide with 16 matches, so I figured it was in there somewhere.  What’s interesting to me is that the installation instructions are in Appendix A.  26 chapters on how to use it, but the installation is an afterthought.  I know, it’s automatically installed, so no need to have that up front.  But there was enough complexity in the installation section that I’m glad I suggested we open the TAR on the installation.

Dressed the the nines

My good friend Moans (who posts to his blog about as often as I do), recently wrote up his current opinion on high availability here.  His post resonated with me, as I’ve had a fair amount of experience with system availability over the years.  While ruminating on his post, I dug up some excellent work on the subject from HP (one of the biggest proponents of 5nines during the First Internet Gilded Age).  This presentation by William Sawyer is particularly good, as well as this paper by Tzvi Chumash.

I’m particularly proud of the reliability of the system I’m currently a part of.  From a good design, to talented engineers, to very good operational folks, we have  a system that consistently achieves high availability.  One thing that I like about the system is that it isn’t very complex.  And it’s especially simple in its redundancy features.

I have to agree with most of the authors from above.  Designing, building and maintaining high-availability features can be very costly and error-prone in and of themselves.  A lot of ideas look good on paper, and sound good coming from the vendor, but in the end, your own people have to own and maintain them.  And if they can’t, for whatever reason, you’ve just wasted a ton of time and money.

This doesn’t mean that you don’t need higher levels of availability — it just means that you need to make sure you capture all of the costs.  That includes training and drills for the people responsible for the system.

High availability is achievable, but a realistic assessment of how to get and stay there really makes it easier.

(I also found this article interesting)

CREATE ROLE DATABASE_ENGINEER;

Doug Burns has written an article here that talks about what he is calling a Development DBA.  Further commentary here by Tom Kyte.  I’ve talked about this role in the past (including a paper on the subject at the 2006 Hotsos conference). 

Doug makes some excellent points in his article, describing the triangular relationship between application developers, production DBAs and the development DBAs.  In my paper, I suggest that one of the main challenges with defining the necessary role is the ambiguity of the title.  Tom often calls the role a development DBA or DBA developer.  Personally I like the term Database Engineer, as I think that developers responsible for the databases during development should exercise engineering-based skills.

Doug observes that many of the more experienced DBAs tend to be Production DBAs — and that’s a shame.  I like to say that the skill sets of Database Engineers and DBAs overlap — but that Production DBAs focus on High Availability, Backup / Recovery and Space / System Management, while Database Engineers focus on Schema Design / Development and Database Coding.  Experts are required in both areas.

About the only thing I disagree with in Doug’s article is the discussion about database ownership and responsibility.  In my opinion, a database engineer should retain ownership and responsibility for the availability and performance of the development database.  That doesn’t mean you prevent developers from accessing it, but it does mean that you have the responsibility to make sure that developers don’t shoot themselves in the foot or prevent other developers from getting their work done.  Doug is right about being helpful — try to understand what the developer is trying to do and work together to figure out a solution — don’t be a “scowling brick wall” :-)  Take pride in the ownership and be part of the solution.