Quick SQLDeveloper Hint — Dial0gInput

The other day I was attempting to debug a SQL statement which was doing a REGEXP_REPLACE and the comments said something like “remove all zero’s from the string”.

However, in looking at the code, it looked like it was removing all capital O’s from the string instead of 0′s — see even here within WordPress I can’t tell the difference between an O (the letter) and a 0 (the number).

At first I thought it a coding mistake until I tried to type both of them into a window within SQL Developer — nope, they looked the same. Must be a font issue.

Fonts can be changed in SQL Developer under the Preferences / Code Editor / Fonts selection — you may want to limit choices to fixed-width fonts by checking “Display Only Fixed-Width Fonts”. The default on my system was DialogInput — which appears to have the O/0 problem. I ended up choosing Consolas 12pt and exiting back out to the editor.

Only to see a horribly grainy looking font.  Yuck!

Back to the Preferences — somewhat hidden under Preferences / Code Editor / Display is an item for “Enable Text Anti-Aliasing” — check it.

Much, much better.

I think that preference item should be moved to the Fonts category, but at least I was able to find it — and my eyes feel a lot better.

Oh yeah, now I can tell zeros from o’s again — now on to those pesky 1′s and l’s … :-)

UPDATE

You may also want to adjust the font used for Printing to match your Editor selection — to do so, navigate to Preferences / Code Editor / Printing as well as Printing HTML to change the fonts to match your Editor selection.

Oracle OpenWorld 2011–Reflective Thoughts

And so another OpenWorld has come and gone, and while I wasn’t able to attend in person this year, I was able to watch most of the keynotes live while following along with my peeps on Twitter.

It’s always interesting to see whether or not people are “impressed” with the announcements from Oracle during OpenWorld — a lot of that depends on your perspective. While the past couple of OpenWorlds brought us Exadata and Exalogic, I felt that there were a LOT of “engineered systems” announced in both the run-up to OpenWorld (Database Appliance, SPARC SuperCluster) and at OpenWorld itself (Exalytics, Big Data Appliance). If you’re keeping score at home, you now have at least the following set of engineered system components to choose from:

Exadata
Exalogic
Exalytics — an OBIEE high-performance system (Essbase, OLAP, TimesTen)
Database Appliance (mid-market 2-node RAC in a box)
Big Data Appliance (Hadoop, NoSQL, R and Infiniband connectivity)
Exadata Storage Expansion Rack
SPARC SuperCluster

I predict that integrators and Oracle sales engineers will be very busy putting together solution portfolios and configurations for large customers.

This bigger set of products also puts more pressure on Oracle to deliver a solid management console that can oversee multi-engineered system landscapes, and while the jury is out on Oracle Enterprise Manager 12c, there were several encouraging bits about it — including the ability to customize the screens and workflows in a whitelabel fashion.

Of course, in addition to the Big Data Appliance, which appears to be Oracle’s way of “legitimizing” Hadoop within the enterprise, and providing tighter integration through enhanced connectors over Infiniband, there was another Oracle database product “announced” in the form of Oracle NoSQL. From most accounts, the NoSQL product appears to be a well-engineered key-value store system based on the Berkley DB software.

Then we had the Oracle Public Cloud announcement and theater around Salesforce.com and their keynote. With the cloud, Oracle emphasized the their stance on the open, portable nature of Java and how you can easily move onto and off of their cloud. Two things about the Oracle cloud were particularly interesting to me: the possibility of getting access to “public” data sets on the cloud, and the Oracle Social Network.

Larry Ellison demonstrated the Oracle Social Network used within a company sales process as a collaborative activity streaming tool integrated with Oracle’s Fusion applications — which seemed to resonate well with the enterprise customers in attendance.

All in all, a lot of stuff — and I didn’t even cover the Fusion Apps stuff.

One final intriguing thought — now that Oracle has so many different database products: RDBMS, NoSQL, Essbase, TimesTen and Rdb — it will be interesting to see how they “integrate” them, possibly on their cloud. I can imagine a future in which you don’t choose your product, but rather your feature and usage requirements, something like this:

Oracle Public Cloud Data Configuration

Describe your data requirements:

I need high-volume access to keys and values, I am less concerned about consistency
I need tables and columns that I can use to create relations and views to support ad-hoc queries and analysis
I need faceted, multi-dimensional analysis structures to support numerical analysis
I have a lot of documents and my data is basically unstructured.

Describe how your want to access your data:

I need JDBC / SQL connectivity
I need a RESTful API
I need a SOAP API

And then underneath the covers the cloud provisions the correct product for you, while watching your usage to see if it needs to configure a different product…

Exalytics and the cloud / OpenWorld kickoff

At yesterday’s Oracle OpenWorld keynote address, Larry Ellison spent a lot of time reviewing the impressive achievements of both the Exadata database machine and the Exalogic middle-tier application machine — extolling the purpose-built nature of both systems around what he termed to be the foundations of business data processing value. In particular, the parallel nature of scanning and handling large amounts of structured data on Exadata, the ability of Exalogic to run reams of Java, and perhaps most important of all, the fact that Oracle has made a significant bet on Infiniband to rapidly move data around within and between the systems.

Referring to Ethernet as being “from the 60′s”, it’s clear that Oracle thinks it’s time to move past it for handling large data transfers.

It was interesting to see Larry try to have it both ways in the keynote — asking what the purpose is of IBM’s “fastest integer” processor, “that’s great, but it’s the fastest for what?” — and then proceed to talk about the SPARC SuperCluster as a general purpose machine.

In any event, the star of the keynote was the 3U rack-mountable Exalytics “Analytic” machine — a high-memory, high-compute “node” that provides OBIEE / Essbase / OLAP folks with their very own engineered system. By cramming memory (1TB) and cpu (40 cores) along with the in-memory TimesTen database technology into the box, Larry described a system that allows for analysis “at the speed of thought”. If you’re already an OBIEE user, this system should provide you with plenty of excitement.

Less clear is how all of these engineered systems (including the yet-to-be-described Big Data Appliance / Hadoop machine on display at OpenWorld) will be put together by integrators and customers to provide an analytic / high-transaction data processing cloud. It’s as if Oracle is slowing replacing each item on its software price list with a hardware “systems” item — it will be interesting to see how these systems get put together into solution portfolios. For example, does it make sense to buy an Exadata half-rack and fill the remaining space in the rack with Exalytic machines? Would such a configuration be supported? Encouraged?

Back to the purpose-built machines — how does this play against the cloud trend of ubiquitous, generic nodes and software tailored to that environment?

I actually think we’ll know a lot more about this interplay (and how Oracle intends to adapt to or shape the discussion) as OpenWorld proceeds — especially around customer expectations for keynotes around big data and the cloud.

In the abstract

During conference season, it can be a challenge to come up with abstracts that you can feel passionate about, while making sure to craft them in a way that is both attractive to selection committees and the audience you feel like you want to reach. I often find that tri-purpose (satisfying myself, a committee, and the potential audience) to be daunting and occasionally conflicting — leading to abstract paralysis.

Starting today, I’m going to work harder at it. If you’ve been to any of my presentations in the recent past, you know that I like to spend more time on what I think are technical “culture” issues rather than examples of how to implement or interpret the technical features of the latest software release. It’s an area that I’m passionate about, and it’s one that I feel is drastically underrepresented and underserved at most technical conferences.

The biggest challenge I have with those kinds of presentations is making them selectable and attractive — for the topics mostly concern our ability to collaborate and communicate effectively in support of our business and mission objectives. And in that case, we all feel (myself included) like we’re from Lake Wobegon.

To me, no where is this more apparent than in the discussions about the Agile movement in software development, testing and production operations. Fellow Oak Table member Martin Widlake has some excellent examples of these issues in his 2 recent blog posts on the subject:

“Friday Philosophy – Why Doesn’t Agile Work?” and “In Defense of Agile Development (and their Ilk)”

(I especially like “Ilk”)

In a small, forgotten corner of the Internet, I belong to a Yahoo! Group (yes, they still exist!) on Agile Databases, which has as its description:

Discussion about Database management with regards to Extreme Programming practices and principles.

You can visit the group here.

In a recent discussion, there was a post from Scott Ambler that I found myself violently agreeing with:

A question was asked about coordinating and scheduling changes made by database and ETL teams with the development teams in order to reduce confusion and churn during development.

Question / Comment: While one or more code iterations are taking place in parallel, the data design and ETL are working on their iteration of the db schema and data, which will be consumed by later code iterations.

Scott’s Comment / Answer: Better yet, this could occur in a “whole team” manner where data-experienced people are embedded in the actual team.  This can improve productivity by reducing overall overhead.  Unfortunately this can be difficult in many companies due to the organizational complexities resulting from the cultural impedance mismatch between data and development professionals.

(Emphasis mine)

I feel like I’ve have the privilege of working in places where those organizational complexities and cultural impedance mismatches were overcome and I’d love to talk about what I think made that happen.

Now just to write some compelling abstracts on the subject — ideas welcome!

Chrysopylae, Part 2

Part 2 of my review of the Oracle GoldenGate 11g Implementer’s Guide begins with Chapter 5, Configuration Options.

The Configuration options chapter deals with more advanced options like batching, compression, encryption, triggering events, loop and conflict detection and DDL replication.

Batching and how SQL statements are cached to support batching, along with error handling and fallback processing are thoroughly explained.

Compression is also covered in some detail, with information about how GoldenGate cannot replicate data from Oracle compressed tables (including the EHCC compression from Exadata database machines).

In-flight (message) encryption and at-rest (trail) encryption is covered as well.

Event triggering is covered at a basic level, but gives a good insight as to what is possible – including the ability to have GoldenGate fire off a shell script in response to a particular set of values being detected in the capture process.

The discussion of bi-directional replication begins with a thorough list of items to be considered, including loops, conflict detection and resolution, sequences and triggers.

Conflict resolution options are slightly limited, and aren’t clearly defined – for example, applying a net difference instead of the after image is only useful in a subset of mathematical operations on numerical columns.  And there is no mention of prioritization by site (by which some sites updates always take precedence).  In truth, conflict resolution procedures can get pretty complicated, and I’m surprised there isn’t more information about them in this section or a referral to a later section (For example, Chapter 7 on Advanced Configuration).

The section on sequences is equally lacking in options, starting with a rather unclear statement about not supporting the replication of sequence values – what is really meant is that sequences themselves are not synchronized across multiple databases.  And the recommendation to use odd / even strategies is also rather simplistic – missing out on multi-master scenarios.  One can always reserve lower digits to enable more than 2 sites, and technically one can set up sequences to allow for an infinite number of sites as well…

Trigger handling advice is also rather simplistic – leading to more questions than answers as it talks about disabling triggers during the application of replicated data – there isn’t a mention of how that will affect an active / active system where local transactions are occurring.

There is a good discussion on DDL replication, with the information that the RECYCLEBIN must be disabled.

Chapter 6 – Configuring GoldenGate for HA

This chapter talks about GoldenGate in RAC environments, including the need for shared filesystems, and configuring GoldenGate with VIPs and with clusterware.  Sample scripts and commands are included – overall this chapter stays on point.

Chapter 7 – Advanced Configuration

In reality I’d call this chapter Configuration Details, but it does a very nice job of going through details around how to map objects in a replication configuration, as well as exploring the ability for GoldenGate to detect errors and execute SQL and/or stored procedures in response to those conditions.

Basic transformation is also covered.

Chapter 8 – Managing Oracle GoldenGate

This chapter covers basic command level security and spends a lot of time on the command interpreter GGSCI.  Also nice is a set of scripts and instructions to take performance output and format it for graphing in Excel.

Chapter 9 – Performance Tuning

The performance tuning chapter focuses on how to parallelize replication traffic and thoroughly exploit all system resources to increase throughput.  It also make mention of the 11.1 release of GoldenGate.  Details about tuning DBFS are also included. 

Like a lot of Performance Tuning advice, this section is more about throughput than performance optimization – in that vein it succeeds in covering ways to push more data more quickly.

Chapter 10 – Troubleshooting GoldenGate

The troubleshooting chapter begins with a good section on tracking down why replication may not be working – getting statistics on every process to see if they think they are capturing or sending data.  There is also good information on the CHECKPARAMS command which can be used to validate configuration files and the author also covers potential issues with the command.

The author covers the looking at checkpoints and networks as well.

There is a good section on creating exception handlers to capture and diagnose duplicate and missing record errors, including capture of before and after images.

Finally the chapter goes into detail on the LOGDUMP utility which can be used to examine trail files for error conditions.

Summary

Overall I found the book to be a good companion to the GoldenGate manuals and training materials.  It’s obvious that the author has a lot of configuration and operational experience with GoldenGate.  I found the book weak on design and planning for replication environments, so if you’re new to replication I’d suggest adding another book to your library above and beyond this one.