The Social Development Database

I’ve always been fascinated by development databases — more so sometimes than huge, heavily utilized production ones. Mainly because I’ve seen how the beginnings of a performance problem, or the start of an elegant solution takes shape within a development database. It’s one of the reasons why I love high levels of visibility through full DDL-auditing within development. I love to SEE what database developers are thinking, and how they are implementing their ideas using specific shapes of data structures.

One of the concepts I’d love to see is a “river of news” panel within development tools to see what is going on within a development database. Some of the good distributed source code control systems do this now.

Here’s a good example of what I mean:

http://github-images.s3.amazonaws.com/blog/2011/mac-screenshots/commits-full.png

Imagine if there was a panel in SQL*Developer or TOAD that looked something like this:

This wouldn’t be all that hard to implement if full DDL-auditing was turned on…

Oracle OpenWorld 2011–Reflective Thoughts

And so another OpenWorld has come and gone, and while I wasn’t able to attend in person this year, I was able to watch most of the keynotes live while following along with my peeps on Twitter.

It’s always interesting to see whether or not people are “impressed” with the announcements from Oracle during OpenWorld — a lot of that depends on your perspective. While the past couple of OpenWorlds brought us Exadata and Exalogic, I felt that there were a LOT of “engineered systems” announced in both the run-up to OpenWorld (Database Appliance, SPARC SuperCluster) and at OpenWorld itself (Exalytics, Big Data Appliance). If you’re keeping score at home, you now have at least the following set of engineered system components to choose from:

Exadata
Exalogic
Exalytics — an OBIEE high-performance system (Essbase, OLAP, TimesTen)
Database Appliance (mid-market 2-node RAC in a box)
Big Data Appliance (Hadoop, NoSQL, R and Infiniband connectivity)
Exadata Storage Expansion Rack
SPARC SuperCluster

I predict that integrators and Oracle sales engineers will be very busy putting together solution portfolios and configurations for large customers.

This bigger set of products also puts more pressure on Oracle to deliver a solid management console that can oversee multi-engineered system landscapes, and while the jury is out on Oracle Enterprise Manager 12c, there were several encouraging bits about it — including the ability to customize the screens and workflows in a whitelabel fashion.

Of course, in addition to the Big Data Appliance, which appears to be Oracle’s way of “legitimizing” Hadoop within the enterprise, and providing tighter integration through enhanced connectors over Infiniband, there was another Oracle database product “announced” in the form of Oracle NoSQL. From most accounts, the NoSQL product appears to be a well-engineered key-value store system based on the Berkley DB software.

Then we had the Oracle Public Cloud announcement and theater around Salesforce.com and their keynote. With the cloud, Oracle emphasized the their stance on the open, portable nature of Java and how you can easily move onto and off of their cloud. Two things about the Oracle cloud were particularly interesting to me: the possibility of getting access to “public” data sets on the cloud, and the Oracle Social Network.

Larry Ellison demonstrated the Oracle Social Network used within a company sales process as a collaborative activity streaming tool integrated with Oracle’s Fusion applications — which seemed to resonate well with the enterprise customers in attendance.

All in all, a lot of stuff — and I didn’t even cover the Fusion Apps stuff.

One final intriguing thought — now that Oracle has so many different database products: RDBMS, NoSQL, Essbase, TimesTen and Rdb — it will be interesting to see how they “integrate” them, possibly on their cloud. I can imagine a future in which you don’t choose your product, but rather your feature and usage requirements, something like this:

Oracle Public Cloud Data Configuration

Describe your data requirements:

I need high-volume access to keys and values, I am less concerned about consistency
I need tables and columns that I can use to create relations and views to support ad-hoc queries and analysis
I need faceted, multi-dimensional analysis structures to support numerical analysis
I have a lot of documents and my data is basically unstructured.

Describe how your want to access your data:

I need JDBC / SQL connectivity
I need a RESTful API
I need a SOAP API

And then underneath the covers the cloud provisions the correct product for you, while watching your usage to see if it needs to configure a different product…

Exalytics and the cloud / OpenWorld kickoff

At yesterday’s Oracle OpenWorld keynote address, Larry Ellison spent a lot of time reviewing the impressive achievements of both the Exadata database machine and the Exalogic middle-tier application machine — extolling the purpose-built nature of both systems around what he termed to be the foundations of business data processing value. In particular, the parallel nature of scanning and handling large amounts of structured data on Exadata, the ability of Exalogic to run reams of Java, and perhaps most important of all, the fact that Oracle has made a significant bet on Infiniband to rapidly move data around within and between the systems.

Referring to Ethernet as being “from the 60′s”, it’s clear that Oracle thinks it’s time to move past it for handling large data transfers.

It was interesting to see Larry try to have it both ways in the keynote — asking what the purpose is of IBM’s “fastest integer” processor, “that’s great, but it’s the fastest for what?” — and then proceed to talk about the SPARC SuperCluster as a general purpose machine.

In any event, the star of the keynote was the 3U rack-mountable Exalytics “Analytic” machine — a high-memory, high-compute “node” that provides OBIEE / Essbase / OLAP folks with their very own engineered system. By cramming memory (1TB) and cpu (40 cores) along with the in-memory TimesTen database technology into the box, Larry described a system that allows for analysis “at the speed of thought”. If you’re already an OBIEE user, this system should provide you with plenty of excitement.

Less clear is how all of these engineered systems (including the yet-to-be-described Big Data Appliance / Hadoop machine on display at OpenWorld) will be put together by integrators and customers to provide an analytic / high-transaction data processing cloud. It’s as if Oracle is slowing replacing each item on its software price list with a hardware “systems” item — it will be interesting to see how these systems get put together into solution portfolios. For example, does it make sense to buy an Exadata half-rack and fill the remaining space in the rack with Exalytic machines? Would such a configuration be supported? Encouraged?

Back to the purpose-built machines — how does this play against the cloud trend of ubiquitous, generic nodes and software tailored to that environment?

I actually think we’ll know a lot more about this interplay (and how Oracle intends to adapt to or shape the discussion) as OpenWorld proceeds — especially around customer expectations for keynotes around big data and the cloud.

In the abstract

During conference season, it can be a challenge to come up with abstracts that you can feel passionate about, while making sure to craft them in a way that is both attractive to selection committees and the audience you feel like you want to reach. I often find that tri-purpose (satisfying myself, a committee, and the potential audience) to be daunting and occasionally conflicting — leading to abstract paralysis.

Starting today, I’m going to work harder at it. If you’ve been to any of my presentations in the recent past, you know that I like to spend more time on what I think are technical “culture” issues rather than examples of how to implement or interpret the technical features of the latest software release. It’s an area that I’m passionate about, and it’s one that I feel is drastically underrepresented and underserved at most technical conferences.

The biggest challenge I have with those kinds of presentations is making them selectable and attractive — for the topics mostly concern our ability to collaborate and communicate effectively in support of our business and mission objectives. And in that case, we all feel (myself included) like we’re from Lake Wobegon.

To me, no where is this more apparent than in the discussions about the Agile movement in software development, testing and production operations. Fellow Oak Table member Martin Widlake has some excellent examples of these issues in his 2 recent blog posts on the subject:

“Friday Philosophy – Why Doesn’t Agile Work?” and “In Defense of Agile Development (and their Ilk)”

(I especially like “Ilk”)

In a small, forgotten corner of the Internet, I belong to a Yahoo! Group (yes, they still exist!) on Agile Databases, which has as its description:

Discussion about Database management with regards to Extreme Programming practices and principles.

You can visit the group here.

In a recent discussion, there was a post from Scott Ambler that I found myself violently agreeing with:

A question was asked about coordinating and scheduling changes made by database and ETL teams with the development teams in order to reduce confusion and churn during development.

Question / Comment: While one or more code iterations are taking place in parallel, the data design and ETL are working on their iteration of the db schema and data, which will be consumed by later code iterations.

Scott’s Comment / Answer: Better yet, this could occur in a “whole team” manner where data-experienced people are embedded in the actual team.  This can improve productivity by reducing overall overhead.  Unfortunately this can be difficult in many companies due to the organizational complexities resulting from the cultural impedance mismatch between data and development professionals.

(Emphasis mine)

I feel like I’ve have the privilege of working in places where those organizational complexities and cultural impedance mismatches were overcome and I’d love to talk about what I think made that happen.

Now just to write some compelling abstracts on the subject — ideas welcome!

NLS, Part Deux

A guest post today, by Brian Ledbetter, a co-worker at Agilex:

On a customer’s database, we ran across a table that would not migrate.  It was admittedly a log table, containing long chunks of HTTP header data, but whenever we tried importing it into our 11gR2 database, we ended up getting:

IMP-00058: ORACLE error 1461 encountered ORA-01461: can bind a LONG value only for insert into a LONG column

After looking at the table structure, the first thing we noticed was that there was a VARCHAR2(4000) column in the table.  Considering that this column was already the maximum size (in bytes) for a CHAR-based data type, it became the focus of our attention.

Looking online for solutions, we found references [1] suggesting that Oracle was implicitly converting this column to a VARCHAR2(4000 CHAR) type, creating a column that can contain up to 4 bytes per character.[2]  Because this overflows the 4000 byte limit on column length, Oracle then attempted to implicitly convert the datatype to a LONG VARCHAR2, which is apparently deprecated in 11gR2.[3]  (We’re not sure why Oracle is still trying to make this conversion, if that’s the case.)

Anyway, we tried precreating the table with a CLOB datatype, and that didn’t work either, so as a workaround, we created a copy of the table with the data trimmed to 1000 characters (leaving plenty of room after UTF8 conversion):

create tabname_migtmp as select col1, col2, substr(col3,1,1000) col3 from tabname;

We then used exp/imp to copy tabname_migtmp over to the 11gR2 server, and inserted the data from it into the final location.

insert into tabname select * from tabname_migtmp;

drop table tabname_migtmp;

[1] http://forums.oracle.com/forums/thread.jspa?threadID=1038043

[2] http://stackoverflow.com/questions/5230346/char-semantics-and-ora-01461

[3] http://forums.oracle.com/forums/thread.jspa?threadID=2230351

See Also: Technote 444171.1, https://supporthtml.oracle.com/ep/faces/secure/km/DocumentDisplay.jspx?id=444171.1