The 3 Letters Which Can Put an American DBA to Sleep are NL…zzzzz

Pity us poor US DBAs — safely secure using our ancient, many-times-upgraded Oracle 6 databases with their US7ASCII character sets.

We knew that ASCII only covered 0-127, but who could blame us when we started putting WE8MSWIN1252 “international” characters into those fields — the database let us, and it felt kind of sexy putting in cool European characters with umlauts and accents on them.

Besides, all of our business was with other American companies, and if someone had some “funny” characters in their name, then they just had to change them!

Of course, all of this is said with tongue firmly planted in cheek.  Nowadays you’d better be able to handle Unicode in your database if you want to have a prayer of not being labeled as something older than teleprinters and typewriters.

I first encountered this situation when working with a US7ASCII database where we started using XMLTYPE columns — little did I know that XMLTYPE columns actually validated the character set of the XML document coming in — one of our fields was the country name.

Everything was fine until February 13th, 2004 — the day ISO added an entry for the Aland Islands… (which has an A with a diacritical ring above it).

We started seeing errors inserting our XML documents — all due to strict validation of the character set.  Did we change character sets?  No — we stopped using the XMLTYPE columns :-(

Fast forward a few years and now I’m lucky enough to work with proper databases created with the AL32UTF8 character set — so now I can store my friend Mogens Noorgard name correctly (or I would if I could spell it�)

However, little did I realize that I needed to declare my columns differently…

You see, back in the day, VARCHAR2(10) meant that I wanted to store up to 10 characters in the column gosh darn it — I didn’t worry about bytes vs. characters — same thing right?

Er, no.

So in a brand new database with an AL32UTF8 character set, why was I getting column length errors trying to insert the string H,U,Y with an umlaut into a VARCHAR2(3) field?

Heck, isn’t Y with an umlaut just another character? Its just WESMSWIN1252 character 255, right?

Don’t tell me it’s a character set issue — I’ve been trying to avoid opening up that NLS manual for years…

Ok, ok — open the manual and start reading about Unicode — specifically UTF-8.  Uh-oh, I read the words “variable-length encoding” and the light starts to dawn…

Turns out that Y with an umlaut is 1 byte in WESMSWIN1252 (specifically 0xFF), but it’s 2 bytes in UTF-8 (0xC3BF).

But didn’t I declare the column to be 3 characters in length?  So why does it care about the underlying encoding?

Enter NLS_LENGTH_SEMANTICS and the fact that the default is set to BYTE.

From the documentation:



Parameter type

Parameter scope
Environment variable, initialization parameter, and ALTER SESSION

Default value

Range of values

By default, the character data types CHAR and VARCHAR2 are specified in bytes, not characters. Hence, the specification CHAR(20) in a table definition allows 20 bytes for storing character data.

This works well if the database character set uses a single-byte character encoding scheme because the number of characters is the same as the number of bytes. If the database character set uses a multibyte character encoding scheme, then the number of bytes no longer equals the number of characters because a character can consist of one or more bytes. Thus, column widths must be chosen with care to allow for the maximum possible number of bytes for a given number of characters. You can overcome this problem by switching to character semantics when defining the column size.

NLS_LENGTH_SEMANTICS enables you to create CHAR, VARCHAR2, and LONG columns using either byte or character length semantics. NCHAR, NVARCHAR2, CLOB, and NCLOB columns are always character-based. Existing columns are not affected.

You may be required to use byte semantics in order to maintain compatibility with existing applications.

NLS_LENGTH_SEMANTICS does not apply to tables in SYS and SYSTEM. The data dictionary always uses byte semantics.

Note that if the NLS_LENGTH_SEMANTICS environment variable is not set on the client, then the client session defaults to the value for NLS_LENGTH_SEMANTICS on the database server. This enables all client sessions on the network to have the same NLS_LENGTH_SEMANTICS behavior. Setting the environment variable on an individual client enables the server initialization parameter to be overridden for that client.



Can anyone tell me why the default would be BYTE?  Why would I want to declare character fields with BYTE lengths?  Thank goodness it’s not in bits…

Anyway, we adjusted our standard to make sure that DDL always specifies BYTE or CHAR in the declaration now:

VARCHAR2(10 CHAR) instead of VARCHAR2(10), so now we can be sure…

More Mass Market Oracle

Almost exactly one year ago, I wrote about the lack of hosted Oracle database packages.  Since then I haven’t seen much movement on that front, until last week, when STRATO AG, a German hosting company started offering a STRATO Oracle Server hosting package based on Parallels Virtuozzo Containers and Oracle Express Edition.  Here’s the press release.  I’m having a bit of trouble finding the package on the STRATO site, but if it’s true it’s another valuable option out there for ways to increase Oracle usage at the low end (where a lot of innovation takes place).  I’m curious how STRATO is handling the Oracle licensing issues…

Adventures in Setting up OEM Agents

So I’ve been working to push out OEM Agents from our Grid Control OMS on Windows XP to a couple of Linux servers running OEL 5.  It’s been a lot of fun, and I thought I’d share of the obstacles I overcame:

1. SSH User Equivalence. 

You need to get this working before anything else will work.  Fortunately there’s a pretty good script in oracle/oms10g/sysman/prov/resources/scripts that will do the hard work for you.  However, my first run of this resulted in (BTW, this is all run from a Cygwin bash shell since all of the scripts are Unix-based):

./ line 17: $’\r’: command not found
./ line 20: $’\r’: command not found
./ line 27: $’\r’: command not found
./ line 231: syntax error near unexpected token `elif’
./ line 231: `   elif ! test -f “$CLUSTER_CONFIGURATION_FI’E”

Couldn’t find much info in Metalink on this — turns out I needed to dos2unix the file before running it.

Also, the documentation on running this script is pretty good — especially this section.

2. You have to be careful with your Cygwin installation — I usually have Cygwin running anyway, and I install the net tools like telnet, ssh, ping, etc. 

However, I got errors saying that the following command in the .sh script was failing:

ping $host -n 5 -w 5

With the following message:

Usage:  ping [-dfqrv] host [packetsize [count [preload]]]

Turns out the Cygwin ping command doesn’t implement the -n and -w options.

Solution was to replace the ping command in the script with /cygdrive/c/windows/system32/ping.exe

3. Also turns out I didn’t actually have a Linux agent staged, even though I had a /oms10g/sysman/agent_download/ directory, I didn’t have an oui/oui_linux.jar file.  Once I downloaded that and unzipped it cleanly, I still got some odd errors.

Searching Metalink, you’ll see some references to an oddly named file: OC4J~OC4J_EMPROV~default_island~1.  It’s a log file kept in /oms10g/opmn/logs and it has a lot of detail about the attempt to deploy the agent.  I was getting the following error:

error:  cannot open zipfile [ /app/oracle/product/agent/tmp/oui/oui_linux.jar ]
unzip:  cannot find /app/oracle/product/agent/tmp/oui/oui_linux.jar, /app/oracle/product/agent/tmp/oui/ or /app/oracle/product/agent/tmp/oui/oui_linux.jar.ZIP.
chmod: cannot access `/app/oracle/product/agent/tmp/oui/Disk1/’: No such file or directory
chmod: cannot access `/app/oracle/product/agent/tmp/oui/Disk1/install/unzip’: No such file or directory

This even though I had the oui_linux.jar file and it clearly transferred to the target machine, I could see it:

$ ls -l
total 42884
———-+ 1 oracle oinstall 43909460 May  5  2006 oui_linux.jar

Odd, why are the permissions 000 instead of 755?

Turns out the permissions on the Windows side were 000 and the script uses scp -p to transfer the file (and retain permissions), so the unjar command on the remote host couldn’t see the file.  I didn’t know you could create a file with 000 permissions on it, although I guess it makes sense — you’d have to chmod it to 000 after creating it though.  A quick chmod command on the Windows side fixed this.

4. Finally I ran into something I haven’t solved yet — I’m running OEL 5 (redhat-5) and the Installer doesn’t recognize that as a supported O/S (it’s looking for redhat-3 and redhat-4).  I’m trying different ways around that but haven’t quite gotten it.


No, I’m not describing anyone’s personality, least of all my own :-)

I should title this series “An Oracle person’s experiences as a Linux SA”….

From my earlier posting, you all know that I like xterms with dark backgrounds and white foregrounds.  I’ve always liked typing that way — probably from my old terminal days.  I gave up on trying to make MS Word (You can do white on blue, albeit it’s a very bright blue) do that — and anyway word processing is different than a terminal to me any day.

Of course, Linux likes to colorize directory listings — and I can’t see the dark, dark blue of directory entries against my black background.  So, on to figuring out why it’s doing that and to make it stop…

LS_OPTIONS looks like it will do the trick, so I set it to –color=none and export it into my environment.  No luck.  Maybe it needs to be set in my login shell, and since it’s my machine and I think everyone should turn it off, I edit /etc/profile and set it there.  Logout / login — still no luck.

Maybe ls is aliased?  Yep — I run “alias ls” and I see that it is.  Since I don’t have an alias file I wonder where global aliases are set.  Maybe /etc/alias or /etc/aliases?  No, no such file exists.

Go back and look at /etc/profile — at the end of the file it goes to /etc/profile.d and runs every .sh file in that directory — odd.  One of those is which of course sets a whole bunch of aliases for ls.  So I rename to, re-login and poof!  No more colors.

I don’t like aliasing commands — leads  to confusion.  About the only commands I alias are cd and vi (points to vim nowadays….).

Edit: I’ve since learned that LS_OPTIONS doesn’t do anything anyway — ls doesn’t look at it.  It’s only used when aliasing ls… (rolls eyes).

Xen VM Cloning

Ok, now on to the cool stuff.  From my earlier post, you may be aware that I have a machine grant (or, to put it another way, domain0) — with 2 Xen VMs (sherman and mcpherson).  Actually, sherman and mcpherson are more like placeholders for VMs — I want to be able to test out different configurations in each one, re-using the same IP address.  So I’ve actually got sherman01 and mcpherson01.  Today I wondered how hard it would be to create sherman02 or simply clone sherman01.  It turned out to be dead simple:

  1. Go to /etc/xen and copy the sherman01 config file to sherman02
  2. Edit the sherman02 config file replacing all references to sherman01 with sherman02
  3. Since I’m using file-backed VMs, copy the files representing the disks for sherman01 to sherman02 equivalents and make sure the config file references them
  4. Shutdown sherman01
  5. Start sherman02

Worked like a charm.