I’ll see your measly 6,000 tps and raise you…

Remember the other day when I was so impressed by updating roughly 6,000 rows per second.  What a rube I was.  That’s nothing.

So, Saturday night (I know, I need to get out more), I decided to work on the data loading problem I described in the last post.  We have roughly 60,000,000 rows to load and are required to get the job done in 3-4 hours.  Round up a bit and that’s 15,000 seconds.   Do the math and we need to be able to load about 4,000 rows per second.  Should be pretty easy — heck, if I can do 6,000 rows per second on the big AIX/EMC box, I should be able to do 4,000 rows per second on this Linux box.

I decided to try external tables for the loading — haven’t had a chance to play with them yet, so why not?  Here’s the documentation link.

So, I create an empty table for my target.  And then I create a 1,000,000 table to use to generate some test data by inserting as selecting 1,000 rows from dba_objects (twice and use a cartesian product to generate 1,000,000 rows).

(At this point, I should have been prepared for the results — but, it was late, I was tired, and I was just focussed on getting to the final step of loading the table.  But, for those of you who want the foreshadowing: the insert as select of 1,000,000 rows from dba_objects took only 8 seconds).

So then I use sqlplus to dump out the 1,000,000 rows to a flat file — took roughly 1 hour (no tuning of sqlplus fetching, forgot to turn termout off, etc..).

At this point it’s 12:30am and I’m really struggling to type commands correctly.  I finally get the external table created and fire up the insert /*+ append */ as select.

Total time to insert the 1,000,000 rows = 5.85 seconds.

I had to rub my eyes a bit to make sure I didn’t make a typo and only insert 100,000 rows.  But, nope — 1,000,000 rows in under 6 seconds.

I almost don’t believe it myself — at the Hotsos conference, Cary Millsap said: “Isn’t Oracle really fast?  Why is it so hard to make it go slow?  How come so many people are good at that?” :-)

So, that’s roughly 170,000 rows per second — or 40X faster than I need it to be.

Looks like we can load 60,000,000 rows in 6 minutes.

Where can I get more of these Linux boxes running 10gR2?

2 Responses to “I’ll see your measly 6,000 tps and raise you…”

  1. Niall Litchfield Says:

    Cool, but don’t forget the indexes – easy to do at 12:30 am.

  2. Roderick Says:

    Long time no hear. Welcome to the blogosphere.

Leave a Reply

Posting code can be a pain. To make sure your code doesn't get eaten, you may want to pre-format it first by using HTML Encoder