Monday, September 15, 2014

Good sources of Oracle information

Pretty often I am being asked about various online and printed source of Oracle-related information. Finally I decided to create a blog post here where I will start to accumulate all the goodies that I just "happen" to find. Maybe I will eventually create even some kind of a guide... Let's see.

Before we start - a couple of disclaimers:
* It is my personal list of favorite resources, so all of the following statements assumed to be prefixed with IMHO :-)
* The list represents my own interests, which are somewhat between DBA and Developer lands (plus a bit of Architect).
* I highly value printed books, because while working on them authors structure (and check!) the material much better comparing to blogs. So, if I mention a name below, you also should check his/her books - they will be worth your money. Even blogs have the advantage of being more interactive (you may get immediate feedback), the quality of books should not be underestimated.

Ok, let's start:

1. My gurus (overall)
- Tom Kyte asktom.oracle.com - no extra explanations needed, I hope ;-)
- Steven Feuerstein http://www.stevenfeuerstein.com/ - one of the main faces behind PL/SQL for the last 20 years
- Jonathan Lewis http://jonathanlewis.wordpress.com/ - after his explanations I finally started to get the Cost-Based Optimizer
- Daniel Morgan http://www.morganslibrary.org/ - huge collection of useful articles and examples! Often more understandable than original Oracle docs.

2. My teachers in the field of performance tuning
- Tanel Poder http://blog.tanelpoder.com/ - I've seen him reading raw trace files! He can decipher X$-objects too.
- Cary Millsap http://carymillsap.blogspot.com/ - extremely rare case of holistic approach to performance tuning (Method-R!)
- Craig Shallahamer http://shallahamer-orapub.blogspot.com/ - another example of very scientific approach to tuning
- Chris Antognini http://antognini.ch/blog/ - a lot of depth in understanding of how Oracle internals work together.

3. My fellows. They do the same kind of job as I do (only sometimes better!)
- Tim Hall http://www.oracle-base.com/
- Adrian Billington http://www.oracle-developer.net/
- Alex Nuijten http://nuijten.blogspot.com/
- Rob van Wijk http://rwijk.blogspot.com/
- Marco Gralike http://www.liberidu.com/blog/

4. DBAs land
- Arup Nanda http://arup.blogspot.com/.
- Alex Faktulin http://afatkulin.blogspot.com/
- Company blog (a lot of smart people up there!): http://www.pythian.com/blog/ Just to name a few: Alex Gorbachev, Michael Abbey, Christo Kutrovski, Jared Still, etc.

Friday, September 12, 2014

Analytic functions in real life

Yesterday while working on a real production report I found a requirement that forced me to come up with a bit nontrivial usage of analytic functions. Here is the problem (translated to SCOTT-level terms):

- I have a table that contains a DATE column (HIREDATE)
- Input parameter is also a date (IN_DT)
- For each department I need to show all rows that have HIREDATE after IN_DT plus the most recent row from the past (only one!).

Obviously, if you split the last condition into two, the solution is reasonably clear:

SELECT empno, ename, deptno, hiredate
FROM scott.emp
WHERE hiredate>:IN_DT
UNION ALL
SELECT empno, ename, deptno, hiredate
FROM (
     SELECT e.*,
            row_number() 
               over(PARTITION BY deptno
                    ORDER BY hiredate desc, 

                             empno desc
                   ) rank_nr
     FROM scott.emp e
     WHERE hiredate<=:IN_DT
     )
WHERE rank_nr = 1

The only problem of this solution - it needs two passed over EMP table. But in my production case that table is huge. Also, just to make life interesting, the result of the specified query has to be joined to 8 other tables to provide the required output. It is understandable, that my first optimization idea was to somehow do a single-pass query. To my own surprise, Oracle analytic functions are flexible enough to help me out:

SELECT empno, ename, deptno, hiredate, rank_nr
FROM
(
SELECT e.*,
      dense_rank()  
        over(PARTITION BY deptno
             ORDER BY 
               CASE WHEN hiredate > :IN_DT THEN '9999'
                    ELSE to_char(hiredate,'YYYYMMDDHH24MISS')||
                           lpad(empno,10,'0')
               END  desc   
            ) rank_nr,
      max(hiredate) 
        over(
             PARTITION BY deptno) max_dt        
FROM scott.emp e
)
WHERE max_dt > :IN_DT AND rank_nr IN (1,2)
      OR
      max_dt < = :IN_DT AND rank_nr = 1

In this solution I two analytical functions
1. DENSE_RANK is forced to order in the following way:
- bring all future dates to the top by assigning them exactly the same value '9999'
- order all other dates by converting them to 'YYYMMDDHH24MISS' format. In that format DESC would bring the most recent date up anyways.
- to resolve conflicts of multiple entries associated with the same date concatenate past dates with padded EMPNO. This way textual sort would correctly work even if the length of EMPNO may differ.
2. MAX() OVER() brings the highest hire date for each department.

Now the outer SELECT has enough information to filter out correct rows: if the highest date if after the parameter, you need to bring ranks 1 and 2 (all future dates + the last past date). Otherwise you need only rank 1 (last past date).

I understand, that this solution may take a couple of seconds to wrap your head around :-). Still - I was able to do exactly what I needed in a single pass!

Summary: if you are working on reports, analytic function is the MUST. That's the only safe way to write efficient code that would consistently work over large data sets.

Tuesday, September 9, 2014

Oracle OpenWorld 2014

Less than 3 weeks left before the biggest Oracle event of the year kicks in - and I bet, Oracle OpenWorld'14 will impress anybody (as usual). I am honored to be selected as one of speakers there. Here are my two talks:

- UGF4428 "Expanding SQL Horizons: PL/SQL User-Defined Functions in the Real World"
  Sunday, Sep 28, 1:30 PM - 2:15 PM
  Moscone South - 310

- CON4429  "Calling SQL from PL/SQL the Right Way"
Thursday, Oct 2, 2:30 PM - 3:15 PM
Moscone South - 308

Everybody is welcome! I really think that these presentations can improve you day-to-day PL/SQL.

P.s. I will do my best to provide a daily update from the ground! There should be a lot of interesting things to share.

Wednesday, August 20, 2014

Time to resurrect blogging!

Sorry for staying quiet for some time, but for some good reasons. Some of these reasons will be explained in later posts, but for now I would like to share the story I wrote upon request of Steven Feuerstein, who asked to tell how PL/SQL impacted our life. Here is my story:

-----------------
I started to work with databases already in college, but more from the data modelling/business analysis angle. When I moved to the USA in 2000 (just after graduation) I quickly recognized that my language skills are too limited to work as BA, so I went to evening Oracle courses in Philadelphia to deepen by technical skills – SQL, PL/SQL, SQL*Plus, into to administration etc. 

And suddenly I realized that I like it even more than playing with ERDs! Also, a bit of a lucky break – one of my teachers recommended me to his former boss. To be fair, I have no idea how I survived that interview – somewhere in the middle of it I realized that this boss is no less than Dr. Paul Dorsey, whose book I was recently quoting in my Master Degree Thesis… Anyways, on February 26th, 2001 Dulcian Inc got a brand new 21-year-old employee with the focus on database development. 

And couple of months later I’ve got a task that really shaped my specialization: I needed to write a module that would check for invalid objects in the schema and compile them in the appropriate order of dependencies. So, the set of tools is obvious (now), but at that time I had to discover: data dictionary views, object collections, recursion, and, finally, Dynamic SQL. For my reader/listeners – that list sounds really familiar, isn’t? :-) Looks like Oracle versions change, but favorite toys stay the same!

After that module started to work (I think, even Dr. Paul got surprised!) my field was settled: solving strange/unsolvable database problems using SQL and PL/SQL. That’s pretty much what I still do now – even 14 years later. Bad news – there are lots of such problems. Good news – it’s fun to solve them (and that’s the way I get new stories for books/presentations!)
---------------------

Not exactly a fairy tale, but it definitely shows that you never know how your career will move forward, especially just after college. By the way, I brought with me to the USA that Russian-translated book by Dr. Paul! It stays right now in the office together with other memorabilia. Here is the scan:


Tuesday, September 3, 2013

Unknown DATAPUMP functionality: getting catalog from the file

To be fair, I am not spending all of my days writing the new PL/SQL book :-) I even work sometimes! And while solving real problems I reasonably often find interesting thing.

Today I was asked by one of my colleagues: is it possible to look inside of the existing DATAPUMP file. This file was a FULL import of client's database, and we needed to check for the existence of a very specific public synonym.

The question was perfectly valid and I started the research. First, I found that IMPDP has parameter SQLFILE=<filename>, that generates all DDL from the dump. It could work out, but the perspective of digging through megabytes of textual data was not very promising, so I continued my research.

Eventually I found a comment in of Oracle-related forums mentioning another IMPDP parameter: MASTER_ONLY=YES. We all know that every time you run IMPDP, it creates a special table that contains the catalog of objects to be processed (names like SYS_IMPORT_TABLE_##, SYS_IMPORT_SCHEMA_##, SYS_IMPORT_FULL_##). As it appears to be, this catalog is human-readable (mostly)! The parameter mentioned above just loads this table - and stops at this point:

C:\>impdp system/system@testDB dumpfile=StrangeFile.dmp logfile=imp_master.log master_only=yes full=y

Import: Release 11.2.0.3.0 - Production on Tue Sep 3 13:09:15 2013

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit
Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Master table "SYSTEM"."SYS_IMPORT_FULL_01" successfully loaded/unloaded
Job "SYSTEM"."SYS_IMPORT_FULL_01" successfully completed at 13:09:32
C:\>

Now the table is ready to be examined! The following query shows the most important data:

select distinct
  object_schema,
  object_name,
  object_type,
  object_tablespace,
  process_order,
  duplicate,
  processing_status,
  processing_state
from system.SYS_IMPORT_FULL_01
where process_order > 0
and object_name is not null
order by object_schema, object_name

Just to make sure that everybody gets limitations here: this method does not provide you DDLs of objects. It allows to find what objects exist in the dump file. So, there will be no guessing anymore, which is always a good thing!

By the way, getting DDLs not is also trivial now:
* take required object names and put it into INCLUDE clause
* add SQLFILE clause to generate DDLs instead of firing them

Monday, August 26, 2013

Oracle 12c OFFSET...FETCH.. bug?

Working on the chapter that covers best ways of manipulating with object collection. Of course, if you talk about pagination of the rowset, it is very tempting to test Oracle 12c new feature - OFFSET…FETCH.

Unfortunately, immediately got an issue. It seems to me that passing a variable to FETCH is not supported:

SQL> declare
  2      type table_tt is table of employees%rowtype;
  3      v_tt table_tt;
  4
  5      v_limit_nr number:=10;
  6      v_counter_nr number:=0;
  7  begin
  8      select *
  9      bulk collect into v_tt
 10      from employees
 11      offset v_counter_nr*v_limit_nr rows
 12      fetch next v_limit_nr rows only; -- variable 
 13  end;
 14  /
    type table_tt is table of employees%rowtype;
                              *
ERROR at line 2:
ORA-03113: end-of-file on communication channel
Process ID: 3060
Session ID: 20 Serial number: 35307
SQL>

Doing the same with hard-coded value works fine:

 ... 
 12      fetch next 10 rows only; -- hard-code
 13  end;
 14  /
PL/SQL procedure successfully completed.
SQL>

Couldn’t find any reference to this anywhere, so it looks like a bug. Before I log it, could anybody confirm/deny the issue for different environment?

I am running:
- the official release of Oracle 12c
- on MS Windows 2012 (64-bit)
- on the top of VMWare ESXi 5.1
- VMX-9 version of the file

Update: got that issue confirmed as a bug #17404511. Thanks a lot to everybody who provided their test-cases. Will keep you posted.

Staying quiet... For a good reason :-)

Please, don't worry - I didn't suddenly disappear from the surface of the world of databases. It just happened that after some years of staying outside of publishing business I've been recruited to make a new book.The tentative name/dates are: PL/SQL Performance Tuning Guide: Tips and Techniques (Oracle Press, ETA late spring 2014).

Good news: my publisher authorized me to use interesting findings as spoilers. So, prepare yourselves for new discoveries in the land of PL/SQL!