Tuesday, September 3, 2013

Unknown DATAPUMP functionality: getting catalog from the file

To be fair, I am not spending all of my days writing the new PL/SQL book :-) I even work sometimes! And while solving real problems I reasonably often find interesting thing.

Today I was asked by one of my colleagues: is it possible to look inside of the existing DATAPUMP file. This file was a FULL import of client's database, and we needed to check for the existence of a very specific public synonym.

The question was perfectly valid and I started the research. First, I found that IMPDP has parameter SQLFILE=<filename>, that generates all DDL from the dump. It could work out, but the perspective of digging through megabytes of textual data was not very promising, so I continued my research.

Eventually I found a comment in of Oracle-related forums mentioning another IMPDP parameter: MASTER_ONLY=YES. We all know that every time you run IMPDP, it creates a special table that contains the catalog of objects to be processed (names like SYS_IMPORT_TABLE_##, SYS_IMPORT_SCHEMA_##, SYS_IMPORT_FULL_##). As it appears to be, this catalog is human-readable (mostly)! The parameter mentioned above just loads this table - and stops at this point:

C:\>impdp system/system@testDB dumpfile=StrangeFile.dmp logfile=imp_master.log master_only=yes full=y

Import: Release 11.2.0.3.0 - Production on Tue Sep 3 13:09:15 2013

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit
Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
Master table "SYSTEM"."SYS_IMPORT_FULL_01" successfully loaded/unloaded
Job "SYSTEM"."SYS_IMPORT_FULL_01" successfully completed at 13:09:32
C:\>

Now the table is ready to be examined! The following query shows the most important data:

select distinct
  object_schema,
  object_name,
  object_type,
  object_tablespace,
  process_order,
  duplicate,
  processing_status,
  processing_state
from system.SYS_IMPORT_FULL_01
where process_order > 0
and object_name is not null
order by object_schema, object_name

Just to make sure that everybody gets limitations here: this method does not provide you DDLs of objects. It allows to find what objects exist in the dump file. So, there will be no guessing anymore, which is always a good thing!

By the way, getting DDLs not is also trivial now:
* take required object names and put it into INCLUDE clause
* add SQLFILE clause to generate DDLs instead of firing them

Monday, August 26, 2013

Oracle 12c OFFSET...FETCH.. bug?

Working on the chapter that covers best ways of manipulating with object collection. Of course, if you talk about pagination of the rowset, it is very tempting to test Oracle 12c new feature - OFFSET…FETCH.

Unfortunately, immediately got an issue. It seems to me that passing a variable to FETCH is not supported:

SQL> declare
  2      type table_tt is table of employees%rowtype;
  3      v_tt table_tt;
  4
  5      v_limit_nr number:=10;
  6      v_counter_nr number:=0;
  7  begin
  8      select *
  9      bulk collect into v_tt
 10      from employees
 11      offset v_counter_nr*v_limit_nr rows
 12      fetch next v_limit_nr rows only; -- variable 
 13  end;
 14  /
    type table_tt is table of employees%rowtype;
                              *
ERROR at line 2:
ORA-03113: end-of-file on communication channel
Process ID: 3060
Session ID: 20 Serial number: 35307
SQL>

Doing the same with hard-coded value works fine:

 ... 
 12      fetch next 10 rows only; -- hard-code
 13  end;
 14  /
PL/SQL procedure successfully completed.
SQL>

Couldn’t find any reference to this anywhere, so it looks like a bug. Before I log it, could anybody confirm/deny the issue for different environment?

I am running:
- the official release of Oracle 12c
- on MS Windows 2012 (64-bit)
- on the top of VMWare ESXi 5.1
- VMX-9 version of the file

Update: got that issue confirmed as a bug #17404511. Thanks a lot to everybody who provided their test-cases. Will keep you posted.

Staying quiet... For a good reason :-)

Please, don't worry - I didn't suddenly disappear from the surface of the world of databases. It just happened that after some years of staying outside of publishing business I've been recruited to make a new book.The tentative name/dates are: PL/SQL Performance Tuning Guide: Tips and Techniques (Oracle Press, ETA late spring 2014).

Good news: my publisher authorized me to use interesting findings as spoilers. So, prepare yourselves for new discoveries in the land of PL/SQL!

Tuesday, April 30, 2013

Database roles - 2. Big...Mistakes?

Looks like my Friday post about BigData got some attention. In that case - let's continue! While looking at tons of articles about Big-anything I've noticed a very constant pattern - everybody is talking about multi-node clusters of database servers: you constantly hear about horizontal scalability, sharding, node failure tolerance etc... But let's ask the question - is it only me who feels that we are over-complicating our development patterns? Maybe we just don't know how to properly use available hardware resources?

Before going any further - let me tell you a "war story": couple of years ago a system built by us at Dulcian was compared to a similar system (both by the scope and functionality). And at one of the tender kick-off meetings our competitors loudly accused us of lying in the proposal! The reasoning was very simple - we claimed that our current required hardware was one database server (16 cores total) utilized by 15% and two application servers (4 core each) utilized by 5%. For competitors such a footprint was plainly impossible - to do exactly the same job they've been burning to the ground a couple of racks of servers!
Yeah, sure - by request our servers were counted and matched what was in the proposal!

That's the price to pay for architectural mistakes: for "them" it was a "new norm" to require a separate application server for every 100 users - and for years nobody questioned that ratio! So, let me ask - did our IT industry fall in the same trap and became accustomed to bad performance? Did we forget to wonder WHY do we need that extra app server for every 100 users?

As far as I see while attending different conferences current software/database architects prefer to have the same problems as their neighbor (pretty good "out-of-jail" card to show to management, isn't it?) rather than doing proper tuning on their systems. It is very rare for a company to do a real honest-to-God performance review - and event if it happens, it is usually in the wrong time and against the will of the current IT staff

Recently I talked to a number of top specialists who make living out of fixing somebody's systems - and all of them repeat the same story:
  • Once upon a time there was a current IT system that performed badly
  • "Contemporary architect" proposed complete rewrite of the whole system using "the new way"
  • Management looked at the cost of rewrite and went ballistic!
  • Somebody on IT staff finally proposed to hire a performance tuning experts/guru-DBAs. At least to be able to say next time that they followed all possible ways!
  • "Hired gun" in a couple weeks found a lot of "interesting things" (wasting about 3/4 of time fighting uncooperative locals). Some of those issue were fixable - some of them were not. But still the system started working faster (and often faster than anybody expected).
  • "Hired gun" got a big check (and curses behind the back from IT staff)
  • "Contemporary architect" got a pink slip (and also curses behind the back from IT management)
As you see, nobody is happy at the end! But the whole situation started when local IT specialists were not able to correctly evaluate existing resources and solutions. Yes, we all would like to be trend-compliant, but jumping to the system overhaul without a good reason is a dangerous as falling behind the technology curve.

Summary: your IT may not be as big as you think!


Monday, April 29, 2013

Dynamic SQL and RESULT_CACHE

Over the weekend one of my friends asked me for a clarification on RESULT_CACHE - a really interesting functionality that allows Oracle to cache in SGA results of PL/SQL function calls.

The point of confusion was the fact that between 11gR1 and 11gR2 Oracle development team significantly re-thought this feature - if in R1 developers needed to explicitly specify objects that should be monitored for cache invalidation (via RELIES_ON clause), starting R2 Oracle took care of it directly (RELIES_ON clause is completely ignored).

Interestingly enough, Oracle will monitor not only direct dependencies! If inside of the function there are calls via Dynamic SQL to other objects - those will be monitored too. Last year I had the whole section on this topic at ODTUG KScope'12 ("Dynamic SQL in 11g World")! Here is a snippet from my white-paper:

Integration of Dynamic SQL and RESULT_CACHE

Another good thing about the current implementation of Dynamic SQL is that Oracle’s PL/SQL team actively integrates it with other advanced features. For example, the “result cache” introduced in 11gR1 (and significantly rewritten in 11gR2) is efficient enough to not only auto-detect hard-coded dependencies, but also recognize and record on-the-fly calls made via Dynamic SQL. The following example includes a function that can get current row counts for a given table:

create or replace function f_getCount_nr (i_tab_tx varchar2)
return number
result_cache
is
    v_sql_tx varchar2(256);
    v_out_nr number;
begin
    execute immediate 
      'select count(*) from '||i_tab_tx into v_out_nr;
    return v_out_nr;
end;

Step #1 would be to confirm that (a) result cache actually works and (b) it recognized on-the-fly dependency.

SQL> select f_getCount_nr('EMP') from dual;
F_GETCOUNT_NR('EMP')
--------------------
                  14

SQL> select ro.id,  ro.name,  do.object_name
  2  from   v$result_cache_objects    ro,
  3         v$result_cache_dependency rd,
  4         dba_objects               do
  5  where  ro.id = rd.result_id
  6  and    rd.object_no = do.object_id;

ID NAME                                                   OBJECT_NAME
-- -----------------------------------------------------  ---- 
1  "SCOTT"."F_GETCOUNT_NR"::8."F_GETCOUNT_NR"#8440... #1  EMP
1  "SCOTT"."F_GETCOUNT_NR"::8."F_GETCOUNT_NR"#8440... #1  F_GETCOUNT_NR

SQL>select f_getCount_nr('EMP') from dual;
F_GETCOUNT_NR('EMP')
--------------------
                  14

SQL> select *
  2  from v$result_cache_statistics
  3  where  name in ('Create Count Success','Find Count');
ID  NAME                 VALUE
--- -------------------- ------
5   Create Count Success 1
7   Find Count           1
SQL>

Oracle successfully recognized the EMP table as cache dependency and was able to return a value from the cache when the function was called a second time. Now to test cache invalidation, I will insert a new row to EMP table and re-fire the function F_GETCOUNT_NR

SQL> insert into emp(empno) values (100);
1 row created.
SQL> commit;
Commit complete.

SQL> select f_getCount_nr('EMP') from dual;
F_GETCOUNT_NR('EMP')
--------------------
                  15

SQL> select id, name, value
  2  from v$result_cache_statistics
  3  where  name in ('Create Count Success',
  4                   'Find Count','Invalidation Count');

ID  NAME                 VALUE
--- -------------------- ------
5   Create Count Success 2
7   Find Count           1
8   Invalidation Count   1
SQL>

This time, Oracle successfully detected data changes and invalidated the previously cached information. Now let’s introduce a new dynamic dependency  (to DEPT table) and see whether the resulting cache would successfully recognize the difference.

SQL> select f_getCount_nr('DEPT') from dual;
F_GETCOUNT_NR('DEPT')
---------------------
                    4
SQL> select id,  name,  object_name
  2  from   v$result_cache_objects    ro,
  3         v$result_cache_dependency rd,
  4         dba_objects               do
  5  where  ro.id = rd.result_id
  6  and    rd.object_no = do.object_id;

ID  NAME                                                            OBJECT_NAME
--- --------------------------------------------------------------- -------------
3   "SCOTT"."F_GETCOUNT_NR"::8."F_GETCOUNT_NR"#8440831613f0f5d3 #1  EMP
3   "SCOTT"."F_GETCOUNT_NR"::8."F_GETCOUNT_NR"#8440831613f0f5d3 #1  F_GETCOUNT_NR
4   "SCOTT"."F_GETCOUNT_NR"::8."F_GETCOUNT_NR"#8440831613f0f5d3 #1  DEPT
4   "SCOTT"."F_GETCOUNT_NR"::8."F_GETCOUNT_NR"#8440831613f0f5d3 #1  F_GETCOUNT_NR

SQL> select id, name, value
  2  from v$result_cache_statistics
  3  where  name in ('Create Count Success',
  4                   'Find Count','Invalidation Count');

ID  NAME                 VALUE
--- -------------------- ------
5   Create Count Success 3
7   Find Count           1
8   Invalidation Count   1
SQL>

As you can see, a variation of the resulting cache with the dependency on DEPT (rather than EMP) was immediately recognized. This means that Dynamic SQL is indeed fully integrated into the overall on-the-fly caching mechanism.

Friday, April 26, 2013

Database roles: BigData, BigUsers, Big...Problems?

Today at lunch we had an interesting discussion about the role of databases in the contemporary IT development - and there was a lot of buzz-words thrown across the table: Hadoop, NoSQL, BigData, BigUsers (reasonably small data+high number of concurrent users), ACID-compliance etc. To be fair, a bit too many buzz-words - I have a very bad feeling that even contemporary architects stopped understanding core concepts behind their work!

Let's start from the key point - we have databases to manage DATA. And one of the key elements of this tasks is to make sure that data is reliably stored and retrieved. And here is a catch - what do we mean by reliable? Or to be precise - what happens to you/your company/your customers if some piece of the data is lost forever/unrecoverable? And the answer on this question drives the whole technology stack! For example, if you work with medical/legal/official data - a small chunk of lost information (if noticed) could mean litigation at best and people's life at worst!

Let's be clear - majority of current NoSQL DB solutions are explicitly not ACID-compliant (or at least not 100% ACID compliant). For example, I found a pretty good analysis of MongoDB and CouchDB - and it is clear that even its proponents say that there are always trade-offs between performance and data reliability. In some articles there are even suggestions to have double-environment implementation, where you have NoSQL-database for non-critical data plus RDBMS for critical data.

Just to clarify - what do I mean by ACID-compliance:
  • Atomicity requires that each transaction is executed in its entirety, or fail without any change being applied.
    • I.e. if you have successful INSERT and successful DELETE in the same transaction - you will have both/none of them committed.
  • Consistency requires that the database only passes from a valid state to the next one, without intermediate points.
    • I.e. it is impossible to catch the database in the state when for the stored data some rules (for example, PK) are not yet enforced
  • Isolation requires that if transactions are executed concurrently, the result is equivalent to their serial execution. A transaction cannot see the partial result of the application of another one.
    • I.e. each transaction works in its own realm until it tries to commit the data.
  • Durability means that the the result of a committed transaction is permanent, even if the database crashes immediately or in the event of a power loss.
    • I.e. it is impossible to have a situation when the application/user thinks the data is committed but after the power failure it is gone. 
As we can see from that list, all of these listed requirements are technically very challenging to implement, especially with the high number of concurrent users and significant data volumes - that why Oracle went extreme with its UNDO/REDO/LOG mechanisms. But that's the price to pay for being sure that if you saved the data - it would NEVER disappear.

I understand that there are environments where that small chance of data loss can be if not ignored, but at least tolerated: we all know that Craiglist is being run by MongoDB - so, what's the impact by one lost add? Somebody might get annoyed, but that's all!

Although when I start hearing about medical systems being built via NoSQL solutions - I start to get nervous. Maybe, in a couple of years before going to the doctor I will first check what kind of software they use! Just to feel safer...

Wednesday, April 24, 2013

Performance tuning. Spending time is NOT OK (if you do not know exactly why)

Yet another performance tuning story, similar to one that happened about a month ago. Sad thing - something was was coded years ago we never questioned the time spent... Just to keep it short:
  • once upon a time there was a very time-consuming module.
  • eventually we were forced to take much closer look - WHY is it so time-consuming
  • we found in our own view a function call that was completely unnecessary.
  • this function was very light that initially we ignored it altogether, but later we realized that we've been calling it 80000 times - and in that case even the lightest cost adds up.
  • removing this function (all needed data was already available via joined tables) took the cost of a class from 40 seconds down to 3.
  • Profit! :-)
We knew that the module is extremely complicated and that a lot of database operations was involved - so we assumed that it is absolutely fine! Of course, we've hit a case when assumptions are the worst enemies of a real knowledge... Especially if we are talking about performance tuning - because unless we've proven where exactly we are losing N seconds in the module X, we cannot say that module X is cannot take less than N seconds. Because those N seconds could be related to the different module/outdated coding technique/structural change etc. And only knowing what exactly is going on we can make a decision whether it is OK or not.

Summary: A couple of lessons learned:
  1. If something does not perform well in the existing system - check it (unless it is something known and documented)! You may think that that time loss is inevitable, but who knows? 
  2. Function calls in SELECT statements could cause a lot of issues even if each call is light - you must think not only about the cost of a function call, but about how many times this function will be fired
By the way, at the recent IOUG Collab'13 in the presentation "Top 5 Issues that Cannot be Resolved by DBAs (other than missed bind variables)" I covered the second topic (number of calls) a bit deeper. Below is a snippet from my white-paper:

Number of calls in SELECT clause

There are multiple ways of ensuring that if a function is referenced in the SELECT clause, it is not fired more often than needed. Unfortunately, few developers are even aware of this problem. My recommendation is to include the following set of examples in any PL/SQL class. This explicitly illustrates the difference between real understanding and guessing.

First, set up a basic environment to count total number of calls: a package variable to store the counter, a simple function, and a checker to display/reset the counter:

create package misha_pkg is
    v_nr number:=0;   
end;

create or replace function f_change_tx (i_tx varchar2)
return varchar2 is
begin
    misha_pkg.v_nr:=misha_pkg.v_nr+1;   
    return lower(i_tx);
end;

Create or replace procedure p_check is
begin
     dbms_output.put_line('Fired:'||misha_pkg.v_nr);
     misha_pkg.v_nr:=0;
end;

Second, run a very simple query against table EMP,  where the function above will be applied against EMP.JOB. And let us keep in mind that there are 14 total rows in the table EMP:

SQL> select empno, ename, f_change_tx(job) job_change_tx
  2  from emp;
   ...
14 rows selected.
SQL> exec p_check
Fired:14
PL/SQL procedure successfully completed.

If you just use the function, it will be fired for every row. But we know that there are only 5 distinct JOB values, so we should try to decrease the number of calls. In Oracle 11gR2, there is a very interesting internal operation called “scalar sub-query caching” being used while processing SQL queries. It allows Oracle to internally reuse previously calculated results on SELECT statements if they are called multiple times in the same query. The following example tests to see if using this operation helps:

SQL> select empno, ename, (select f_change_tx(job) from dual)
  2  from emp;
  ...
14 rows selected.
SQL> exec p_check
Fired:5
PL/SQL procedure successfully completed.
SQL>

The result shows that it did help. Now, only five distinct calls are registered, which isexactly as needed. Although, since we are discussing cache, why not use it explicitly? There is another very powerful feature called “PL/SQL function result cache.” The following example enables it on the function while the same query is run two times:

create or replace function f_change_tx (i_tx varchar2)
return varchar2 result_cache is
begin
    misha_pkg.v_nr:=misha_pkg.v_nr+1;   
    return lower(i_tx);
end;

SQL> select empno, ename, f_change_tx(job) from emp;
...
14 rows selected.
SQL> exec p_check
Fired:5
SQL> select empno, ename, f_change_tx(job) from emp;
...
14 rows selected.
SQL> exec p_check
Fired:0

The result is impressive! If the first call matches the sub-query caching, the second call is a fantastic example of great performance tuning – everything works as needed, but nothing is being done (actually, this is not 100% true, since the cache should be retrieved anyway, but for practical purposes it is a very simple PK lookup).