Friday, May 24, 2024

The Power of Proactive Monitoring. Part 4

 In the previous post I introduced one of my favorite performance debugging tools - Hierarchical Profiler. Since Oracle engineers were so proactive that instrumented SQL and PL/SQL engines with a clear mechanism to understand who calls whom inside of the database, it is just great that we (as developers) can have access to the same level of internal information. That information indeed tells us complete and true story of what is happening - not guesses, not estimates, real fingerprints of every involved element.

Now let's get a bit more real and look at the problem that I bet a lot of people encounter if they start using user-defined PL/SQL functions within SQL - it is a very powerful technique, but it also can cause a lot of damage if used inappropriately/without understanding of the internals.

Imagine the following scenario:

  1. Our help-desk started to get a lot of performance complains related to the same module 
  2. Corresponding developer checked 10046 trance and couldn't find anything suspicious
  3. The ticket is elevated to a senior performance expect (well, me :) )
And the very first thing I notice about the core query is that is contains a user-defined PL/SQL function. I immediately ask the developer - do you KNOW how many times that function is executed during the normal activity? The answer I've got was: "Well, it shouldn't be too much..."

Let's pause for a sec. We are soo accustomed to Cost-Based optimizer to do various miracles - but the database engine works only as good as data we provided. And it is important to check whether its logic matches ours. Here was the clear case of my developer "assume" instead of "check". So, the right solution is to get some hard statistics: I wrapped the suspect SQL call into HProf start/stop (below is a simplified case):

SQL> exec dbms_hprof.start_profiling ('IO', 'HProf_Case1.txt');
SQL> declare
  2      v_tx varchar2(32767);
  3  begin
  4      select listagg(owner_tx,',') within group (order by 1)
  5      into v_tx
  6      from (
  7          select distinct scott.f_change_tx(owner) owner_tx
  8          from scott.test_tab
  9          );
 10  end;
 11  /
SQL> exec dbms_hprof.stop_profiling;

Just to be clear - the table TEST_TAB has 50000 rows with only 26 distinct owners, and the function F_CHANGE_TX does basic formatting, nothing special... The HPROF report surprised everybody - 50K calls to the function!


This means that time was wasted on a very cheap function which was just fired lots and lots of times, because original developer just guessed the query behavior. He knew that the function would do basic formatting of the repeated value, so the output always will be the same - but he forgot to tell that to Oracle. And without that information the database fired mentioned above function for every row!

The solution was pretty simple - the query was rewritten to help CBO:

SQL> exec dbms_hprof.start_profiling ('IO', 'HProf_Case1_fix.txt');
SQL> declare
  2      v_tx varchar2(32767);
  3  begin
  4      select listagg(owner_tx,',') within group (order by 1)
  5      into v_tx
  6      from (
  7          select  scott.f_change_tx(owner) owner_tx
  8          from (select distinct owner 
  9                 from scott.test_tab)
 10           );
 11  end;
 12  /
SQL> exec dbms_hprof.stop_profiling

And here is a proof: 26 function calls, overall - 28 times faster!


IMHO, that was simple enough and straightforward enough illustration of how simple it is to find a problem if you have the right tool! And it is also a great illustration of the power of knowing something (rather than guessing)
 




No comments: