Tuesday, June 9. 2009
ORA-04031 Errors and Monitoring ... Posted by Barb Matthews in 11g at 07:00
ORA-04031 Errors and Monitoring Shared Pool Subpool Memory Utilization with sgastatx.sql by Tanel Poder
Don't miss Tanel Poder's upcoming seminar Advanced Oracle Troubleshooting Seminar, to be held in Denver June 15-17. If you're on the fence about whether to take this intensive class, take a look at some class feedback.
Since Oracle 9.2 the shared pool can be “partitioned” into multiple parts. This was probably done for relieving shared pool latch contention for crappy applications (which use shared pool latches too much due bad cursor or connection management).
The “partitions” are called shared pool subpools and there can be up to 7 subpools. Each subpool is protected by a separate shared pool latch and each subpool has its own freelists and LRU list. If you are interested in more details, a good starting point is this whitepaper by Oracle.
There are a few different ways for detecting how many subpools you have in use. The more convenient ones are here:
You could query X$KGHLU which has a line for each shared pool subpool and (from 10g) also java pool if it’s defined:
SQL> select count(distinct kghluidx) num_subpools
2 from x$kghlu
3 where kghlushrpool = 1;
The “kghlushrpool” column, which is 1 for shared pool subheaps and 0 for java pool, isn’t there in 9i (and in 9i the java pool apparently is not reported in x$kghlu anyway).
The reason I don’t just count all matching lines from x$kghlu but use count distinct instead is that in Oracle 10.2.0.1 there are 4x more lines reported in this x$table. There’s an additional concept called sub-sub-pool starting from 10.2 where each shared pool sub-pool is split futher into 4 areas (allocations with different expected lifetime/durations go into different sub-sub-pools, but the same sub-pool latch protects all activity in sub-sub pools too). But in 10.2.0.1 the x$kghlu reports all sub-sub-pools too for some reason. The whitepaper from Oracle mentioned above explains this in more detail.
So from above output I see that in my instance all 7 shared pool subpools are in use. Oracle determines the number of needed subpools (during instance startup) based on your shared pool size and cpu_count. IIRC in 9.2 if you had 4 CPUs or more AND the shared_pool_size was bigger than 256 MB then 2 subpools were used, in 10g shared_pool_size had to be bigger for that, 512 MB I think and in 11g its 1GB. I don’t recall the exact threshold values and that’s not really important as you can see yourself how many subpools are in use with the above query.
For the sake of this experiment, I set the _kghdsidx_count variable to 7, this parameter can be used to force the number of subpools you want. In 9.2 days it was actually quite common to set this back to 1 IF you had ORA-4031 errors AND the reason was diagnosed to be free space imbalance between subpools. However since 10g this has been almost unnecessary as Oracle has improved their heap management algorithms.
SQL> @pd kghdsidx
NAME VALUE DESCRIPTION
--------------------------------------------- ------------------------------ ------------------
_kghdsidx_count 7 max kghdsidx count
So far the two above approaches have required access to X$ tables which usually means you need to be logged on as SYSDBA. What if you don’t have such access?
In such case you can work this out pretty reliably by looking into how many of the shared pool latches are actually in use. All 7 latches are always there, even if you have less subpools in use, that number is hardcoded into Oracle. But you can see how many latches have a significant number of gets against them.
In my case it's evident that all latches are in use; they all have a significant number of gets against them:
SQL> select child#, gets
2 from v$latch_children
3 where name = 'shared pool'
4 order by child#;
7 rows selected.
It’s ok to see some latch gets against the latches of unused subheaps, but this number should be much much smaller than others. The reason appears to be that all subheap latches are taken when shared pool is allocated and when shared pool resize operations are done.
For example, this is what I see after setting the number of shared pool subpools to 2 in my test database (and running some hard parsing workload):
And now to the troubleshooting part!
Note that this article doesn’t aim to explain all the basics of ORA-4031 troubleshooting, I’ll talk about the subpool utilization imbalance problem only. If you haven’t read MetaLink Note 396940.1 - “Troubleshooting and Diagnosing ORA-4031 Error” yet, I recommend to do this first and then read my comments here.
As you know, ORA-4031 errors look like this:
ORA-04031: "unable to allocate n bytes of shared memory ("shared pool", "object_name", "alloc type(2,0)" ...)
“n” shows how many bytes we tried to allocate when ended up with the failure. Italic strings can show various different values but essentially they’re just some metadata describing for what did we try to allocate that memory.
Note the two bold pieces. The “shared pool” means that we tried to make the allocation from shared pool (if you have problems with other pools you can see there “large pool”, “streams pool”, “java pool” as well).
The “2″ in “(2,0)” means that the failure happened in shared pool sub pool number 2 and the “0″ shows sub-sub-pool number 0.
Sometimes the error happens just due heavily undersized shared pool (combined bad cursor management or some incorrect parameter values). In such cases you would see the shared pool free memory drop to near-zero in V$SGASTAT.
However, sometimes you can have ORA-4031’s even when you see plenty of free space available in V$SGASTAT. What’s the issue with that?
This case happens mainly for two reasons:
1) Shared pool free memory fragmentation
There is no big enough free chunk available even after flushing out unpinned chunks from LRU list. In other words, you have a lot of small free chunks scattered around in different places in shared pool but there is no single big enough chunk available for acommodating our allocation. I will talk about troubleshooting this problem in a separate post.
2) Unbalanced memory usage / free memory in different shared pool subpools
This is what I’m explaining in this post.
So, how to monitor which subpool has how much free memory available?
V$SGASTAT unfortunately just shows a sum of all subpools:
SQL> select from v$sgastat
2 where pool = 'shared pool'
3 and name = 'free memory';
POOL NAME BYTES
------------ -------------------------- ----------
shared pool free memory 188017360
However, when we look into the source code of GV$SGASTAT we see this (output edited for readability):
SQL> @v gv$sgastat
SQL> desc x$ksmss
There’s an interesting column, KSMDSIDX column - and it’s also what I was looking for! This column contains the subpool number in it!
Before I query by that column, remember that I had set the number of subpools back to 2 in my test instance:
SQL> @pd kghdsidx
Ok, lets see what values we have in that column:SQL> select distinct ksmdsidx from x$ksmss;
Hmm… 1 and 2 look ok as I have subpool 1 and 2 defined in the instance, but why is there a subpool 0 also reported? (from 10g anyway).
This is due to a little feature in Oracle. When you start the instance in 10g, then not all memory reserved for shared pool is immediately given to subpool heaps. Some memory is reserved for individual subpool growth. This allows some subpools to grab more memory than others if they have more allocations after instance startup. This may be useful in cases where due some specific issue some subpool always needs much more memory than others. On the other hand, I have not seen a subpool heap give memory back to some other subpool so if one subpool allocates all of the reserved memory after instance start due some application startup activity, then the other pools may remain too small for the whole lifetime of the instance.
So, if you have ORA-4031 out of shared pool memory errors or suspect that shared pool memory pressure is the cause of some performance problem (like shared pool latch contention and excessive library cache evictions/reloads) then you’d want to monitor shared pool memory breakdown at the subheap level.
And (finally) I can introduce a little script sgastatx.sql which queries X$KSMSS and formats the output for better readability.
The script takes one parameter, what memory allocation reasons to report (% would report all):
I will start with “total” which just reports me the shared pool totals and doesn’t break down by allocation reason.
SQL> @sgastatx total
The bold part above is the total memory reserved for shared pool. The “0 - unused” is the not-yet-used-for-any-subheap part of the memory. And 1 & 2 are the allocations to subheap 1 and 2.
By the way I can confirm these numbers by querying v$sgainfo:
SQL> select from v$sgainfo;
SQL> select current_size from v$sga_dynamic_components where component = 'shared pool';
Continuing with the examples, usually I would be interested in seeing how much free memory each subpool has in it:
SQL> @sgastatx "free memory"
And when there’s not enough free memory in some subpool then you can run sgastatx with % parameter to report all memory users in that subpool. I removed some lines from output for brevity:SQL> @sgastatx %
Of course sometimes you’d want to know how the memory usage breakdown changes over time, for that you’d need to write a little collector script which dumps the data into some table and visualize it later on, like I have done for regular V$SGASTAT data with my PerfSheet tool
Friday, April 3. 2009
Another Great Book by Robert Freeman ... Posted by Barb Matthews in 11g, Books at 06:51
Robert's latest technical book about Oracle is available. It's called OCP Oracle Database 11g Certified Professional Study Guide. It's a prep guide for the Oracle Database 11g OCP Exam (Exam 1Z0-053). Robert will be teaching a corresponding class, so watch our training schedule for an announcement.
Check out this and other great books on our books link.
Tuesday, March 31. 2009
Oracle 11g: Reading alert log via ... Posted by Barb Matthews in 11g at 06:08
Oracle has done some major improvements in the diagnosability infrastructure in version 11g. Here’s one little detail.
Before Oracle 11g it is possible to access the alert log via SQL using an external table or a pipelined function which in turn uses utl_file.
After reading the text you need to parse it to extract the information you need from there.
Starting from 11g Oracle does all this work for you. There is a fixed table X$DBGALERTEXT, when you query it, Oracle reads the log.xml from alert directory (which contains all the data what alert.log does), parses it and returns the details back as rows:
SQL> select message_text from X$DBGALERTEXT where rownum <= 20;
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Shared memory segment for instance monitoring created
Picked latch-free SCN scheme 2
Using LOG_ARCHIVE_DEST_10 parameter default value as USE_DB_RECOVERY_FILE_DEST
Autotune of undo retention is turned on.
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up ORACLE RDBMS Version: 126.96.36.199.0.
Using parameter settings in client-side pfile /u01/app/oracle/admin/LIN11G/pfile/init.ora on machine linux03
System parameters with non-default values:
processes = 150
memory_target = 404M
control_files = "/u01/oradata/LIN11G/control01.ctl"
control_files = "/u01/oradata/LIN11G/control02.ctl"
control_files = "/u01/oradata/LIN11G/control03.ctl"
db_block_size = 8192
20 rows selected.
This is the text representation, but you can get individual details from other columns as listed below:
SQL> desc X$DBGALERTEXT
Name Null? Type
------------------------------- -------- ----------------------------
1 ADDR RAW(4)
2 INDX NUMBER
3 INST_ID NUMBER
4 ORIGINATING_TIMESTAMP TIMESTAMP(3) WITH TIME ZONE
5 NORMALIZED_TIMESTAMP TIMESTAMP(3) WITH TIME ZONE
6 ORGANIZATION_ID VARCHAR2(64)
7 COMPONENT_ID VARCHAR2(64)
8 HOST_ID VARCHAR2(64)
9 HOST_ADDRESS VARCHAR2(16)
10 MESSAGE_TYPE NUMBER
11 MESSAGE_LEVEL NUMBER
12 MESSAGE_ID VARCHAR2(64)
13 MESSAGE_GROUP VARCHAR2(64)
14 CLIENT_ID VARCHAR2(64)
15 MODULE_ID VARCHAR2(64)
16 PROCESS_ID VARCHAR2(32)
17 THREAD_ID VARCHAR2(64)
18 USER_ID VARCHAR2(64)
19 INSTANCE_ID VARCHAR2(64)
20 DETAILED_LOCATION VARCHAR2(160)
21 PROBLEM_KEY VARCHAR2(64)
22 UPSTREAM_COMP_ID VARCHAR2(100)
23 DOWNSTREAM_COMP_ID VARCHAR2(100)
24 EXECUTION_CONTEXT_ID VARCHAR2(100)
25 EXECUTION_CONTEXT_SEQUENCE NUMBER
26 ERROR_INSTANCE_ID NUMBER
27 ERROR_INSTANCE_SEQUENCE NUMBER
28 VERSION NUMBER
29 MESSAGE_TEXT VARCHAR2(2048)
30 MESSAGE_ARGUMENTS VARCHAR2(128)
31 SUPPLEMENTAL_ATTRIBUTES VARCHAR2(128)
32 SUPPLEMENTAL_DETAILS VARCHAR2(128)
33 PARTITION NUMBER
34 RECORD_ID NUMBER
There’s also a fixed table X$DBGDIREXT, which returns all file and directory names under [diagnostic_dest]/diag directory:
SQL> select lpad(' ',lvl,' ')||logical_file file_name
2 from X$DBGDIREXT
3 where rownum <=20;
20 rows selected.
If you’re building some custom alert log monitoring, then starting from 11g these x$ tables can help you. On the other hand, I would prefer to monitor logfiles using plain and simple scripts as accessing this X$ table requires the instance to be up and operational. The better use case I see is that if you don’t have access to OS filesystem and nevertheless want to see alert log contents, that can be the easiest option in 11g…
Thursday, November 6. 2008
Oracle Database 11g New Features - ... Posted by Barb Matthews in 11g at 16:03
You've missed the chance to take Robert Freeman's Advanced Backup and Recovery Using Oracle 10g RMAN (for now, anyway). The good news is, Robert will be teaching again, this time on a topic that folks are finally ready to sink their teeth into - Oracle Database 11g New Features. If you've been debating whether you should migrate to RDBMS 11g, take Robert's class and find out all the new features that make it worth the upgrade effort.
Robert will teach this class in Salt Lake City from February 23-27, and in Denver from March 2-5, so check out the training schedule and sign up before the classes fill! Students will receive a complimentary copy of Robert's book Oracle Database 11g New Features. Classes that Robert teaches include: Backup & Recovery Using Oracle 10g RMAN, Oracle 10gR2 DBA New Features, Oracle 10gR2 Database Administration I, and Oracle Database 11g New Features.
Robert has a new book coming out! Check out OCP: Oracle Database 11g Administrator Certified Professional Study Guide You can pre-order it on Amazon for a reduced price.
And while we're at it, here's some insight from Robert about what it's like to write a book. Since Robert has written 13 books, he ought to know!
Writing books is a love hate affair for me. Sometimes I love the work, sometimes I hate the work. For 3-4 months it's deadline, schedule, write, deadline, write, deadline, etc... then, there are the edits.
Every chapter you write goes to a technical editor and then a copy editor. These folks take your hard work and turn it on its ear. In the end, it makes a much better product, but it also tends to make authors sick (at least it does me sometimes). Once these daemons of the dark have done their work, you have to go back through, review their changes and questions and fix any mistakes. There is more to the process, but maybe I'll comment on later processes as we get on down the line.
Book writing is an amazing process, and it's a wonder, but mistakes do still get though, sometimes big ones. Nothing makes me sicker than to pull a brand spanking new book out of the box, open it up and then find the first mistake. Ackkkkkk!!!!
Then, there are the reviews. Oh man, maybe that's a whole different discussion too. Suffice to say, your first sweet review makes for a great day. The first sour review makes you swear you will never write another word. They can be painful.
For those of you who think that those of us who write get rich off of the deal, please think again. I've written 13 books now (I think).... for the first 4 I didn't see any royalties. Yes, I did get some advances, but if you figure out how many hours you write and divide it by the advances... well, I just calculated it out and it comes to maybe 15 to 20 dollars an hour (and thats being very liberal with how much time I spend per chapter). I can tell you that if you have a popular book, you can make more than that, but you don't sell books in the millions like Steven King. Royalty checks are nice, but there is never any guarentee.
So, why do I write? One, I like it. Two, I have to admit that the emails I get on a regular basis about my RMAN book in particualr are very motivating. I've had people tell me that my book pulled their database from certain death. That is very motivating. I also find it keeps me up on the latest and greatest. I think it's too easy to be complacent in our DBA jobs. Doing what we do, the normal way we do it. Well, the technology and the job is a-changin'... I want to be there as it does and be a part of it.
Finally, lest you think writing is easy work, I can probably find a writing project for you if you are glutting for punishment and can write. You will hate me after six months probably.
« previous page (Page 1 of 1, totaling 4 entries) next page »
Syndicate This Blog
Steven Chan about Collaborate 10 Database SIG Meeting by Barbara Matthews
Sun, 09.05.2010 23:52
Barb, you take amazing notes. Thanks for sharing them with the members of our EBS communi ty who couldn't make it [...]
Jack about Installations Gone Wild - Guest Author Lon White
Fri, 04.09.2009 11:06
Lon, I am getting same RW-2 0019 error while installing Or acle Release 12 on Linux. I ha ve setup my network as D [...]
Dave Hiller about OAM Patch Search Responsibility - Guest Author, Kathy Duret
Mon, 09.02.2009 10:36
This is extremely helpful!