Exadata networking
It continues to amaze me how a company can spend big bucks on a very complex system such as an Oracle Exdata, and expect a single DBA to administer it. In my opinion, it is at least a 2.5 person job:
1) DBA to administer Oracle databases, ASM and grid infrastructure
2) Systems administrator to maintain Linux/Solaris OS, users, local files systems, security, etc..
3) Network administrator. This is probably the most over looked responsibility. Exadata networking is quite complex and requires a competent network guru.
New book on SQLT
Congratulations to Stelios Charalambides on his new book about SQLT to be published. You can pre-order here: http://www.apress.com/9781430248095
Carlos Sierra has done a masterful job in creating the tool and maintaining it through the years. Although I haven’t seen Stelios’ book yet, my guess is that he will dissect SQLT and share some tips and tricks on how to get the most out it.
How to create in 5 minutes a SQL Tuning Test Case using SQLTXPLAIN
Cool video from Carlos Sierra on using SQLT test cases.
How to create in 5 minutes a SQL Tuning Test Case using SQLTXPLAIN.
Exadata Smart Flash Logging
I’ve had the privilege of applying BP12 and working with the new Smart Flash Logging feature. As some of you may already know, this feature allows for asynch redo writes to both spinning disks and flash cache for redo writes. The first write to complete wins and processing continues. Some key things to know about Smart Flash Logging are:
- You must be on 11.2.2.2.4 cell patch
- You must be using Oracle RDBMS 11.2.0.2.0 BP11 or later although support for BP9 is there with a hidden parameter.
- It is not a redo log mirror. It only mirrors the tail end of the redo logs. This allows for a small foot print at 512MB
- It is not a read cache. Redo log reads are done solely from spinning disks.
- Each 512MB log is divided into 32MB chunks across all 16 flash disks.
- Is used as a circular buffer and reused after data is verified to have been written to disk.
cellcli syntax:
CREATE FLASHLOG [ALL [FLASHDISK]] [attribute=value] [,attribute=value] ... Examples: CREATE FLASHLOG ALL CREATE FLASHLOG ALL SIZE=1G CREATE FLASHLOG CELLDISK='fd1,fd2' CREATE FLASHLOG CELLDISK='fd1,fd2' SIZE=1G LIST FLASHLOG [attibute_list] [DETAIL] LIST FLASHLOG LIST FLASHLOG DETAILS LIST FLASHLOG ATTIBUTES name, efficiency Note: Attributes can be name, cellDisk, creationTime, degradedCelldisks, effectiveSize, efficiency, id, size, status DROP FLASHLOG DROP FLASHLOG FORCE
Note: Using the FORCE option should be used with caution, and only if DROP FLASHLOG fails due to the existence of saved redo.
As you might imagine, there are also new metrics which can be viewed via cellcli LIST METRIC CURRENT where OBJECTTYPE=’FLASHLOLG’
If you are using this new feature, please do let us know about your experiences.
–Rich
Replacing an Exadata disk…things you need to check
I recently assisted a customer who had a predictive disk failure in his Exadata. A new disk was sent out and the failing disk was removed and the new disk inserted. The whole process should be as easy as that, but in this case it wasn’t. The celldisks, and griddisks that were part of the old disk had been recreated automatically, but the newly created griddisks were not added back to the ASM diskgroup. Working through Doc ID 1281395.1 for manually adding back celldisks and griddisks did the trick. We only needed to modify the ASM diskgroups and add the disks back in as follows:
alter diskgroup DATA_DG01 add disk ‘o/*/DATA_DG01_CD_05_hr01cel02′ rebalance power 11;
alter diskgroup FRA_DG01 add disk ‘o/*/FRA_DG01_CD_05_hr01cel02′ rebalance power 11;
alter diskgroup SYSTEMDG add disk ‘o/*/SYSTEMDG_CD_05_hr01cel02′ rebalance power 11;
We did this on 3 separate ASM instances so the rebalance would not be serialized. We also chose to do it at a time of day when database resource needs were the least.
Of course you should also check the progress of the rebalance by querying gv$asm_operation
Pet peeves I have with housekeeping
I cannot stand it when people do not take the time to clean up old trace files, trim excessive alert logs, audit logs, cores, etc….
On a test server I just reviewed I found gigabytes of junk:
[oracle@testdb01 trace]$ pwd
/u01/app/oracle/diag/asm/+asm/+ASM1/trace
[oracle@testdb01 trace]$ du -h .
1.3M ./cdmp_20110819001755
1.4M ./cdmp_20110911034245
1.6M ./cdmp_20110929142949
11G
I think some common sense housekeeping is in order(especially since the internal disk on Exadata db nodes are not particularly large).
[oracle@testdb01 trace]$ crontab -l
..
###############################################
#
# General housekeeping
#
###############################################
# Rotate the listener log
00 00 1 * * /usr/local/home/oracle/tools/lsnr_log_arch.ksh 2>&1
# Cleanup ASM trace files, core files and audit trails
13 02 * * * (cd /u01/app/oracle/diag/asm/+asm/+ASM1/trace && find . -name \*.trc -type f -mtime +90 -exec rm -f {} \;)
14 02 * * * (cd /u01/app/oracle/diag/asm/+asm/+ASM1/trace && find . -mtime +90 -type f -exec rm -f {} \;)
15 02 * * * (cd /u01/app/11.2.0/grid/rdbms/audit && find . -name \*.aud -type f -mtime +90 -exec rm -f {} \;)
….
This is by no means a complete housekeeping crontab, but it gives you the idea of what I’m talking about. Do you really need trace files older than 90 days? I’ve never seen a valid case for keeping them longer than a few weeks in my experiences. Your audit requirements might prevent you from removing audit logs, but you could probably sweep them to an archive destination on an NFS mount or send them to tape for longer term storage.
Just keep it clean. When Oracle Support asks for you to upload your alert logs and trace files, I doubt they want ALL of your trace files in an 11GB hairball/tar ball.
Expert Oracle Exadata….a must read.
Folks,
If you haven’t yet purchased a copy of Expert Oracle Exadata, you are missing out. This is a well written book by Kerry Osborne, Randy Johnson and Tanel Poder with commentary from Kevin Closson. I’ve read most of it and find the content to be very relevant with excellent examples and even free scripts used throughout.
crs_stat crsstat and crsctl
I’m still seeing a lot of folks using the Oracle supplied scripts for checking on the status of grid services. They work well, but I find them difficult to read due to formatting:
Example:
[root@testdb01 richtmp]# crs_stat -t Name Type Target State Host ------------------------------------------------------------ dbfs_mount local_resource ONLINE OFFLINE ora....105.vip ora....t2.type ONLINE OFFLINE ora.DATA.dg ora....up.type ONLINE ONLINE richtmp02 ora....ER.lsnr ora....er.type ONLINE ONLINE richtmp02 ora....N1.lsnr ora....er.type ONLINE ONLINE richtmp02 ora....N2.lsnr ora....er.type ONLINE ONLINE richtmp02 ora....N3.lsnr ora....er.type ONLINE ONLINE richtmp02 ora....DP.lsnr ora....er.type ONLINE ONLINE richtmp02 ora.RECO.dg ora....up.type ONLINE ONLINE richtmp02 ora....EMDG.dg ora....up.type ONLINE ONLINE richtmp02 ora....NG.lsnr ora....er.type ONLINE ONLINE richtmp02 ora.asm ora.asm.type ONLINE ONLINE richtmp02 ora....vc1.svc ora....ce.type ONLINE OFFLINE ora....vc2.svc ora....ce.type ONLINE OFFLINE ora.coe.db ora....se.type ONLINE ONLINE richtmp02 ora.coedb.db ora....se.type ONLINE OFFLINE ora.dbfs.db ora....se.type ONLINE OFFLINE ora....svc.svc ora....ce.type ONLINE OFFLINE ora.dbm.db ora....se.type ONLINE OFFLINE ora....b01.vip ora....t1.type ONLINE ONLINE richtmp02 ora....SM2.asm application ONLINE ONLINE richtmp02 ora....02.lsnr application ONLINE ONLINE richtmp02 ora....02.lsnr application ONLINE ONLINE richtmp02 ora....02.lsnr application ONLINE ONLINE richtmp02 ora....b02.gsd application OFFLINE OFFLINE ora....b02.ons application ONLINE ONLINE richtmp02 ora....b02.vip ora....t1.type ONLINE ONLINE richtmp02 ora.eons ora.eons.type OFFLINE OFFLINE ora.gsd ora.gsd.type OFFLINE OFFLINE ora....network ora....rk.type ONLINE ONLINE richtmp02 ora....network ora....rk.type ONLINE ONLINE richtmp02 ora.oc4j ora.oc4j.type OFFLINE OFFLINE ora.ons ora.ons.type ONLINE ONLINE richtmp02 ora....ry.acfs ora....fs.type ONLINE OFFLINE ora.rich.db ora....se.type ONLINE OFFLINE ora.scan1.vip ora....ip.type ONLINE ONLINE richtmp02 ora.scan2.vip ora....ip.type ONLINE ONLINE richtmp02 ora.scan3.vip ora....ip.type ONLINE ONLINE richtmp02 Not very friendly is it? Let's try something a bit better: [root@testdb01 richtmp]# ./crsstat HA Resource Target State ----------- ------ ----- dbfs_mount ONLINE OFFLINE ora.10_141_138_105.vip ONLINE OFFLINE ora.DATA.dg ONLINE ONLINE on richtmp02 ora.LISTENER.lsnr ONLINE ONLINE on richtmp02 ora.LISTENER_SCAN1.lsnr ONLINE ONLINE on richtmp02 ora.LISTENER_SCAN2.lsnr ONLINE ONLINE on richtmp02 ora.LISTENER_SCAN3.lsnr ONLINE ONLINE on richtmp02 ora.LISTENER_SDP.lsnr ONLINE ONLINE on richtmp02 ora.RECO.dg ONLINE ONLINE on richtmp02 ora.SYSTEMDG.dg ONLINE ONLINE on richtmp02 ora.TESTING.lsnr ONLINE ONLINE on richtmp02 ora.asm ONLINE ONLINE on richtmp02 ora.coe.coesvc1.svc ONLINE OFFLINE ora.coe.coesvc2.svc ONLINE OFFLINE ora.coe.db ONLINE ONLINE on richtmp02 ora.coedb.db ONLINE OFFLINE ora.dbfs.db ONLINE OFFLINE ora.dbfs.dbfssvc.svc ONLINE OFFLINE ora.dbm.db ONLINE OFFLINE ora.richtmp01.vip ONLINE ONLINE on richtmp02 ora.richtmp02.ASM2.asm ONLINE ONLINE on richtmp02 ora.richtmp02.LISTENER_DMORLDB02.lsnr ONLINE ONLINE on richtmp02 ora.richtmp02.LISTENER_SDP_DMORLDB02.lsnr ONLINE ONLINE on richtmp02 ora.richtmp02.TESTING_DMORLDB02.lsnr ONLINE ONLINE on richtmp02 ora.richtmp02.gsd OFFLINE OFFLINE ora.richtmp02.ons ONLINE ONLINE on richtmp02 ora.richtmp02.vip ONLINE ONLINE on richtmp02 ora.eons OFFLINE OFFLINE ora.gsd OFFLINE OFFLINE ora.net1.network ONLINE ONLINE on richtmp02 ora.net2.network ONLINE ONLINE on richtmp02 ora.oc4j OFFLINE OFFLINE ora.ons ONLINE ONLINE on richtmp02 ora.registry.acfs ONLINE OFFLINE ora.rich.db ONLINE OFFLINE ora.scan1.vip ONLINE ONLINE on richtmp02 ora.scan2.vip ONLINE ONLINE on richtmp02 ora.scan3.vip ONLINE ONLINE on richtmp02
This is much easier on the eyes. Thanks to a simple script as follows:
[root@testdb01 richtmp]# cat crsstat
#!/bin/bash
#
# Sample 11g CRS resource status query script
#
# Description:
# - Returns formatted version of crs_stat -t, in tabular
# format, with the complete rsc names and filtering keywords
# - The argument, $RSC_KEY, is optional and if passed to the script, will
# limit the output to HA resources whose names match $RSC_KEY.
# Requirements:
# - $ORACLE_HOME should be set in your environment. I'm using the ASM1 home/Grid Home for this.
RSC_KEY=$1
QSTAT=-u
AWK=/usr/bin/awk # if not available use /usr/xpg4/bin/awk
# Table header:echo ""
$AWK \
'BEGIN {printf "%-45s %-10s %-18s\n", "HA Resource", "Target", "State";
printf "%-45s %-10s %-18s\n", "-----------", "------", "-----";}'
# Table body:
$ORACLE_HOME/bin/crs_stat $QSTAT | $AWK \
‘BEGIN { FS=”=”; state = 0; }
$1~/NAME/ && $2~/’$RSC_KEY’/ {appname = $2; state=1};
state == 0 {next;}
$1~/TARGET/ && state == 1 {apptarget = $2; state=2;}
$1~/STATE/ && state == 2 {appstate = $2; state=3;}
state == 3 {printf “%-45s %-10s %-18s\n”, appname, apptarget, appstate; state=0;}’
=======================================================================================
Do your comrades a favor and provide them useful and readable output from your commands. Formatting is important. Don’t forget to use a fixed font like courier new also.
Exadata and non-RAC
I have a client who is choosing to not use RAC on their Exadata system. They have asked what should be done with the first 2 cell disks, which are smaller than the other 10 on each cell. I advised that they should recreate the celldisks to be of equal sizes for balance.
You just never know what might happen on an Exadata system.
cellcli collated metric….
I’ve been searching around for a command line tool that can gather Oracle Exadata cell metrics without having to use a gui. My dreams have been asnwered by Guy Harrison with his cellcli.pl tool he cooked up.
The tool is a simple perl script wrapping cellcli. It can do aggregate metrics across all cell in you Exadata rack, show individual metric and one of my favs: show the full metric defintion(I don’t like having to remember all the cell metric prefixes).