ASM由于其高度的封装性,使得我们很难知道窥探其内部的原理。可以通过一下视图和数据字典来来查看ASM 的信息。
一. 相关视图和数据字典
V$ASM_DISKGROUP | X$KFGRP | performs disk discovery and lists diskgroups |
V$ASM_DISKGROUP_STAT | X$KFGRP_STAT | diskgroup stats without disk discovery |
V$ASM_DISK | X$KFDSK, X$KFKID | performs disk discovery, lists disks and their usage metrics |
V$ASM_DISK_STAT | X$KFDSK_STAT, X$KFKID | lists disks and their usage metrics |
V$ASM_FILE | X$KFFIL | lists ASM files, including metadata/asmdisk files |
V$ASM_ALIAS | X$KFALS | lists ASM aliases, files and directories |
V$ASM_TEMPLATE | X$KFTMTA | lists the available templates and their properties |
V$ASM_CLIENT | X$KFNCL | lists DB instances connected to ASM |
V$ASM_OPERATION | X$KFGMG | lists rebalancing operations |
N.A. | X$KFKLIB | available libraries, includes asmlib path |
N.A. | X$KFDPARTNER | lists disk-to-partner relationships |
N.A. | X$KFFXP | extent map table for all ASM files |
N.A. | X$KFDAT | extent list for all ASM disks |
N.A. | X$KFBH | describes the ASM cache (buffer cache of ASM in blocks of 4K (_asm_blksize) |
N.A. | X$KFCCE | a linked list of ASM blocks. to be further investigated |
This list is obtained querying v$fixed_view_definitionwhere view_name like '%ASM%' which exposes all the v$ and gv$ views with theirdefinition. Fixed tables are exposed by querying v$fixed_table where name like'x$kf%' (ASM fixed tables use the 'X$KF' prefix).
SQL>select * fromv$fixed_view_definition whereview_name like '%ASM%';
SQL>select * from sys.v$fixed_tablewhere name like 'X$KF%' ;
Noteon 11g there are additional V$views: , and X$tables: *
New in 11g:
V$ASM_ATTRIBUTE | X$KFENV | ASM attributes, the X$ table shows also 'hidden' attributes |
V$ASM_DISK_IOSTAT | X$KFNSDSKIOST | I/O statistics |
N.A. | X$KFDFS |
|
N.A. | X$KFDDD |
|
N.A. | X$KFGBRB |
|
N.A. | X$KFMDGRP |
|
N.A. | X$KFCLLE |
|
N.A. | X$KFVOL |
|
N.A. | X$KFVOLSTAT |
|
N.A. | X$KFVOFS |
|
N.A. | X$KFVOFSV |
|
二. Striping and Mirroring with ASM, extentsand allocation units
Abasic example, using ASM and normal redundancy: the available storage, say 64HDs over FC SAN, are used to create the main DB diskgroup: DATADG. DATADG islogically divided into 2 evenly sized groups of disks: 32 disks in failgroupN.1 and 32 in failgroup N.2. Oracle datafiles created in DATADG are 'striped'into smaller pieces, extents of 1MB in size. Extents are allocated to thestorage in 2 (mirrored) allocation units (AU): one AU in failgroup N.1 theother in failgroup N.2.
Allocation Units
EveryASM disk is divided into allocation units (AU). An AU is the fundamental unitof allocation within a disk group. A file extent consists of one or more AU. AnASM file consists of one or more file extents.
When you create a disk group, youcan set the ASM AU size to be between 1 MB and 64 MB in powers of two, such as,1, 2, 4, 8, 16, 32, or 64. Larger AU sizes typically provide performanceadvantages for data warehouse applications that use large sequential reads.
默认的AU 大小是1M。
三. X$KFFXP
ThisX$ table contains the mapping between files, extents and allocation units. Itallows to track the position of all the extents of a given file striped andmirrored across storage.
Note:RDBMS read operations access only the primary extent of a mirrored couple(unless there is an IO error) . Write operations instead write all mirroredextents to disk.
ADDR | x$ table address/identifier |
INDX | row unique identifier |
INST_ID | instance number (RAC) |
NUMBER_KFFXP | ASM file number. Join with v$asm_file and v$asm_alias |
COMPOUND_KFFXP | File identifier. Join with compound_index in v$asm_file |
INCARN_KFFXP | File incarnation id. Join with incarnation in v$asm_file |
PXN_KFFXP | Progressive file extent number |
XNUM_KFFXP | ASM file extent number (mirrored extent pairs have the same extent value) |
GROUP_KFFXP | ASM disk group number. Join with v$asm_disk and v$asm_diskgroup |
DISK_KFFXP | Disk number where the extent is allocated. Join with v$asm_disk |
AU_KFFXP | Relative position of the allocation unit from the beginning of the disk. The allocation unit size (1 MB) in v$asm_diskgroup |
LXN_KFFXP | 0->primary extent, ->mirror extent, 2->2nd mirror copy (high redundancy and metadata) |
FLAGS_KFFXP | N.K. |
CHK_KFFXP | N.K. |
SIZE_KFFXP | 11g, to support variable size AU, integer value which marks the size of the extent in AU size units. |
Example1 - reading ASM files with direct OS access
(1)Find the 2 mirrored extents of an ASM file (thespfile in this example)
sys@+ASM1>selectGROUP_KFFXP,DISK_KFFXP,AU_KFFXP from x$kffxp where number_kffxp=(selectfile_number from v$asm_alias where name='spfiletest1.ora');
GROUP_KFFXP DISK_KFFXP AU_KFFXP
----------- ---------- ----------
1 20 379
1 3 101
(2)find the diskname
sys@+ASM1> select disk_number,path fromv$asm_disk where GROUP_NUMBER=1 anddisk_number in (3,20);
DISK_NUMBER PATH
---------------------------------------------------
3 /dev/mpath/itstor417_2p1
20 /dev/mpath/itstor419_2p1
(3)access the data directly from disk with dd
ddif=/dev/mpath/itstor417_2p1 bs=1024k count=1 skip=101|strings|more
四. X$KFDAT
This X$ table contains details of all allocation units (free and used).
ADDR | x$ table address/identifier |
INDX | row unique identifier |
INST_ID | instance number (RAC) |
GROUP_KFDAT | diskgroup number, join with v$asm_diskgroup |
NUMBER_KFDAT | disk number, join with v$asm_disk |
COMPOUND_KFDAT | disk compund_index, join with v$asm_disk |
AUNUM_KFDAT | Disk allocation unit (relative position from the beginning of the disk), join with x$kffxp.au_kffxp |
V_KFDAT | V=this Allocation Unit is used; F=AU is free |
FNUM_KFDAT | file number, join with v$asm_file |
I_KFDAT | N.K. |
XNUM_KFDAT | Progressive file extent number join with x$kffxp.pxn_kffxp |
RAW_KFDAT | raw format encoding of the disk,and file extent information |
Example2 - listallocation units of a given file from x$kfdat
similarly to example 1 above, another wayto retrieve ASM file allocation maps:
sys@+ASM1> selectGROUP_KFDAT,NUMBER_KFDAT,AUNUM_KFDAT from x$kfdat where fnum_kfdat=(selectfile_number from v$asm_alias where name='spfiletest1.ora');
GROUP_KFDAT NUMBER_KFDAT AUNUM_KFDAT
----------- ------------ -----------
1 3 101
1 20 379
Example3 - from strace data of an oracle userprocess
(1)from the strace file of a user(shadow) process identify IO operations:
ex: strace-p 30094 2>&1|grep -v time
read64(15,"#\242\0\0\33\0@\2\343\332\177\303s\5\1\4\211\330\0\0\0"..., 8192,473128960) = 8192
itis a read operation of 8KB (oracle block) at the offset 473128960 (=451 MB +27*8KB) from file descriptor FD=15
(2)using /proc/30094/fd -> findFD=15 is /dev/mpath/itstor420_1p1
(3)I find the group and disk number ofthe file:
sys@+ASM1> selectGROUP_NUMBER,DISK_NUMBER from v$asm_disk
where path='/dev/mpath/itstor420_1p1';
GROUP_NUMBER DISK_NUMBER
------------ -----------
1 30
(4)using the disk number, group numberand offset (from strace above) I find the file number and extent number:
sys@+ASM1> select number_kffxp,XNUM_KFFXP from x$kffxp where group_kffxp=1 and disk_kffxp=20 and au_kffxp=451;
NUMBER_KFFXP XNUM_KFFXP
------------ ----------
268 17
(5)from v$asm_file fnum=268 is file ofthe users' tablesspace:
sys@+ASM1> select name from v$asm_aliaswhere FILE_NUMBER=268
NAME
------------------------------
USERS.268.612033477
sys@DB> select file#,name fromv$datafile where upper(name) like '%USERS.268.612033477';
FILE# NAME
------------------------------------------------------------------
9 +TEST1_DATADG1/test1/datafile/users.268.612033477
(6)from dba extents finally find theowner and segment name relative to the original IO operation:
sys@TEST1> selectowner,segment_name,segment_type from dba_extents
where FILE_ID=9 and 27+17*1024*1024 betweenblock_id and block_id+blocks;
OWNER SEGMENT_NAME SEGMENT_TYPE
------------------------------------------------------------ ------------------
SCOTT EMP TABLE
五. X$KFDPARTNER
ThisX$ table contains the disk-to-partner (1-N) relationship. Two disks of a givenASM diskgroup are partners if they each contain a mirror copy of the sameextent. Therefore partners must belong to different failgroups of the samediskgroup. From a few liveexamples I can see that typically disks have 10 partners each atdiskgroup creation and fluctuate around 10 partners following ASM operations.This mechanism is in place to reduce the chance of losing both sides of themirror in case of double disk failure.
ADDR | x$ table address/identifier |
INDX | row unique identifier |
INST_ID | instance number (RAC) |
GRP | diskgroup number, join with v$asm_diskgroup |
DISK | disk number, join with v$asm_disk |
COMPOUND | disk identifier. Join with compound_index in v$asm_disk |
NUMBER_KFDPARTNER | partner disk number, i.e. disk-to-partner (1-N) relationship |
MIRROR_KFDPARNER | =1 in a healthy normal redundancy config |
PARITY_KFDPARNER | =1 in a healthy normal redundancy config |
ACTIVE_KFDPARNER | =1 in a healthy normal redundancy config |
六. X$KFFIL and metadata files
Three types of metadata:
(1)diskgroup metadata: files with NUMBER_KFFIL <256 ASM metadata andASMlog files. These files have high redundancy (3 copies) and block size =4KB.
1)ASM log files are used for ASMinstance and crash recovery when a crash happens with metadata operations (seebelow COD and ACD)
2)at diskgroup creation 6 files withmetadata are visible from x$kffil
(2)disk metadata: disk headers (typically the first 2 AU of each disk)are not listed in x$kffil (they appear as file number 0 in x$kfdat). Containdisk membership information. This part of the disk has to be 'zeroed out'before the disk can be added to ASM diskgroup as a new disk.
(3)file metadata: 3 mirrored extents with file metadata, visible fromx$kffxp and x$kfdat
Example: list all files,system and users' with their sizes:
SYS@+ASM2(rac2)> select group_kffil group#,number_kffil file#, filsiz_kffil filesize_after_mirr, filspc_kffilraw_file_size from x$kffil;
GROUP# FILE# FILESIZE_AFTER_MIRRRAW_FILE_SIZE
---------- ---------- --------------------------------
1 1 2097152 2097152
1 2 1048576 1048576
1 3 88080384 89128960
1 4 1392640 2097152
1 5 1048576 1048576
1 6 1048576 1048576
1 256 545267712 547356672
1 257 629153792 631242752
1 258 351281152 353370112
1 259 7872512 8388608
1 260 15319040 16777216
Example: List all filesincluding metadata allocated in the ASM diskgroups
SYS@+ASM2(rac2)> select group_kfdat group#,FNUM_KFDAT file#, sum(1) AU_used from x$kfdat where v_kfdat='V' group by group_kfdat,FNUM_KFDAT,v_kfdat;
GROUP# FILE# AU_USED
---------- ---------- ----------
1 0 2
1 1 2
1 2 1
1 3 85
1 4 2
1 5 1
1 6 1
1 256 522
1 257 602
1 258 337
1 259 8
Descriptionof metadata files
This paragraph is from: Oracle AutomaticStorage Management, Oracle Press Nov 2007, N. Vengurlekar, M. Vallath, R.Long
(1). File#0, AU=0: disk header (disk name, etc), Allocation Table (AT)and Free Space Table (FST)
(2). File#0, AU=1: Partner Status Table (PST)
(3). File#1: File Directory (files and their extent pointers)
(4). File#2: Disk Directory
(5). File#3: Active Change Directory (ACD) The ACD is analogous to a redolog, where changes to the metadata are logged. Size=42MB * number of instances
(6). File#4: Continuing Operation Directory (COD). The COD is analogousto an undo tablespace. It maintains the state of active ASM operations such asdisk or datafile drop/add. The COD log record is either committed or rolledback based on the success of the operation.
(7). File#5: Template directory
(8). File#6: Alias directory
(9). 11g, File#9: Attribute Directory
(10). 11g, File#12: Stalenessregistry, created when needed to track offline disks
七. DBMS_DISKGROUP, an internal ASM package
dbms_diskgroupis an Oracle 'internal package' (it doesn't show up as an object being that ASMhas no dictionary) called dbms_diskgroup. It is used to access the ASM withfilesystem-like calls. 11g asmcmd uses this package to implement the cpcommand. A list of procedures:
dbms_diskgroup.open(:fileName, :openMode, :fileType, :blkSz, :hdl,:plkSz, :fileSz) |
dbms_diskgroup.createfile(:fileName, :fileType, :blkSz, :fileSz, :hdl, :plkSz, :fileGenName) |
dbms_diskgroup.close(:hdl) |
dbms_diskgroup.read(:hdl, :offset, :blkSz, :data_buf) |
dbms_diskgroup.commitfile(:handle) |
dbms_diskgroup.resizefile(:handle,:fsz) |
dbms_diskgroup.remap(:gnum, :fnum, :virt_extent_num) |
dbms_diskgroup.getfileattr(:fileName, :fileType, :fileSz, :blkSz) |
dbms_diskgroup.checkfile(?) |
dbms_diskgroup.patchfile(?) |
八. ASM parameters and underscore parameters
可以使用SQL 查看ASM 参数:
/* Formatted on 2011/8/30 16:28:54(QP5 v5.163.1008.3004) */
SELECT a.ksppinm"Parameter", c.ksppstvl"Instance Value"
FROMx$ksppi a,x$ksppcv b,x$ksppsv c
WHERE a.indx = b.indx AND a.indx = c.indx ANDksppinm LIKE '%asm%'
ORDER BY a.ksppinm;
或者:
select * fromall_parameters where name like '%asm%';
Value | |
_asm_acd_chunks | 1 |
_asm_allow_only_raw_disks | TRUE |
_asm_allow_resilver_corruption | FALSE |
_asm_ausize | 1048576 |
_asm_blksize | 4096 |
_asm_disk_repair_time | 14400 |
_asm_droptimeout | 60 |
_asm_emulmax | 10000 |
_asm_emultimeout | 0 |
_asm_kfdpevent | 0 |
_asm_libraries | ufs (may differ if asmlib is used) |
_asm_maxio | 1048576 |
_asm_stripesize | 131072 |
_asm_stripewidth | 8 |
_asm_wait_time | 18 |
_asmlib_test | 0 |
_asmsid | asm |
asm_diskgroups | list of diskgroups to be mounted at startup |
asm_diskstring | search path for physical disks to be used with ASM |
asm_power_limit | default rebalance power value |
注意这里的_asm_ausize =1M
Oracle 11g 里新增加的参数:
Value | |
_asm_compatibility | 10.1 |
_asm_dbmsdg_nohdrchk | FALSE |
_asm_droptimeout | removed in 11g |
_asm_kfioevent | 0 |
_asm_repairquantum | 60 |
_asm_runtime_capability_volume_support | FALSE |
_asm_skip_resize_check | FALSE |
_lm_asm_enq_hashing | TRUE |
asm_preferred_read_failure_groups |
九. ASM-related acronyms 相关名词解释
(1). PST - Partner Status Table. Maintains info on disk-to-diskgroupmembership.
(2). COD - Continuing Operation Directory. The COD structuremaintains the state of active ASM operations or changes, such as disk ordatafile drop/add. The COD log record is either committed or rolled back basedon the success of the operation. (source Oracle whitepaper)
(3). ACD - Active Change Directory. The ACD is analogous to a redolog, where changes to the metadata are logged. The ACD log record is used todetermine point of recovery in the case of ASM operation failures or instancefailures. (source Oracle whitepaper)
(4). OSM Oracle Storage Manager, legacy name, synonymous of ASM
(5). CSS Cluster Synchronization Services. Part of Oracleclusterware, mandatory with ASM even in single instance. CSS is used toheartbeat the health of the ASM instances.
(6). RBAL - Oracle backgroud process. In an ASM instance coordinatedrebalancing operations. In a DB instance, opens and mount diskgroups from thelocal ASM instance.
(7). ARBx - Oracle backgroud processes. In an ASM instance, a slavefor rebalancing operations
(8). PSPx - Oracle backgroud processes. In an ASM instance, ProcessSpawners
(9). GMON - Oracle backgroud processes. In an ASM instance,diskgroup monitor.
(10). ASMB - Oracle backgroudprocess. In an DB instance, keeps a (bequeath) persistent DB connection to thelocal ASM instance. Provides hearthbeat and ASM statistics. During a diskgrouprebalancing operation ASM communicates to the DB AU changes via this connection.
(11). O00x - Oracle backgroudprocesses. Slaves used to connected from the DB to the ASM instance for 'shortoperations'.