undo retention Oracle 10g UNDO表空间过大导致磁盘空间不足的解决

精选转载

yesu898 2012-11-23 16:42:31 博主文章分类：Oracle

Oracle10g中引入了一个新的自动调整undo retention的特性，本意是为了尽量避免ora-01555错误，但是自动的东西，有时候会不可避免的聪明过头，这个特性容易导致undo表空间过度使用无法回收。

在Oracle10gR2中只要使用了自动undo表空间管理，不管设置undo_retention为多少，自动undo retention特性都会启用。这时MMON进程每隔30秒会根据maxquerylen计算出一个tuned undo retention，然后将系统的undo retention设置为该值。如果undo tablespace的datafile是不能自动扩展的话，可能触发bug 5387030，tuned undo retenttion会变得非常大，导致undo表空间长时间无法回收空间。

通过以下查询可以查看tuned undo retention的值：

select tuned_undoretention, maxquerylen, maxqueryid from v$undostat;

在我们的一个案例中这个值最大达到了345600也就是96小时，使得undo表空间在事务比较频繁的情况下很快达到了100%的使用率，导致监控短信频繁响起。

知道了原因，解决方案也就有了：

10.2.0.2/10.2.0.3有相应的patch，这个bug在10.2.0.4中已经修复，建议找时间停机打patch

设置隐含参数_smu_debug_mode=33554432，将tuned_undoretention取值算法修正为max(maxquerylen secs + 300,undo_retention )，不建议使用

设置隐含参数_undo_autotune=false，关闭自动undo retention调整特性，不建议使用

参考：
Note:461480.1 FAQ – Automatic Undo Management (AUM) / System Managed Undo (SMU)
Note:240746.1 10g NEW FEATURE on AUTOMATIC UNDO RETENTION
Bug 5387030 – Automatic tuning of undo_retention causes unusual extra space allocation

============================================================

Oracle 10g UNDO表空间过大导致磁盘空间不足的解决

2011-08-24 14:21 frankfan126 ChinaUnix博客我要评论(0) 字号：T | T

本文我们详细地对导致Oracle 10g UNDO表空间过大导致磁盘不足的原因进行了分析，并给出了解决方法，希望能够对您有所帮助。

AD：

在Oracle 10g数据库的应用中，出现了UNDO表空间过大导致磁盘空间不足而崩溃的现象。对此问题进行分析后，总结了出现该问题的原因主要有以下两点：

1. 有较大的事务量让Oracle Undo自动扩展，产生过度占用磁盘空间的情况；

2. 有较大事务没有收缩或者没有提交所导制；

说明：本问题在Oracle系统管理中属于比较正常的一现象，日常维护多注意对磁盘空间的监控。

Oracle 10g 有自动Automatic Undo Retention Tuning 这个特性。设置的 undo_retention 参数只是一个指导值，缺省值900秒,，Oracle 会自动调整 Undo (会跨过 undo_retention 设定的时间) 来保证不会出现 Ora-1555 错误.。通过查询V$UNDOSTAT（该视图记录4天以内的UNDO表空间使用情况，超过4天可以查询DBA_HIST_UNDOSTAT视图）的tuned_undoretention （该字段在10G版本才有，9I是没有的）字段可以得到Oracle 根据事务量（如果是文件不可扩展，则会考虑剩余空间）采样后的自动计算出最佳的 retenton 时间.。

1)查询retention值

show parameter undo_retention

查询自动计算出最佳的retenton 时间

select tuned_undoretention, maxquerylen, maxqueryid from v$undostat;

2）更改retention值

ALTER SYSTEM SET undo_retention=10800 SCOPE=BOTH;

这样对于一个事务量分布不均匀的数据库来说,，就会引发潜在的问题--在批处理的时候可能 Undo 会用光，而且这个状态将一直持续，不会释放。

如何取消10g的auto UNDO Retention Tuning，有如下三种方法：

(1)10.2.0.2/10.2.0.3有相应的patch，这个bug在10.2.0.4中已经修复，建议找时间停机打patch.

(2)设置隐含参数_smu_debug_mode=33554432，将tuned_undoretention取值算法修正为max(maxquerylen secs + 300,undo_retention )，不建议使用SQL> Alter system set "_smu_debug_mode" = 33554432;

(3)设置隐含参数_undo_autotune=false，关闭自动undo retention调整特性，不建议使用SQL> Alter system set "_undo_autotune" = false;from metalink 420525.1： Automatic Tuning of Undo_retention Causes Space Problems.

解决步骤：

1. 启动SQLPLUS，并用sys登陆到数据库。

#su - oracle  
$>sqlplus / as sysdba

2. 查找数据库的UNDO表空间名,确定当前例程正在使用的UNDO表空间：Show parameter undo_tablespace。

3. 确认UNDO表空间；

SQL> select name from v$tablespace;  
NAME  
------------------------------  
.......  
UNDOTBS1

4. 检查数据库UNDO表空间占用空间情况以及数据文件存放位置；

SQL>select file_name,bytes/1024/1024 from dba_data_files where tablespace_name like 'UNDOTBS%';

5. 查看回滚段的使用情况，哪个用户正在使用回滚段的资源，如果有用户最好更换时间（特别是生产环境）。

SQL> select s.username, u.name from v$transaction t,v$rollstat r, v$rollname u,v$session s  
where s.taddr=t.addr and t.xidusn=r.usn and r.usn=u.usn order by s.username;

6. 检查UNDO Segment状态；

SQL> select usn,xacts,rssize/1024/1024/1024,hwmsize/1024/1024/1024,shrinks from v$rollstat order by rssize;

7. 创建新的UNDO表空间，并设置自动扩展参数；

SQL> create undo tablespace undotbs2 datafile '/opt/oracle/oradata/ge01/UNDOTBS2.dbf' size 100m reuse autoextend on next 50m maxsize 5000m; 
Tablespace created.

8. 动态更改spfile配置文件；

SQL> alter system set undo_tablespace=undotbs2 scope=both;  
System altered.

9. 等待原UNDO表空间所有UNDO SEGMENT OFFLINE；

select usn,xacts,status,rssize/1024/1024,hwmsize/1024/1024, shrinks from   v$rollstat order by rssize;

10. 再执行看UNDO表空间所有UNDO SEGMENT ONLINE；

select usn,xacts,status,rssize/1024/1024,hwmsize/1024/1024, shrinks from v$rollstat order by rssize;

11. 删除原有的UNDO表空间；

SQL> drop tablespace undotbs1 including contents;  
Tablespace dropped.

12. 确认删除是否成功；

SQL> select name from v$tablespace;  
NAME  
------------------------------  
.......  
UNDOTBS2  
12 rows selected.

13. 更新pfile

SQL> create pfile from spfile;  
File created.

14. 册除原UNDO表空间的数据文件，其文件名为步骤中执行的结果。

#rm $ORACLE_BASE/oradata/$ORACLE_SID/undotbs01.dbf

关于Oracle 10g UNDO表空间过大导致磁盘空间不足的解决方法就介绍到这里了，希望本次的介绍能够对您有所收获！

===========================================================================

以下是MAXQUERYLEN列的官方定义。

Identifies the length of the longest query (in seconds) executed in the instance during the period. You can use this statistic to estimate the proper setting of the UNDO_RETENTION initialization parameter. The length of a query is measured from the cursor open time to the last fetch/execute time of the cursor. Only the length of those cursors that have been fetched/executed during the period are reflected in the view.

根据以上的内容，我还是觉得在同一个取样时间段内，最长的那个query会覆盖掉其他query。
所以1500秒的query，会覆盖1400秒的query是必然的事情。

但是，就我自己的需求而言〉〉查找执行时间超过1200秒的SQL 文以及执行用户和执行时间。

利用v$undostat，可能不是最好的方法，但也是一个可行的方法。当然，还要联合v$active_session_history和v$sqlarea等其他视图。