关于read by other session跟cr request retry的问题

今天有个同事说有张表通过索引可以查出指定的数据,如果走全表扫描等上非常久也跑不出数据,加rownum<3这样的也出不来,我去看了后台的$session_wait,event事件是read by other session,看到这种情况后,我去另外一台(两台RAC的)看了一下,另外一台关于这张表的等待事件是cr request retry,这个是什么问题,我无从入手,第一次碰到,请教有经验的大哥、大姐们一下。谢谢

0

1个回答

[code="java"]ODS系统比较慢,要求诊断一下,给ODS系统做了AWR。首先交代一下ODS环境:

  (1)环境:RedHat 5.1+Oracle10.2.0.4 RAC(两节点)

  (2)现象:系统异常慢

  由于当时质疑人资管控系统数据库有问题,而人资管控系统是在节点2实例上,所以查看节点2实例的top5等待事件:

  节点2:

Top 5 Timed Events Avg %Total
~~~~~~~~~~~~~~~~~~ wait Call
~~~~~~~~~~~~~~~~~~ wait Call
Event Waits Time (s) (ms) Time Wait Class
Event Waits Time (s) (ms) Time Wait Class



gc buffer busy 2,346,848 24,567 10 33.2 Cluster
gc buffer busy 2,346,848 24,567 10 33.2 Cluster
read by other session 1,941,773 18,734 10 25.3 User I/O
read by other session 1,941,773 18,734 10 25.3 User I/O
db file scattered read 673,646 10,209 15 13.8 User I/O
db file scattered read 673,646 10,209 15 13.8 User I/O
CPU time 6,627 9.0
CPU time 6,627 9.0
log file sync 129,112 4,068 32 5.5 Commit
log file sync 129,112 4,068 32 5.5 Commit

  从top 5 event来看,主要是read类型的event比较多

  db file sequential read 顺序读导致

  gc buffer busy 产生的根本原因是Oracle 的cache fusion机制,简单来说cache fusion的主要目的

  1、减少各个节点对数据库的读写,提交效率,内存之间的数据同步要比磁盘到内存快很多

  2、保持数据的完整性,当某个节点执行查询时首先在本节点buffer中是否存在,如果不存在就会到另外节点buffer去查找,

  如果存在大量的cache交换就会产生gc buffer busy

  3、read by other session

  When information is requested from the database, Oracle will first read the data from disk into the database buffer cache.

  If two or more sessions request the same information, the first session will read the data into the buffer cache while other sessions wait.

  In previous versions, this wait was classified under the "buffer busy waits" event.

  However, in Oracle 10.1 and higher, this wait time is now broken out into the "read by other session" wait event.

  Excessive waits for this event are typically due to several processes repeatedly reading the same blocks,

  e.g. many sessions scanning the same index or performing full table scans on the same table.

  Tuning this issue is a matter of finding and eliminating this contention.

  该event是10.1以后新引入的,之前归类于buffer busy waits,Oracle对event的划分越来越细,

  产生原因:对同一个索引进行的大量扫描或对同一个表进行全表扫描

  查询等待的文件

  SELECT p1 "file#", p2 "block#", p3 "class#"
  FROM v$session_wait
  WHERE event = 'read by other session';

  查询热点块对应的对象

  SQL> SELECT relative_fno, owner, segment_name, segment_type
  FROM dba_extents
  WHERE file_id = &file
  AND &block BETWEEN block_id AND block_id + blocks - 1

  查看Top Sql 发现如下比较可疑,单次执行要花费20多分钟,

  查看表的索引无异常,查看执行计划是index range scan,貌似没有问题

  该表的数据量比较大且变化比较频繁,怀疑是由于索引引起的热点块,于是将其改为

  全表扫描,这样可以避免索引的热点块产生

  经过一段时间观察系统正常

  SQL ordered by Elapsed Time DB/Inst: BILLBJ/billbj1 Snaps: 15955-15956
  -> Resources reported for PL/SQL code includes the resources used by all SQL
  statements called by the code.
  -> % Total DB Time is the Elapsed Time of the SQL statement divided
  into the Total Database Time multiplied by 100
  Elapsed CPU Elap per % Total
  Time (s) Time (s) Executions Exec (s) DB Time SQL Id
  ---------- ---------- ------------ ---------- ------- -------------
  10,435 709 8 1304.3 66.8 5uc64k11j23wx
  Module: JDBC Thin Client
  select a.day day,sum(a.data) data,sum(a.indata) indata from bandwidth_d
  ailytable a where a.day between :v_start_date and :v_end_date and a.channelId
  in (select * from the (select cast(:v_channelId as channelIdArray) from dual))
  and a.nodeId in (select * from the (select cast(:v_nodeId as nodeIdArray) from d

  有时候产生同一个问题的几个原因可能是相悖的,比如read by other session,所以我们不能局限于某一个方面,此案例中db file sequential read wait也说明顺序读等待过多,最后的解决方法也说明查询未必适用索引就好。

  其实read by other session等待事件是要解决热点块问题,而解决热点块问题首先要减低IO次数,我们来看看IO情况:

~~~~~~~~~~~~ Per Second Per Transaction
~~~~~~~~~~~~ Per Second Per Transaction
--------------- ---------------
--------------- ---------------
Redo size: 28,546.06 881.39
Redo size: 28,546.06 881.39
Logical reads: 37,292.67 1,151.45
Logical reads: 37,292.67 1,151.45
Block changes: 151.32 4.67
Block changes: 151.32 4.67
Physical reads: 2,872.88 88.70
Physical reads: 2,872.88 88.70
Physical writes: 2.64 0.08
Physical writes: 2.64 0.08
User calls: 114.65 3.54
User calls: 114.65 3.54
Parses: 60.26 1.86
Parses: 60.26 1.86
Hard parses: 28.64 0.88
Hard parses: 28.64 0.88
Sorts: 15.86 0.49
Sorts: 15.86 0.49
Logons: 0.42 0.01
Logons: 0.42 0.01
Executes: 106.84 3.30
Executes: 106.84 3.30
Transactions: 32.39
Transactions: 32.39

  很明显逻辑读比较多,另外等待事件cr request retry也是造成性能差的一个原因。我们可以通过分割应用,对大表分区等方法进行系统优化。[/code]

你可以参照下!!

0
Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!