iteye_18080 2011-11-20 09:24
浏览 375
已采纳

关于read by other session跟cr request retry的问题

今天有个同事说有张表通过索引可以查出指定的数据,如果走全表扫描等上非常久也跑不出数据,加rownum<3这样的也出不来,我去看了后台的$session_wait,event事件是read by other session,看到这种情况后,我去另外一台(两台RAC的)看了一下,另外一台关于这张表的等待事件是cr request retry,这个是什么问题,我无从入手,第一次碰到,请教有经验的大哥、大姐们一下。谢谢

  • 写回答

1条回答

  • Java面试专栏 2011-11-20 09:42
    关注

    [code="java"]ODS系统比较慢,要求诊断一下,给ODS系统做了AWR。首先交代一下ODS环境:

      (1)环境:RedHat 5.1+Oracle10.2.0.4 RAC(两节点)

      (2)现象:系统异常慢

      由于当时质疑人资管控系统数据库有问题,而人资管控系统是在节点2实例上,所以查看节点2实例的top5等待事件:

      节点2:

    Top 5 Timed Events Avg %Total
    ~~~~~~~~~~~~~~~~~~ wait Call
    ~~~~~~~~~~~~~~~~~~ wait Call
    Event Waits Time (s) (ms) Time Wait Class
    Event Waits Time (s) (ms) Time Wait Class



    gc buffer busy 2,346,848 24,567 10 33.2 Cluster
    gc buffer busy 2,346,848 24,567 10 33.2 Cluster
    read by other session 1,941,773 18,734 10 25.3 User I/O
    read by other session 1,941,773 18,734 10 25.3 User I/O
    db file scattered read 673,646 10,209 15 13.8 User I/O
    db file scattered read 673,646 10,209 15 13.8 User I/O
    CPU time 6,627 9.0
    CPU time 6,627 9.0
    log file sync 129,112 4,068 32 5.5 Commit
    log file sync 129,112 4,068 32 5.5 Commit

      从top 5 event来看,主要是read类型的event比较多

      db file sequential read 顺序读导致

      gc buffer busy 产生的根本原因是Oracle 的cache fusion机制,简单来说cache fusion的主要目的

      1、减少各个节点对数据库的读写,提交效率,内存之间的数据同步要比磁盘到内存快很多

      2、保持数据的完整性,当某个节点执行查询时首先在本节点buffer中是否存在,如果不存在就会到另外节点buffer去查找,

      如果存在大量的cache交换就会产生gc buffer busy

      3、read by other session

      When information is requested from the database, Oracle will first read the data from disk into the database buffer cache.

      If two or more sessions request the same information, the first session will read the data into the buffer cache while other sessions wait.

      In previous versions, this wait was classified under the "buffer busy waits" event.

      However, in Oracle 10.1 and higher, this wait time is now broken out into the "read by other session" wait event.

      Excessive waits for this event are typically due to several processes repeatedly reading the same blocks,

      e.g. many sessions scanning the same index or performing full table scans on the same table.

      Tuning this issue is a matter of finding and eliminating this contention.

      该event是10.1以后新引入的,之前归类于buffer busy waits,Oracle对event的划分越来越细,

      产生原因:对同一个索引进行的大量扫描或对同一个表进行全表扫描

      查询等待的文件

      SELECT p1 "file#", p2 "block#", p3 "class#"
      FROM v$session_wait
      WHERE event = 'read by other session';

      查询热点块对应的对象

      SQL> SELECT relative_fno, owner, segment_name, segment_type
      FROM dba_extents
      WHERE file_id = &file
      AND &block BETWEEN block_id AND block_id + blocks - 1

      查看Top Sql 发现如下比较可疑,单次执行要花费20多分钟,

      查看表的索引无异常,查看执行计划是index range scan,貌似没有问题

      该表的数据量比较大且变化比较频繁,怀疑是由于索引引起的热点块,于是将其改为

      全表扫描,这样可以避免索引的热点块产生

      经过一段时间观察系统正常

      SQL ordered by Elapsed Time DB/Inst: BILLBJ/billbj1 Snaps: 15955-15956
      -> Resources reported for PL/SQL code includes the resources used by all SQL
      statements called by the code.
      -> % Total DB Time is the Elapsed Time of the SQL statement divided
      into the Total Database Time multiplied by 100
      Elapsed CPU Elap per % Total
      Time (s) Time (s) Executions Exec (s) DB Time SQL Id
      ---------- ---------- ------------ ---------- ------- -------------
      10,435 709 8 1304.3 66.8 5uc64k11j23wx
      Module: JDBC Thin Client
      select a.day day,sum(a.data) data,sum(a.indata) indata from bandwidth_d
      ailytable a where a.day between :v_start_date and :v_end_date and a.channelId
      in (select * from the (select cast(:v_channelId as channelIdArray) from dual))
      and a.nodeId in (select * from the (select cast(:v_nodeId as nodeIdArray) from d

      有时候产生同一个问题的几个原因可能是相悖的,比如read by other session,所以我们不能局限于某一个方面,此案例中db file sequential read wait也说明顺序读等待过多,最后的解决方法也说明查询未必适用索引就好。

      其实read by other session等待事件是要解决热点块问题,而解决热点块问题首先要减低IO次数,我们来看看IO情况:

    ~~~~~~~~~~~~ Per Second Per Transaction
    ~~~~~~~~~~~~ Per Second Per Transaction
    --------------- ---------------
    --------------- ---------------
    Redo size: 28,546.06 881.39
    Redo size: 28,546.06 881.39
    Logical reads: 37,292.67 1,151.45
    Logical reads: 37,292.67 1,151.45
    Block changes: 151.32 4.67
    Block changes: 151.32 4.67
    Physical reads: 2,872.88 88.70
    Physical reads: 2,872.88 88.70
    Physical writes: 2.64 0.08
    Physical writes: 2.64 0.08
    User calls: 114.65 3.54
    User calls: 114.65 3.54
    Parses: 60.26 1.86
    Parses: 60.26 1.86
    Hard parses: 28.64 0.88
    Hard parses: 28.64 0.88
    Sorts: 15.86 0.49
    Sorts: 15.86 0.49
    Logons: 0.42 0.01
    Logons: 0.42 0.01
    Executes: 106.84 3.30
    Executes: 106.84 3.30
    Transactions: 32.39
    Transactions: 32.39

      很明显逻辑读比较多,另外等待事件cr request retry也是造成性能差的一个原因。我们可以通过分割应用,对大表分区等方法进行系统优化。[/code]

    你可以参照下!!

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 求chat4.0解答一道线性规划题,用lingo编程运行,第一问要求写出数学模型和lingo语言编程模型,第二问第三问解答就行,我的ddl要到了谁来求了
  • ¥15 Ubuntu在安装序列比对软件STAR时出现报错如何解决
  • ¥50 树莓派安卓APK系统签名
  • ¥15 maple软件,用solve求反函数出现rootof,怎么办?
  • ¥65 汇编语言除法溢出问题
  • ¥15 Visual Studio问题
  • ¥20 求一个html代码,有偿
  • ¥100 关于使用MATLAB中copularnd函数的问题
  • ¥20 在虚拟机的pycharm上
  • ¥15 jupyterthemes 设置完毕后没有效果