weixin_39605004
weixin_39605004
2020-11-21 20:08

Adding `dsqrtEvaluator` on X86

Adding X86's version of getSupportsHardwareSQRT() function and make it return true value. So that X86 will be able to reduce StrictMath.sqrt() and Math.sqrt() function to dsqrt in recognizedCallTransformer.

Also, on X86, there is a function inlineMathSQRT() which previously handles the call to StrictMath.sqrt() and Math.sqrt(). X86 specially handled the constant case in inlineMathSQRT(). In dsqrtEvaluator, I didn't include the constant handling, because the constant case will be handled in treeSimplification (i.e. reducing dsqrt to dconst). Following is the trace log I generated from my own test case[1] using the jvm which includes the newly added dsqrtEvaluator and having inlineMathSQRT removed:

  • Post Inlining Trees: for testsqrt.sq(D)D (Trees before treeSimplification)

n1n       BBStart <block_2>                                                                   [0x7eff1d5e5370] bci=[-1,0,17] rc=0 vc=14 vn=- li=- udi=- nc=0
n42n      treetop                                                                             [0x7eff1d5e6040] bci=[-1,3,17] rc=0 vc=14 vn=- li=- udi=- nc=1
n9n         aladd                                                                             [0x7eff1d5e55f0] bci=[-1,3,17] rc=1 vc=14 vn=- li=- udi=- nc=2
n7n           aconst 0x964300 (java/lang/StrictMath.class) (classPointerConstant sharedMemory )  [0x7eff1d5e5550] bci=[-1,3,17] rc=1 vc=14 vn=- li=- udi=- nc=0 flg=0x10000
n8n           lconst 48                                                                       [0x7eff1d5e55a0] bci=[-1,3,17] rc=1 vc=14 vn=- li=- udi=- nc=0
n43n      treetop                                                                             [0x7eff1d5e6090] bci=[-1,0,17] rc=0 vc=14 vn=- li=- udi=- nc=1
n3n         dconst 42.25 [0x4045200000000000]                                                 [0x7eff1d5e5410] bci=[-1,0,17] rc=2 vc=14 vn=- li=- udi=- nc=0
n10n      dstore  <auto slot>[#360  Auto] [flags 0x6 0x0 ]                                  [0x7eff1d5e5640] bci=[-1,6,17] rc=0 vc=14 vn=- li=- udi=- nc=1
n6n         dsqrt ()                                                                          [0x7eff1d5e5500] bci=[-1,3,17] rc=1 vc=14 vn=- li=- udi=- nc=1 flg=0x40000
n3n           ==>dconst 42.25 [0x4045200000000000]
n15n      NULLCHK on n11n [#31]                                                               [0x7eff1d5e57d0] bci=[-1,11,21] rc=0 vc=14 vn=- li=- udi=- nc=1
n14n        calli  java/io/PrintStream.println(D)V[#364  virtual Method -240] [flags 0x500 0x0 ] ()  [0x7eff1d5e5780] bci=[-1,11,21] rc=1 vc=15 vn=- li=- udi=- nc=3 flg=0x20
n13n          aloadi  <vft-symbol>[#295  Shadow] [flags 0x18607 0x0 ]                         [0x7eff1d5e5730] bci=[-1,11,21] rc=1 vc=14 vn=- li=- udi=- nc=1
n11n            aload  java/lang/System.out Ljava/io/PrintStream;[#361  final Static] [flags 0x20307 0x0 ]  [0x7eff1d5e5690] bci=[-1,7,21] rc=2 vc=14 vn=- li=- udi=- nc=0
n11n          ==>aload
n12n          dload  <auto slot>[#360  Auto] [flags 0x6 0x0 ]                               [0x7eff1d5e56e0] bci=[-1,10,21] rc=1 vc=14 vn=- li=- udi=- nc=0
n17n      dreturn                                                                             [0x7eff1d5e5870] bci=[-1,15,22] rc=0 vc=14 vn=- li=- udi=- nc=1
n16n        dload  <auto slot>[#360  Auto] [flags 0x6 0x0 ]                                 [0x7eff1d5e5820] bci=[-1,14,22] rc=1 vc=14 vn=- li=- udi=- nc=0
n2n       BBEnd </auto></auto></vft-symbol></auto></block_2> =====                                                              [0x7eff1d5e53c0] bci=[-1,15,22] rc=0 vc=14 vn=- li=- udi=- nc=0
  • Trees after treeSimplification: for testsqrt.sq(D)D

n1n       BBStart <block_2>                                                                   [0x7eff1d5e5370] bci=[-1,0,17] rc=0 vc=22 vn=- li=- udi=- nc=0
n42n      treetop                                                                             [0x7eff1d5e6040] bci=[-1,3,17] rc=0 vc=22 vn=- li=- udi=- nc=1
n9n         aladd                                                                             [0x7eff1d5e55f0] bci=[-1,3,17] rc=1 vc=22 vn=- li=- udi=- nc=2
n7n           aconst 0x964300 (java/lang/StrictMath.class) (classPointerConstant sharedMemory )  [0x7eff1d5e5550] bci=[-1,3,17] rc=1 vc=22 vn=- li=- udi=- nc=0 flg=0x10000
n8n           lconst 48 (highWordZero )                                                       [0x7eff1d5e55a0] bci=[-1,3,17] rc=1 vc=22 vn=- li=- udi=- nc=0 flg=0x4000
n43n      treetop                                                                             [0x7eff1d5e6090] bci=[-1,0,17] rc=0 vc=22 vn=- li=- udi=- nc=1
n3n         dconst 42.25 [0x4045200000000000]                                                 [0x7eff1d5e5410] bci=[-1,0,17] rc=1 vc=22 vn=- li=- udi=- nc=0
n10n      dstore  <auto slot>[#360  Auto] [flags 0x6 0x0 ]                                  [0x7eff1d5e5640] bci=[-1,6,17] rc=0 vc=22 vn=- li=- udi=- nc=1
n6n         dconst 6.5 [0x401a000000000000] ()                                                [0x7eff1d5e5500] bci=[-1,3,17] rc=1 vc=22 vn=- li=- udi=- nc=0 flg=0x40000
n15n      NULLCHK on n11n [#31]                                                               [0x7eff1d5e57d0] bci=[-1,11,21] rc=0 vc=22 vn=- li=- udi=- nc=1
n14n        calli  java/io/PrintStream.println(D)V[#364  virtual Method -240] [flags 0x500 0x0 ] ()  [0x7eff1d5e5780] bci=[-1,11,21] rc=1 vc=22 vn=- li=- udi=- nc=3 flg=0x20
n13n          aloadi  <vft-symbol>[#295  Shadow] [flags 0x18607 0x0 ]                         [0x7eff1d5e5730] bci=[-1,11,21] rc=1 vc=22 vn=- li=- udi=- nc=1
n11n            aload  java/lang/System.out Ljava/io/PrintStream;[#361  final Static] [flags 0x20307 0x0 ] (X>=0 sharedMemory )  [0x7eff1d5e5690] bci=[-1,7,21] rc=2 vc=22 vn=- li=- udi=- nc=0 flg=0x100
n11n          ==>aload
n12n          dload  <auto slot>[#360  Auto] [flags 0x6 0x0 ]                               [0x7eff1d5e56e0] bci=[-1,10,21] rc=1 vc=22 vn=- li=- udi=- nc=0
n17n      dreturn                                                                             [0x7eff1d5e5870] bci=[-1,15,22] rc=0 vc=22 vn=- li=- udi=- nc=1
n16n        dload  <auto slot>[#360  Auto] [flags 0x6 0x0 ]                                 [0x7eff1d5e5820] bci=[-1,14,22] rc=1 vc=22 vn=- li=- udi=- nc=0
n2n       BBEnd </auto></auto></vft-symbol></auto></block_2> =====                                                              [0x7eff1d5e53c0] bci=[-1,15,22] rc=0 vc=22 vn=- li=- udi=- nc=0

[1] test case I used to generate the log: ``` public class testsqrt { public static void main(String[] args) { double total = 0; for (int i = 0; i < 5; i++) total += sq(); System.out.println(total); }

public static double sq()
   {
   double res = StrictMath.sqrt(42.25);
   System.out.println(res);
   return res;
  }

}

该提问来源于开源项目:eclipse/omr

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

8条回答

  • weixin_39600400 weixin_39600400 5月前

    I'm not sure that relying on the const to be folded is something we should do if the result of the dsqrt will be incorrect without it happening... If it will be incorrect on OpenJ9 and not OMR then OpenJ9 should retain that logic in its own dsqrt evaluator

    点赞 评论 复制链接分享
  • weixin_39926311 weixin_39926311 5月前

    I'm not sure that relying on the const to be folded is something we should do if the result of the dsqrt will be incorrect without it happening..

    Can you explain? Why would evaluating the const and applying the square root instruction be incorrect? How is that different than passing a const through a parameter and doing a square root on the parameter?

    Reduction of a

    
    dsqrt
      dconst x
    

    IL to a dconst x^0.5 is the responsibility of the simplifier, not the codegen IMO. The code generator should be dead simple and avoid such optimizations which should be cross platform. No need to have duplicate logic all across different codegens which is why Power and Z don't have such const cases in their dsqrtEvaluators

    点赞 评论 复制链接分享
  • weixin_39600400 weixin_39600400 5月前

    so if we are going to rely on the folding in simplifier can we please make sure the special case handling in the x codegen will be addressed by the code in the simplifier so we can be sure the handling of the evaluation overall remains unchanged?

    点赞 评论 复制链接分享
  • weixin_39605004 weixin_39605004 5月前

    We add a few JIT tests named jit_recognizedMethod to test the corner cases of dsqrtEvaluator in [1] and launched a dependent build. jit_recognizedMethod_0 test with the variation '-Xint' can make sure the test works correctly. And jit_recognizedMethod_1 test includes the options -Xjit:count=1, disableAsyncCompilation which we checked locally to make sure the test would perform the transformation to dsqrt node and the dsqrt node would be reduced to dconst. We can see the tests are all passed on X86.

    Therefore, we can say the transformation in recognizedCallTransformation and the reduction in treeSimplification will give us the correct result.

    [2]-[5] are some example tests that we launch in the dependent build[1].

    [1] https://github.com/eclipse/openj9/pull/7609#issuecomment-549400562 [2] https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_sanity.functional_x86-64_linux_Personal/415/console [3] https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_sanity.functional_x86-64_linux_xl_Personal/185/console [4] https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_sanity.functional_x86-64_mac_Personal/202/console [5] https://ci.eclipse.org/openj9/job/Test_openjdk8_j9_sanity.functional_x86-64_windows_Personal/303/console

    点赞 评论 复制链接分享
  • weixin_39600400 weixin_39600400 5月前

    -omr build all

    点赞 评论 复制链接分享
  • weixin_39926311 weixin_39926311 5月前

    -omr build all

    点赞 评论 复制链接分享
  • weixin_39926311 weixin_39926311 5月前

    I think this one is good to go. The changes are all x86.

    点赞 评论 复制链接分享
  • weixin_39600400 weixin_39600400 5月前

    AppVeyor failure is an infra issue.390 and ppc are not affected by the change so I agree this is good to go.

    点赞 评论 复制链接分享

相关推荐