weixin_39941262
weixin_39941262
2020-11-21 20:58

run2 re-miniAOD setup based on UL inputs

  • run2_miniAOD_UL
    process modifier is introduced for UL reminiAOD needs.
    • note that using proc modifiers has been preferable over eras since a while; I took the chance to switch to the new way
    • I expect that one modifier should be enough for all years; a year-specific config modifications should be possible with standard modifiers in combination, e.g. as
      (ctpps_2016 & run2_miniAOD_UL).toModify
      • if this will not be enough e.g. for 2016 we can add
        run2_miniAOD_UL16extra
        , but hopefully this can be avoided
  • only minimal updates to miniAOD parameters compared to the defaults are made to have the workflow running
    • rerun container discriminators as in run2_miniAOD_94XFall17 please check and confirm if this needs a simple change (a more complex one would better be done by you in a follow up PR)
    • drop trying to add pixelClusterTagInfos as in other run2 reminis because this collection is not available in 106X AODs (added in 11_X)
    • PUPPI re-addition will be needed after a fixed version of #29254 is available
    • I did not add the photon/electron correctors, assuming that these will be added by EGM either in #29526
  • tentative data and MC workflows are set up for 4 periods of Run2 (2 for 2016, 2017, and 2018)
    • the data workflow numbering takes the existing "legacy" remini as "10" as a reference (even though 0 is implicit due to the numeric nature) and assigns this UL to be "11": e.g. 136.8311 (= 136.83110) -> 136.83111. This minimizes conflicts with some nanoAOD workflows and uses a common pattern
    • the MC workflow numbering starts from 1325.5 as a base (this corresponds to the old 2016 MC remini) and appends "160" (1325.516) and "161" (1325.5161) for the 2016 pre and post-VFP, respectively; similarly "17" and "18" are used for the 2017 and 2018 UL workflows. This was an attempt to establish a somewhat intuitive pattern and avoid conflicts with existing wfs.
  • the input datasets are not available at CERN for most of the cases, a request will need to be made (somehow, for data the cmsDriver is not generating a site=T2_CH_CERN requirement and the workflows can run out of the box).
    • for
      TTbar_13_reminiaod2017UL_INPUT
      I couldn't find an AODSIM relval with PU and use a RECO input instead
  • 136.88811, 2018D UL remini wf is added to the short matrix

A summary of limited validation for 1325.517 (MC 2017) by comparing the original 10_6_X MINIAOD with remini in CMSSW_11_1_X_2020-05-04-1100 using FWLite: the differences are apparently in recognizable places where changes were introduced since 10_6_X - PUPPI updates are visible, including bugfixes in the MET uncertainties and corrections - muon beta variables are filled - tau ID variables are hard to compare with a plain branch diff due to reshuffling of the IDs introduced in 11_1_X

please let me know if you can make PhedEx request for the datasets proposed here in the tests or if an alternative can be transferred (I can make an update to the PR then)

该提问来源于开源项目:cms-sw/cmssw

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

30条回答

  • weixin_39668527 weixin_39668527 5月前

    sorry I just saw your comment about PheDEX. I think it should be fine even the input is not at CERN. We can switch these lines to search at other sites. https://github.com/slava77/cmssw/blob/eac0dc26d24ad80f85d4b4bf638ce1b931feac0f/Configuration/PyReleaseValidation/python/MatrixUtil.py#L180-L185 However, for Data AOD we probably need to ask computing to lock them at the site. What do you think?

    点赞 评论 复制链接分享
  • weixin_39962889 weixin_39962889 5月前

    A new Pull Request was created by (Slava Krutelyov) for master.

    It involves the following packages:

    Configuration/ProcessModifiers Configuration/PyReleaseValidation PhysicsTools/PatAlgos

    , , , , , , , , , , , can you please review it and eventually sign? Thanks. , , , , -Grunewald, , , , , , , , , , , , , , , , , , , , , this is something you requested to watch as well. , you are the release manager for this.

    cms-bot commands are listed here

    点赞 评论 复制链接分享
  • weixin_39941262 weixin_39941262 5月前

    please test

    点赞 评论 复制链接分享
  • weixin_39962889 weixin_39962889 5月前

    The tests are being triggered in jenkins. https://cmssdt.cern.ch/jenkins/job/ib-run-pr-tests/6146/console Started: 2020/05/07 06:19

    点赞 评论 复制链接分享
  • weixin_39962889 weixin_39962889 5月前

    +1 Tested at: eac0dc26d24ad80f85d4b4bf638ce1b931feac0f https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4d00f1/6146/summary.html CMSSW: CMSSW_11_1_X_2020-05-06-2300 SCRAM_ARCH: slc7_amd64_gcc820

    点赞 评论 复制链接分享
  • weixin_39962889 weixin_39962889 5月前

    Comparison job queued.

    点赞 评论 复制链接分享
  • weixin_39962889 weixin_39962889 5月前

    Comparison is ready https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4d00f1/6146/summary.html

    comparisons for the following workflows were not done due to missing matrix map: * /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-4d00f1/136.88811_RunJetHT2018D_reminiaodUL+RunJetHT2018D_reminiaodUL+REMINIAOD_data2018UL+HARVEST2018_REMINIAOD_data2018UL

    Comparison Summary: - No significant changes to the logs found - Reco comparison results: 4 differences found in the comparisons - DQMHistoTests: Total files compared: 34 - DQMHistoTests: Total histograms compared: 2697527 - DQMHistoTests: Total failures: 44 - DQMHistoTests: Total nulls: 0 - DQMHistoTests: Total successes: 2697164 - DQMHistoTests: Total skipped: 319 - DQMHistoTests: Total Missing objects: 0 - DQMHistoSizes: Histogram memory added: 0.0 KiB( 33 files compared) - Checked 147 log files, 16 edm output root files, 34 DQM output files

    点赞 评论 复制链接分享
  • weixin_39941262 weixin_39941262 5月前

    what is the strategy of introducing new input files for new workflows in the IB infrastructure? Can you trigger transfers while the PR is in progress or does it happen after the integration? Please let me know. Thank you.

    点赞 评论 复制链接分享
  • weixin_39941262 weixin_39941262 5月前

    assign xpog in case there are comments/concerns on the strategy and the setup.

    点赞 评论 复制链接分享
  • weixin_39962889 weixin_39962889 5月前

    New categories assigned: xpog

    , you have been requested to review this Pull request/Issue and eventually sign? Thanks

    点赞 评论 复制链接分享
  • weixin_39941262 weixin_39941262 5月前

    +1

    for https://github.com/cms-sw/cmssw/pull/29756 eac0dc2 - tests passed OK, including the new workflow

    点赞 评论 复制链接分享
  • weixin_39993623 weixin_39993623 5月前

    +upgrade

    点赞 评论 复制链接分享
  • weixin_39620629 weixin_39620629 5月前
    * rerun container discriminators as in run2_miniAOD_94XFall17   please check and confirm if this needs a simple change (a more complex one would better be done by you in a follow up PR)
    

    I took a look on this only now. Rerunig tau discriminators as in run2_miniAOD_94XFall17 is correct to deal with different tau discriminator data format in 10_6_X UL AOD input samples compared to one expected by patTaus in 11_1_X. But, rerun discriminators will be as in release used for reMiniAOD rather than ones in input. This is not necessarily wrong (I think Tau POG might like it), but I wonder if it is what was expected. Regardless, I think that there is a way to read also discriminators in old (pre 11_1_X) format.

    点赞 评论 复制链接分享
  • weixin_39941262 weixin_39941262 5月前

    This is not necessarily wrong (I think Tau POG might like it), but I wonder if it is what was expected. Regardless, I think that there is a way to read also discriminators in old (pre 11_1_X) format.

    Thank you for checking. If Tau POG will like it, I would then accept this as expected. 😄 In the context of 10_6_X, this will be a bit of a waste of CPU, but that's a rather small fraction.

    For 11_X, rerunning may be more forward looking so that we do not commit ourselves to support reading the old discriminators in the future.

    点赞 评论 复制链接分享
  • weixin_39681171 weixin_39681171 5月前

    Reading old data format is indeed possible. We recently used this functionality to fix two of the matrix workflows. Slava might remember. But I agree with Michal that rerunning IDs in the latest version might be preferred. If the decision depends on the ID, a hybrid solution could also be possible as the switch of reading old or new format is just a matter of python configuration.

    点赞 评论 复制链接分享
  • weixin_39620629 weixin_39620629 5月前

    This is not necessarily wrong (I think Tau POG might like it), but I wonder if it is what was expected. Regardless, I think that there is a way to read also discriminators in old (pre 11_1_X) format.

    Thank you for checking. If Tau POG will like it, I would then accept this as expected. smile In the context of 10_6_X, this will be a bit of a waste of CPU, but that's a rather small fraction.

    For 11_X, rerunning may be more forward looking so that we do not commit ourselves to support reading the old discriminators in the future.

    I agree with the above arguments. We are checking with conveners.

    One more issue: it will be needed to adjust also NanoAOD with the new modifier. But I think it is anyway beyond the scope of this PR, isn't it?

    点赞 评论 复制链接分享
  • weixin_39941262 weixin_39941262 5月前

    One more issue: it will be needed to adjust also NanoAOD with the new modifier. But I think it is anyway beyond the scope of this PR, isn't it?

    I'm assuming that it can be done in a follow up. I added xpog to the list of signatures earlier. So, if some updates are needed in this PR from xpog review, I can update.

    点赞 评论 复制链接分享
  • weixin_39672979 weixin_39672979 5月前

    Do you have any comments ? xpog:
    pdmv:
    analysis:

    点赞 评论 复制链接分享
  • weixin_39668527 weixin_39668527 5月前

    +1

    点赞 评论 复制链接分享
  • weixin_39620629 weixin_39620629 5月前

    This is not necessarily wrong (I think Tau POG might like it), but I wonder if it is what was expected. Regardless, I think that there is a way to read also discriminators in old (pre 11_1_X) format.

    Thank you for checking. If Tau POG will like it, I would then accept this as expected. smile In the context of 10_6_X, this will be a bit of a waste of CPU, but that's a rather small fraction.

    For 11_X, rerunning may be more forward looking so that we do not commit ourselves to support reading the old discriminators in the future.

    It is just a confirmation from conveners (-wolf, ) that Tau POG likes it.

    点赞 评论 复制链接分享
  • weixin_39672979 weixin_39672979 5月前

    do you have any comments? analysis:
    xpog:

    点赞 评论 复制链接分享
  • weixin_39722070 weixin_39722070 5月前

    +xpog

    yes, for NanoAOD there will be a follow up in a separate pull request, that will come once the updates for MiniAOD are integrated

    点赞 评论 复制链接分享
  • weixin_39672979 weixin_39672979 5月前

    +operations

    点赞 评论 复制链接分享
  • weixin_39672979 weixin_39672979 5月前

    merge have you copied the missing input files at CERN? https://github.com/cms-sw/cmssw/pull/29756#issuecomment-625238866

    点赞 评论 复制链接分享
  • weixin_39941262 weixin_39941262 5月前
    * PUPPI re-addition will be needed after a fixed version of #29254 is available
    

    ah. So, the tests were not redone while the PUPPI PR was merged. This broke the tests in the IB

    点赞 评论 复制链接分享
  • weixin_39939668 weixin_39939668 5月前

    The relevant lines to change for these workflow are likely these: https://github.com/cms-sw/cmssw/pull/29254/files#diff-76dede28e7751d7bb6c5c9542de4155cR430 https://github.com/cms-sw/cmssw/pull/29254/files#diff-416ad16d53535660648a4a4bb98636d6R33 https://github.com/cms-sw/cmssw/pull/29254/files#diff-76dede28e7751d7bb6c5c9542de4155cR374 if you could take care that would be awesome

    点赞 评论 复制链接分享
  • weixin_39978863 weixin_39978863 5月前

    +1

    点赞 评论 复制链接分享
  • weixin_39962889 weixin_39962889 5月前

    This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will be automatically merged.

    点赞 评论 复制链接分享
  • weixin_39962889 weixin_39962889 5月前

    The code-checks are being triggered in jenkins.

    点赞 评论 复制链接分享
  • weixin_39962889 weixin_39962889 5月前

    +code-checks

    Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-29756/15191

    • This PR adds an extra 80KB to repository

    • There are other open Pull requests which might conflict with changes you have proposed:

    • File Configuration/PyReleaseValidation/python/relval_steps.py modified in PR(s): #29630
    点赞 评论 复制链接分享

相关推荐