weixin_39887546
2020-12-28 20:01 阅读 0

options for selecting tcga + non-tcga studies

After we removed the option to select all studies and replace it with TCGA PanCancer Atlas studies, a number of users complained.

image

Here are two users about their use cases:

“I have been studying some genes with very low mutation frequencies, and had previously determined that some mutations were present in very specific, non-TCGA studies. So if the search is only limited to the TCGA, then those mutations are missed. Manually selecting all studies is really a chore to be sure of not missing some rare mutations. These mutations could hold functional clues to how the enzyme works in cancer, so missing out on them could be detrimental.”

"We generally select all studies when doing a query in cBioPortal. Usually the situation for when we are performing the query is related to variant curation; we are looking for reported cases of a particular variant in any type of cancer. It is easy for us to select all the studies and figure out which are duplicate cases based on the sample ID, rather than selecting certain studies which would limit acquisition of experience with the variant. The TCGA PanCancer studies alone would be insufficient for this same limitation."

To solve this (addressing the use cases above), one option would be providing a button to allow users to select non-redundant TCGA + non-TCGA studies or a "good default set of studies" as proposed in #3395.

/product

该提问来源于开源项目:cBioPortal/cbioportal

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享

9条回答 默认 最新

  • weixin_39887546 weixin_39887546 2020-12-28 20:01

    as commented in https://github.com/cBioPortal/cbioportal/issues/3395#issuecomment-473596022, here is a list of 155 studies: default_studies_list_20190326.txt

    Please:

    • [ ] Make it configurable for the "Quick select" buttons, e.g. add to properties or json configuration files including the button name and studies
    • by default there is no quick select buttons
    • [ ] Add a new button (ie. in properties) "A curated set of 165 studies" after "TCGA PanCancer Atlas studies"
    • /product please propose a better name if you can
    • [ ] MSK portal will only have the "TCGA PanCancer Atlas studies" button
    • [ ] public portal will have both buttons
    • [ ] change https://www.cbioportal.org/ln?q=TP53:MUT to use the new list

    since this is a whitelist. We will need to update this list when new studies are being pushed out.

    点赞 评论 复制链接分享
  • weixin_39887546 weixin_39887546 2020-12-28 20:01

    I am wondering if we should use this list for Quick Search too... It is slower than just TCGA pancancer studies, but not too bad: https://www.cbioportal.org/results/mutations?session_id=5c8daff7e4b046111fee2481

    点赞 评论 复制链接分享
  • weixin_39636333 weixin_39636333 2020-12-28 20:01

    do you think these quick search buttons (pancan and now the curated) are important for other portal instances beside mskcc and public? the configuration is a little awkward, 1. b/c we have to use structured data to represent the list, 2. because in pan can case, the pan can studies button can really only be shown and defined at run time.

    Simplest solution would be to just have a flag that turns these on for our portals. Let me know what you think.

    点赞 评论 复制链接分享
  • weixin_39887546 weixin_39887546 2020-12-28 20:01
    1. Maybe we should define them in the frontend config? e.g.
    json
    {
    "quickSelect":[
     {
       "name":"TCGA PanCancer Atlas studies"
       "studyIds":[...],
       "descreption:":"33 TCGA PanCancer Atlas studies"
     },
     {
       "name":"Curated set of non-redundant studies"
       "studyIds":[...],
       "descreption:":"155 studies that are manually curated including TCGA and non-TCGA studies with no overlapping samples"
     }
    ]
    
    }
    
    1. I think we should put the name of the pancan studies as well.
    点赞 评论 复制链接分享
  • weixin_39636333 weixin_39636333 2020-12-28 20:01

    it is defined in frontend config. the problem is that on dashi, we could have a json file sitting on disc. in aws we cannot do this as instances are ephemeral. so we the json configuration needs to live in a repository just as portal.properties. will discuss with ino tomorrow

    点赞 评论 复制链接分享
  • weixin_39887546 weixin_39887546 2020-12-28 20:01

    maybe we can have some public configure files (non private keys) on github?

    点赞 评论 复制链接分享
  • weixin_39636333 weixin_39636333 2020-12-28 20:01

    do you think the tooltip for the curated set should indicate something about the non-overlapping nature? i.e. that's what drove the curation of the set?

    点赞 评论 复制链接分享
  • weixin_39887546 weixin_39887546 2020-12-28 20:01

    I've updated the name and description above.

    点赞 评论 复制链接分享
  • weixin_39887546 weixin_39887546 2020-12-28 20:01

    utuc_mskcc_2013 should be utuc_mskcc_2015

    https://github.com/cBioPortal/cbioportal/issues/5889

    点赞 评论 复制链接分享

相关推荐