Hi Brad,
Sorry to trouble you again! I have met into an error while trying out the SV prioritisation: [2017-01-16T13:15Z] Prioritize: simplified annotation output [2017-01-16T13:15Z] Sort VCF by reference [2017-01-16T13:15Z] tabix index 1258_290816_M-sort-call-effects-simple-prep.vcf.gz [2017-01-16T13:15Z] Prioritize: convert to tab delimited [2017-01-16T13:15Z] SV genotyping with svtyper [2017-01-16T14:05Z] Sort VCF by reference [2017-01-16T14:05Z] tabix index 1258_290816_M-sort-M1258_290816-svs-prep-1258_290816_M-std-wgts-prep-combined-filter-backfilter-effects-wgts-prep.vcf.gz [2017-01-16T14:05Z] Prioritize: simplified annotation output [2017-01-16T14:06Z] Sort VCF by reference [2017-01-16T14:06Z] tabix index 1258_290816_M-sort-M1258_290816-svs-prep-1258_290816_M-std-wgts-prep-combined-filter-backfilter-effects-wgts-prep-simple-prep.vcf.gz [2017-01-16T14:06Z] Prioritize: convert to tab delimited [2017-01-16T14:06Z] Prioritize: simplified annotation output [2017-01-16T14:06Z] Sort VCF by reference [2017-01-16T14:06Z] tabix index somaticSV-1258_290816_M-simple-prep.vcf.gz [2017-01-16T14:06Z] Prioritize: convert to tab delimited [2017-01-16T14:06Z] Combine prioritized from multiple callers Traceback (most recent call last): File "/mnt/projects/dlho/tancrc/bcbio_pipeline/bin/bcbio_nextgen.py", line 230, in main(kwargs) File "/mnt/projects/dlho/tancrc/bcbio_pipeline/bin/bcbio_nextgen.py", line 43, in main run_main(kwargs) File "/mnt/projects/dlho/tancrc/bcbio_pipeline/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 43, in run_main fc_dir, run_info_yaml) File "/mnt/projects/dlho/tancrc/bcbio_pipeline/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 87, in _run_toplevel for xs in pipeline(config, run_info_yaml, parallel, dirs, samples): File "/mnt/projects/dlho/tancrc/bcbio_pipeline/anaconda/lib/python2.7/site-packages/bcbio/pipeline/main.py", line 180, in variant2pipeline samples = structural.run(samples, run_parallel, "ensemble") File "/mnt/projects/dlho/tancrc/bcbio_pipeline/anaconda/lib/python2.7/site-packages/bcbio/structural/init.py", line 146, in run for xs in to_process.values())) File "/mnt/projects/dlho/tancrc/bcbio_pipeline/anaconda/lib/python2.7/site-packages/bcbio/distributed/multi.py", line 28, in run_parallel return run_multicore(fn, items, config, parallel=parallel) File "/mnt/projects/dlho/tancrc/bcbio_pipeline/anaconda/lib/python2.7/site-packages/bcbio/distributed/multi.py", line 86, in run_multicore for data in joblib.Parallel(parallel["num_jobs"])(joblib.delayed(fn)(x) for x in items): File "/mnt/projects/dlho/tancrc/bcbio_pipeline/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 800, in call while self.dispatch_one_batch(iterator): File "/mnt/projects/dlho/tancrc/bcbio_pipeline/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 658, in dispatch_one_batch self._dispatch(tasks) File "/mnt/projects/dlho/tancrc/bcbio_pipeline/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 566, in _dispatch job = ImmediateComputeBatch(batch) File "/mnt/projects/dlho/tancrc/bcbio_pipeline/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 180, in init self.results = batch() File "/mnt/projects/dlho/tancrc/bcbio_pipeline/anaconda/lib/python2.7/site-packages/joblib/parallel.py", line 72, in call return [func(args, kwargs) for func, args, kwargs in self.items] File "/mnt/projects/dlho/tancrc/bcbio_pipeline/anaconda/lib/python2.7/site-packages/bcbio/utils.py", line 51, in wrapper return apply(f, args, *kwargs) File "/mnt/projects/dlho/tancrc/bcbio_pipeline/anaconda/lib/python2.7/site-packages/bcbio/distributed/multitasks.py", line 228, in detect_sv return structural.detect_sv(args) File "/mnt/projects/dlho/tancrc/bcbio_pipeline/anaconda/lib/python2.7/site-packages/bcbio/structural/init.py", line 165, in detect_sv for svdata in caller_fn(items): File "/mnt/projects/dlho/tancrc/bcbio_pipeline/anaconda/lib/python2.7/site-packages/bcbio/structural/prioritize.py", line 41, in run data = _cnv_prioritize(data) File "/mnt/projects/dlho/tancrc/bcbio_pipeline/anaconda/lib/python2.7/site-packages/bcbio/structural/prioritize.py", line 189, in _cnv_prioritize df = supported[pcall["variantcaller"]]"fn" File "/mnt/projects/dlho/tancrc/bcbio_pipeline/anaconda/lib/python2.7/site-packages/bcbio/structural/prioritize.py", line 152, in _cnvkit_prioritize mdf = mdf[mdf["gene"].str.contains("|".join(genes))] File "/mnt/projects/dlho/tancrc/bcbio_pipeline/anaconda/lib/python2.7/site-packages/pandas/core/strings.py", line 1471, in contains regex=regex) File "/mnt/projects/dlho/tancrc/bcbio_pipeline/anaconda/lib/python2.7/site-packages/pandas/core/strings.py", line 231, in str_contains regex = re.compile(pat, flags=flags) File "/mnt/projects/dlho/tancrc/bcbio_pipeline/anaconda/lib/python2.7/re.py", line 194, in compile return _compile(pattern, flags) File "/mnt/projects/dlho/tancrc/bcbio_pipeline/anaconda/lib/python2.7/re.py", line 249, in _compile p = sre_compile.compile(pattern, flags) File "/mnt/projects/dlho/tancrc/bcbio_pipeline/anaconda/lib/python2.7/sre_compile.py", line 583, in compile "sorry, but this version only supports 100 named groups" AssertionError: sorry, but this version only supports 100 named groups
I am using lumpy, manta, and CNVkit callers. Thanks for looking into this for me!
该提问来源于开源项目:bcbio/bcbio-nextgen