|
|
|
|
|
|
BIO-03 Diversity of marine host-associated microbiomes
|
|
Symbiont-Screener: a reference-free tool to separate host sequences
from symbionts for error-prone long reads
Mengyang Xu* , BGI-Qingdao, Qingdao, 266555, China
BGI-Shenzhen, Shenzhen, 518083, China Lidong Guo, BGI-Qingdao, Qingdao, 266555, China
College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China Chengcheng Shi, BGI-Qingdao, Qingdao, 266555, China Xiaochuan Liu, BGI-Qingdao, Qingdao, 266555, China Yanwei Qi, BGI-Qingdao, Qingdao, 266555, China Jianwei Chen, BGI-Qingdao, Qingdao, 266555, China Jinglin Han, BGI-Qingdao, Qingdao, 266555, China Li Deng, BGI-Qingdao, Qingdao, 266555, China Xin Liu, BGI-Qingdao, Qingdao, 266555, China
BGI-Shenzhen, Shenzhen, 518083, China
State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, 518083, China Guangyi Fan, BGI-Qingdao, Qingdao, 266555, China
BGI-Shenzhen, Shenzhen, 518083, China
State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, 518083, China |
Although decontamination is necessary for eliminating the effect of foreign genomes on symbiosis research and biomedical discoveries, the direct extraction of host error-prone long reads with no references remains challenging. We here present Symbiont-Screener, a reference-free approach to identify host raw long reads according to a trio-based screening model, which exploits strobemer and unsupervised clustering to overcome high error rates. When applied to simulated and real contaminated datasets, it outperforms other de novo decontamination tools, and obtains high precision and recall of decontamination comparable to that of state-of-the-art reference-based classifiers, thus promising a high-quality genome reconstruction of the host. The code of the analysis is available at https://github.com/BGI-Qingdao/Symbiont-Screener. This research was supported by the National Natural Science Foundation of China (Grant No. 32100514). |
|
|
|
|
|
|
|