chrombert_imputation_cistrome¶
Generate prediction result (full bigwig file or table) from ChromBERT when given cell type name, region and regulator.
Note
Either –o-bw or –o-table must be provided, depends on which format you want to output the results.
chrombert_imputation_cistrome [OPTIONS] SUPERVISED_FILE --o-bw BW_PATH --o-table TABLE_PATH --finetune-ckpt CKPT --prompt-kind KIND
Options
- supervised_file¶
Path to the supervised file.
- --o-bw¶
Path of the output BigWig file.
- --o-table¶
Path to the output table if you want to output the table.
- --prompt-kind¶
Prompt data class. Choose from cistrome or expression. This option is required.
- --basedir¶
Base directory for the required files. Default is set to the value of DEFAULT_BASEDIR.
- -g, --genome¶
Genome version. For example, hg38 or mm10. Only hg38 is supported now. Default is hg38.
- --pretrain-ckpt¶
Path to the pretrain checkpoint. Optional if it could be inferred from other arguments.
- -d, --hdf5-file¶
Path to the HDF5 file that contains the dataset. Optional if it could be inferred from other arguments.
- -hr, --high-resolution¶
Use 200-bp resolution instead of 1-kb resolution. Caution: 200-bp resolution is preparing for the future release of ChromBERT, which is not available yet.
- --finetune-ckpt¶
Path to the finetune checkpoint. Optional.
- --prompt-dim-external¶
Dimension of external data. Use 512 for scGPT and 768 for ChromBERT’s embedding. Default is 512.
- --prompt-celltype-cache-file¶
Path to the cell-type-specific prompt cache file. Optional.
- --prompt-regulator-cache-file¶
Path to the regulator prompt cache file. Optional.
- --prompt-celltype¶
The cell-type-specific prompt. For example, dnase:k562 for cistrome prompt and k562 for expression prompt. It can also be provided in the supervised file if the format supports. Optional.
- --prompt-regulator¶
The regulator prompt. Determine the kind of output. For example, ctcf or h3k27ac. It can also be provided in the supervised file if the format supports. Optional.
- --batch-size¶
Batch size. Default is 8.
- --num-workers¶
Number of workers for the dataloader. Default is 8.