chrombert_get_cistrome_emb¶
Extract cistrome embeddings from ChromBERT.
chrombert_get_cistrome_emb [OPTIONS] SUPERVISED_FILE IDS... -o ONAME
Options
- SUPERVISED_FILE¶
Path to the supervised file.
- IDS¶
IDs to extract. Can be in GSMID format or the regulator:cellline format. To generate a cache file for prompts, use the regulator:cellline format.
- -o, --oname¶
Path to the output HDF5 file. This option is required.
- --basedir¶
Base directory for the required files. Default is set to the value of DEFAULT_BASEDIR.
- -g, --genome¶
Genome version. For example, hg38 or mm10. Only hg38 is supported now. Default is hg38.
- -k, --ckpt¶
Path to the pretrain or fine-tuned checkpoint. Optional if it can be inferred from other arguments.
- --meta¶
Path to the meta file. Optional if it can be inferred from other arguments.
- --mask¶
Path to the matrix mask file. Optional if it can be inferred from other arguments.
- -d, --hdf5-file¶
Path to the HDF5 file that contains the dataset. Optional if it can be inferred from other arguments.
- -hr, --high-resolution¶
Use 200-bp resolution instead of 1-kb resolution. Caution: 200-bp resolution is preparing for the future release of ChromBERT, which is not available yet.
- --batch-size¶
Batch size. Default is 8.
- --num-workers¶
Number of workers for the dataloader. Default is 8.