chrombert_make_dataset

Generate general datasets for ChromBERT from bed files.

chrombert_make_datasets [OPTIONS] BED

Options

BED

Path to the bed file.

-o, --oname

Path to the output file. Stdout if not specified. Must end with .tsv or .txt.

--mode

Mode to generate the dataset. Choices are:

  • region: only consider overlap between input regions to determine the label generated. Useful for narrowPeak-like input.

  • all: report all overlapping status like bedtools intersect -wao. You should determine the label column by yourself.

Default is region.

--center

If used, only consider the center of the input regions.

--label

If mode is not region, this column will be used as the label. Default is the 4th column (1-based).

--no-filter

Do not filter the regions that are not overlapped.

--basedir

Base directory for the required files. Default is set to the value of DEFAULT_BASEDIR.

-g, --genome

Genome version. For example, hg38 or mm10. Only hg38 is supported now. Default is hg38.

-hr, --high-resolution

Use 200-bp resolution instead of 1-kb resolution. Caution: 200-bp resolution is preparing for the future release of ChromBERT, which is not available yet.