Skip to content

🧬 Generative modeling of regulatory DNA sequences with diffusion probabilistic models πŸ’¨

License

Notifications You must be signed in to change notification settings

pinellolab/DNA-Diffusion

Repository files navigation

DNA Diffusion

Generative modeling of regulatory DNA sequences with diffusion probabilistic models.

build codecov PyPI version

All Contributors


Documentation: https://pinellolab.github.io/DNA-Diffusion

Source Code: https://github.com/pinellolab/DNA-Diffusion


Introduction

DNA-Diffusion is diffusion-based model for generation of 200bp cell type-specific synthetic regulatory elements.

Installation

Our preferred package / project manager is uv. To install the necessary packages, run:

uv sync

This will create a virtual environment in .venv and install all dependencies listed in the pyproject.toml file.

Usage

Sequence Generation

We provide a basic config file for generating sequences using the diffusion model resulting in 1000 sequences made per cell type. Base generation utilizes a guidance scale 1.0, however this can be tuned within the sample.py with the cond_weight_to_metric parameter. To generate sequences call:

uv run sample.py

Training

If you would like to train the model, we provide a basic config file for training the diffusion model. To train the model call:

uv run train.py

Contributors ✨

Thanks goes to these wonderful people (emoji key):

Lucas Ferreira da Silva
Lucas Ferreira da Silva

πŸ€” πŸ’»
Luca Pinello
Luca Pinello

πŸ€”
Simon
Simon

πŸ€” πŸ’»

This project follows the all-contributors specification. Contributions of any kind welcome!