Add DAG-GFlowNet (Bayesian Structure learning, Deleu et al., 2022) #296

hyeok9855 · 2025-03-26T14:30:01Z

PR Summary

This PR includes the BayesianStructure env to reproduce DAG-GFlowNet.

You can find a script for training with an MLP in tutorials/examples/train_bayesian_structure.py.

Issues

Training is very slow.

TODO in some following PRs

Make it run faster
Use linear transformer policy
Use ModifiedDB
Reproduce the results obtained with modified DB for larger graphs (#nodes=20 and 50)

…to follow torchgfn conventions

…raphBuilding

hyeok9855 · 2025-03-28T14:09:19Z

The below error is resolved in #299

@josephdviviano
An error occurs when using the replay buffer:

python tutorials/examples/train_bayesian_structure.py

  3%|███████▊                                                                                                                                                                                                                                                           | 30/1000 [05:01<2:42:32, 10.05s/it]
Traceback (most recent call last):
  File "/home/sanghyeok/GFN/torchgfn-dag/torchgfn/tutorials/examples/train_bayesian_structure.py", line 519, in <module>
    main(args)
  File "/home/sanghyeok/GFN/torchgfn-dag/torchgfn/tutorials/examples/train_bayesian_structure.py", line 397, in main
    loss = gflownet.loss(env, training_samples, recalculate_all_logprobs=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sanghyeok/GFN/torchgfn-dag/torchgfn/src/gfn/gflownet/trajectory_balance.py", line 70, in loss
    _, _, scores = self.get_trajectories_scores(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sanghyeok/GFN/torchgfn-dag/torchgfn/src/gfn/gflownet/base.py", line 204, in get_trajectories_scores
    log_pf_trajectories, log_pb_trajectories = self.get_pfs_and_pbs(
                                               ^^^^^^^^^^^^^^^^^^^^^
  File "/home/sanghyeok/GFN/torchgfn-dag/torchgfn/src/gfn/gflownet/base.py", line 185, in get_pfs_and_pbs
    return get_trajectory_pfs_and_pbs(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/sanghyeok/GFN/torchgfn-dag/torchgfn/src/gfn/utils/prob_calculations.py", line 45, in get_trajectory_pfs_and_pbs
    log_pf_trajectories = get_trajectory_pfs(
                          ^^^^^^^^^^^^^^^^^^^
  File "/home/sanghyeok/GFN/torchgfn-dag/torchgfn/src/gfn/utils/prob_calculations.py", line 72, in get_trajectory_pfs
    raise AssertionError("Something wrong happening with log_pf evaluations")

The assertion is in get_trajectory_pfs of src/gfn/utils/prob_calculations.py; here.

You can reproduce the error by running the following on this branch:

python tutorials/examples/train_bayesian_structure.py

josephdviviano

Some intermediate comments.

src/gfn/gym/bayesian_structure.py

josephdviviano · 2025-04-11T19:01:26Z

src/gfn/gym/helpers/bayesian_structure/priors.py

+        return self._log_prior
+
+
+class FairPrior(BasePrior):


can we get docstrings explaining the math behind all priors please?

The priors are taken directly from the original repo without modifications. (And there's no docstring there, either.) My understanding of this task is just as bad as yours.

Perhaps we should ask the author for assistance with this?

josephdviviano · 2025-04-11T19:04:42Z

src/gfn/gym/helpers/bayesian_structure/sampling.py

+    return order
+
+
+def sample_from_linear_gaussian(


what is this used for? the return datastruct is strange and I would expect the topological sort to be potentially slow?

what is this used for?

This function generates the data from the true DAG, and the data is required for the reward function (scorer).

the return datastruct is strange

Could you elaborate more?

I would expect the topological sort to be potentially slow?

This code is almost identical to the one in the original repo. And this is called only once at the initialization of the scorer object, so I think it will be okay.

josephdviviano · 2025-04-11T19:05:22Z

src/gfn/gym/helpers/bayesian_structure/scores.py

+    return ld
+
+
+class BaseScore(ABC):


Is this base class necessary?

Yes, if we want to support other scores later (e.g., BDe score). Note that this structure (BaseScore - BGeScore) was also brought over from the original repo.

tutorials/examples/train_bayesian_structure.py

josephdviviano · 2025-04-11T19:10:24Z

src/gfn/gym/helpers/bayesian_structure/priors.py

+        self._log_prior = all_parents * math.log(p) + (
+            self.num_variables - all_parents - 1
+        ) * math.log1p(-p)
+        return self._log_prior


This is the second definition from here, right?

https://en.wikipedia.org/wiki/Erd%C5%91s%E2%80%93R%C3%A9nyi_model#Definition

Well, I'm not sure either. This is taken from the original repo without modification.

Copilot

Pull Request Overview

This PR adds support for DAG-GFlowNet to reproduce Bayesian Structure learning as described in Deleu et al. (2022) by introducing a new BayesianStructure environment along with several helper modules for scoring, sampling, priors, and graph generation. Additional updates improve device propagation, platform‐specific multiprocessing handling, and overall consistency of the codebase.

Updated probability distribution computation in modules to include parameter assertions and refined exploration handling.
Added new helper modules under gfn/gym/helpers/bayesian_structure for scoring (BGeScore), sampling, priors, graph generation, and data factories.
Refactored various environment, actions, and container functions to consistently propagate the device and enhance multiprocessing compatibility.

Reviewed Changes

Copilot reviewed 34 out of 34 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
src/gfn/modules.py	Refined the probability distribution function with parameter assertions and logic.
src/gfn/gym/hypergrid.py	Updated the multiprocessing start method to check the platform and force device usage.
src/gfn/gym/helpers/bayesian_structure/*.py	Introduced new scorer, sampling, priors, graph generation, evaluation, and factories.
src/gfn/gym/graph_building.py	Improved node/edge creation with explicit device handling and refactored action logic.
src/gfn/(containers	env
pyproject.toml	Added pgmpy dependency with a version constraint.

Comments suppressed due to low confidence (2)

src/gfn/gym/helpers/bayesian_structure/scores.py:189

[nitpick] Consider renaming 'tmp_var' to a more descriptive name (e.g., 'adjusted_sample_size') to improve readability.

tmp_var = self.num_samples + self.alpha_w - self.num_nodes + num_parents

src/gfn/gym/graph_building.py:348

[nitpick] Although noted by the TODO comment, replacing the hard-coded upper limit '10' with a configurable parameter would enhance flexibility and clarity of the code.

n_nodes = np.random.randint(1, 10)  # TODO: make the max n_nodes a parameter

younik

I didn't check the helpers yet

src/gfn/gym/bayesian_structure.py

saleml

This is a very strong PR. Very well written, and follows the spirit of the library to perfection! I agree that we need to investigate why it is slow.
Before merging, could you please modify README.md?
In the section "Other environments available in the package include:", it would be nice to talk about this environment (and other graph environments while you're at it :)), and describe it as coming from the Deleu et al. paper. In fact this is our least toy example (so kudos for that Sanghyeok and Abhijith!), and I strongly believe it should be highlighted/advertised more. You should evne provide some example of commands (both in the top of the file and in the README) that could be run.

What do you think?

saleml · 2025-04-19T19:44:55Z

pyproject.toml

@@ -49,6 +49,7 @@ tox = { version = "*", optional = true }

 # scripts dependencies.
 matplotlib = { version = "*", optional = true }
+pgmpy = { version = "<1.0.0", optional = true }


any reason we want this to be <1.0.0

There are some differences between versions 0.x and 1.0.0:

Difference in attribute names (see the error below)

LinearGaussianCPD in version 0.x accepts "variance" as input, but in 1.0.0, it accepts standard deviation.

Since the original DAG-GFN repo uses version 0.x, I believe it's reasonable to follow their settings. Additionally, pgmpy was upgraded to 1.0.0 recently (Apr 1st), and I feel reluctant to use it.

Traceback (most recent call last): File "/home/sanghyeok/GFN/torchgfn-dag/torchgfn/tutorials/examples/train_bayesian_structure.py", line 391, in <module> main(args) File "/home/sanghyeok/GFN/torchgfn-dag/torchgfn/tutorials/examples/train_bayesian_structure.py", line 188, in main scorer, _, gt_graph = get_scorer( ^^^^^^^^^^^ File "/home/sanghyeok/GFN/torchgfn-dag/torchgfn/src/gfn/gym/helpers/bayesian_structure/factories.py", line 75, in get_scorer graph, data, score = get_data( ^^^^^^^^^ File "/home/sanghyeok/GFN/torchgfn-dag/torchgfn/src/gfn/gym/helpers/bayesian_structure/factories.py", line 48, in get_data data = sample_from_linear_gaussian(graph, num_samples=num_samples, rng=rng) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/sanghyeok/GFN/torchgfn-dag/torchgfn/src/gfn/gym/helpers/bayesian_structure/sampling.py", line 46, in sample_from_linear_gaussian cpd.mean[0], cpd.variance, size=(num_samples,) ^^^^^^^^ AttributeError: 'LinearGaussianCPD' object has no attribute 'mean'

tutorials/examples/train_bayesian_structure.py

josephdviviano and others added 14 commits March 24, 2025 16:59

changes to extend

e4d2ba4

minor bugs

0941177

WIP_Creating DAG Env

c0e6141

reward method added for DAG Env

099bb64

renaming: causal_dag -> bayesian_structure

b18a74a

refactor __init__ of BayesianStructure env and add some refactorings …

b4b93d8

…to follow torchgfn conventions

minor refactorings in BayesianStructure env

1ff00ab

use Actions first and then convert it to GraphActions following RingG…

3cf7a09

…raphBuilding

added BGe Score for reward

1a7379a

relocate helper modules

0874e07

added helper functions for score calculation and data generation

eeee3c7

some refactorings

f43991e

fix BGe score

31a900c

add training script with TB

6ba6bd2

hyeok9855 requested review from josephdviviano and younik March 26, 2025 14:30

hyeok9855 marked this pull request as draft March 26, 2025 14:30

change to GNN policy

54817fe

hyeok9855 added 2 commits April 2, 2025 23:37

increase learning rate for logZ and fix epsilon scheduling

1b3c0cb

add evaluation metrics

232f1a2

hyeok9855 mentioned this pull request Apr 2, 2025

Fix extend of GraphStates with 2d batch_shape #299

Merged

hyeok9855 closed this in #299 Apr 3, 2025

hyeok9855 reopened this Apr 3, 2025

hyeok9855 self-assigned this Apr 3, 2025

hyeok9855 added 5 commits April 3, 2025 20:07

minor refactorings

8165b1d

edit pyproject.toml

e04f779

Merge branch 'master' into dag_gfn

0518245

add assertions in estimator

c6162f2

Merge branch 'master' into dag_gfn

e2d16a5

josephdviviano requested changes Apr 11, 2025

View reviewed changes

saleml mentioned this pull request Apr 15, 2025

DAG environment #26

Closed

hyeok9855 added 10 commits April 16, 2025 01:07

Merge branch 'master' into dag_gfn

49a37f8

Merge branch 'hyeok9855/refactor-graph' into dag_gfn

92c5bc3

apply changes to DAG-GFN envs and training scripts

f06bf54

vectorize masks

c12fc61

Merge branch 'hyeok9855/refactor-graph' into dag_gfn

03bcb7c

use batchified to_dense_adj

d95e988

rollback score calculation

4e0a8cd

fix device

9aff0ed

change default experimental settings

5eb5ad9

Merge branch 'hyeok9855/refactor-graph' into dag_gfn

292b998

hyeok9855 requested a review from Copilot April 17, 2025 09:44

Copilot AI reviewed Apr 17, 2025

View reviewed changes

Merge branch 'master' into dag_gfn

2720a06

younik reviewed Apr 18, 2025

View reviewed changes

src/gfn/gym/bayesian_structure.py Outdated Show resolved Hide resolved

src/gfn/gym/bayesian_structure.py Outdated Show resolved Hide resolved

src/gfn/gym/bayesian_structure.py Show resolved Hide resolved

src/gfn/gym/bayesian_structure.py Show resolved Hide resolved

Reflect Omar's review

55b12a9

saleml reviewed Apr 19, 2025

View reviewed changes

hyeok9855 added 2 commits May 5, 2025 16:58

use numpy random number generator

fdd2efe

minor edit in the expected results

7e92b6f

hyeok9855 mentioned this pull request May 8, 2025

Add batch_ptrs to GraphState #308

Merged

Merge branch 'master' into dag_gfn

5ceadfc

hyeok9855 mentioned this pull request May 14, 2025

fix batch graph state #314

Merged

hyeok9855 added 7 commits May 20, 2025 23:40

Merge branch 'master' into dag_gfn

fff7dc1

Merge branch 'master' into dag_gfn

8c710df

Merge branch 'master' into dag_gfn

84d0dbb

Merge branch 'master' into dag_gfn

9ca5f77

refactor following the changes in master branch

ff4b69f

add Jensen-Shannon divergence metric

ab6be36

minor refactoring

e3b0bab

Add DAG-GFlowNet (Bayesian Structure learning, Deleu et al., 2022) #296

Are you sure you want to change the base?

Add DAG-GFlowNet (Bayesian Structure learning, Deleu et al., 2022) #296

Conversation

hyeok9855 commented Mar 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Issues

TODO in some following PRs

Uh oh!

hyeok9855 commented Mar 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

josephdviviano left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

younik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

saleml left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

hyeok9855 commented Mar 26, 2025 •

edited

Loading

hyeok9855 commented Mar 28, 2025 •

edited

Loading