feat: more flexible helper for batch reduction #155

jokasimr · 2025-05-05T13:14:04Z

This PR implements a shortcut through the workflow that is useful in several common scenarios involving multiple samples, angles and potentially multiple runs per angle.

Copilot

Pull Request Overview

This PR adds a more flexible helper for batch reduction by refactoring the dataset computation workflow and its associated tests. Key changes include:

Replacing the function orso_datasets_from_measurements with a more versatile from_measurements that supports both list and mapping inputs.
Adjusting the test cases to validate the new helper behavior and result types.
Moving the RawChopper class from the amor module to the reflectometry types module and updating corresponding imports.

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
tests/tools_test.py	Updated tests to use from_measurements and verify expected parameters.
src/ess/reflectometry/workflow.py	Adjusted imports to use the new location of RawChopper and streamline workflow.
src/ess/reflectometry/types.py	Added RawChopper class to reflectometry/types to replace its previous location.
src/ess/reflectometry/tools.py	Renamed the dataset helper and refactored scaling logic with additional helper functions.
src/ess/amor/types.py	Removed duplicate RawChopper class.
src/ess/amor/load.py	Removed the import and reference to RawChopper.

Copilot · 2025-05-06T07:29:03Z

src/ess/reflectometry/tools.py

+    return datasets if names is None else dict(zip(names, datasets, strict=True))
+
+
+def _workflow_needs_quantity_A_even_if_quantitiy_B_is_set(workflow, A, B):


The function name '_workflow_needs_quantity_A_even_if_quantitiy_B_is_set' has a typo in 'quantitiy'. Consider correcting it to '_workflow_needs_quantity_A_even_if_quantity_B_is_set' for clarity.

Suggested change

def _workflow_needs_quantity_A_even_if_quantitiy_B_is_set(workflow, A, B):

def _workflow_needs_quantity_A_even_if_quantity_B_is_set(workflow, A, B):

nvaytet · 2025-05-06T15:03:27Z

src/ess/reflectometry/tools.py

+                wf[Filename[SampleRun]] = parameters[Filename[SampleRun]]
+        return wf
+
+    if scale_to_overlap:


As discussed during the standup, I don't really get why we need the helper function, as opposed to just having different workflows.

You would have a workflow that computes a reflectivity curve.

If you want to compute this for multiple runs, you just map over the run numbers or filenames, and then use compute_mapped

If you want to scale_to_overlap, you have a workflow that basically does the mapping and adds a provider that either just scales the results or scales and merges. You then compute a different result (e.g. CombinedScaledReflectivityOverQ)

You can then view the graph of what is being done for everything (because with the helper function you can't see it all, only pieces, and they are not easy to get to either if you just installed the package?)

In any case, we should avoid having a long discussion here on github, this was just to have a starting point for an in-person discussion.

It's possible we can implement the same thing using sciline map reduce. Let us see the helper in this PR as a reference implementation and try to re-implement as one big workflow.

nvaytet

After in person discussion, it seems difficult to avoid using a helper function to loop over files, and use a single large pipeline for all cases.

We are already using mapping to combine events from multiple files, and having a double mapping to map over different angles, but using the same mapping parameter Filename[SampleRun] would not work.

Having Filename as a scope with 2 params could maybe work, Filename[SampleRun, Angle], but that would require changing it in essreduce in the types common to all other workflows.

In addition, the scale_to_overlap sometimes wants to scale the event weights before the final 1d R(Q) result, and it is not as simple as just inserting a provider at the end of the workflow to apply the scaling.

With all this taken into account, the helper function seems like a good approach.

nvaytet · 2025-05-20T13:00:24Z

src/ess/reflectometry/tools.py

@@ -279,18 +284,18 @@ def combine_curves(
    )


-def orso_datasets_from_measurements(
+def from_measurements(


Can we think of a better name? I don't have great suggestions, I thought of something like batch_reduction, but it's not super descriptive either...

nvaytet · 2025-05-20T13:05:20Z

src/ess/reflectometry/tools.py

+            # Check if any of the targets need ReducibleData if
+            # ReflectivityOverQ already exists.
+            # If they don't, we can avoid recomputing ReducibleData.
+            targets = target if hasattr(target, '__len__') else (target,)


Can there ever be a danger that someone passed a string as target and the __len__ check fails? I guess if they passed a string, they will have other problems earlier? (target not found in graph)

jokasimr added 2 commits May 5, 2025 15:10

feat: more flexible helper for batch reduction

edcf4c2

fix: remove duplicate

7dbbf2c

MridulS requested a review from Copilot May 6, 2025 07:28

Copilot AI reviewed May 6, 2025

View reviewed changes

jokasimr added 8 commits May 6, 2025 09:38

update docs

25add6b

spelling

eda422e

fix: add theta to reference, can be useful in some contexts

98a9389

fix: handle case when SampleRotation etc are set in workflow

743743a

fix: add parameters before setting filenames

45bcd0e

docs: fix

7fb806b

tests

f097188

Merge branch 'main' into helper-for-reducing-multiple

4f9cbd2

jokasimr added this to Development Board May 6, 2025

github-project-automation bot moved this to In progress in Development Board May 6, 2025

jokasimr moved this from In progress to Selected in Development Board May 6, 2025

nvaytet reviewed May 6, 2025

View reviewed changes

Merge branch 'main' into helper-for-reducing-multiple

7c1750d

nvaytet reviewed May 20, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: more flexible helper for batch reduction #155

feat: more flexible helper for batch reduction #155

jokasimr commented May 5, 2025

Copilot AI left a comment

Copilot AI May 6, 2025

MridulS May 6, 2025

jokasimr May 6, 2025

nvaytet May 6, 2025

jokasimr May 6, 2025

nvaytet left a comment

nvaytet May 20, 2025

nvaytet May 20, 2025

		return datasets if names is None else dict(zip(names, datasets, strict=True))


		def _workflow_needs_quantity_A_even_if_quantitiy_B_is_set(workflow, A, B):

feat: more flexible helper for batch reduction #155

Are you sure you want to change the base?

feat: more flexible helper for batch reduction #155

Conversation

jokasimr commented May 5, 2025

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Copilot AI May 6, 2025

Choose a reason for hiding this comment

MridulS May 6, 2025

Choose a reason for hiding this comment

jokasimr May 6, 2025

Choose a reason for hiding this comment

nvaytet May 6, 2025

Choose a reason for hiding this comment

jokasimr May 6, 2025

Choose a reason for hiding this comment

nvaytet left a comment

Choose a reason for hiding this comment

nvaytet May 20, 2025

Choose a reason for hiding this comment

nvaytet May 20, 2025

Choose a reason for hiding this comment