configurable rollouts actor #23

ollmer · 2025-05-14T17:08:14Z

No description provided.

NicolasAG · 2025-05-21T14:06:41Z

pipelinerl/tapeagents_rollouts.py

+        "output_tokens": sum([llm_call.output_length_tokens for llm_call in llm_calls]),
+        "overflow": 0,  # TODO: should we treat max_loops stop as overflow?
+    }
+    return samples, metrics


Could we also return the resulting tape? It could be useful if one has a different implementation for the metric "has_error". For instance, I am using this definition of tape error, which catches more cases:

def tape_contains_an_error(tape: WebTape) -> bool: """ Returns true if the tape ends with an error, ie if one of the following is true: - the last step is an LLMOutputParsingFailureAction - the tape metadata has an error - the last step is a PageObservation with an error """ return isinstance(tape.steps[-1], LLMOutputParsingFailureAction) or \ tape.metadata.result.get("error") is not None or \ (isinstance(tape.steps[-1], PageObservation) and tape.steps[-1].error)

I could also be useful to compute additional stats on the tape without modifying this function for each statistic we would like to add.
For instance I am also computing the n_llm_calls, n_error_steps, n_page_observations, and n_steps in the tape.

good point. Also this function signature is a contract that should be universal enough and not bound to the tapeagents. So I would propose to return the object of a new class RolloutResult with the fields samples, metrics and rollout_artifacts: dict. This way we can put tape in the result.rollout_artifacts["tape"]

NicolasAG · 2025-05-21T14:12:29Z

pipelinerl/tapeagents_rollouts.py

+    assert tape is not None, "No tape generated"
+    has_errors = any([1 for s in tape.steps if s.llm_dict().get("error")])
+    has_answer = any([isinstance(s, StopStep) for s in tape.steps])
+    _, llm_calls = agent.reuse(tape)


is this better than just running this?

llm_calls = [] for i, step in enumerate(tape.steps): if "llm_call" not in step.metadata.other or step.metadata.other["llm_call"] is None: continue llm_call = step.metadata.other["llm_call"] if isinstance(llm_call, dict): llm_call = LLMCall(**llm_call) llm_calls.append(llm_call)

I just copied our approach to generating training data from the tapeagent's agent.make_training_data() https://github.com/ServiceNow/TapeAgents/blob/main/tapeagents/agent.py#L862. We can discuss if we really need it. This double-check that the tape was produced by the same agent is not necessary here imo.

ollmer added 19 commits May 8, 2025 16:57

debug mode

5286210

remove mount usage

b7f2d91

adjust for new sample lengths

7baccf0

error message

f93ecf5

debug config

fd23d60

pyproject toml to build package

d6e7dd3

move deps to pyproject

5def616

get size of the stream

3542f51

log queue and stream sizes to wandb

e3c9309

little more logging of tuning steps

247c74f

save launcher commands to respective stage folders

0fb9c16

auto names for debug runs, limit seq lengths

547e6ca

tapeagents rollout generator, non async

3e04427

:wqMerge branch 'main' into oleh_exps

a8f81c4

Merge branch 'shared_memory_array_rizar' into oleh_exps

8aef881

spend less time on logging

bd01bc7

configurable rollouts actor

2db0f31

more straightforward way to set rollout function

4b5bac9

fix

7844e30

ollmer changed the base branch from oleh_exps to main May 15, 2025 09:15

revert seq length

6de3ecc

NicolasAG reviewed May 21, 2025

View reviewed changes

rizar added 2 commits May 27, 2025 18:42

Merge branch 'main' into configurable_rollouts

7ece4d5

move all math to math folder

21d3072

rizar mentioned this pull request May 29, 2025

configurable actor, environment, data loader #30

Merged

rizar merged commit 21d3072 into main May 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

configurable rollouts actor #23

configurable rollouts actor #23

Uh oh!

ollmer commented May 14, 2025

Uh oh!

NicolasAG May 21, 2025

Uh oh!

ollmer May 22, 2025

Uh oh!

NicolasAG May 21, 2025

Uh oh!

ollmer May 22, 2025

Uh oh!

Uh oh!

configurable rollouts actor #23

configurable rollouts actor #23

Uh oh!

Conversation

ollmer commented May 14, 2025

Uh oh!

NicolasAG May 21, 2025

Choose a reason for hiding this comment

Uh oh!

ollmer May 22, 2025

Choose a reason for hiding this comment

Uh oh!

NicolasAG May 21, 2025

Choose a reason for hiding this comment

Uh oh!

ollmer May 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!