Skip to content

Added new curriculum mdp that allows modification on any environment parameters #2777

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

ooctipus
Copy link
Contributor

@ooctipus ooctipus commented Jun 26, 2025

Description

This PR created two curriculum mdp that can change any parameter in env instance.
namely modify_term_cfg and modify_env_param.

modify_env_param is a more general version that can override any value belongs to env, but requires user to know the full path to the value.

modify_term_cfg only work with manager_term, but is a more user friendly version that simplify path specification, for example, instead of write "observation_manager.cfg.policy.joint_pos.noise", you instead write "observations.policy.joint_pos.noise", consistent with hydra overriding style

Besides path to value is needed, modify_fn, modify_params is also needed for telling the term how to modify.

Demo 1: difficulty-adaptive modification for all python native data type

# iv -> initial value, fv -> final value
def initial_final_interpolate_fn(env: ManagerBasedRLEnv, env_id, data, iv, fv, get_fraction):
    iv_, fv_ = torch.tensor(iv, device=env.device), torch.tensor(fv, device=env.device)
    fraction = eval(get_fraction)
    new_val = fraction * (fv_ - iv_) + iv_
    if isinstance(data, float):
        return new_val.item()
    elif isinstance(data, int):
        return int(new_val.item())
    elif isinstance(data, (tuple, list)):
        raw = new_val.tolist()
        # assume iv is sequence of all ints or all floats:
        is_int = isinstance(iv[0], int)
        casted = [int(x) if is_int else float(x) for x in raw]
        return tuple(casted) if isinstance(data, tuple) else casted
    else:
        raise TypeError(f"Does not support the type {type(data)}")

(float)

    joint_pos_unoise_min_adr = CurrTerm(
        func=mdp.modify_term_cfg,
        params={
            "address": "observations.policy.joint_pos.noise.n_min",
            "modify_fn": initial_final_interpolate_fn,
            "modify_params": {"iv": 0., "fv": -.1, "get_fraction": "env.command_manager.get_command("difficulty")"}
        }
    )

(tuple or list)

command_object_pose_xrange_adr = CurrTerm(
        func=mdp.modify_term_cfg,
        params={
            "address": "commands.object_pose.ranges.pos_x",
            "modify_fn": initial_final_interpolate_fn,
            "modify_params": {"iv": (-.5, -.5), "fv": (-.75, -.25), "get_fraction": "env.command_manager.get_command("difficulty")"}
        }
    )

Demo 3: overriding entire term on env_step counter rather than adaptive

def value_override(env: ManagerBasedRLEnv, env_id, data, new_val, num_steps):
    if env.common_step_counter > num_steps:
        return new_val
    return mdp.modify_term_cfg.NO_CHANGE

object_pos_curriculum = CurrTerm(
        func=mdp.modify_term_cfg,
        params={
            "address": "commands.object_pose",
            "modify_fn": value_override,
            "modify_params": {"new_val": <new_observation_term>, "num_step": 120000 }
        }
    )

Demo 4: overriding Tensor field within some arbitary class not visible from term_cfg
(you can see that 'address' is not as nice as mdp.modify_term_cfg)

def resample_bucket_range(env: ManagerBasedRLEnv, env_id, data, static_friction_range, dynamic_friction_range, restitution_range, num_steps):
    if env.common_step_counter > num_steps:
          range_list = [static_friction_range, dynamic_friction_range, restitution_range]
          ranges = torch.tensor(range_list, device="cpu")
          new_buckets = math_utils.sample_uniform(ranges[:, 0], ranges[:, 1], (len(data), 3), device="cpu")
          return new_buckets
    return mdp.modify_env_param.NO_CHANGE

object_physics_material_curriculum = CurrTerm(
        func=mdp.modify_env_param,
        params={
            "address": "event_manager.cfg.object_physics_material.func.material_buckets",
            "modify_fn": resample_bucket_range,
            "modify_params": {"static_friction_range": [.5, 1.], "dynamic_friction_range": [.3, 1.], "restitution_range": [0.0, 0.5], "num_step": 120000 }
        }
    )

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist

  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

@ooctipus
Copy link
Contributor Author

@jtigue-bdai Feel free to view and provide some feedback

Copy link
Collaborator

@jtigue-bdai jtigue-bdai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @ooctipus, we don't currently have tests for mdp terms but do you think you could put together a unit test for this? Because it has the potential for touching so many things I think it would be good to get some unit tests for it.

Reads `cfg.params["address"]`, replaces only the first occurrence of "s."
with "_manager.cfg.", and then behaves identically to ModifyEnvParam.

for example: command_manager.cfg.object_pose.ranges.xpos -> commands.object_pose.ranges.xpos
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the example here can you show an example use of this so the syntax is clear?

Comment on lines 45 to 48
This term compiles getter/setter accessors for a target attribute (specified by
`cfg.params["address"]`) the first time it is called, then on each invocation
reads the current value, applies a user-provided `modify_fn`, and writes back
the result.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add an example code snippet on how you would use this?

Comment on lines 132 to 179
if isinstance(self.container, tuple):
getter = lambda: self.container[self.last]

def setter(val):
tuple_list = list(self.container)
tuple_list[self.last] = val
self.container = tuple(tuple_list)

elif isinstance(self.container, dict):
getter = lambda: self.container[self.last]

def setter(val):
self.container[self.last] = val

elif isinstance(self.container, object):
getter = lambda: getattr(self.container, self.last)

def setter(val):
setattr(self.container, self.last, val)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to add a condition for single values (i.e. int, float, bool, etc) or does the object condition handle this?

Copy link
Contributor Author

@ooctipus ooctipus Jun 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need for single values, because the object condition handle this, the type check is checking the container not the last.

for example, observations.policy.joint_pos.unoise.n_min has value -0.1,
then the self.container becomes unoise, a object, self.last becomes n_min.

there are three kinds of container as far as I can think of, tuple, dict, or object. So the condition should be complete

@ooctipus ooctipus requested a review from pascal-roth as a code owner June 27, 2025 19:34
@ooctipus ooctipus force-pushed the feat/modify_env_param_curriculum branch from e28803f to 8957e93 Compare June 27, 2025 19:35
ooctipus added 2 commits June 27, 2025 14:26
…hing, and wrote test for this modify_env_param and modify_term_cfg
@ooctipus ooctipus force-pushed the feat/modify_env_param_curriculum branch from 8957e93 to 60a0b87 Compare June 27, 2025 21:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants