Skip to content

Test installer's raid1 feature #311

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 17 commits into
base: master
Choose a base branch
from
Open

Conversation

ydirson
Copy link
Contributor

@ydirson ydirson commented May 21, 2025

This builds on top of #226

@ydirson ydirson requested review from glehmann and stormi May 21, 2025 12:48
@ydirson ydirson force-pushed the installer-features/raid1 branch from 950e085 to b2bf560 Compare May 22, 2025 08:41
@ydirson
Copy link
Contributor Author

ydirson commented May 22, 2025

Last push rebases, and fixes the issues in AnswerFile dict typechecking

@ydirson ydirson force-pushed the installer-features/raid1 branch 5 times, most recently from 6c68ae2 to 14d1ffc Compare May 22, 2025 10:34
ydirson added 9 commits May 26, 2025 10:38
- adding it behind the scene makes it more difficult to write tests for IPv6
- only needed for install, not upgrade or restore

Signed-off-by: Yann Dirson <[email protected]>
Signed-off-by: Yann Dirson <[email protected]>
This is useful for a lambda passed to @pytest.mark.answerfile, where in
some variants of a test we want to add an element, but nothing in other
variants (eg. a <raid> block)

Signed-off-by: Yann Dirson <[email protected]>
This is a preparation for type hint addition, where we cannot mutate
in-place a variable to another type: we have to build an object of the
correct return type incrementally.

Signed-off-by: Yann Dirson <[email protected]>
Issue raised by type checkers.

Signed-off-by: Yann Dirson <[email protected]>
Type checkers today are unable to determine that `defn` does not contain
an `attrib` member, this prevents them from checking our dict would
provide compatible data (which is does not).

Signed-off-by: Yann Dirson <[email protected]>
@ydirson ydirson force-pushed the installer-features/raid1 branch from 14d1ffc to 2edba63 Compare May 26, 2025 08:41
ydirson added 8 commits May 26, 2025 15:27
Note the `type: ignore[call-arg]` in test_fixtures is here to overcome a
bug/limitation in mypy 1.15.0: in callable_marker() it seems to consider
that `value(**params)` produces at least one argument.

Signed-off-by: Yann Dirson <[email protected]>
Signed-off-by: Yann Dirson <[email protected]>
Signed-off-by: Yann Dirson <[email protected]>
The helper_vm_with_plugged_disk fixture for tune_firstboot was only able
to attach a single disk.  For RAID support we need several disks, and that
fixture would fail.

Signed-off-by: Yann Dirson <[email protected]>
Using a temporary variable is unnecessary and hurts readability.

Also use logger formatting as designed.

Signed-off-by: Yann Dirson <[email protected]>
Code will be more readable when we start manipulating other info about
system disks.

Signed-off-by: Yann Dirson <[email protected]>
Adds a new system_disk test parameter to distinguish raid1 setup from
(pre-existing) single-disk one.  Adds Alpine-specific setup for
tune_firstboot to manually assemble the RAID.

Signed-off-by: Yann Dirson <[email protected]>
@ydirson ydirson force-pushed the installer-features/raid1 branch from 2edba63 to b54c656 Compare May 26, 2025 13:27
etree = ET.ElementTree(self._defn_to_xml_et(self.defn))
etree.write(filename)

# chainable mutators for lambdas

def top_append(self, *defs):
def top_append(self, *defs: Union[SimpleAnswerfileDict, None, ValueError]) -> Self:
assert not isinstance(self.defn['CONTENTS'], str), "a toplevel CONTENTS must be a list"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't we check that self.defn['CONTENTS'] is a list or a Sequence here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of this is to be as close as possible for checking that self.defn is indeed of a subclass of its formal type hint. I had a quick try at actually writing a type hint for this but got into problems (which does not mean it is not feasible, I chose concentrated on other things at that time).

self.defn is typed SimpleAnswerfileDict, so self.defn['CONTENTS'] is NotRequired[Union[str, "SimpleAnswerfileDict", Sequence["SimpleAnswerfileDict"]]].

The code assumes a "toplevel" is rather typed Required[Union["SimpleAnswerfileDict", Sequence["SimpleAnswerfileDict"]]], so:

  • if we stick we assert I missed an assert "CONTENTS" in self.defn first
  • the problem with asserting the actual types is that isinstance cannot check using those type hints, which is why I chose to check self.defn is just not in the complementary set

But yes, this is a bit hackish and should be better explained in a comment, or simply typed better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pydantic could help here:

class Foo(TypedDict):
    bar: int
    baz: str
    plop: NotRequired[None | str]
foo_validator = TypeAdapter(Foo)
# raise an exception when the arg doesn't match what's declared in Foo
foo_validator.validate_python(dict(bar=1, baz="blah", plop=None))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be more pythonic to keep as much as possible at the type-hint level :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's using the type hints, just replacing the assert with the pydantic validator:

from typing import TypedDict
from typing_extensions import NotRequired
from pydantic import TypeAdapter
import os

class Foo(TypedDict):
    bar: int
    baz: str
    plop: NotRequired[None | str]
foo_validator = TypeAdapter(Foo)

v = None
if os.path.exists("/tmp"):
    v = dict(bar=1, baz="blah", plop=None)
# at that point v is of type dict[str, int | str | None]

v = foo_validator.validate_python(v)
# here v is of type Foo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, that makes the runtime check more expensive (the reason why type hints are ignored by the interpreter)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but it's still very fast compared to almost everything we're doing in the tests

In [2]: %timeit foo_validator.validate_python(v)
209 ns ± 1.35 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

if item is not None
]
elif key == 'TAG':
pass # already copied
else:
new_defn[key] = value
new_defn[key] = value # type: ignore[literal-required]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it would have been simpler to keep the old implementation (before the previous commit that prepared this one) and just cast the result to the expected type.
Do we have some benefit to do it that way, with a cast and a type: ignore?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 'pydantic' can validate a TypedDict. We could think of using it, independently of the solution, that both require a cast.

Copy link
Contributor Author

@ydirson ydirson May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it would have been simpler to keep the old implementation (before the previous commit that prepared this one)

Not sure which commit you mean, its name would help 🙂
oh "do the _normalize_structure copy more manually"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have some benefit to do it that way, with a cast and a type: ignore?

The problem is, it cannot be just a cast. The original implementation gradually mutates a SimpleAnswerfileDict into a AnswerfileDict, and the typechecker is not smart enough.

on:

            defn['CONTENTS'] = [AnswerFile._normalize_structure(item)
                                for item in defn['CONTENTS']]

pyright:

  /home/user/src/xcpng/tests/lib/installer.py:47:13 - error: Could not assign item in TypedDict
    Type "list[AnswerfileDict]" is not assignable to type "str | SimpleAnswerfileDict | Sequence[SimpleAnswerfileDict]"
...
      "list[AnswerfileDict]" is not assignable to "Sequence[SimpleAnswerfileDict]"
        Type parameter "_T_co@Sequence" is covariant, but "AnswerfileDict" is not a subtype of "SimpleAnswerfileDict"
          "CONTENTS" is not required in "SimpleAnswerfileDict"
          "CONTENTS" is an incompatible type
            Type "str | list[AnswerfileDict]" is not assignable to type "str | SimpleAnswerfileDict | Sequence[SimpleAnswerfileDict]"

I tried quite a few things and that quickly becomes a nightmare of casts which defeats the type-checking. Right now I don't see a better way to write _normalize_structure, and that type: ignore is only for mypy, pyright does not seem to have any issue there (which might be another issue, I would expect it to require us to duplicate the known-attributes for AnswerfileDict).

Anyway, if we move those attributes into a separate dict as discussed in a thread below, the need for that override will vanish.

return self

# makes a mutable deep copy of all `contents`
@staticmethod
def _normalize_structure(defn):
def _normalize_structure(defn: Union[SimpleAnswerfileDict, ValueError]) -> AnswerfileDict:
assert isinstance(defn, dict), f"{defn!r} is not a dict"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would raise an AssertionError if defn is a ValueError. Shouldn't we raise the ValueError directly?

Copy link
Contributor Author

@ydirson ydirson May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, the whole idea of ValueError here is that we'd like to get it thrown in the first place, so we might not need it in the type any more if we switch from ternary operator to dict lookups.

Else yes, that makes sense.

@@ -98,7 +98,15 @@ def installer_iso(request):
@pytest.fixture(scope='function')
def system_disks_names(request):
firmware = request.getfixturevalue("firmware")
yield {"uefi": "nvme0n1", "bios": "sda"}[firmware]
system_disk_config = request.getfixturevalue("system_disk_config")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

system_disk_config could be validated here.

Maybe use an enum to represent the possible values?

Also disk could have a more explicit name like single_disk.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

system_disk_config could be validated here

do you mean type-hinted?

Also disk could have a more explicit name like single_disk.

I am not very happy with disk but those land in the test name, so I'd prefer a short alternative. For context, after piling IPv6 parameter on top of this, the test are like:

tests/install/test.py::TestNested::test_boot_inst[uefi-83nightly-host1-disk-iso-nosr-ipv6static]

And we are not anywhere near what they ought to be ☹️

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

system_disk_config could be validated here

do you mean type-hinted?

I mean validated at run time to make sure that the value is a valid one.

An enum could help also with the static analysis. I think that mypy/pyright are able to check when we are not checking all the possible variants

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That rather sounds like something the type checkers should be aware of: pytest.mark.parametrize indeed declares an enum, but declaring it ourselves for every parameter 1. seems wrong 2. is likely to make the code less readable

{"TAG": "admin-interface", "name": "eth0", "proto": "dhcp"},
{"TAG": "primary-disk",
"guest-storage": "no" if local_sr == "nosr" else "yes",
"CONTENTS": system_disks_names[0]},
"CONTENTS": ("md127" if system_disk_config == "raid1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of nesting conditional expressions. I think I would prefer something like:

dict(raid1='md127', disk=system_disks_names[0])[system_disk_config]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used those in a few places (I do like getting in a single short line what would otherwise take 4), but in this particular place we'll want to be consistent and not mix the 2 idioms, and I'm not sure how readable that would get with the larger expressions.
OTOH, getting real ValueError raised instead of having to add them in the type hints has a lot of appeal... this needs to be looked at.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why didn't you raise the ValueError in the first place if that's an acceptable option?

I think that

def raise_value_error(v) -> SimpleAnswerfileDict:
    raise ValueError(v)

{"TAG": "raid", "device": "md127",
  "CONTENTS": [
     {"TAG": "disk", "CONTENTS": diskname} for diskname in system_disks_names
 ],
} if system_disk_config == "raid1"
else None if system_disk_config == "disk"
else raise_value_error(f"system_disk_config {system_disk_config!r}")

should have worked (not tasted). Am I missing something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dict(
    raid1={
        "TAG": "raid",
        "device": "md127",
        "CONTENTS": [
            {"TAG": "disk", "CONTENTS": diskname} for diskname in system_disks_names
        ],
    },
    disk=None,
)[system_disk_config]

still seems much easier to read to me :)

Copy link
Contributor Author

@ydirson ydirson May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately it seems we hit a mypy limitation, with

        .top_append(
            {"iso": {"TAG": "source", "type": "local"},
             "net": {"TAG": "source", "type": "url",
                     "CONTENTS": ISO_IMAGES[iso_version]['net-url']},
             }[package_source],
...

gets it complaining:

tests/install/test.py:84: error: Argument 1 to "top_append" of "AnswerFile" has incompatible type "dict[str, str]"; expected "SimpleAnswerfileDict | ValueError | None"  [arg-type]

... where the values of that dict are obviously valid SimpleAnswerfileDict's

and the following does not look near as neat:

            cast(SimpleAnswerfileDict, {"iso": {"TAG": "source", "type": "local"},
                                        "net": {"TAG": "source", "type": "url",
                                                "CONTENTS": ISO_IMAGES[iso_version]['net-url']},
                                        }[package_source]),

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants