improve argument evaluation #104

mjaspers2mtu · 2025-06-08T13:59:43Z

allow variables to be initialized with hex and binary values, not just integers

For example:
magic: .long 48879

could also be:
magic: .long 0b1011111011101111

or what I frequently use it for:
magic: .long 0xBEEF

int(x, 0) in Python is a very useful feature of the int() constructor that automatically interprets the base (radix) of the number based on its prefix

allow variables to be initialized with hex and binary values, not just integers

ThomasWaldmann · 2025-06-08T17:40:03Z

This will introduce a slight incompatibility in case someone has used integers with leading zeros. But I think it is worth it nevertheless, it should be just pointed out in docs / changelog.

>>> int("010")
10

>>> int("010", 0)
Traceback (most recent call last):
  File "<python-input-1>", line 1, in <module>
    int("010", 0)
    ~~~^^^^^^^^^^
ValueError: invalid literal for int() with base 0: '010'

https://docs.python.org/3/library/functions.html#int

wnienhaus · 2025-06-09T04:57:00Z

Thanks for this PR.

I thought we already supported this (see the test fixture here for example), because we support it for opcode arguments, but I now see we don't evaluate what is passed to .long et al. (I will have a look later, why it works in that fixture, if it even does)

According to the GNU assembler manual, the .long / .int / etc directives take expressions as argument, which should evaluate to integers. So we really should evaluate the argument, and technically, we should even evaluate multiple expressions if there are multiple (separated by comma).

But for now, perhaps the easiest is to stick to supporting a single value, but to pass that value through our eval_args function, which we use for evaluating expressions when used in opcodes arguments. (See tests here for what it's capable of). That way we have only one consistent way of evaluating expressions.

We already do this for the .set directive a few lines above (see here), so it should be easy to achieve.

Would you mind adapting the PR to use eval_args?

Using eval_args instead for better compatibility when using expressions

mjaspers2mtu · 2025-06-09T11:12:41Z

I agree @wnienhaus that the use of eval_args is a better idea, as it also allows expressions. There is still an issue with leading 0's (thanks @ThomasWaldmann), which is not present when using leading 0's in opcode arguments, since this also goes through arg_qualify.

There is a caveat, that when using leading 0's in opcode arguments, it can not be an expression.
For example, this works:
move r0, 048879

but this does not:
move r0, 048878+1

This has to do with the try except blocks in the arg_qualify function. where int("048879") evaluates to 48879 (first try block) but eval("048879") throws an exception (second try block).

mjaspers2mtu · 2025-06-09T11:15:20Z

If you figure out why the test fixture passes, I am interested what the reason is :)

wnienhaus · 2025-06-12T20:03:22Z

So... it turns out the change in behaviour (from what I remember to how it's now) is because of a "regression" (or fully-understood and intentional change) in MicroPython.

When looking into this, I noticed our unit tests fail with the latest MicroPython (v1.25.0) - they fail here, exactly for the same reason you (@mjaspers2mtu) described in your last comment about the try..except block, but while trying to cast a hex value. Typing int("0x5") into the REPL fails with an exception, which I believe it didn't do when I last worked on this.

And sure enough, with MicroPython v1.24.1 all our unit tests still pass. And typing int("0x5") into the REPL results in 5.

I traced it down to this commit in MicroPython: micropython/micropython@13b13d1. Because that change does not relate directly to hex values, and the tests that were added don't test any hex values, it might be that the impact on hex values was unintended. However, since Python 3 also throws pretty much the same exception, it may also have been intended (and the change is just missing some test cases).

Btw, int("0x5", 0) works on both versions of MicroPython, but... int("010") returns 10 on the older MicroPython, while it returns an exception (as per @ThomasWaldmann's comment earlier) on the newer MicroPython - and as per description of the commit.

So I guess we have 2 options:

We check with MicroPython authors, if this (potentially unintended) change is what they wanted - it is how Python 3 works, so it could be that they do.
Change our code, to pass 0 as the base argument to int(). This will handle the hex value case, but as @ThomasWaldmann pointed out it would not work for 010. While that behaviour would be in line with Python 3, it would not be what GNU as does - it would interpret that as octal. Since our goal (mine at least) had been to support anything that binutils (gnu as) from Espressif handles without modification, this would be sad. I wonder if we can detect this ourselves and handle it as octal? e.g. if first char is "0" and second is a digit 0-9, then pass it to int() with base set to 8? @ThomasWaldmann what do you think?

I think I'll raise an issue about 1. either way, just to find out what their intent was - and (potentially) make them aware of the issue with hex values.

dpgeorge · 2025-06-13T01:57:17Z

So... it turns out the change in behaviour (from what I remember to how it's now) is because of a "regression" (or fully-understood and intentional change) in MicroPython.

Yes, we did make a breaking change with how int() works when not given a base argument. And the case of hex literals was considered, and a test patched so it now passes, see: micropython/micropython@7b3f189

So given that above commit, I would say this change is understood and intentional. Although unfortunate that it did break something here 😞

wnienhaus · 2025-06-13T11:22:42Z

Hehe :) That solves my option 1. Thanks @dpgeorge . (I had seen that commit you referenced, but didn't read the commit description properly, so given the lack of a negative test to show that the exception is indeed wanted, I wasn't sure what the intent was. Of course reading the commit description again now, it feels pretty clear).

I will look at putting "format detection" into our code, so it interprets number literals like GNU as does (by passing the correct base to int()). Based on as's documentation, octal should be 0[0-9]*, hex should be 0[xX][0-9]+, decimal should be [1-9][0-9]* and binary should be 0[bB][0-1]*. And each case should allow a minus - prefix. int already handles the minus prefix correctly, e.g. int("-0b11", 2) == -3, so it's only important for the logic that determines the correct base.

I won't manage to look into this for the next few days, so @mjaspers2mtu - if you feel up to it, give it a go. This was your PR after all, and I don't want to take over if you would like to solve this. But otherwise, I'm very happy to look into it next week.

mjaspers2mtu · 2025-06-15T12:22:41Z

@wnienhaus Great finds, I agree with the vision of having ULP code written for Espressif work without modification in micropython-esp32-ulp.

Would it be possible to support both zero-padding and the Python-style 0o prefix for octal values? Personally, I find the 0o notation more clear. Or maybe you think it should throw an error just like it would in the Espressif implementation?

Also, feel free to take the lead on the implementation—you're clearly more familiar with this part of the codebase than I am

Update assemble.py

721f754

allow variables to be initialized with hex and binary values, not just integers

ThomasWaldmann changed the title ~~Update assemble.py~~ improve argument evaluation Jun 9, 2025

Update assemble.py

bf320af

Using eval_args instead for better compatibility when using expressions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

improve argument evaluation #104

improve argument evaluation #104

Uh oh!

mjaspers2mtu commented Jun 8, 2025

Uh oh!

ThomasWaldmann commented Jun 8, 2025

Uh oh!

wnienhaus commented Jun 9, 2025

Uh oh!

mjaspers2mtu commented Jun 9, 2025 •

edited

Loading

Uh oh!

mjaspers2mtu commented Jun 9, 2025

Uh oh!

wnienhaus commented Jun 12, 2025

Uh oh!

dpgeorge commented Jun 13, 2025

Uh oh!

wnienhaus commented Jun 13, 2025

Uh oh!

mjaspers2mtu commented Jun 15, 2025

Uh oh!

Uh oh!

improve argument evaluation #104

Are you sure you want to change the base?

improve argument evaluation #104

Uh oh!

Conversation

mjaspers2mtu commented Jun 8, 2025

Uh oh!

ThomasWaldmann commented Jun 8, 2025

Uh oh!

wnienhaus commented Jun 9, 2025

Uh oh!

mjaspers2mtu commented Jun 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mjaspers2mtu commented Jun 9, 2025

Uh oh!

wnienhaus commented Jun 12, 2025

Uh oh!

dpgeorge commented Jun 13, 2025

Uh oh!

wnienhaus commented Jun 13, 2025

Uh oh!

mjaspers2mtu commented Jun 15, 2025

Uh oh!

Uh oh!

mjaspers2mtu commented Jun 9, 2025 •

edited

Loading