Skip to content

improve argument evaluation #104

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

mjaspers2mtu
Copy link

allow variables to be initialized with hex and binary values, not just integers

For example:
magic: .long 48879

could also be:
magic: .long 0b1011111011101111

or what I frequently use it for:
magic: .long 0xBEEF

int(x, 0) in Python is a very useful feature of the int() constructor that automatically interprets the base (radix) of the number based on its prefix

allow variables to be initialized with hex and binary values, not just integers
@ThomasWaldmann
Copy link
Collaborator

This will introduce a slight incompatibility in case someone has used integers with leading zeros. But I think it is worth it nevertheless, it should be just pointed out in docs / changelog.

>>> int("010")
10

>>> int("010", 0)
Traceback (most recent call last):
  File "<python-input-1>", line 1, in <module>
    int("010", 0)
    ~~~^^^^^^^^^^
ValueError: invalid literal for int() with base 0: '010'

https://docs.python.org/3/library/functions.html#int

@wnienhaus
Copy link
Collaborator

Thanks for this PR.

I thought we already supported this (see the test fixture here for example), because we support it for opcode arguments, but I now see we don't evaluate what is passed to .long et al. (I will have a look later, why it works in that fixture, if it even does)

According to the GNU assembler manual, the .long / .int / etc directives take expressions as argument, which should evaluate to integers. So we really should evaluate the argument, and technically, we should even evaluate multiple expressions if there are multiple (separated by comma).

But for now, perhaps the easiest is to stick to supporting a single value, but to pass that value through our eval_args function, which we use for evaluating expressions when used in opcodes arguments. (See tests here for what it's capable of). That way we have only one consistent way of evaluating expressions.

We already do this for the .set directive a few lines above (see here), so it should be easy to achieve.

Would you mind adapting the PR to use eval_args?

@ThomasWaldmann ThomasWaldmann changed the title Update assemble.py improve argument evaluation Jun 9, 2025
Using eval_args instead for better compatibility when using expressions
@mjaspers2mtu
Copy link
Author

mjaspers2mtu commented Jun 9, 2025

I agree @wnienhaus that the use of eval_args is a better idea, as it also allows expressions. There is still an issue with leading 0's (thanks @ThomasWaldmann), which is not present when using leading 0's in opcode arguments, since this also goes through arg_qualify.

There is a caveat, that when using leading 0's in opcode arguments, it can not be an expression.
For example, this works:
move r0, 048879

but this does not:
move r0, 048878+1

This has to do with the try except blocks in the arg_qualify function. where int("048879") evaluates to 48879 (first try block) but eval("048879") throws an exception (second try block).

@mjaspers2mtu
Copy link
Author

If you figure out why the test fixture passes, I am interested what the reason is :)

@wnienhaus
Copy link
Collaborator

So... it turns out the change in behaviour (from what I remember to how it's now) is because of a "regression" (or fully-understood and intentional change) in MicroPython.

When looking into this, I noticed our unit tests fail with the latest MicroPython (v1.25.0) - they fail here, exactly for the same reason you (@mjaspers2mtu) described in your last comment about the try..except block, but while trying to cast a hex value. Typing int("0x5") into the REPL fails with an exception, which I believe it didn't do when I last worked on this.

And sure enough, with MicroPython v1.24.1 all our unit tests still pass. And typing int("0x5") into the REPL results in 5.

I traced it down to this commit in MicroPython: micropython/micropython@13b13d1. Because that change does not relate directly to hex values, and the tests that were added don't test any hex values, it might be that the impact on hex values was unintended. However, since Python 3 also throws pretty much the same exception, it may also have been intended (and the change is just missing some test cases).

Btw, int("0x5", 0) works on both versions of MicroPython, but... int("010") returns 10 on the older MicroPython, while it returns an exception (as per @ThomasWaldmann's comment earlier) on the newer MicroPython - and as per description of the commit.

So I guess we have 2 options:

  1. We check with MicroPython authors, if this (potentially unintended) change is what they wanted - it is how Python 3 works, so it could be that they do.
  2. Change our code, to pass 0 as the base argument to int(). This will handle the hex value case, but as @ThomasWaldmann pointed out it would not work for 010. While that behaviour would be in line with Python 3, it would not be what GNU as does - it would interpret that as octal. Since our goal (mine at least) had been to support anything that binutils (gnu as) from Espressif handles without modification, this would be sad. I wonder if we can detect this ourselves and handle it as octal? e.g. if first char is "0" and second is a digit 0-9, then pass it to int() with base set to 8? @ThomasWaldmann what do you think?

I think I'll raise an issue about 1. either way, just to find out what their intent was - and (potentially) make them aware of the issue with hex values.

@dpgeorge
Copy link
Member

So... it turns out the change in behaviour (from what I remember to how it's now) is because of a "regression" (or fully-understood and intentional change) in MicroPython.

Yes, we did make a breaking change with how int() works when not given a base argument. And the case of hex literals was considered, and a test patched so it now passes, see: micropython/micropython@7b3f189

So given that above commit, I would say this change is understood and intentional. Although unfortunate that it did break something here 😞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants