Skip to content

Change the way built-in functions are resolved #13632

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
javiereguiluz opened this issue Mar 8, 2024 · 9 comments
Closed

Change the way built-in functions are resolved #13632

javiereguiluz opened this issue Mar 8, 2024 · 9 comments

Comments

@javiereguiluz
Copy link
Contributor

Description

When some PHP app uses a function like file_exists(), the PHP compiler doesn't know if it's the PHP built-in function or a user-defined function added by that project.


This has some impact with changes like this: #12461

To unlock the potential performance improvements of that PR, you have two options:

(1) Prefix all calls to PHP built-in functions with \
(2) Import all PHP built-in functions with use function ...

Both solutions look cumbersome to me.


In other words:

  • All PHP projects in the World are a bit slower than they could be...
  • ... because maybe, some project, could include a user-defined function called substr() or file_exists() and PHP needs to take that into account

Could PHP fix this in some way? Here are some proposals:

(1) We could make this behavior opt-in instead of opt-out. If a project includes user-defined functions that match built-in function names, make them add a declares(overrides_php_functions=1); statement

(2) In order to change the default behavior in a backward-compatible way, we could add a transitory php.ini directive such as overrides_php_functions set to true by default (to match current PHP behavior) and then apps can change it to false so the performance improvement can be unlocked without changing their codebase

Thanks

@MorganLOCode
Copy link

See prior discussion on this subject
https://wiki.php.net/rfc/use_global_elements

@iluuu1994
Copy link
Member

I haven't followed this discussion, so I'm not sure why this RFC failed. I can think of two basic approaches:

  • Treat all unqualified function names as global, as the use_global_elements RFC did.
    • All functions that are actually local to the namespace now need to be prefixed.
    • This is inverse to classes, where global symbols need to be prefixed.
  • Treat all known (i.e. internal) functions as global, and treat the rest as before.
    • This avoids change all calls to all your own functions.
    • This adds a discrepancy between user and internal functions, which we worked hard to get rid of.

So, given these caveats, I'm not sure if there's a good way to proceed.

Copy link

github-actions bot commented Jun 7, 2024

There has not been any recent activity in this feature request. It will automatically be closed in 14 days if no further action is taken. Please see https://github.com/probot/stale#is-closing-stale-issues-really-a-good-idea to understand why we auto-close stale feature requests.

@github-actions github-actions bot added the Stale label Jun 7, 2024
@javiereguiluz
Copy link
Contributor Author

OK, let's close this proposal because it didn't have any recent activity. I'll try to keep sending other ideas in the future. Cheers!

@iluuu1994
Copy link
Member

iluuu1994 commented Jun 7, 2024

@javiereguiluz I agree that it would be great transitioning towards statically known namespaces for function calls. In practice, I'm not sure how to best achieve it without lots of code churn. A declare could be annoying without and error prone without a code fixer and analysis tool. Furthermore, if the declare is enabled by default at some point, all the declares can be removed again. https://wiki.php.net/rfc/namespace_scoped_declares could solve that.

Also, it's quite annoying that the chosen approach for functions would be the opposite of the one chosen for classes (global by default for functions, local by default for classes). Disambiguating local functions is also a bit ugly (e.g. namespace\preg_match()) without includes (although, then again, you can just use includes).

I suppose another approach could be to just reverse the lookup order: If the function exists globally, use that. Otherwise, use the local function. This should result in much fewer code changes, while allowing static knowledge of internal function calls, which are the majority of all non-instance function calls in OOP code bases. WDYT?

I will have a look soon to see if this makes a meaningful difference in a few code bases.

@iluuu1994 iluuu1994 reopened this Jun 7, 2024
@iluuu1994 iluuu1994 self-assigned this Jun 7, 2024
@iluuu1994
Copy link
Member

I created a simple PoC here: iluuu1994#132 It results in a 0.91% reduction in instructions for Symfony Demo. But it's unclear how much that correlates to performance. I will measure real time next.

@iluuu1994
Copy link
Member

iluuu1994 commented Jun 9, 2024

I wrote a script to analyze the impact by checking how many namespaced functions exist in the top 1000 composer packages with names equivalent to internal, global functions.

https://gist.github.com/iluuu1994/4b83481baac563f8f0d3204c697c5551

There are 484, with the vast majority coming from thecodingmachine/safe, which is expected, as it aims to provide safer wrappers of internal functions. Excluding these, there are only 24, which is surprisingly little. Because it seems the impact is acceptable, I will be proposing this for PHP 9. I will think about ways to provide a migration path. I will also ping internals, to see if this would break any use-cases.

@javiereguiluz
Copy link
Contributor Author

Let's close this issue because the related PR #14529 was closed as "won't fix". Thanks!

@iluuu1994
Copy link
Member

iluuu1994 commented Dec 5, 2024

@javiereguiluz There is some hope for this still. @arnaud-lb is thinking about the concept of modules, which is a collection of files compiled at once, reserving some namespace for itself. If during compilation it is evident that some local function (e.g. \App\strlen()) does not exist, it becomes impossible to register it afterwards, and thus we can assume that the symbol refers to a global function. This would also enable more optimizations in general. That said, it's also significantly more complex and requires an intervention from the user (converting their code-base to modules). But it might be a good alternative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants