Skip to content

Improve performance of MemoryEfficientLongestCommonSubsequenceCalculator #118

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 1, 2023

Conversation

staabm
Copy link
Contributor

@staabm staabm commented May 1, 2023

when creating a diff of a file with 5000 lines we can see MemoryEfficientLongestCommonSubsequenceCalculator to bottleneck on max(). I think the main problem is the function call overhead, as its invoked ~43.000.000 times.

grafik

with this change we just use plain php, which improves our workload in rector by 1 minute:

grafik

refs rectorphp/rector#7899

@sebastianbergmann sebastianbergmann merged commit 792812d into sebastianbergmann:main May 1, 2023
@staabm staabm deleted the patch-1 branch May 1, 2023 07:16
@sebastianbergmann sebastianbergmann changed the title Perf optimize MemoryEfficientLongestCommonSubsequenceCalculator Improve performance of MemoryEfficientLongestCommonSubsequenceCalculator May 1, 2023
@staabm
Copy link
Contributor Author

staabm commented May 1, 2023

thank you. I will have another look into TimeEfficientLongestCommonSubsequenceCalculator to check whether there is the same problem

@BafS
Copy link

BafS commented May 1, 2023

Is this also the case for the v4? Many of use are stuck with PHPUnit 9 which requires diff ^4, can I do a PR for the v4?

@staabm
Copy link
Contributor Author

staabm commented May 1, 2023

I think v4 is also affected. Sebastian needs to decide whether its worth backporting.

@sebastianbergmann
Copy link
Owner

Sure, I would consider a PR that backports these performance improvements to ^4.

@bwoebi
Copy link

bwoebi commented May 2, 2023

While in this case the performance fix is justified:

~/php-src-8.2 # time ./sapi/cli/php -r '$b = 1; $c = 2; for ($i = 0; $i < 430000000; ++$i) { $a = max($b, $c); }'

real	0m6.924s
user	0m6.917s
sys	0m0.004s
~/php-src-8.2 # time ./sapi/cli/php -r '$b = 1; $c = 2; for ($i = 0; $i < 430000000; ++$i) { if ($b < $c) { $a = $b; } else { $a = $c; } }'

real	0m2.523s
user	0m2.514s
sys	0m0.008s

It does not nearly have the practical effect shown in the profiler. On a standard 8.2 it is approximatively 3x as expensive (the empty loop takes 1.15 seconds on that machine).

A profiler like blackfire traces every single function call and thus has a comparatively high overhead.

On the PHP side a simple patch for max() to compare integers inexpensively would reduce the overhead to 2x as expensive:

diff --git a/ext/standard/array.c b/ext/standard/array.c
index 8a74704967..7d42a6ade2 100644
--- a/ext/standard/array.c
+++ b/ext/standard/array.c
@@ -1275,11 +1275,28 @@ PHP_FUNCTION(max)
                int i;
 
                max = &args[0];
+               if (Z_TYPE_P(max) == IS_LONG) {
+                       zend_long max_lval = Z_LVAL_P(max);
 
-               for (i = 1; i < argc; i++) {
-                       is_smaller_or_equal_function(&result, &args[i], max);
-                       if (Z_TYPE(result) == IS_FALSE) {
-                               max = &args[i];
+                       for (i = 1; i < argc; i++) {
+                               if (EXPECTED(Z_TYPE(args[i]) == IS_LONG)) {
+                                       if (max_lval < Z_LVAL(args[i])) {
+                                               max_lval = Z_LVAL(args[i]);
+                                               max = &args[i];
+                                       }
+                               } else {
+                                       goto non_lval;
+                               }
+                       }
+
+                       RETURN_LONG(max_lval);
+               } else {
+                       for (i = 1; i < argc; i++) {
+non_lval:
+                               is_smaller_or_equal_function(&result, &args[i], max);
+                               if (Z_TYPE(result) == IS_FALSE) {
+                                       max = &args[i];
+                               }
                        }
                }

@staabm
Copy link
Contributor Author

staabm commented May 2, 2023

@bwoebi thank you for getting into the discussion. could you do the necessary changes at php-src level? this would certainly help for a lot of applications arround.

@staabm
Copy link
Contributor Author

staabm commented May 6, 2023

to make sure we don't loose bwoebis very valuable comment, I went ahead and created a php-src issue: php/php-src#11192

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants