-
Notifications
You must be signed in to change notification settings - Fork 100
Add -M/--find-renames option and blame.renames config to control rename detection #755
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: vfs-2.49.0
Are you sure you want to change the base?
Conversation
Co-authored-by: dscho <[email protected]>
git blame
in a repository cloned via VFS for GitThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we have to abandon the idea of -M
because that already exists (as you found out) but already means something else (yay consistency of Git's user interface).
-M[<num>]:: | ||
--find-renames[=<num>]:: | ||
Detect moved or copied lines within a file. When a commit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, you cannot do that. The existing -M
option is not about whole-file rename detection, therefore we cannot reuse that, and my instructions regarding -M
are moot.
You have to introduce a new --find-renames
option instead.
blame.c
Outdated
@@ -1423,10 +1423,17 @@ static struct blame_origin *find_rename(struct repository *r, | |||
struct blame_origin *porigin = NULL; | |||
struct diff_options diff_opts; | |||
int i; | |||
extern int rename_detection_mode; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this won't work. You declared it as static
in builtin/blame.c
, which makes it file-local, and you cannot ever see it from this here file.
Instead, you have to introduce a new attribute in blame.h
, probably in struct blame_scoreboard
next to the xdl_opts
(which are also diff-related).
blame.c
Outdated
diff_opts.detect_rename = DIFF_DETECT_RENAME; | ||
/* | ||
* Use rename_detection_mode if specified, otherwise default to DIFF_DETECT_RENAME | ||
* For mode values > 0 and < 100, use it as similarity threshold | ||
*/ | ||
diff_opts.detect_rename = (rename_detection_mode == 0) ? 0 : | ||
(rename_detection_mode > 0) ? | ||
rename_detection_mode : DIFF_DETECT_RENAME; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only place where you actually use rename_detection_mode
, which you described elsewhere as a percentage. However, it is used exclusively as a Boolean. You have to study diff*.c
harder to see how the --find-renames
value is used there and imitate it here.
t/t8015-blame-rename-detection.sh
Outdated
. ./test-lib.sh | ||
|
||
test_expect_success 'setup test file rename with content changes' ' | ||
git init && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do study other t/t-[0-9]*-*blame*.sh
scripts. git init
is not necessary.
t/t8015-blame-rename-detection.sh
Outdated
echo abc >1.txt && | ||
echo def >>1.txt && | ||
echo ghi >>1.txt && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use test_write_lines
instead.
t/t8015-blame-rename-detection.sh
Outdated
echo abc >1.txt && | ||
echo def >>1.txt && | ||
echo ghi >>1.txt && | ||
git add . && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.
is sloppy. Specify 1.txt
explicitly.
echo def >>1.txt && | ||
echo ghi >>1.txt && | ||
git add . && | ||
git commit -m "Initial commit" && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to prefix this with test_tick
to make the OIDs reproducible.
t/t8015-blame-rename-detection.sh
Outdated
|
||
# This test confirms that by default, git blame follows partial-file renames | ||
test_expect_success 'git blame follows inexact renames by default' ' | ||
FIXED_1=$(git rev-parse --short HEAD^) && | ||
FIXED_2=$(git rev-parse --short HEAD) && | ||
|
||
git blame 2.txt >output && | ||
grep "$FIXED_1" output | grep -q abc && | ||
grep "$FIXED_2" output | grep -q 123 && | ||
grep "$FIXED_1" output | grep -q ghi | ||
' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is too focused, and unnecessary. We really only need to verify that 1.txt
is mentioned only when the rename-detection score is high enough.
t/t8015-blame-rename-detection.sh
Outdated
# This test confirms that --no-find-renames or -M0 turns off rename detection | ||
test_expect_success 'git blame can disable rename detection' ' | ||
git blame --no-find-renames 2.txt >output && | ||
! grep -q 1.txt output | ||
' | ||
|
||
# This test confirms that -M100 only follows exact renames | ||
test_expect_success 'git blame can restrict to exact renames' ' | ||
git blame -M100 2.txt >output && | ||
! grep -q 1.txt output | ||
' | ||
|
||
# This test checks that blame.renames config works | ||
test_expect_success 'blame.renames=false disables rename detection' ' | ||
git -c blame.renames=false blame 2.txt >output && | ||
! grep -q 1.txt output | ||
' | ||
|
||
# This test checks that -M with a score works | ||
test_expect_success 'git blame with similarity score follows renames above threshold' ' | ||
# Must follow 1.txt->2.txt rename for abc which are identical | ||
git blame -M70 2.txt >output && | ||
grep "$FIXED_1" output | grep -q abc && | ||
# Should not follow for others below threshold | ||
grep "$FIXED_2" output | grep -q 123 && | ||
grep "$FIXED_2" output | grep -q ghi | ||
' | ||
|
||
# This test checks that -M overrides blame.renames | ||
test_expect_success '-M overrides blame.renames config' ' | ||
# Using blame.renames=false but -M60 | ||
git -c blame.renames=false blame -M60 2.txt >output && | ||
grep "$FIXED_1" output | grep -q abc && | ||
# The rest would be below 60% threshold | ||
grep "$FIXED_2" output | grep -q 123 && | ||
grep "$FIXED_2" output | grep -q ghi | ||
' | ||
|
||
# This test checks that blame.renames with a score works | ||
test_expect_success 'blame.renames with score controls rename threshold' ' | ||
# Set threshold at 70%, abc is identical so above threshold | ||
git -c blame.renames=70 blame 2.txt >output && | ||
grep "$FIXED_1" output | grep -q abc && | ||
# Other lines below threshold | ||
grep "$FIXED_2" output | grep -q 123 && | ||
grep "$FIXED_2" output | grep -q ghi | ||
' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are waaaaay too many test cases for the job. All you need to do is to run git blame
with various ways to specify the rename score, and then verify that 1.txt
is either mentioned or not, depending on that score.
In fact, I highly suspect that this single test case that you need to add for this entire PR would find a much nicer home in one of the existing t/t*-blame*.sh
scripts.
Co-authored-by: dscho <[email protected]>
Co-authored-by: dscho <[email protected]>
Co-authored-by: dscho <[email protected]>
When running
git blame
on large repositories, the automatic rename detection can cause performance issues because it unnecessarily compares blob contents to follow inexact renames. This PR adds options to control this behavior:Added new features:
Command line options to control rename detection in
git blame
:-M[<n>]
/--find-renames[=<n>]
to specify the similarity threshold-M0
or--no-find-renames
turns off rename detection completely-M100
limits detection to only exact renames (identical blob content)New configuration option
blame.renames
that can be set to:true
or1
- follow renames with default similarity index (50%)false
or0
- disable rename detection entirelycopy
- detect copies as well as renamesDocumentation improvements:
git-blame.adoc
-M/--find-renames
option inblame-options.adoc
blame.renames
config inconfig/blame.adoc
Testing:
t/t8015-blame-rename-detection.sh
that verifies the new functionalityExample:
Fixes #753.
Warning
Firewall rules blocked me from connecting to one or more addresses
I tried to connect to the following addresses, but was blocked by firewall rules:
cdn.fwupd.org
/usr/bin/fwupdmgr refresh
(dns block)If you need me to access, download, or install something from one of these locations, you can either:
💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.