-
Notifications
You must be signed in to change notification settings - Fork 6.1k
8360023: Add an insertion sort implementation to Hotspot #25895
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
👋 Welcome back qamai! A progress list of the required criteria for merging this PR into |
❗ This change is not yet ready to be integrated. |
@merykitty The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
Webrevs
|
GrowableArrayIterator<E, false> ncbegin() { | ||
return GrowableArrayIterator<E, false>(this, 0); | ||
} | ||
|
||
GrowableArrayIterator<E, false> ncend() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think ncbegin
and ncend
are good names. Why not to use begin
and end
?
Also GrowableArrayIterator<E, false>
looks confusing because of the second parameter.
Why not to use something like GrowableArrayConstIterator
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it would be an incompatible change, there are places where we do things like
GrowableArrayIterator<E> it = array.begin();
I think your latter concern can be addressed by using GrowableArrayNonConstIterator
.
Maybe instead of introducing a generalized version of the insertion sort, you can have a function implementing the insertion sort in the context of JDK-8357186? This specialized function will be much smaller than the PR changes. |
@eastig Thanks a lot for your reviews. Yes a specialized insertion function could work, but a generalized function would be more useful and easier to test. A large part of this change is to modernize |
This is my point. If amount of changes is so small and there is only one use case, why do we need the type independent implementation? |
For whom would it be more useful? Usually generalized versions come from a set of specialized versions. Not vice versa. There is no issue to test a specialized version. |
Firstly, since the generalized function is no more complex than the specialized function, so why not go for the generalized function and save us the troubles generalizing the specialized function if the need arises. Secondly, it is much much easier to test the generalized function. I can easily verify that sorting an array of |
Unfortunately experience teaches us, code written for purpose of future uses is never used as is. If you want the insertion sort, I'd recommend to have it in |
Here the simple implementation which does not require a lot of code: void GrowableArrayView<>::insertion_sort(int f(E*, E*)) {
if (_data == nullptr) return;
for (int i = 1; i < length(); i++) {
E key = _data[i];
int j = i - 1;
while (j >= 0 && f(_data[j], key)) {
_data[j + 1] = _data[j];
j--;
}
_data[j + 1] = key;
}
} |
Just do a
We are programming in C++, I think it would be better to follow the C++ convention. The practical reason is that it prevents users not wanting to sort from having to include the sort functionality.
Now we have |
Note that insertion sort is the most efficient sorting algorithm for small arrays, so we can use it for non-stable sort as well. |
Ok. Do we need to rewrite this code to use a stable sort?
Yes, we use C++ but we use subset of it: https://github.com/openjdk/jdk/blob/master/doc/hotspot-style.md
One issue is the pollution of the global namespace by the name which rarely be used. Another is that this is opposite to what C++ programmers are familiar with: iterator and const_iterator. It's already confusing: users of |
Hi @merykitty , Thank you for taking the effort to produce tooling for everyone when you found a need for it yourself. Often, we have useful datatypes hidden away into internals that we'd like to use, or we simply do other solutions because our preferred solution is missing. Unfortunately, I think that the ceremony required to get your insertion sort working for someone else's type will put other devs off from using it. None, AFAIK, of our datatypes are compatible with the STL's interfaces. If we take this definition:
And change it around a bit:
Then I think we have something general-ish.
This is going to be general enough, for most of our cases we have a contiguous array with a size of some fixed element type which we want to change in-place. This is sufficient for expressing that. This also fits well with the |
It has a potential problem with O(n^2). |
@jdksjolen Thanks for your suggestion. Actually, I can make it so that a I have reverted the |
Cheers! This looks good to me. Let's see what the rest of the community thinks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some drive-by comments.
class InsertionSort : AllStatic { | ||
public: | ||
template <class T, class Compare> | ||
static void sort(T* data, int size, Compare comp) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int size
to size_t size
? or at least unsigned int
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hotspot container usually uses signed int for size. So I think int
here is a sensible choice.
// which the element is not greater than the current element (note that we are traversing | ||
// backward) | ||
T* prev = pos - 1; | ||
if (comp(*prev, current_elem) <= 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: would be better to pass pointers here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comp
usually receives references. Practically, it is almost the same as receiving pointers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies, I didn't notice it was a reference.
* | ||
*/ | ||
|
||
#include "runtime/os.hpp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: sort the imports?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have cleaned up the unused import here. What do you mean by sorting the imports?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sort the "#include" lines alphabetically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you want to have unittest.hpp
above the utilities
files. I have done that. I was confused because the convention in this area is pretty blurry as many files have the unittest.hpp
as their last include.
@merykitty, |
@eastig It is almost the same, isn't it? |
I'm happy to approve it with a few minor changes. |
Thanks for the reviews @theRealAph , I have addressed them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Hi,
This PR adds an implementation of insertion sort to Hotspot. It is an algorithm that is inplace and stable, and it is the ideal algorithm for arrays with small numbers of elements. The motivation for this is JDK-8357186 in which a stable sort is desired and the number of elements is small. Additionally, since insertion sort is the most efficient sorting algorithm for small arrays, it can be used in non-stable sort as well.
In addition, I make some improvements to
GrowableArrayIterator
:JDK-8360032 is a follow-up work that will build a stable merge-insertion sort on top of this PR.
Please take a look and share your thoughts. Thanks very much.
Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25895/head:pull/25895
$ git checkout pull/25895
Update a local copy of the PR:
$ git checkout pull/25895
$ git pull https://git.openjdk.org/jdk.git pull/25895/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 25895
View PR using the GUI difftool:
$ git pr show -t 25895
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25895.diff
Using Webrev
Link to Webrev Comment