-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Word completion #13206
base: master
Are you sure you want to change the base?
Word completion #13206
Conversation
This would be super nice. And it is also very useful for people who do not have all their LSPs installed and do not have LSP completion available, as this will help get for example complex variable names right :) . Will this be case sensitive LikeClassNames / include snake_names? :) |
It's also useful in my workflow in multi-language repos: some files might be written in C and others in another language. LSP doesn't cover that case very well. The set of words |
b9e3ee5
to
b7186fe
Compare
`TinyBoxedStr` is a small-string optimized replacement for `Box<str>` styled after <https://cedardb.com/blog/german_strings/>. A nearly identical type is used in helix-editor/spellbook for space savings. This type is specialized for its use-case: * strings are immutable after creation * strings are very very small (less than 256 bytes long) Because of these attributes we can use nearly the full size of the type for the inline representation and also keep a small total size. Many other small string crates in the wild are 3 `usize`s long (same as a regular `String`) to support mutability. This type is more like a `Box<str>` (2 `usize`s long). Mostly I like this small string type though because it's very straightforward to implement. Other than a few functions that reach into `std::alloc` or `std::ptr`, the code is short and boring.
b7186fe
to
9720020
Compare
| `word-completion` | Enable completion of words from open buffers. | `true` | | ||
| `word-completion-trigger-length` | Minimum number of characters required to automatically trigger word completion. | `7` | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this could be encapsulated as
[editor.word-completion]
enable = true
trigger-length = 7
I think word completions shouldn't appear if same item is provided by other superior source. edit: cant seem to upload picture? |
The same problem happens with path completions so I think that should be a seperate pr so we can figure out the best way to deduplicate completions |
I wonder if this same mechanism could (later) be used for Unicode completion using Typst's Codex: https://github.com/typst/codex. In addition to looking for words from the open buffers, they would also be searched from the "dictionary" provided by Codex. Related to #1438. |
Of course a problem might occur, if one actually wants to write |
Just want to say that this is such a nice QoL improvement! Something you really didn't know how much you missed until you had it again after a long absence. In testing, I did feel like 5 was a much better target for trigger length. I even turn to 2-3 when I am writing documentation, it really keeps the flow going with quick action/responses. |
@@ -71,6 +71,7 @@ These configuration keys are available: | |||
| `text-width` | Maximum line length. Used for the `:reflow` command and soft-wrapping if `soft-wrap.wrap-at-text-width` is set, defaults to `editor.text-width` | | |||
| `rulers` | Overrides the `editor.rulers` config key for the language. | | |||
| `path-completion` | Overrides the `editor.path-completion` config key for the language. | | |||
| `word-completion` | Overrides the `editor.word-completion` config key for the language. | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing the override of word-completion-trigger-length
(and should probably include @RoloEdits suggestion of merging under [word-completion] enable; trigger-length;
pub struct TinyBoxedStr { | ||
len: u8, | ||
prefix: [u8; Self::PREFIX_LEN], | ||
trailing: TinyBoxedStrTrailing, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two usize
s is 16 bytes.
This works on average (and just barely) for the .md
files in rust-lang/rust-analyzer
and helix-editor/helix
, but it doesn't for starship/starship
and rust-lang/rust
.
Several good quality Rust crates offer 22-to-24 small strings that would fit a lot more cases and avoid helix reinventing the wheel where it feels unnecessary.
Notably, I don't see any kind of benchmark justifying reimplementing small strings and all their complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already a direct dependency on smartstring
. Should try that one first if we are looking for small string optimized crates.
if let Some(word_completion_request) = | ||
word::completion(editor, trigger, handle.clone(), savepoint) | ||
{ | ||
requests.spawn_blocking(word_completion_request); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file chains the completion requests building for the various type. Each handler also starts by doing mostly the same stuff, getting the cursor, the current line, ...
Maybe not for this MR, but finding a way to properly distribute this in parallel would be good
let words: Vec<_> = words(text.slice(..)).collect(); | ||
let mut inner = self.inner.write(); | ||
for word in words { | ||
inner.insert(word); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need the collect if words()
already return an iterator ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so that we hold the write lock for only a very short time. I'm not sure it's necessary though as parsing the words out should be quite fast
let words: Vec<_> = words(text.slice(..)).collect(); | ||
let mut inner = self.inner.write(); | ||
for word in words { | ||
inner.remove(word); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
We could consider having a lower trigger length for documentation languages like markdown. Mainly I find word completion noisy/annoying when it competes with LSP completions.
The completion code part of this PR is not really very novel - it should be possible to add any sort of "core" completions now since #2608 did a bunch of refactoring to make it possible. |
Completion of words from open buffers is a feature I miss from Kakoune. The spirit of this implementation is the similar: words are extracted from open buffers and stored in a database (
WordDB
in Kakoune). Then completion reads from the database to fuzzy-find candidates. All reading and writing to the index is done off the main thread and large changes to the index are debounced. Updates to the index examine theChangeSet
which caused the change and only consider windows of the text around eachOperation::Insert
andOperation::Delete
.Note that this doesn't follow Pascal's idea for using Aho-Corasick on the current buffer (comment) because I would like words from all documents. Word completion is especially useful when taking notes - for example one window has code I'm reading and the other window has a markdown file for notes - and being able to quickly complete words from the other buffer(s) is really useful for that workflow.
This PR also adds a custom small string type based on what I use in spellbook. I like it because it's simple and specialized for the use case (so, very compact). I'm not very attached to it though and we could slot in some other type. I think we already have a transitive dependency on KString, for one option.
Closes #1063