Rich LLM logging #4866

owtaylor · 2025-03-27T20:07:06Z

Description

Instead of treating LLM logging as simply writing lines of text sequentially, emit a stream of typed data objects that include rich information about the interaction, such as timestamps, message types, tool calls, etc.

These are then collected by a new custom webview in the vscode extension, allowing a user (developer) to browse through the different interactions and view them in detail. In progress interactions are now streamed live, instead of being held back to avoid concurrency issues.

There is a file logging implementation on top of this that includes the rich information and tries to be prettier and easier to read than the old logging; this is used for the binary and hence for Jetbrains for now, though exposing the rich webview in Jetbrains could be done too.

Checklist

The relevant docs, if any, have been updated or created - I haven't done this yet - this could be done as a follow-up PR or I can update this PR.
The relevant tests, if any, have been updated or created - this shouldn't have any affect on tests.

Screenshots

Screencast.From.2025-03-27.16-03-32.webm

[ For visual changes, include screenshots. ]

Testing instructions

Enable the "Enable Model Logging" preference
Switch to the "LLM Log" view in the bottom panel

Review guide and notes

This is a rather large set of changes, some hints:

The interface to the new logging can be found in core/index.d.ts
The main non-trivial change to existing code is the changes to BaseLLM in core/llm/index.ts - this is where the actual logging is added - there is as lot of boilerplate that is replicated between streamChat, streamComplete, etc. I have some ideas how this could be cleaned up, but it would make this patch less straightforward to review. One subtle thing in the changes here is the adding finally {} blocks to catch cancellation via a call generator.return() - this should probably be documented, but would be easier if it wasn't duplicated in a bunch of places.
logFormatter.ts and logFormatter.test.ts are pretty big and somewhat complex, but don't have much relevance to the overall code-flow - they just adapt the new log interface into a stream.
Again the code in gui/src/components/llmLog/* is rather big, but it's "just" the react GUI. If you want to understand how it works, gui/src/hooks/useLLMLog.ts is a good place to start - it handles converting the linear stream of log items into state that can be efficiently rendered.

Other notes

If the preference off, then there's pretty minimal overhead - we create the log items, but they are immediately discarded
If the preference in on, but the log view is not visible, then the only thing that happens is that the log items are collected into an array. If someone is editing with autocompletion on for a long time, the memory used by the array could be quite large because we're saving all the context sent on every keystroke. I suspect that it will be 100's of MB rather than 100's of GB's but some measurement or exact calculation would be useful. If this ends up being a problem, saving only the last 100 LLM interactions or something like that is probably the right approach. Converting to JSON and compressing would also be possible.
Performance when the log view is visible is going to be slower, but I don't think there's any gross inefficiency. One thing that could definitely be done is batching updates - only sending a bunch of items from the core to the log view GUI every 0.1s or something. If the history is very large, switching between the log view and other views might be slow since retainContextWhenHidden is not set (to reduce overhead when hidden). Again, limiting the amount of history saved would help.
This reverts Citations made visible in the output for Perplexity Sonar research models #4511 (cc: @ferenci84) because citations are not plumbed through the type system sufficiently to put them into the new log stream. Once citation support is more developed, it will be easy to add it here.

Possible future extensions

Menu items to save an interaction in text form and/or as a raw JSON dump.
Some representation of timestamps within the output - tooltips, or annotations at the edge, for example
Include the role that the model is being used for - edit/chat/etc.
Optionally log and include embedding model calls used for indexing
Tricky: when streamFim is being used for autocompletion, visualize the affect that filtering has on the stream - show what gets filtered out.

netlify · 2025-03-27T20:07:33Z

✅ Deploy Preview for continuedev ready!

Name	Link
🔨 Latest commit	`6e54e7f`
🔍 Latest deploy log	https://app.netlify.com/sites/continuedev/deploys/67f42f69c7030a000718759f
😎 Deploy Preview	https://deploy-preview-4866--continuedev.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

RomneyDa

@owtaylor just pair-reviewed with @Patrick-Erichsen, this is a sweet addition. Code is nicely inserted and agreed with most all decisions.

regarding including llmLogger react app code in gui, might move some files around as nitpicks later but overall we think you made the right decisions/compromises here, convenient to have same build output location for build script copying etc.
-As you mentioned not too bad to expose jetbrains log webview if we want, and not worth doing both for large experimental feature immediately, especially with opt-in behavior and the fact that improves default logging for jetbrains anyways

@Patrick-Erichsen and I will pull and smoke test a bit more, then should be good to merge

RomneyDa

@owtaylor some second thoughts, would agree with your concerns about memory, could you limit the logs array to maybe 10s of thousands to prevent memory from being a concern? Maybe start shifting at that limit

Also less important but could you do a quick review of if using HeroIcons is reasonable rather than @vscode/codicons icons? not a dealbreaker but should be cross platform and perhaps value of matching the sidebar icon set and not using vs code icons in jetbrains in the future >= value of not matching native vs code icons. Didn't look too much into it so if the hero icons are super lacking don't worry about it

Patrick-Erichsen

This is an excellent implementation! Playing with this locally was really fun. I feel like this is quickly going to become something we wonder how we lived without having for so long 😅

+1 to the feedback Dallin had above, went ahead and approved however since I think it's mergeable as is.

A few minor UI nitpicks:

Chatted with @TyDunn on the name for the tab - can we call it "Continue Console" rather than "LLM Logs"?
Could we use the phrase "Autocomplete" and "Edit" rather than "Fim" and "Completion"? Mostly to align with terminology in our docs.

Lastly - regarding the failing tests, we still have some issues with flake, so whenever you're ready to merge this just let me know and I'll re-run them til they pass.

core/llm/logFormatter.test.ts

core/config/types.ts

core/llm/index.ts

extensions/vscode/package.json

extensions/vscode/src/ContinueLogWebviewViewProvider.ts

owtaylor · 2025-04-01T14:54:06Z

Thanks @RomneyDa and @Patrick-Erichsen for the review!!

@owtaylor some second thoughts, would agree with your concerns about memory, could you limit the logs array to maybe 10s of thousands to prevent memory from being a concern? Maybe start shifting at that limit

I'll come up with an additional commit to add a limit Probably limiting the number of interactions rather than the number of item, since that will be easier for communication with the view.

Also less important but could you do a quick review of if using HeroIcons is reasonable rather than @vscode/codicons icons? not a dealbreaker but should be cross platform and perhaps value of matching the sidebar icon set and not using vs code icons in jetbrains in the future >= value of not matching native vs code icons. Didn't look too much into it so if the hero icons are super lacking don't worry about it

Will do.

Chatted with @TyDunn on the name for the tab - can we call it "Continue Console" rather than "LLM Logs"?

Sure - "LLM Log" has some advantage of compactness and avoiding overflow of the panel tabs, but the leap from LLM => Continue is likely not obvious to all users.

I suppose then "continue.enableModelLogging" changes "continue.enableConsole" and so forth. Will make a stab at it!

Could we use the phrase "Autocomplete" and "Edit" rather than "Fim" and "Completion"? Mostly to align with terminology in our docs.

Hmm, @shermanhuman made a suggestion that what should appear in the list is the role rather than the "method" - I think that makes a lot of sense, and would align things with the terminology that you want, but does require the role to be passed through to the ILLM methods (probably in the options) - OK if I look at that as a follow-up?

Patrick-Erichsen · 2025-04-01T17:13:34Z

Sure - "LLM Log" has some advantage of compactness and avoiding overflow of the panel tabs, but the leap from LLM => Continue is likely not obvious to all users.

That was essentially the concern, e.g. a user seeing the tab and being unsure where it came from. Maybe a minor concern since it's opt-in, but we're about the same length as DEBUG CONSOLE 😁

Hmm, @shermanhuman made a suggestion that what should appear in the list is the role rather than the "method" - I think that makes a lot of sense, and would align things with the terminology that you want, but does require the role to be passed through to the ILLM methods (probably in the options) - OK if I look at that as a follow-up?

Definitely, I'm okay with that being a follow up 👍

owtaylor · 2025-04-01T17:49:14Z

This is a good callout, totally okay to leave it as continueLogView.

Too late :-) - using Console in the code did fix some awkwardness of llmLog as a identifier/filename component.

New version:

View renamed to Console, corresponding changes to filenames/configuration settings/commands
Switched to heroicons
Limited to 50 most recent interactions
Rebased over a trivial conflict

I think that addresses everything noted.

Lastly - regarding the failing tests, we still have some issues with flake, so whenever you're ready to merge this just let me know and I'll re-run them til they pass.

Let's see how this does. The failure in the last run had an odd screenshot - the sidebar ended up in the panel. I don't think it had anything to do with this because setting for the Log (console) will be off by default.

Patrick-Erichsen · 2025-04-03T16:37:59Z

Hmmm, re-running the tests and looks like the same issue

Keyboard Shortcuts Fresh VS Code window → sidebar closed → cmd+L with no code highlighted → opens sidebar and focuses input → cmd+L closes sidebar

In the e2e tests we do manually move the extension to the right sidebar: https://github.com/continuedev/continue/blob/main/extensions/vscode/e2e/actions/GUI.actions.ts#L15

I'm wondering if some of the iframe logic there has an assumption that there will only be a single iframe: https://github.com/continuedev/continue/blob/main/extensions/vscode/e2e/actions/GUI.actions.ts#L32

If you're able to run the e2e tests locally, I would just bump the timeout for that test to some arbitrarily long number (100000000) and that will allow you to poke around the IDE during the test to see if the feature is enabled or not.

Feel free to timebox your troubleshooting to a reasonable length and then pass it over to us, the e2e tests are a bit tricky so I'm sure we'll be able to figure it out if it feels like you're going down a rabbit hole.

owtaylor · 2025-04-07T17:53:42Z

In the e2e tests we do manually move the extension to the right sidebar: https://github.com/continuedev/continue/blob/main/extensions/vscode/e2e/actions/GUI.actions.ts#L15

The problem with moving the sidebar to the side is that the second quick pick for "View: Move View" lists all View Containers, including the empty "Continue Log" one, this upsets the hardcoded index of 14 (which is the Copilot Chat view container?)

Changing selectQuickPick(4) => selectQuickPick("Continue") and selectQuickPick(14) => selectQuickPick("New Secondary Side Bar Entry") fixes the sidebar move, and seems to make the "GUI" and "KeyboardShortcuts" work locally for me. I'll include that in my branch and fix the conflicts.

When we catch/log/rethrow AbortError, throw it as itself, rather than throw Error(e.message) so that outer callers can detect it by the exception name.

We can't add citations to the rich log until we have them plumbed through with the correct data types. That will happen when they get added to the UI. Since the only appearance so far was in the old text log, just revert the addition for now. This reverts commit 6bc0bfc.

Instead of treating LLM logging as simply writing lines of text sequentially, emit a stream of typed data objects that include rich information about the interaction, such as timestamps, message types, tool calls, etc. These are then collected by a new custom webview in the vscode extension (the "Continue Console"), allowing a user or developer to browse through the different interactions and view them in detail. In progress interactions are now streamed live, instead of being held back to avoid concurrency issues. This webview could be also exposed for IntelliJ or file logging could be reimplemented on top of the new framework.

Add a streaming implementation of rich LLM logging to use if we don't have the webview connected; it converts the LLMInteractionItems into a human-readable stream with timestamps, handling of overlapping interactions, and so forth. Use this for the binary.

Only retain and display the 50 most recent interactions; this prevents the stored items from growing without bound if the user is using the vscode instance to edit for a long time.

Selecting the 14th view container ran into problems when an (emty) Continue Console view container was added. This also changes it to move to new secondary sidebar container rather than reusing one from Copilot.

owtaylor · 2025-04-07T18:29:00Z

The failure on this run:

 Installing extensions...
Error while installing extensions: AggregateError [ETIMEDOUT]: 
AggregateError [ETIMEDOUT]: 
Error: Command failed: ELECTRON_RUN_AS_NODE=1 "/home/runner/work/continue/continue/extensions/vscode/e2e/storage/VSCode-linux-x64/code" "/home/runner/work/continue/continue/extensions/vscode/e2e/storage/VSCode-linux-x64/resources/app/out/cli.js" --force --install-extension "ms-vscode-remote.remote-containers" --extensions-dir=/home/runner/work/continue/continue/extensions/vscode/e2e/.test-extensions

is actually a flake.

Patrick-Erichsen · 2025-04-07T19:56:48Z

@tomasz-stefaniak will hopefully be getting to the bottom of the test flake this week

Patrick-Erichsen · 2025-04-07T19:59:00Z

All tests passing!

25ee72e#diff-a6d8ec1e9272f89aa61d2e14627eef192f6b5ed76d7a6f1647991e56d9ff5ac4R21

Thanks for cleaning this up, that was bound to break at some point so apologies you were the first one to have to deal with it.

--

I think we're good to merge here, if you're ready as well let's hit the button 👍

Patrick-Erichsen · 2025-04-07T20:03:24Z

Handled a quick merge conflict that just arose

owtaylor · 2025-04-07T20:04:12Z

I think we're good to merge here, if you're ready as well let's hit the button 👍

Good to go from my perspective - I'll keep an eye out for any problems that are reported (or let me know) and will look at adding some enhancements like showing the role over the next few weeks.

RomneyDa · 2025-04-08T00:33:04Z

@owtaylor thanks for cleaning things up and working in the requested changes! Looks great. Merging.

owtaylor requested a review from a team as a code owner March 27, 2025 20:07

owtaylor requested review from RomneyDa and removed request for a team March 27, 2025 20:07

owtaylor force-pushed the rich-llm-logging branch from b00ffa2 to ec83625 Compare March 27, 2025 20:27

RomneyDa requested a review from Patrick-Erichsen April 1, 2025 00:33

RomneyDa approved these changes Apr 1, 2025

View reviewed changes

RomneyDa requested changes Apr 1, 2025

View reviewed changes

Patrick-Erichsen previously approved these changes Apr 1, 2025

View reviewed changes

owtaylor dismissed Patrick-Erichsen’s stale review via f7e3470 April 1, 2025 17:43

owtaylor force-pushed the rich-llm-logging branch from ec83625 to f7e3470 Compare April 1, 2025 17:43

owtaylor added 6 commits April 7, 2025 14:04

BaseLLM: don't disguise AbortError when rethrowing

5166e74

When we catch/log/rethrow AbortError, throw it as itself, rather than throw Error(e.message) so that outer callers can detect it by the exception name.

Console: limit number of interactions that we retain

ffec6c6

Only retain and display the 50 most recent interactions; this prevents the stored items from growing without bound if the user is using the vscode instance to edit for a long time.

e2e: select quick picks based on strings, not harcoded indexes

25ee72e

Selecting the 14th view container ran into problems when an (emty) Continue Console view container was added. This also changes it to move to new secondary sidebar container rather than reusing one from Copilot.

owtaylor force-pushed the rich-llm-logging branch from f7e3470 to 25ee72e Compare April 7, 2025 18:19

Merge branch 'main' into rich-llm-logging

6e54e7f

RomneyDa approved these changes Apr 8, 2025

View reviewed changes

RomneyDa merged commit 1ad3d7f into continuedev:main Apr 8, 2025
29 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rich LLM logging #4866

Rich LLM logging #4866

owtaylor commented Mar 27, 2025

netlify bot commented Mar 27, 2025 •

edited

Loading

RomneyDa left a comment

RomneyDa left a comment •

edited

Loading

Patrick-Erichsen left a comment •

edited

Loading

owtaylor commented Apr 1, 2025

Patrick-Erichsen commented Apr 1, 2025

owtaylor commented Apr 1, 2025

Patrick-Erichsen commented Apr 3, 2025

owtaylor commented Apr 7, 2025

owtaylor commented Apr 7, 2025

Patrick-Erichsen commented Apr 7, 2025

Patrick-Erichsen commented Apr 7, 2025

Patrick-Erichsen commented Apr 7, 2025

owtaylor commented Apr 7, 2025

RomneyDa commented Apr 8, 2025

Rich LLM logging #4866

Rich LLM logging #4866

Conversation

owtaylor commented Mar 27, 2025

Description

Checklist

Screenshots

Testing instructions

Review guide and notes

Other notes

Possible future extensions

netlify bot commented Mar 27, 2025 • edited Loading

✅ Deploy Preview for continuedev ready!

RomneyDa left a comment

Choose a reason for hiding this comment

RomneyDa left a comment • edited Loading

Choose a reason for hiding this comment

Patrick-Erichsen left a comment • edited Loading

Choose a reason for hiding this comment

owtaylor commented Apr 1, 2025

Patrick-Erichsen commented Apr 1, 2025

owtaylor commented Apr 1, 2025

Patrick-Erichsen commented Apr 3, 2025

owtaylor commented Apr 7, 2025

owtaylor commented Apr 7, 2025

Patrick-Erichsen commented Apr 7, 2025

Patrick-Erichsen commented Apr 7, 2025

Patrick-Erichsen commented Apr 7, 2025

owtaylor commented Apr 7, 2025

RomneyDa commented Apr 8, 2025

netlify bot commented Mar 27, 2025 •

edited

Loading

RomneyDa left a comment •

edited

Loading

Patrick-Erichsen left a comment •

edited

Loading