Skip to content

Adding function information to the /render page #134

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 45 commits into
base: main
Choose a base branch
from

Conversation

AlonBilman
Copy link

Hey Tamir!

We've added function info to the render page - it updates only when fetching from GitHub (via URL).
Currently, most of the code is in App.svelte.
Here are some example links to test locally (port 5173):
1
2
3
4
5
6
7
8
We are waiting for your feedback and hopefully it meets your standards.

@tmr232 tmr232 changed the title Issue#94 Adding function information to the /render page Mar 19, 2025
@tmr232 tmr232 self-requested a review March 19, 2025 18:36
Copy link
Owner

@tmr232 tmr232 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Alon,

Thanks for your contribution!

I went over it, and there are a few things that need more work.
There's a bit of code to change and move around, and tests that need to be written.

Additionally, I think there are visual-design changes required:

  • The current color is shared for both the dark and light themes, and only kinda works with either. I think it is better to use different colors for the different themes. This is especially important on the light theme, that currently has very low contrast of text and background.
  • As this is a programming tool, and we're presenting data, I think a monospaced font will be a better fit.
  • There is currently a lot of spacing in the display. Between the lines and between the labels and the info. I think it should be tightened.
  • I think the text should be selectable. There is no reason to prevent selection.
  • I think it'll be better for the data to be presented under the controls and not on the opposite side
  • Maybe it's worth it to add the project and filepath as well as the function name?

image
image

If you have any disagreement regarding the visual side - please let me know!

Last but not least - please make sure you run the linters (bun lint) and they pass before submitting your code. Note that this is critical, and in most cases I would not review code that does not pass the automated checks.

@AlonBilman
Copy link
Author

Hey Tamir!

First of all, thank you for reviewing the code!

Regarding linting:
When running bun lint, it modifies the entire codebase, resulting in 100+ changes.
Is there a way to apply the linter to a specific file? Running bun lint <file path> also seems to modify everything.

Secondly, in CONTRIBUTING.md, there is a mention of the command bun check but I couldn’t get it to work. It says there is no script called "check."
Could you please clarify how to run this command correctly? - I believe this is the reason the code didn't pass the automation.

Regarding the visual aspects - any preferences you have will be applied

Of course, all your comments will be addressed, fixed, and modified accordingly.

Thanks again for the review! changes are on the way :)

@tmr232
Copy link
Owner

tmr232 commented Mar 20, 2025

Regarding linting: When running bun lint, it modifies the entire codebase, resulting in 100+ changes. Is there a way to apply the linter to a specific file? Running bun lint <file path> also seems to modify everything.

The linters passing for all files are required for merging the code, you can see the failing checks at https://github.com/tmr232/function-graph-overview/actions/runs/13949594615/job/39061318568?pr=134
If you're getting a large number of changes, especially in files you did not edit yourself, it is likely that your tools are not up to date as I recently replaced prettier with biome. Try running bun install again and then bun lint.

Secondly, in CONTRIBUTING.md, there is a mention of the command bun check but I couldn’t get it to work. It says there is no script called "check." Could you please clarify how to run this command correctly? - I believe this is the reason the code didn't pass the automation.

That's my bad - I changed the command from bun check to bun lint and forgot to update the docs.

@AlonBilman AlonBilman closed this Mar 23, 2025
@AlonBilman AlonBilman deleted the issue#94 branch March 23, 2025 13:50
@AlonBilman AlonBilman restored the issue#94 branch March 23, 2025 14:04
@tmr232 tmr232 reopened this Mar 23, 2025
@Bennahmias
Copy link

Hey Tamir,

I saw that you changed the test framework from bun:test to vitest, and updated only the src/parser-loader/bun.ts file — but not src/parser-loader/vite.ts.

I wanted to ask if you're planning to update vite.ts as well, because in src/render/src/App.svelte we’re importing initParsers and iterFunctions from file-parsing/vite.ts.

In our tests, we also import from file-parsing/vite.ts, and it’s currently failing because of the wasm path issue. When we import from file-parsing/bun.ts, it works.

Just wanted to check if you're planning to handle this too. Thanks!

@tmr232
Copy link
Owner

tmr232 commented Apr 24, 2025

I guess the recent changes make the naming less clear...
vite.js is for the built files (the extension and the web version), bun.js is for scripts and tests.
I might merge those into a single import at some point, but don't know when that'll be.

Used a direct Tree-sitter query
Still need to figure out a way to do the same for Go , I couldn't find a generic approach yet.
# Conflicts:
#	src/test/functionDetails.test.ts
…pdate usages

test: add test for function expression with direct name in TypeScript
…ach (except for C++)

fix: Go name extraction now works correctly and without bugs

all name extraction logic is now working well
Updated `updateMetadata` to support new logic
We should note that I'm still fetching twice from GitHub..
@AlonBilman AlonBilman requested a review from tmr232 May 10, 2025 09:56
@AlonBilman
Copy link
Author

Hi Tamir,

Here’s what we’ve done so far:

First, we added support for displaying both the CFG metadata and the function name in GitHub, as well as the CFG metadata in the Graph.

We used Tree-sitter queries to improve the extraction process (thanks for the suggestion).

The code for extracting function names is spread across src/control-flow/, where each cfg-<language>.ts file handles its own logic.

We also kept function-utils, which handles shared logic and dispatching based on language.

About conflicts in function names:
When Tree-sitter queries return multiple matches, we treat it as a conflict and return <unsupported>. This is handled by extractTaggedValueFromTreeSitterQuery.
(These cases are hard to predict and usually don’t result in a valid function name.)

We added 35 tests for all the scenarios we came up with, covering different languages and function types.

On the frontend side:
In App.svelte, we updated the updateMetadata function to support Graph as well, and now we're extracting the metadata based on the render type.

(Just a note: there’s currently nothing in the README that explains how to use Graph on the render page)

When the type is GitHub, we’re fetching the data twice...
Obviously, that’s not ideal. How would you handle it better?

A GIF that summarizes the changes:
render-vid_

@AlonBilman
Copy link
Author

Also, when I commit my changes, should I also include the snapshot updates that were generated by the tests?

@tmr232
Copy link
Owner

tmr232 commented May 10, 2025

Also, when I commit my changes, should I also include the snapshot updates that were generated by the tests?

Are you seeing any changes beyond line-ending changes in the snapshots?

@AlonBilman
Copy link
Author

Also, when I commit my changes, should I also include the snapshot updates that were generated by the tests?

Are you seeing any changes beyond line-ending changes in the snapshots?

Oh, this is weird. Earlier this week, I saw a lot of changes in the snapshot files, but now, when I run the tests, it only changes the end-of-line characters, as you mentioned.
It wasn't like that - you can see it in this commit, where I deleted the changes manually.

But for now, I guess it's alright.

Copy link
Owner

@tmr232 tmr232 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only reviewed the name extraction logic, not the UI part.

There are some questions, but also some bugs. I'm pusing a couple of tests that demonstrate them.

@@ -166,3 +166,17 @@ function processSwitchlike(switchSyntax: SyntaxNode, ctx: Context): BasicBlock {

return blockHandler.update({ entry: headNode, exit: mergeNode });
}

const nodeType = {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this constant for? Why not use the names directlu?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll modify it


export function extractCFunctionName(func: SyntaxNode): string | undefined {
if (func.type === nodeType.functionDefinition) {
return func.descendantsOfType(nodeType.identifier)[0]?.text;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a case where this really is undefined, or is it due to the weird typing in the current web-tree-sitter library? If it's the weird types, we have treeSitterNoNullNodes to deal with that in an "easy to find later" way.

tag: string,
): string | undefined {
const language = func.tree.language;
const queryObj = new Query(language, query);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure we can accidentally get a nested function here.
Also, there's an entire query mechanism that we can use.

func: SyntaxNode,
language: Language,
): string | undefined {
switch (language) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to use a mapping to easily enforce all languages via typing.

@@ -0,0 +1,83 @@
import { Query, type Node as SyntaxNode } from "web-tree-sitter";
import type { Language } from "./cfg";
import { extractCFunctionName } from "./cfg-c.ts";
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This leads to a cyclic dependency. Honestly - I'm surprised this works at all. In any case - it's better not to have cycles.

tag: "var.name",
};

const assignmentQueryAndTag = {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All three queries can be unified with an alternation in the query.

return extractNameByNodeType(func, nodeType.fieldIdentifier);
case nodeType.funcLiteral:
// Check if the func_literal is assigned to a variable or is a standalone function
return (
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is important to limit the depth of a query, so that they properly ignore nesting.
Consider:

func main() {
	var x = func() {
		y := func() {}
		y()
	}
	x()
}

See maxStartDepth (used in block-matcher.ts)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I'll check it out and make the changes.


const variableDeclaratorQueryAndTag = {
query: `
(lexical_declaration
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What of var definitions, and of definitions without either var, let or const?

*
* @param func - The syntax node to search within.
* @param type - The type of the child node to extract the name from.
* used among all languages (mostly the easy cases of extracting the name).
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line currently renders as part of the parameter definition, which seems odd to me.

.map((c) => c.node.text);

if (names.length > 1) {
return "<unsupported>";
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are we actually checking here? What is not supported, and why are we treating it differently than other cases where we failed to get a name?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed before, when I have something like y, x := func() {}, func() {}, the query for extracting the variables returns two results:
(short_var_declaration left: (expression_list (identifier) @name))
It's unclear what the "name" of the function is, because there are two expressions. Since I get two matches for one query, I treat it as a conflict.

How can I handle this better?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First thing, adding a comment that says this is helpful.

Second thing - please be consistent. There are more cases where we can't get the name of the function. Why are some just unnamed, and this is <unsupported>?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right , I should have included an explanation in the code itself.
Regarding the second thing:
I understand what you're saying. I thought it was different because the extraction succeeded, it just returned more than one name.
In most other cases, I only got a single name, so I could tell if the name extraction had failed.
I could return undefined instead

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I just return undefined in this case and treat it as an unnamed function?
I think that makes sense

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, undefined if we can't get the name, the name if we can, anonymous if there isn't one.

@AlonBilman
Copy link
Author

Hi ,

So, you’ve added a couple of tests and I’d like to talk about one of them
Let’s look at this code you've given:

func main() {
	var x = func() {
		y := func() {}
		y()
	}
	x()
}

If you ask me what the name of this function is, I’d probably say "main"
Now, when a user gives the render page a GitHub link and a line number that points to a function, they might give the line for func main() { (let’s call that line 1).
If they want to look at the inner function, they would give the line for that nested function instead.
When we fetch the function from GitHub, we only get the function itself. So if we want main, we get everything inside it.
But if we ask for line 2, we only get:

var x = func() {
	y := func() {}
        y()
}

Since we always want to get the outer function name for any link and line number everything works. we would get x.
If we were pointing at y := func() {} , we’d get y.

To be completely honest, I didn’t really understand the test.
Why do we want to iterate inside the node and get the names of nested functions? I think (maybe) it doesn't match the render page feature?
Am I missing something here?

(The test is in Go section)

@tmr232
Copy link
Owner

tmr232 commented May 19, 2025

If you ask me what the name of this function is, I’d probably say "main"

That may be true, but what the test is asking is "what are the names of all the functions here".

If we have a function that extracts function names, there is no reason for it to not work on nested functions.

The test is there because in that code, I expect to be able to query the names of all the functions, and get correct results (or undefined if we can't get that).
But instead, we get main for main, y for x and y for y.

Additionally, the test does not have to match the /render page. It has to test the function, which should work in any relevant use-case.

@AlonBilman
Copy link
Author

The test is there because in that code, I expect to be able to query the names of all the functions, and get correct results (or undefined if we can't get that).
But instead, we get main for main, y for x and y for y.

This is kind of a new feature, extracting function names from within a function.
Ok, it still seems a bit odd to me that it doesn't work when we're iterating over the subnodes (right?).
There’s probably a difference here.

Let’s say I have a solution that returns a list of names:
For the render page - yeah, the first one is fine, Id always return the outer name.
The rest would be listed in the order they were declared inside the function?
I guess that makes sense.

I'll keep in touch

const language = func.tree.language;
const queryObj = new Query(language, query);

const rootNode = func.tree.rootNode;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I somehow missed this... You're getting the root of the file - this will never work.

Try the following:

If you print all the names you get with this, you'll see what I mean.

@tmr232
Copy link
Owner

tmr232 commented May 20, 2025

Let’s say I have a solution that returns a list of names: For the render page - yeah, the first one is fine, Id always return the outer name. The rest would be listed in the order they were declared inside the function? I guess that makes sense.

There's no need to get a list of names. You need to get the correct name for the function object.

Also, for the render page, it is not always the "outer name", it's the name of the function on the line we choose.
Like I said before, we analyze the entire file.
See the example in #134 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants