Skip to content

chore(api): Changed the search api to return raw source (instead of base64 encoding) #356

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 17, 2025

Conversation

brendan-kellam
Copy link
Contributor

@brendan-kellam brendan-kellam commented Jun 17, 2025

This PR alters the behaviour of the search api (and all apis that depend on it) to return raw source code instead of a base64 encoding. Reasoning: we are decoding it on the client in multiple different places, so it would be beneficial to decode it in a single spot.

Note: This is a breaking change to the API surface. However, since the API surface is still unofficial/unsupported, I will roll this as a patch version change. See #101

Summary by CodeRabbit

  • Bug Fixes
    • Code previews and file sources now display correctly as plain text, fixing issues related to base64-encoded content.
  • Chores
    • Updated internal APIs and components to handle raw source code directly, removing base64 decoding steps.
    • Updated changelogs to document these changes and maintain alignment with recent API updates.

@brendan-kellam brendan-kellam requested a review from msukkari June 17, 2025 22:44
Copy link

coderabbitai bot commented Jun 17, 2025

Walkthrough

The changes remove the usage and implementation of the base64Decode utility across both backend and frontend codebases. Code that previously decoded base64-encoded content now directly uses plain text content, reflecting an upstream change in how data is provided. In one location, base64 decoding is newly introduced to handle encoded search results.

Changes

Files/Groups Change Summary
packages/mcp/src/index.ts Removed base64Decode import and decoding; now uses plain text content in code and file tools.
packages/mcp/src/utils.ts, packages/web/src/lib/utils.ts Removed the base64Decode function implementation and its export.
packages/web/src/app/[domain]/browse/[...path]/components/codePreviewPanel.tsx Removed base64Decode import; uses raw source string for code preview.
packages/web/src/app/[domain]/search/components/codePreviewPanel/index.tsx Removed base64Decode import; uses raw file source in query result.
packages/web/src/app/[domain]/search/components/searchResultsPanel/fileMatch.tsx Removed base64Decode and memoization; uses raw match content for rendering and logic.
packages/web/src/ee/features/codeNav/components/exploreMenu/referenceList.tsx Removed base64Decode and memoization; uses raw line content for highlighting.
packages/web/src/ee/features/codeNav/components/symbolHoverPopup/symbolDefinitionPreview.tsx Removed base64Decode and memoization; uses raw line content for highlighting.
packages/web/src/features/agents/review-agent/nodes/fetchFileContent.ts Removed base64Decode import; uses raw file content from response.
packages/web/src/features/search/searchApi.ts Added base64Decode import; now decodes base64-encoded content for search results transformation.
CHANGELOG.md, packages/mcp/CHANGELOG.md Added changelog entries documenting the removal of base64 encoding in APIs and client updates.

Sequence Diagram(s)

sequenceDiagram
    participant Backend
    participant Frontend

    Note over Backend,Frontend: Previous Flow (most locations)
    Backend->>Frontend: Sends base64-encoded content
    Frontend->>Frontend: Decodes content with base64Decode
    Frontend->>User: Renders decoded content

    Note over Backend,Frontend: New Flow (most locations)
    Backend->>Frontend: Sends plain text content
    Frontend->>User: Renders content directly

    Note over Backend,Frontend: Exception (searchApi.ts)
    Backend->>Frontend: Sends base64-encoded search results
    Frontend->>Frontend: Decodes content with base64Decode
    Frontend->>User: Renders decoded search result
Loading

Poem

A hop and a skip, the code is now clean,
No more base64 in sight to be seen!
We preview our files with text nice and clear,
Decoding’s retired—let’s all give a cheer!
With simpler flows, our tails wiggle free,
Here’s to plain content, as easy as can be!
🐇✨


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (6)
packages/web/src/app/[domain]/search/components/codePreviewPanel/index.tsx (1)

54-60: UX: spinner never centred in Safari

Minor: the flex container sets h-full but the parent of the panel isn’t guaranteed to provide an explicit height, causing the spinner to hug the top in Safari.
Add min-h-full (or ensure the parent is flex-1) to keep it vertically centred for all browsers.

packages/web/src/features/agents/review-agent/nodes/fetchFileContent.ts (1)

20-27: Large-file memory / log safety

fileContent can now be arbitrary plain text and may reach MBs.
Before passing it upstream:

  1. Guard against > N KB to avoid blowing out token limits of the agent caller.
  2. Never log the raw content – the current code only logs the request, but double-check that no upstream logger prints context.
-const fileContent = fileSourceResponse.source;
+const raw = fileSourceResponse.source;
+if (raw.length > 200_000) {
+    logger.warn(`Truncating oversized file (${raw.length} bytes): ${filename}`);
+}
+const fileContent = raw.slice(0, 200_000);

Adjust the threshold to whatever your agent can safely digest.

packages/web/src/ee/features/codeNav/components/exploreMenu/referenceList.tsx (1)

172-173: Preserve trailing newline for accurate highlighting

React will collapse a trailing \n at the end of {lineContent} which may shift highlight offsets by one line.
Wrap the string in <>{lineContent || '\u200B'}</> or append a ZERO-WIDTH-SPACE when lineContent.endsWith('\n') to keep the visual line count correct.

packages/web/src/app/[domain]/search/components/searchResultsPanel/fileMatch.tsx (1)

20-25: Use platform-agnostic newline splitting

split('\n') misses Windows \r\n endings and can off-by-one the computed endLineNumber.
Tiny tweak:

-        const endLineNumber = match.content.trimEnd().split('\n').length + startLineNumber - 1;
+        const endLineNumber =
+            match.content.trimEnd().split(/\r?\n/).length + startLineNumber - 1;
packages/mcp/src/index.ts (2)

116-119: Include language identifier in fenced code blocks

Markdown renderers highlight syntax when a language is provided. Inserting the repo’s language makes snippets easier to read:

-                    return `\`\`\`\n${chunk.content}\n\`\`\``;
+                    return `\`\`\`${file.language?.toLowerCase() || ''}\n${chunk.content}\n\`\`\``;

Also consider escaping embedded triple-backticks to avoid fence breaking.


201-205: Same improvement for full-file source output

-text: `file: ${fileName}\nrepository: ${repoId}\nlanguage: ${response.language}\nsource:\n${response.source}`,
+text: `file: ${fileName}\nrepository: ${repoId}\nlanguage: ${response.language}\nsource:\n\`\`\`${response.language?.toLowerCase() || ''}\n${response.source}\n\`\`\``,

Provides consistent formatting and syntax highlighting.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 22d548e and cd8d71a.

📒 Files selected for processing (10)
  • packages/mcp/src/index.ts (3 hunks)
  • packages/mcp/src/utils.ts (0 hunks)
  • packages/web/src/app/[domain]/browse/[...path]/components/codePreviewPanel.tsx (2 hunks)
  • packages/web/src/app/[domain]/search/components/codePreviewPanel/index.tsx (1 hunks)
  • packages/web/src/app/[domain]/search/components/searchResultsPanel/fileMatch.tsx (3 hunks)
  • packages/web/src/ee/features/codeNav/components/exploreMenu/referenceList.tsx (1 hunks)
  • packages/web/src/ee/features/codeNav/components/symbolHoverPopup/symbolDefinitionPreview.tsx (1 hunks)
  • packages/web/src/features/agents/review-agent/nodes/fetchFileContent.ts (1 hunks)
  • packages/web/src/features/search/searchApi.ts (3 hunks)
  • packages/web/src/lib/utils.ts (0 hunks)
💤 Files with no reviewable changes (2)
  • packages/mcp/src/utils.ts
  • packages/web/src/lib/utils.ts
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: build
  • GitHub Check: build
  • GitHub Check: build
🔇 Additional comments (2)
packages/web/src/ee/features/codeNav/components/symbolHoverPopup/symbolDefinitionPreview.tsx (1)

53-54: Potential HTML injection path – confirm LightweightCodeHighlighter escapes

Since raw lineContent is rendered directly, ensure the highlighter internally escapes < / & etc.
Previously we implicitly trusted base64Decode to give us raw bytes, so no new risk is introduced, but it’s worth confirming the component uses dangerouslySetInnerHTML only with escaped output.

packages/web/src/features/search/searchApi.ts (1)

268-270: To verify how base64Decode behaves on invalid input and examine its implementation alongside the exact call sites, let’s pull in the relevant snippets:

#!/bin/bash
set -e

echo "1) Implementation of base64Decode in shared utils:"
rg --color=never -n -A5 "export const base64Decode" packages/shared/src/utils.ts

echo
echo "2) Surrounding code in searchApi.ts around line 268:"
sed -n '260,300p' packages/web/src/features/search/searchApi.ts

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
CHANGELOG.md (1)

15-15: Refine changelog entry wording
Avoid repeating “Changed” (heading + verb), add the missing article, and hyphenate “base64-encoded.” For example:

- ### Changed
- Changed search api (and all apis that depend on it) to return raw source code instead of base64 encoded string. ([356](https://github.com/sourcebot-dev/sourcebot/pull/356))
+ ### Changed
+ [**Breaking Change**] Search API and all dependent APIs now return raw source code instead of a base64-encoded string. ([#356](https://github.com/sourcebot-dev/sourcebot/pull/356))
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cd8d71a and d8013cf.

📒 Files selected for processing (2)
  • CHANGELOG.md (1 hunks)
  • packages/mcp/CHANGELOG.md (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • packages/mcp/CHANGELOG.md
🧰 Additional context used
🪛 LanguageTool
CHANGELOG.md

[duplication] ~14-~14: Possible typo: you repeated a word.
Context: ...//pull/354) ### Changed - Changed search api (and all apis that depend on...

(ENGLISH_WORD_REPEAT_RULE)


[uncategorized] ~15-~15: Possible missing article found.
Context: ...t) to return raw source code instead of base64 encoded string. ([356](https://github.c...

(AI_HYDRA_LEO_MISSING_A)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: build
  • GitHub Check: build
  • GitHub Check: build

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
CHANGELOG.md (1)

16-17: Breaking Change label required
This update is a backward-incompatible change to the API surface. Please explicitly mark it as a Breaking Change under Unreleased > Changed to alert consumers.

### [Unreleased]
- ### Changed
- - Changed search api (and all apis that depend on it) to return raw source code instead of base64 encoded string. ([356](https://github.com/sourcebot-dev/sourcebot/pull/356)).
+ ### Changed
+ - **Breaking Change**: Search API (and all APIs that depend on it) now returns raw source code directly instead of a base64-encoded string. ([#356](https://github.com/sourcebot-dev/sourcebot/pull/356))
🧹 Nitpick comments (1)
CHANGELOG.md (1)

10-10: Grammar: use "an" before "API"
The note reads "a API change," but it should be "an API change" for correct English.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d8013cf and 1e3d586.

📒 Files selected for processing (1)
  • CHANGELOG.md (1 hunks)
🧰 Additional context used
🪛 LanguageTool
CHANGELOG.md

[style] ~13-~13: The noun “invitation” is usually used instead of ‘invite’ in formal writing.
Context: ... account join request when redeeming an invite. [#352](https://github.com/sourcebot-de...

(AN_INVITE)


[duplication] ~16-~16: Possible typo: you repeated a word.
Context: ...//pull/354) ### Changed - Changed search api (and all apis that depend on...

(ENGLISH_WORD_REPEAT_RULE)


[uncategorized] ~17-~17: Possible missing article found.
Context: ...t) to return raw source code instead of base64 encoded string. ([356](https://github.c...

(AI_HYDRA_LEO_MISSING_A)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: build
  • GitHub Check: build
  • GitHub Check: build

@brendan-kellam brendan-kellam merged commit 1d95e82 into main Jun 17, 2025
5 checks passed
@brendan-kellam brendan-kellam deleted the bkellam/decoded_base64 branch June 17, 2025 22:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant