Skip to content

Add support for artifact in llama-index-server #580

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Apr 28, 2025
Merged

Conversation

leehuwuj
Copy link
Collaborator

@leehuwuj leehuwuj commented Apr 23, 2025

Summary by CodeRabbit

  • New Features

    • Added utility functions for improved handling and retrieval of artifacts from chat requests.
    • Enhanced server routing to prioritize API routes before UI static files for better route management.
    • Introduced new asynchronous workflows for generating and updating code and document artifacts via chat, including planning, generation, and user-facing explanation steps.
  • Chores

    • Updated workflows to use new tooling for dependency management, building, and publishing.
    • Adjusted workflow paths and logic to match the updated project directory structure.
    • Refined CI workflows to prevent unnecessary runs on specific workflow file changes.

Copy link

changeset-bot bot commented Apr 23, 2025

⚠️ No Changeset found

Latest commit: 2f17cdc

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

Copy link

coderabbitai bot commented Apr 23, 2025

Warning

Rate limit exceeded

@leehuwuj has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 2 minutes and 58 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 6104c3d and 2f17cdc.

📒 Files selected for processing (3)
  • python/llama-index-server/examples/artifact/code_workflow.py (1 hunks)
  • python/llama-index-server/examples/artifact/document_workflow.py (1 hunks)
  • python/llama-index-server/llama_index/server/api/models.py (3 hunks)

"""

Walkthrough

The updates transition the llama-index-server project’s GitHub Actions workflows from using Poetry to the uv toolchain for Python dependency management, building, and publishing. Workflow steps now utilize uv commands for installing dependencies, running tests, building, and publishing, with corresponding changes to artifact paths and working directories to reflect a new project layout. Additionally, a new utility module is introduced in the server codebase, providing functions to extract and sort artifacts from chat requests. The server code also modifies static file mounting to accept an optional route name and introduces an add_api_route method to ensure the UI static files route is always ordered last in routing. Two new example workflows are added that implement asynchronous artifact generation workflows for code and document creation using a large language model, with detailed planning, generation, and synthesis steps emitting UI events.

Changes

File(s) Change Summary
.github/workflows/test_llama_index_server.yml
.github/workflows/release_llama_index_server.yml
Migrated workflow steps from Poetry to uv toolchain, updated working directories and artifact paths, and adjusted release logic to use uv for versioning, building, and publishing.
.github/workflows/e2e.yml Updated E2E workflow to ignore changes to .github/workflows/*llama_index_server.yml files in addition to existing path ignores.
python/llama-index-server/llama_index/server/api/utils/chat_request.py Added new utility module with get_artifacts and get_last_artifact functions for extracting and sorting artifacts from chat requests.
python/llama-index-server/llama_index/server/server.py Modified static file mounting to accept an optional route name and introduced add_api_route method to ensure UI static files route is always last in routing order.
python/llama-index-server/examples/artifact/code_workflow.py Added asynchronous artifact workflow for code generation and explanation using LLM, with steps for planning, generating code artifacts, and synthesizing answers, emitting UI events.
python/llama-index-server/examples/artifact/document_workflow.py Added asynchronous artifact workflow for document generation (Markdown/HTML) using LLM, with planning, generation, and synthesis steps, emitting UI events.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant ChatRequest
    participant Utils (chat_request.py)

    User->>ChatRequest: Submit chat request with messages
    ChatRequest->>Utils: Call get_artifacts(chat_request)
    Utils->>ChatRequest: Extract artifacts from messages
    Utils->>Utils: Sort artifacts by creation time
    Utils-->>ChatRequest: Return sorted artifacts list
    ChatRequest->>Utils: Call get_last_artifact(chat_request)
    Utils->>Utils: Retrieve artifacts, select last
    Utils-->>ChatRequest: Return last artifact or None
Loading
sequenceDiagram
    participant User
    participant ArtifactWorkflow
    participant LLM
    participant MemoryBuffer

    User->>ArtifactWorkflow: Start workflow with chat request
    ArtifactWorkflow->>MemoryBuffer: Initialize with chat history and last artifact
    ArtifactWorkflow->>LLM: Planning prompt for next step (coding or answering)
    LLM-->>ArtifactWorkflow: Return plan with next step and requirements
    ArtifactWorkflow->>MemoryBuffer: Save plan
    ArtifactWorkflow->>User: Emit UI event (plan state)
    alt next step is coding
        ArtifactWorkflow->>LLM: Generate or update code artifact
        LLM-->>ArtifactWorkflow: Return code block
        ArtifactWorkflow->>MemoryBuffer: Save generated artifact
        ArtifactWorkflow->>User: Emit artifact UI event
    end
    ArtifactWorkflow->>LLM: Synthesize user-facing answer
    LLM-->>ArtifactWorkflow: Streamed answer
    ArtifactWorkflow->>User: Emit UI event (completed state) and stop event
Loading

Possibly related PRs

Suggested reviewers

  • marcusschiesser

Poem

In the warren, tools have changed,
From Poetry to uv, workflows rearranged.
Artifacts sorted, the last one in line—
Rabbits code swiftly, their pipelines align.
With each hop and sync, the future looks bright,
🐇✨ Automation done right!
"""

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

…tor and DocumentGenerator classes. Update app_writer to utilize FunctionAgent for code and document generation workflows. Remove deprecated ArtifactGenerator class. Enhance artifact transformation logic in callbacks. Improve system prompts for clarity and instruction adherence.
- Introduced `code_workflow.py` for generating and updating code artifacts based on user requests.
- Introduced `document_workflow.py` for generating and updating document artifacts (Markdown/HTML).
- Created `main.py` to set up FastAPI server with artifact workflows.
- Added a README for setup instructions and usage.
- Implemented UI components for displaying artifact status and progress.
- Updated chat router to remove unused event callbacks.
- Renamed `ArtifactUIEvents` to `UIEventData` for clarity.
- Introduced `last_artifact` attribute in `ArtifactWorkflow` to streamline artifact retrieval.
- Updated chat history handling to utilize the new `last_artifact` attribute.
- Modified event streaming to use `UIEventData` for consistent event structure.
- Added a new UI component for displaying artifact workflow status and progress.
@leehuwuj leehuwuj changed the title [WIP] Add support for artifact in llama-index-server Add support for artifact in llama-index-server Apr 25, 2025
@leehuwuj leehuwuj marked this pull request as ready for review April 25, 2025 08:29
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

♻️ Duplicate comments (2)
.github/workflows/test_llama_index_server.yml (2)

50-54: Duplicate: reorder setup-python before setup-uv.

This is the same ordering concern as in the unit-test job.


77-81: Duplicate: reorder setup-python before setup-uv.

Same suggestion for the build job to ensure uv uses the correct Python interpreter.

🧹 Nitpick comments (13)
llama-index-server/pyproject.toml (1)

1-17: Validate PEP 621 metadata fields and update placeholders.

The new [project] table correctly replaces Poetry metadata with PEP 621 fields. However, please verify that:

  • The license field syntax (license = "MIT") is recognized by Hatchling per PEP 621 (some tools expect license = { text = "MIT" } or a license-file).
  • The placeholder author ("Your Name" <[email protected]>) is updated to the actual package maintainers.
.github/workflows/test_llama_index_server.yml (1)

23-31: Consider reordering setup-python and setup-uv steps.

Currently, the workflow installs uv before configuring the Python runtime. To ensure uv sync uses the intended Python version, you may want to:

-  - name: Install uv
-    uses: astral-sh/setup-uv@v5
-    with:
-      enable-cache: true
-  - name: Set up Python ${{ matrix.python-version }}
-    uses: actions/setup-python@v5
-    with:
-      python-version: ${{ matrix.python-version }}
+  - name: Set up Python ${{ matrix.python-version }}
+    uses: actions/setup-python@v5
+    with:
+      python-version: ${{ matrix.python-version }}
+  - name: Install uv
+    uses: astral-sh/setup-uv@v5
+    with:
+      enable-cache: true

This ensures that uv picks up the correct interpreter when creating virtual environments.

llama-index-server/examples/artifact/README.md (1)

34-36: Add language specifier to code fence.

The markdown code block should include a language specifier for better syntax highlighting and to follow markdown best practices.

-   ```
+   ```txt
    http://localhost:8000
    ```
🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

34-34: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

llama-index-server/llama_index/server/api/utils/chat_request.py (1)

6-11: Consider removing unnecessary type ignore comment.

The type ignore comment appears unnecessary since the list comprehension already filters out None values with the condition check.

    return [
-        Artifact.from_message(message)  # type: ignore
+        Artifact.from_message(message)
        for message in chat_request.messages
        if Artifact.from_message(message) is not None
    ]
llama-index-server/examples/artifact/components/ui_event.jsx (1)

32-41: fade state is never set to true – dead code

fade is initialised to false and only ever reset to false inside useEffect; therefore the conditional styles (opacity-0 pointer-events-none) can never be triggered.

If a fade-out animation is desired when the workflow completes, call setFade(true) before hiding the card, e.g.

-      setVisible(false);
+      setFade(true);
+      // hide after animation
+      setTimeout(() => setVisible(false), 400);

Otherwise remove the state and related classes to keep the component lean.

llama-index-server/examples/artifact/document_workflow.py (2)

83-90: Use MessageRole.USER constant for consistency & type-safety

ChatMessage.role is usually an enum; passing raw strings bypasses static checks and
will break if the enum values ever change.

-                ChatMessage(
-                    role="user",
+                ChatMessage(
+                    role=MessageRole.USER,

(also add from llama_index.core.types import MessageRole to imports).


244-249: Interpolating a pydantic.BaseModel directly into the prompt may leak Python representation

{event.requirement} formats to something like
DocumentRequirement(type='markdown', title='...', requirement='...'), which is not user-friendly and may confuse the LLM. Serialize to JSON instead:

-            requirement=event.requirement,
+            requirement=event.requirement.model_dump_json(indent=2),

This yields deterministic input and reduces prompt-token waste.

llama-index-server/tests/api/test_event_stream.py (4)

168-169: Prefer monkeypatch over direct attribute mutation for logger mocks

Directly overwriting logger.warning (and later logger.error) with a MagicMock changes the global logger instance for the whole runtime of the test session, which may leak into other tests executed in parallel.

Consider using pytest’s monkeypatch fixture to patch the attribute only for the test’s scope:

-logger.warning = MagicMock()  # type: ignore
+monkeypatch = pytest.MonkeyPatch()
+monkeypatch.setattr(logger, "warning", MagicMock())  # type: ignore
+# ... run assertions ...
+monkeypatch.undo()

This keeps the logger pristine for other tests.


193-194: Apply the same scoped-patch approach for logger.error

For the same reasons discussed for logger.warning, patch logger.error via monkeypatch to avoid cross-test interference.


217-241: Unused helper – either remove or add coverage

_mock_agent_stream_with_empty_deltas is not referenced in any test. If filtering empty/whitespace‐only deltas is part of the contract, add a dedicated test that asserts they are ignored; otherwise, delete the dead code to keep the test suite lean.


257-259: Remove dead code

_mock_dict_event is unused and can be safely removed to reduce noise.

llama-index-server/examples/artifact/code_workflow.py (2)

271-280: Harden code-block extraction

Similar to the JSON regex, make the code extraction pattern non-greedy and tolerant of optional whitespace/newlines after the language specifier:

-language_pattern = r"```(\w+)([\s\S]*)```"
+language_pattern = r"```(\w+)\s*([\s\S]*?)\s*```"

Also consider validating that code_match.group(1) matches the allowed languages to avoid silent acceptance of unsupported blocks.


320-327: Align role values with the underlying enum

Throughout the workflow (prepare_chat_history, synthesize_answer) plain strings like "system"/"assistant" are used.
If ChatMessage.role is typed as MessageRole, creating messages with raw strings bypasses static type checking and may break when the enum changes. Prefer the enum (or helper constants) instead:

-from llama_index.core.types import MessageRole
-...
-ChatMessage(role="system", content="...")
+from llama_index.core.types import MessageRole
+...
+ChatMessage(role=MessageRole.SYSTEM, content="...")

This provides compile-time safety and guards against typos.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 53e1cd5 and 2c3ac6d.

⛔ Files ignored due to path filters (2)
  • llama-index-server/poetry.lock is excluded by !**/*.lock
  • llama-index-server/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (13)
  • .github/workflows/test_llama_index_server.yml (3 hunks)
  • llama-index-server/examples/artifact/README.md (1 hunks)
  • llama-index-server/examples/artifact/code_workflow.py (1 hunks)
  • llama-index-server/examples/artifact/components/ui_event.jsx (1 hunks)
  • llama-index-server/examples/artifact/document_workflow.py (1 hunks)
  • llama-index-server/examples/artifact/main.py (1 hunks)
  • llama-index-server/llama_index/server/api/models.py (3 hunks)
  • llama-index-server/llama_index/server/api/routers/chat.py (2 hunks)
  • llama-index-server/llama_index/server/api/utils/__init__.py (1 hunks)
  • llama-index-server/llama_index/server/api/utils/chat_request.py (1 hunks)
  • llama-index-server/llama_index/server/chat_ui.py (1 hunks)
  • llama-index-server/pyproject.toml (1 hunks)
  • llama-index-server/tests/api/test_event_stream.py (11 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (7)
llama-index-server/llama_index/server/api/utils/__init__.py (1)
llama-index-server/llama_index/server/api/utils/chat_request.py (2)
  • get_artifacts (6-11)
  • get_last_artifact (14-16)
llama-index-server/llama_index/server/api/utils/chat_request.py (1)
llama-index-server/llama_index/server/api/models.py (3)
  • Artifact (175-213)
  • ChatRequest (32-41)
  • from_message (181-213)
llama-index-server/examples/artifact/main.py (3)
llama-index-server/examples/artifact/code_workflow.py (1)
  • ArtifactWorkflow (54-340)
llama-index-server/llama_index/server/server.py (2)
  • LlamaIndexServer (52-230)
  • UIConfig (16-49)
llama-index-server/llama_index/server/api/models.py (1)
  • ChatRequest (32-41)
llama-index-server/examples/artifact/components/ui_event.jsx (3)
templates/types/streaming/nextjs/app/components/ui/card.tsx (4)
  • Card (76-76)
  • CardHeader (80-80)
  • CardTitle (81-81)
  • CardContent (77-77)
templates/types/streaming/nextjs/app/components/ui/lib/utils.ts (1)
  • cn (4-6)
templates/types/streaming/nextjs/app/components/ui/chat/custom/markdown.tsx (1)
  • Markdown (11-27)
llama-index-server/tests/api/test_event_stream.py (4)
llama-index-server/tests/api/test_chat_api.py (2)
  • logger (15-16)
  • chat_request (20-24)
llama-index-server/llama_index/server/api/models.py (2)
  • ChatRequest (32-41)
  • ChatAPIMessage (23-29)
llama-index-server/llama_index/server/api/utils/vercel_stream.py (3)
  • VercelStreamResponse (10-44)
  • convert_text (28-32)
  • convert_data (35-38)
llama-index-server/llama_index/server/api/callbacks/stream_handler.py (1)
  • stream_events (28-56)
llama-index-server/examples/artifact/code_workflow.py (3)
llama-index-server/llama_index/server/api/models.py (4)
  • ArtifactType (158-160)
  • ChatRequest (32-41)
  • CodeArtifactData (163-166)
  • UIEvent (147-155)
llama-index-server/llama_index/server/api/utils/chat_request.py (1)
  • get_last_artifact (14-16)
llama-index-server/examples/artifact/document_workflow.py (9)
  • PlanEvent (35-37)
  • GenerateArtifactEvent (40-41)
  • SynthesizeAnswerEvent (44-46)
  • UIEventData (49-51)
  • ArtifactWorkflow (54-324)
  • prepare_chat_history (77-98)
  • planning (101-193)
  • generate_artifact (196-291)
  • synthesize_answer (294-324)
llama-index-server/examples/artifact/document_workflow.py (4)
llama-index-server/llama_index/server/api/models.py (4)
  • ArtifactType (158-160)
  • ChatRequest (32-41)
  • DocumentArtifactData (169-172)
  • UIEvent (147-155)
llama-index-server/llama_index/server/api/utils/chat_request.py (1)
  • get_last_artifact (14-16)
llama-index-server/examples/artifact/code_workflow.py (9)
  • PlanEvent (35-37)
  • GenerateArtifactEvent (40-41)
  • SynthesizeAnswerEvent (44-46)
  • UIEventData (49-51)
  • ArtifactWorkflow (54-340)
  • prepare_chat_history (79-100)
  • planning (103-195)
  • generate_artifact (198-307)
  • synthesize_answer (310-340)
llama-index-server/examples/artifact/components/ui_event.jsx (2)
  • event (46-46)
  • event (137-137)
🪛 markdownlint-cli2 (0.17.2)
llama-index-server/examples/artifact/README.md

34-34: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

🔇 Additional comments (17)
llama-index-server/pyproject.toml (3)

34-37: Build system configuration looks correct.

The [build-system] section specifies hatchling>=1.24 and the hatchling.build backend, aligning with the migration from Poetry to Hatchling. No issues detected here.


38-64: Dev dependency group appears comprehensive.

The new [dependency-groups] section consolidates development dependencies under dev, matching prior [tool.poetry.group.dev.dependencies]. Version constraints are explicit and consistent.


66-67: Verify wheel target includes correct package paths.

The [tool.hatch.build.targets.wheel] section lists packages = ["llama_index/"]. Please confirm that:

  1. The slash-terminated path is recognized by Hatchling (it may expect dotted package names, e.g. "llama_index"),
  2. The wheel actually contains your llama_index modules upon build (e.g., inspect with unzip -l dist/llama-index-server-0.1.14-py3-none-any.whl).
.github/workflows/test_llama_index_server.yml (6)

33-36: Dependencies installed with uv sync --all-extras --dev.

The unit-test job installs all extras (including dev) before running tests, matching the new Hatchling and uv-based workflow.


37-40: Unit test invocation is correct.

Using uv run pytest tests under the llama-index-server working directory will properly execute your test suite across all platforms.


60-63: Dev install for type-check is correct.

The type-check job installs the same extras, ensuring dependencies for mypy are available.


64-67: Mypy command alignment is good.

Running uv run mypy llama_index will type-check the codebase as expected.


87-90: Build installation uses correct sync flags.

The build job’s uv sync --all-extras installs the package and any optional extras without dev dependencies, suitable for wheel generation.


91-94: Verify import-test module path.

The import test uses:

uv run python -c "from llama_index.server import LlamaIndexServer"

Please confirm the module path matches your package structure so this fails fast if the wheel layout is incorrect.

llama-index-server/llama_index/server/chat_ui.py (1)

8-8: Version bump for Chat UI looks good.

The update from version "0.1.5" to "0.1.6" aligns with the new artifact workflow features being introduced in this PR.

llama-index-server/llama_index/server/api/utils/__init__.py (1)

1-3: Good API design for artifact utilities.

The explicit export of the new artifact utility functions makes them easily accessible to consumers of the API. The __all__ list properly defines the public interface of this module.

llama-index-server/examples/artifact/README.md (1)

1-53: Comprehensive README with clear instructions.

The documentation provides a well-structured guide for setting up and using the artifact workflow features. It covers prerequisites, setup steps, and usage notes effectively.

🧰 Tools
🪛 markdownlint-cli2 (0.17.2)

34-34: Fenced code blocks should have a language specified
null

(MD040, fenced-code-language)

llama-index-server/llama_index/server/api/utils/chat_request.py (1)

14-16: Effective utility for retrieving the last artifact.

The get_last_artifact function is well-implemented, handling the empty case correctly and providing a clean interface for retrieving the most recent artifact.

llama-index-server/llama_index/server/api/routers/chat.py (3)

10-14: Import organization aligns with workflow enhancements.

The updated imports properly include the necessary event classes from the agent workflow module to support the new artifact workflows.


133-135: Improved event filtering enhances streaming efficiency.

Explicitly ignoring AgentInput and AgentSetup events is a good optimization, preventing unnecessary event streaming for these internal workflow events.


141-141: Enhanced error logging with full exception info.

Adding exc_info=True to the error logging is an excellent improvement that will provide more detailed stack traces for debugging purposes.

llama-index-server/examples/artifact/main.py (1)

13-19: Double-check the OpenAI model identifier and expose it via config

Hard-coding "gpt-4.1" couples the example to a specific model name that may not exist in every OpenAI account or may be renamed in the future. A single typo here will break the entire demo at runtime.

-        llm=OpenAI(model="gpt-4.1"),
+        # Prefer reading from env/CLI to avoid hard-coding a potentially
+        # unavailable model name.
+        llm=OpenAI(model=os.getenv("OPENAI_MODEL", "gpt-4o")),

(remember to add import os at the top).

Would you mind running a quick check against your account to confirm which GPT-4 models are currently available?

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
.github/workflows/release_llama_index_server.yml (1)

54-58: Simplify version bump
The current awk snippet works but is brittle (pre-releases, metadata). If uvx supports a native patch bump, e.g.:

uvx version patch

it would be more robust and self-documenting.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2c3ac6d and f975b79.

📒 Files selected for processing (1)
  • .github/workflows/release_llama_index_server.yml (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (28)
  • GitHub Check: build
  • GitHub Check: typescript (22, 3.11, ubuntu-22.04, nextjs, --llamacloud)
  • GitHub Check: typescript (22, 3.11, ubuntu-22.04, nextjs, --example-file)
  • GitHub Check: typescript (22, 3.11, ubuntu-22.04, nextjs, --no-files)
  • GitHub Check: typescript (22, 3.11, windows-latest, nextjs, --llamacloud)
  • GitHub Check: typescript (22, 3.11, windows-latest, nextjs, --example-file)
  • GitHub Check: typescript (22, 3.11, windows-latest, nextjs, --no-files)
  • GitHub Check: typescript (22, 3.11, macos-latest, nextjs, --llamacloud)
  • GitHub Check: typescript (22, 3.11, macos-latest, nextjs, --example-file)
  • GitHub Check: typescript (22, 3.11, macos-latest, nextjs, --no-files)
  • GitHub Check: typescript (20, 3.11, ubuntu-22.04, nextjs, --llamacloud)
  • GitHub Check: python (20, 3.11, ubuntu-22.04, fastapi, --llamacloud)
  • GitHub Check: typescript (20, 3.11, ubuntu-22.04, nextjs, --example-file)
  • GitHub Check: python (20, 3.11, ubuntu-22.04, fastapi, --example-file)
  • GitHub Check: typescript (20, 3.11, ubuntu-22.04, nextjs, --no-files)
  • GitHub Check: python (20, 3.11, ubuntu-22.04, fastapi, --no-files)
  • GitHub Check: typescript (20, 3.11, windows-latest, nextjs, --llamacloud)
  • GitHub Check: python (20, 3.11, windows-latest, fastapi, --llamacloud)
  • GitHub Check: typescript (20, 3.11, windows-latest, nextjs, --example-file)
  • GitHub Check: python (20, 3.11, windows-latest, fastapi, --example-file)
  • GitHub Check: typescript (20, 3.11, windows-latest, nextjs, --no-files)
  • GitHub Check: python (20, 3.11, windows-latest, fastapi, --no-files)
  • GitHub Check: typescript (20, 3.11, macos-latest, nextjs, --llamacloud)
  • GitHub Check: python (20, 3.11, macos-latest, fastapi, --llamacloud)
  • GitHub Check: typescript (20, 3.11, macos-latest, nextjs, --example-file)
  • GitHub Check: python (20, 3.11, macos-latest, fastapi, --example-file)
  • GitHub Check: typescript (20, 3.11, macos-latest, nextjs, --no-files)
  • GitHub Check: python (20, 3.11, macos-latest, fastapi, --no-files)
🔇 Additional comments (8)
.github/workflows/release_llama_index_server.yml (8)

34-38: Approve UV installation step
Using astral-sh/setup-uv@v5 with caching is correct and aligns with the new uv-based workflow.


45-46: Dependency sync with UV
Replacing poetry install with:

uv sync --all-extras --dev

ensures all extras and dev dependencies are installed under Hatchling/PEP 621.


62-65: Retrieve current version
Reading project.version via uvx --from=toml-cli toml get and exporting it to GITHUB_OUTPUT is precise for naming the PR.


99-103: Approve UV installation in publish job
Mirroring the release job’s astral-sh/setup-uv@v5 step ensures consistency and caching benefits.


110-111: Install runtime dependencies
uv sync --all-extras installs only the necessary extras for building/publishing, which is appropriate here.


115-118: Get version in publish job
Consistently fetching project.version again in the publish workflow guarantees the release tag matches the bumped version.


120-123: Verify build artifacts
uv build --no-sources omits the source distribution. Confirm if wheel-only uploads are intended, or remove --no-sources to include a source tarball.


124-127: Publish to PyPI
Using uv publish --token ${{ secrets.PYPI_TOKEN }} correctly uploads the distribution to PyPI.

- Updated `code_workflow.py` and `document_workflow.py` to improve chat history handling and user message storage.
- Enhanced `ArtifactWorkflow` to utilize optional fields in the `Requirement` model.
- Revised prompt instructions for clarity and conciseness in generating requirements.
- Modified UI event components to reflect changes in workflow stages and improve user feedback.
- Improved error handling for JSON parsing in artifact annotations.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
llama-index-server/examples/artifact/document_workflow.py (2)

176-181: Use non-greedy JSON extraction (duplicate feedback)

This regex is greedy and may swallow multiple fenced blocks. A safer, non-greedy pattern was suggested earlier:

-json_block = re.search(r"```json([\s\S]*)```", response.text)
+json_block = re.search(r"```json\s*([\s\S]*?)\s*```", response.text)

You can then json.loads() the capture (or keep model_validate_json).


264-266: Regex misses language-less code blocks (duplicate feedback)

The current pattern ignores markdown\n```, html\n````, and plain ````` \n```** fences.**
Adopt the more permissive variant:

-language_pattern = r"```(markdown|html)([\s\S]*)```"
+language_pattern = r"```(?:\s*(markdown|html))?([\s\S]*?)```"

Fallback to event.requirement.type when the language group is None.

llama-index-server/examples/artifact/code_workflow.py (1)

92-96: Mark previous artifact context with a non-user role

Same rationale as in the document workflow: present historical context as "system" to avoid confusing the model.

-ChatMessage(
-    role="user",
+ChatMessage(
+    role="system",
     content=f"The previous {self.last_artifact.type.value} is: \n{self.last_artifact.model_dump_json()}",
)
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f975b79 and 4fdfca5.

📒 Files selected for processing (4)
  • llama-index-server/examples/artifact/code_workflow.py (1 hunks)
  • llama-index-server/examples/artifact/components/ui_event.jsx (1 hunks)
  • llama-index-server/examples/artifact/document_workflow.py (1 hunks)
  • llama-index-server/llama_index/server/api/models.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • llama-index-server/llama_index/server/api/models.py
  • llama-index-server/examples/artifact/components/ui_event.jsx
🧰 Additional context used
🧬 Code Graph Analysis (1)
llama-index-server/examples/artifact/code_workflow.py (3)
llama-index-server/llama_index/server/api/models.py (5)
  • Artifact (175-195)
  • ArtifactType (158-160)
  • ChatRequest (32-41)
  • CodeArtifactData (163-166)
  • UIEvent (147-155)
llama-index-server/llama_index/server/api/utils/chat_request.py (1)
  • get_last_artifact (14-16)
llama-index-server/examples/artifact/document_workflow.py (7)
  • PlanEvent (35-37)
  • GenerateArtifactEvent (40-41)
  • SynthesizeAnswerEvent (44-46)
  • UIEventData (49-51)
  • ArtifactWorkflow (54-333)
  • planning (110-202)
  • synthesize_answer (303-333)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
.github/workflows/release_llama_index_server.yml (1)

23-26: ⚠️ Potential issue

This condition should include protection against bump commit loops

The current condition doesn't protect against infinite bumping loops. Per previous review comments, you should exclude bump commits by adding an additional condition that filters out commits containing "chore(release): bump llama-index-server version to".

if: |
  github.event_name == 'push' && 
  !startsWith(github.ref, 'refs/heads/release/llama-index-server-v') &&
- !contains(github.event.head_commit.message, 'Release: llama-index-server v')
+ !contains(github.event.head_commit.message, 'Release: llama-index-server v') &&
+ !contains(github.event.head_commit.message, 'chore(release): bump llama-index-server version to')
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 24-24: trailing spaces

(trailing-spaces)

🧹 Nitpick comments (1)
.github/workflows/release_llama_index_server.yml (1)

24-24: Fix trailing whitespace

There is a trailing space at the end of this line that should be removed.

- github.event_name == 'push' && 
+ github.event_name == 'push' &&
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 24-24: trailing spaces

(trailing-spaces)

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4fdfca5 and fbed55a.

⛔ Files ignored due to path filters (3)
  • packages/create-llama/templates/components/agents/python/form_filling/sec_10k_template.csv is excluded by !**/*.csv
  • packages/create-llama/templates/components/agents/typescript/form_filling/sec_10k_template.csv is excluded by !**/*.csv
  • packages/create-llama/templates/components/data/101.pdf is excluded by !**/*.pdf
📒 Files selected for processing (11)
  • .changeset/plenty-spies-tickle.md (1 hunks)
  • .github/workflows/e2e.yml (5 hunks)
  • .github/workflows/lint_on_push_or_pull.yml (1 hunks)
  • .github/workflows/release_llama_index_server.yml (4 hunks)
  • .github/workflows/test_llama_index_server.yml (1 hunks)
  • .gitignore (0 hunks)
  • .husky/pre-commit (1 hunks)
  • package.json (1 hunks)
  • packages/create-llama/.gitignore (1 hunks)
  • packages/create-llama/e2e/typescript/resolve_dependencies.spec.ts (1 hunks)
  • packages/create-llama/package.json (1 hunks)
💤 Files with no reviewable changes (1)
  • .gitignore
✅ Files skipped from review due to trivial changes (3)
  • .changeset/plenty-spies-tickle.md
  • packages/create-llama/.gitignore
  • packages/create-llama/package.json
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/test_llama_index_server.yml
🧰 Additional context used
🪛 YAMLlint (1.35.1)
.github/workflows/release_llama_index_server.yml

[error] 24-24: trailing spaces

(trailing-spaces)

🔇 Additional comments (13)
.github/workflows/release_llama_index_server.yml (4)

34-37: Using latest uv setup looks good

Using astral-sh/setup-uv@v5 with caching enabled is a good practice for optimizing the workflow.


44-46: Effective migration to uv for dependency management

The migration from Poetry to uv for dependency installation is well-implemented. Using uv sync --all-extras --dev ensures all dependencies including development dependencies are installed.


53-58: Version bumping logic correctly updated for uv toolchain

The version bumping logic has been properly updated to use uvx with toml-cli, replacing the Poetry version bumping mechanism.


120-126: Build and publish steps properly migrated to uv

The package build and publish steps have been correctly migrated from Poetry to uv with appropriate commands.

package.json (1)

1-39: Well-structured monorepo configuration

The package.json has been properly configured for a monorepo structure with:

  • Appropriate renaming to "create-llama-monorepo"
  • Private flag set to true (preventing accidental publishing of the root package)
  • Workspaces correctly defined to include all packages
  • Scripts updated to use pnpm workspace commands with proper filtering
  • Only monorepo-level dependencies retained

The setup with @changesets/cli is a good approach for managing versions across packages.

.husky/pre-commit (1)

3-3: Path correctly updated for the new directory structure

The path has been properly updated to reflect the new monorepo structure, ensuring that the Python formatting check targets the correct location.

.github/workflows/lint_on_push_or_pull.yml (2)

38-38: Source path correctly specified for Python format check

The src parameter has been appropriately added to target the new location of the llama-index-server code.


44-44: Source path correctly specified for Python lint check

The src parameter has been appropriately added to target the new location of the llama-index-server code.

packages/create-llama/e2e/typescript/resolve_dependencies.spec.ts (1)

77-77: Good addition of the --ignore-workspace flag

The addition of --ignore-workspace to the pnpm install command is appropriate for the migration to a monorepo structure. This flag ensures that dependencies are installed directly in the test project directory without being affected by the parent workspace configuration, maintaining isolation for the test environment.

.github/workflows/e2e.yml (4)

1-1: Appropriate workflow name update

The workflow name has been updated to explicitly indicate it's for the create-llama package, which provides better clarity in the monorepo context.


6-6: Correctly updated path-ignore patterns

The paths-ignore patterns have been updated to reflect the new monorepo structure, ensuring the workflow is triggered appropriately based on the new directory organization.

Also applies to: 10-10


54-54: Properly added working directory paths

The working-directory parameter has been correctly added to all relevant steps, ensuring commands run in the packages/create-llama directory in accordance with the new monorepo structure.

Also applies to: 58-58, 62-62, 73-73, 124-124, 128-128, 132-132, 141-141


79-79: Correctly updated artifact paths

The artifact paths have been updated to include the packages/create-llama prefix, which properly aligns with the new directory structure in the monorepo.

Also applies to: 147-147

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
python/llama-index-server/llama_index/server/server.py (1)

240-254: ⚠️ Potential issue

add_api_route raises when UI is disabled and mutates list during iteration

Issues

  1. ui_route may remain None (e.g., UI disabled). Calling self.mount(ui_route.path …) will raise AttributeError.
  2. Removing an element from self.routes while iterating over the same list can skip subsequent routes unpredictably.
  3. Nested if can be flattened (Ruff SIM102).

Proposed fix:

-        ui_route = None
-        for route in self.routes:
-            if isinstance(route, Mount):
-                if route.name == self.ui_config.ui_path:
-                    ui_route = route
-                    self.routes.remove(route)
+        ui_route: Optional[Mount] = None
+        for route in list(self.routes):  # iterate over a snapshot
+            if isinstance(route, Mount) and route.name == self.ui_config.ui_path:
+                ui_route = route
+                self.routes.remove(route)
+                break  # only one UI route expected
@@
-        super().add_api_route(*args, **kwargs)
-        self.mount(ui_route.path, ui_route.app, name=ui_route.name)
+        super().add_api_route(*args, **kwargs)
+
+        # Re-attach UI route at the end, if it exists
+        if ui_route is not None:
+            self.mount(ui_route.path, ui_route.app, name=ui_route.name)

This prevents runtime errors and satisfies the static-analysis hint.

🧰 Tools
🪛 Ruff (0.8.2)

248-249: Use a single if statement instead of nested if statements

(SIM102)

🧹 Nitpick comments (2)
python/llama-index-server/llama_index/server/server.py (2)

166-170: Potential name collision for mounted UI route
You pass name=self.ui_config.ui_path (default “.ui”).

  1. “.ui” is not very descriptive in route listings.
  2. If the same directory is mounted again elsewhere this will raise StarletteError: Route name [...] already exists.

Consider normalising to something stable and unique such as "ui-static" or "ui-static-<hash>".


210-226: _mount_static_files gains a name parameter – add uniqueness guard
Starlette will raise if two routes share the same name. Because callers can now pass an arbitrary name, _mount_static_files should defensively check for duplicates before mounting, e.g.:

 if os.path.exists(directory):
+    # Prevent accidental duplicate route names
+    if name and any(r.name == name for r in self.routes):
+        self.logger.warning("Static route %s already mounted, skipping", name)
+        return

This avoids hard-to-trace startup errors when multiple components attempt to mount the same folder.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e09de60 and 11ae75d.

📒 Files selected for processing (1)
  • python/llama-index-server/llama_index/server/server.py (5 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
python/llama-index-server/llama_index/server/server.py

248-249: Use a single if statement instead of nested if statements

(SIM102)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
python/llama-index-server/llama_index/server/server.py (1)

241-256: 🛠️ Refactor suggestion

⚠️ Potential issue

Avoid mutating self.routes while iterating & flatten nested if

Removing items from a list you are iterating over can silently skip elements and is generally fragile.
Ruff also flags the nested if (SIM102). A safer, cleaner approach:

-        ui_route = None
-        for route in self.routes:
-            if isinstance(route, Mount):
-                if route.name == self.ui_config.ui_path:
-                    ui_route = route
-                    self.routes.remove(route)
-        super().add_api_route(*args, **kwargs)
-        if ui_route:
-            self.mount(ui_route.path, ui_route.app, name=ui_route.name)
+        ui_route = next(
+            (r for r in self.routes
+             if isinstance(r, Mount) and r.name == self.ui_config.ui_path),
+            None,
+        )
+        if ui_route:
+            # Remove after iteration to avoid in-loop mutation pitfalls
+            self.routes.remove(ui_route)
+
+        super().add_api_route(*args, **kwargs)
+
+        if ui_route:
+            # Re-append UI route so it remains the last match
+            self.mount(ui_route.path, ui_route.app, name=ui_route.name)

Benefits
• No list mutation during iteration
• Single if branch complies with Ruff SIM102
• Clearer intent and easier maintenance

🧰 Tools
🪛 Ruff (0.8.2)

249-250: Use a single if statement instead of nested if statements

(SIM102)

♻️ Duplicate comments (1)
python/llama-index-server/llama_index/server/server.py (1)

8-8: 🛠️ Refactor suggestion

Import Mount from Starlette for forward-compatibility

FastAPI merely re-exports Mount from Starlette; relying on that re-export is not part of FastAPI’s public-API guarantee and has broken in the past for some users. Importing directly from starlette.routing keeps you insulated from FastAPI version changes.

-from fastapi.routing import Mount
+from starlette.routing import Mount
🧹 Nitpick comments (1)
python/llama-index-server/llama_index/server/server.py (1)

211-227: Potential name collision & missing uniqueness check when mounting static files

FastAPI.mount() requires the name parameter to be unique across mounted apps.
Passing name=self.ui_config.ui_path (typically ".ui") risks collisions if:

  1. The same UI is mounted twice (e.g., during hot-reload in dev).
  2. Another piece of code elsewhere reuses the same name accidentally.

Consider guarding against duplicates:

-            self.mount(
-                path,
-                StaticFiles(directory=directory, check_dir=False, html=html),
-                name=name or f"{directory}-static",
-            )
+            mount_name = name or f"{directory}-static"
+            if any(rt.name == mount_name for rt in self.routes):
+                self.logger.warning("Static route '%s' already mounted, skipping", mount_name)
+                return
+            self.mount(
+                path,
+                StaticFiles(directory=directory, check_dir=False, html=html),
+                name=mount_name,
+            )
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 11ae75d and 51f70a9.

📒 Files selected for processing (1)
  • python/llama-index-server/llama_index/server/server.py (5 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
python/llama-index-server/llama_index/server/server.py

249-250: Use a single if statement instead of nested if statements

(SIM102)

⏰ Context from checks skipped due to timeout of 90000ms (28)
  • GitHub Check: build
  • GitHub Check: typescript (22, 3.11, ubuntu-22.04, nextjs, --llamacloud)
  • GitHub Check: typescript (22, 3.11, ubuntu-22.04, nextjs, --example-file)
  • GitHub Check: typescript (22, 3.11, ubuntu-22.04, nextjs, --no-files)
  • GitHub Check: typescript (22, 3.11, windows-latest, nextjs, --llamacloud)
  • GitHub Check: typescript (22, 3.11, windows-latest, nextjs, --example-file)
  • GitHub Check: typescript (22, 3.11, windows-latest, nextjs, --no-files)
  • GitHub Check: typescript (22, 3.11, macos-latest, nextjs, --llamacloud)
  • GitHub Check: typescript (22, 3.11, macos-latest, nextjs, --example-file)
  • GitHub Check: typescript (22, 3.11, macos-latest, nextjs, --no-files)
  • GitHub Check: typescript (20, 3.11, ubuntu-22.04, nextjs, --llamacloud)
  • GitHub Check: typescript (20, 3.11, ubuntu-22.04, nextjs, --example-file)
  • GitHub Check: typescript (20, 3.11, ubuntu-22.04, nextjs, --no-files)
  • GitHub Check: typescript (20, 3.11, windows-latest, nextjs, --llamacloud)
  • GitHub Check: python (20, 3.11, ubuntu-22.04, fastapi, --llamacloud)
  • GitHub Check: typescript (20, 3.11, windows-latest, nextjs, --example-file)
  • GitHub Check: python (20, 3.11, ubuntu-22.04, fastapi, --example-file)
  • GitHub Check: typescript (20, 3.11, windows-latest, nextjs, --no-files)
  • GitHub Check: python (20, 3.11, ubuntu-22.04, fastapi, --no-files)
  • GitHub Check: typescript (20, 3.11, macos-latest, nextjs, --llamacloud)
  • GitHub Check: python (20, 3.11, windows-latest, fastapi, --llamacloud)
  • GitHub Check: typescript (20, 3.11, macos-latest, nextjs, --example-file)
  • GitHub Check: python (20, 3.11, windows-latest, fastapi, --example-file)
  • GitHub Check: typescript (20, 3.11, macos-latest, nextjs, --no-files)
  • GitHub Check: python (20, 3.11, windows-latest, fastapi, --no-files)
  • GitHub Check: python (20, 3.11, macos-latest, fastapi, --llamacloud)
  • GitHub Check: python (20, 3.11, macos-latest, fastapi, --example-file)
  • GitHub Check: python (20, 3.11, macos-latest, fastapi, --no-files)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
python/llama-index-server/examples/artifact/document_workflow.py (1)

88-95: Token-efficiency: don’t dump full artifact JSON into the prompt

As with the code workflow, inserting the full JSON of the previous artifact can quickly bloat the prompt. Prefer a concise summary or an external reference.

🧹 Nitpick comments (2)
python/llama-index-server/examples/artifact/code_workflow.py (1)

318-325: Use detected language when the requirement omitted it

event.requirement.language can legitimately be None (the prompt allows it).
In that case the UI receives an empty string and loses syntax highlighting. You already have the language captured by the regex; prefer it as a fallback:

-    language=event.requirement.language or "",
+    language=event.requirement.language or code_match.group(1),
python/llama-index-server/examples/artifact/document_workflow.py (1)

28-32: Rename field type to avoid masking the built-in

Pydantic models allow it, but a field literally named type shadows Python’s built-in type, which can be confusing in stack-traces and IDE auto-completion. A clearer alternative is doc_type.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 484cb9c and 88af75c.

📒 Files selected for processing (2)
  • python/llama-index-server/examples/artifact/code_workflow.py (1 hunks)
  • python/llama-index-server/examples/artifact/document_workflow.py (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
python/llama-index-server/examples/artifact/document_workflow.py (3)
python/llama-index-server/llama_index/server/api/models.py (5)
  • Artifact (175-195)
  • ArtifactType (158-160)
  • ChatRequest (32-41)
  • DocumentArtifactData (169-172)
  • UIEvent (147-155)
python/llama-index-server/llama_index/server/api/utils/chat_request.py (1)
  • get_last_artifact (21-23)
python/llama-index-server/examples/artifact/code_workflow.py (9)
  • PlanEvent (35-37)
  • GenerateArtifactEvent (40-41)
  • SynthesizeAnswerEvent (44-45)
  • UIEventData (48-50)
  • ArtifactWorkflow (53-360)
  • prepare_chat_history (78-107)
  • planning (110-223)
  • generate_artifact (226-328)
  • synthesize_answer (331-360)
⏰ Context from checks skipped due to timeout of 90000ms (27)
  • GitHub Check: typescript (22, 3.11, ubuntu-22.04, nextjs, --llamacloud)
  • GitHub Check: typescript (22, 3.11, ubuntu-22.04, nextjs, --example-file)
  • GitHub Check: typescript (22, 3.11, ubuntu-22.04, nextjs, --no-files)
  • GitHub Check: typescript (22, 3.11, windows-latest, nextjs, --llamacloud)
  • GitHub Check: typescript (22, 3.11, windows-latest, nextjs, --example-file)
  • GitHub Check: typescript (22, 3.11, windows-latest, nextjs, --no-files)
  • GitHub Check: typescript (22, 3.11, macos-latest, nextjs, --llamacloud)
  • GitHub Check: typescript (22, 3.11, macos-latest, nextjs, --example-file)
  • GitHub Check: typescript (22, 3.11, macos-latest, nextjs, --no-files)
  • GitHub Check: python (20, 3.11, ubuntu-22.04, fastapi, --llamacloud)
  • GitHub Check: typescript (20, 3.11, ubuntu-22.04, nextjs, --llamacloud)
  • GitHub Check: python (20, 3.11, ubuntu-22.04, fastapi, --example-file)
  • GitHub Check: typescript (20, 3.11, ubuntu-22.04, nextjs, --example-file)
  • GitHub Check: typescript (20, 3.11, ubuntu-22.04, nextjs, --no-files)
  • GitHub Check: python (20, 3.11, ubuntu-22.04, fastapi, --no-files)
  • GitHub Check: typescript (20, 3.11, windows-latest, nextjs, --llamacloud)
  • GitHub Check: python (20, 3.11, windows-latest, fastapi, --llamacloud)
  • GitHub Check: typescript (20, 3.11, windows-latest, nextjs, --example-file)
  • GitHub Check: python (20, 3.11, windows-latest, fastapi, --example-file)
  • GitHub Check: typescript (20, 3.11, windows-latest, nextjs, --no-files)
  • GitHub Check: python (20, 3.11, windows-latest, fastapi, --no-files)
  • GitHub Check: typescript (20, 3.11, macos-latest, nextjs, --llamacloud)
  • GitHub Check: python (20, 3.11, macos-latest, fastapi, --llamacloud)
  • GitHub Check: typescript (20, 3.11, macos-latest, nextjs, --example-file)
  • GitHub Check: python (20, 3.11, macos-latest, fastapi, --example-file)
  • GitHub Check: typescript (20, 3.11, macos-latest, nextjs, --no-files)
  • GitHub Check: python (20, 3.11, macos-latest, fastapi, --no-files)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (6)
python/llama-index-server/examples/artifact/document_workflow.py (3)

168-173: ⚠️ Potential issue

Fix the JSON extraction regex to make it non-greedy

The current regex pattern for extracting JSON blocks is greedy, which could lead to incorrect matches if the response contains multiple code blocks.

Apply this change to make the regex non-greedy and case-insensitive:

-json_block = re.search(r"```json([\s\S]*)```", response.text)
+json_block = re.search(r"```json\s*([\s\S]+?)\s*```", response.text, re.IGNORECASE)

This will ensure it correctly extracts the first JSON block and handles whitespace properly. The re.IGNORECASE flag allows matching both json and JSON.


257-266: ⚠️ Potential issue

Fix the document block extraction regex

Similar to the JSON extraction issue, the regex pattern for extracting document content is greedy and will fail if the response contains nested code blocks.

Apply these changes to improve the regex and remove the type ignore:

-language_pattern = r"```(markdown|html)([\s\S]*)```"
+language_pattern = r"```(markdown|html)\s*([\s\S]+?)\s*```"
 doc_match = re.search(language_pattern, response.text)
 if doc_match is None:
     return SynthesizeAnswerEvent(
         requirement=event.requirement,
         generated_artifact="There is no change to the document. "
         + response.text.strip(),
     )
 content = doc_match.group(2).strip()
 doc_type = doc_match.group(1)
-    type=doc_type,  # type: ignore
+    type=doc_type,

Using the captured doc_type directly avoids the need for the # type: ignore comment.


96-99: 🛠️ Refactor suggestion

Avoid including the full artifact JSON in the prompt

Passing the complete serialized artifact as JSON directly into the prompt can be inefficient:

  1. It increases token usage unnecessarily
  2. May exceed context limits for large artifacts
  3. Reduces the effective context available for the LLM

Consider either:

  1. Extracting only the essential information from the artifact
  2. Summarizing the content for the context
  3. Using retrieval-augmented generation to reference the artifact by ID
-    context=str(self.last_artifact.model_dump_json())
-    if self.last_artifact
-    else "",
+    context=self._extract_artifact_summary()
+    if self.last_artifact
+    else "",

Add a helper method that includes only the necessary parts:

def _extract_artifact_summary(self) -> str:
    """Extract a concise summary of the artifact for context."""
    if not self.last_artifact:
        return ""
    
    data = self.last_artifact.data
    return f"Document type: {data.type}\nTitle: {data.title}\nContent preview: {data.content[:500]}..."
python/llama-index-server/examples/artifact/code_workflow.py (3)

293-299: ⚠️ Potential issue

Fix the code block extraction regex to make it non-greedy

The current regex pattern for extracting code blocks is greedy. This can cause issues if the response contains nested code blocks or multiple blocks.

Apply this change to make the regex non-greedy and handle whitespace properly:

-language_pattern = r"```(\w+)([\s\S]*)```"
+language_pattern = r"```(\w+)\s*([\s\S]+?)\s*```"
 code_match = re.search(language_pattern, response.text)
 if code_match is None:
     return SynthesizeAnswerEvent()
 else:
     code = code_match.group(2).strip()

This will ensure it correctly extracts each code block, preventing over-capture when multiple fenced blocks are present.


98-100: 🛠️ Refactor suggestion

Avoid including the full artifact JSON in the prompt

Serializing the entire artifact as JSON and including it in the prompt context is inefficient and can lead to excessive token usage.

Consider implementing a summarization method similar to the one suggested for the document workflow:

-    context=str(self.last_artifact.model_dump_json())
-    if self.last_artifact
-    else "",
+    context=self._extract_artifact_summary()
+    if self.last_artifact
+    else "",

And add a helper method:

def _extract_artifact_summary(self) -> str:
    """Extract a concise summary of the artifact for context."""
    if not self.last_artifact:
        return ""
    
    data = self.last_artifact.data
    # Provide a short preview instead of the full code
    code_preview = data.code[:300] + "..." if len(data.code) > 300 else data.code
    return f"File: {data.file_name}\nLanguage: {data.language}\nCode preview:\n{code_preview}"

283-287: 🛠️ Refactor suggestion

Avoid including the full artifact JSON in the prompt

Similar to the planning step, including the complete artifact JSON in the prompt can waste tokens and limit context.

Apply the same approach recommended earlier:

-    previous_artifact=self.last_artifact.model_dump_json()
-    if self.last_artifact
-    else "",
+    previous_artifact=self._extract_artifact_summary(include_full_code=True)
+    if self.last_artifact
+    else "",

Update the helper method to optionally include full code when needed:

def _extract_artifact_summary(self, include_full_code: bool = False) -> str:
    """Extract a summary of the artifact for context."""
    if not self.last_artifact:
        return ""
    
    data = self.last_artifact.data
    if include_full_code:
        return f"```{data.language}\n{data.code}\n```"
    else:
        code_preview = data.code[:300] + "..." if len(data.code) > 300 else data.code
        return f"File: {data.file_name}\nLanguage: {data.language}\nCode preview:\n{code_preview}"
🧹 Nitpick comments (3)
python/llama-index-server/examples/artifact/document_workflow.py (1)

277-278: Use consistent event type for artifacts

The event type "artifact" is used here, which works but seems inconsistent with our naming patterns.

For consistency and clarity, consider renaming this to match other UI event types or class names:

 UIEvent(
-    type="artifact",
+    type="artifact_event",
     data=Artifact(
python/llama-index-server/examples/artifact/code_workflow.py (2)

315-318: Handle empty language and file_name properly

The code is directly using event.requirement.language or "" for potential None values, which is good. However, consider validating or providing default values that are more descriptive.

 data=CodeArtifactData(
-    language=event.requirement.language or "",
-    file_name=event.requirement.file_name or "",
+    language=event.requirement.language or "text",
+    file_name=event.requirement.file_name or "untitled.txt",
     code=code,
 ),

This makes the UI experience better by providing meaningful defaults rather than empty strings.


352-353: Add requirement to UI event data for consistency

The completed state UI event is missing the requirement field that's used in the document workflow.

 data=UIEventData(
     state="completed",
+    requirement=None,
 ),

This maintains consistency with the document workflow implementation.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 88af75c and 6104c3d.

📒 Files selected for processing (2)
  • python/llama-index-server/examples/artifact/code_workflow.py (1 hunks)
  • python/llama-index-server/examples/artifact/document_workflow.py (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
python/llama-index-server/examples/artifact/code_workflow.py (3)
python/llama-index-server/llama_index/server/api/models.py (5)
  • Artifact (175-195)
  • ArtifactType (158-160)
  • ChatRequest (32-41)
  • CodeArtifactData (163-166)
  • UIEvent (147-155)
python/llama-index-server/llama_index/server/api/utils/chat_request.py (1)
  • get_last_artifact (21-23)
python/llama-index-server/examples/artifact/document_workflow.py (6)
  • PlanEvent (35-37)
  • GenerateArtifactEvent (40-41)
  • SynthesizeAnswerEvent (44-46)
  • UIEventData (49-51)
  • ArtifactWorkflow (54-326)
  • planning (102-195)

@leehuwuj leehuwuj merged commit 3130cdf into main Apr 28, 2025
33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants