fix: flatten before detect #1361

nikhilsinhaparseable · 2025-06-28T11:38:04Z

flatten the input json
convert to json array
add separator as expected in ingestion flow
then infer schema

Summary by CodeRabbit

Bug Fixes
- Improved error handling for deeply nested JSON inputs during schema detection.
- Enhanced feedback when JSON flattening fails, providing clearer error messages.
New Features
- Added automatic flattening of JSON data before schema inference for better compatibility with various JSON structures.

coderabbitai · 2025-06-28T11:38:22Z

Walkthrough

The detect_schema function in src/handlers/http/logstream.rs was updated to introduce multiple JSON flattening steps before schema inference. The function now checks for excessive nesting, applies flattening, and handles errors for deeply nested or unflattenable JSON, returning appropriate error responses when necessary.

Changes

File(s)	Change Summary
src/handlers/http/logstream.rs	Modified `detect_schema` to check for deep nesting, perform JSON flattening, handle flattening errors, and reject overly nested JSON before schema inference. No signature changes.
build.rs	Updated `println!` macro to use inline variable interpolation syntax `{ui_version}` instead of positional `{}` for setting `UI_VERSION` environment variable during build.
src/alerts/alerts_utils.rs, src/alerts/mod.rs, src/catalog/mod.rs, src/cli.rs, src/connectors/common/mod.rs, src/connectors/kafka/config.rs, src/handlers/airplane.rs, src/handlers/http/cluster/mod.rs, src/handlers/http/cluster/utils.rs, src/handlers/http/llm.rs, src/handlers/http/mod.rs, src/handlers/http/modal/ingest_server.rs, src/handlers/http/oidc.rs, src/handlers/http/query.rs, src/handlers/http/users/dashboards.rs, src/handlers/http/users/filters.rs, src/hottier.rs, src/option.rs, src/otel/metrics.rs, src/parseable/mod.rs, src/rbac/utils.rs, src/storage/azure_blob.rs, src/storage/localfs.rs, src/storage/object_storage.rs, src/storage/s3.rs, src/storage/store_metadata.rs, src/users/filters.rs, src/utils/arrow/flight.rs, src/utils/human_size.rs, src/utils/json/mod.rs, src/utils/mod.rs	Modernized string formatting across multiple files by replacing positional `{}` placeholders with Rust's inline named variable interpolation `{var}` syntax. No changes to logic, control flow, or error handling.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Handler (detect_schema)
    participant Flatten Utils

    Client->>Handler (detect_schema): Send JSON input
    Handler->>Handler: Check nesting level
    alt Too deeply nested
        Handler-->>Client: Return BAD_REQUEST (cannot flatten)
    else Not deeply nested
        Handler->>Flatten Utils: generic_flattening
        alt Flattening fails
            Handler-->>Client: Return BAD_REQUEST (flattening failed)
        else Flattening succeeds
            Handler->>Flatten Utils: flatten::flatten
            alt Detailed flattening fails
                Handler-->>Client: Return BAD_REQUEST (flattening failed)
            else Flattened JSON ready
                Handler->>Handler: Convert to array & infer schema
                Handler-->>Client: Return inferred schema
            end
        end
    end

Possibly related PRs

perf: save on clone cost during ingestion #1178: Modifies convert_to_array in flatten.rs, which is likely used by the flattening utilities now invoked by detect_schema, establishing a direct code-level relationship.

Suggested labels

for next release

Poem

A warren of JSON, nested and deep,
Now flattened before schema we keep.
If levels are many, a warning we send—
"Too nested!" we say, and politely defend.
But if all goes well, the schema appears,
Flattened and ready—let's give three cheers!
🐇✨

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 228428e and 95f4782.

📒 Files selected for processing (32)

src/alerts/alerts_utils.rs (3 hunks)
src/alerts/mod.rs (1 hunks)
src/catalog/mod.rs (1 hunks)
src/cli.rs (4 hunks)
src/connectors/common/mod.rs (1 hunks)
src/connectors/kafka/config.rs (3 hunks)
src/handlers/airplane.rs (1 hunks)
src/handlers/http/cluster/mod.rs (1 hunks)
src/handlers/http/cluster/utils.rs (1 hunks)
src/handlers/http/llm.rs (1 hunks)
src/handlers/http/logstream.rs (2 hunks)
src/handlers/http/mod.rs (1 hunks)
src/handlers/http/modal/ingest_server.rs (1 hunks)
src/handlers/http/oidc.rs (1 hunks)
src/handlers/http/query.rs (2 hunks)
src/handlers/http/users/dashboards.rs (3 hunks)
src/handlers/http/users/filters.rs (3 hunks)
src/hottier.rs (2 hunks)
src/option.rs (1 hunks)
src/otel/metrics.rs (1 hunks)
src/parseable/mod.rs (1 hunks)
src/rbac/utils.rs (1 hunks)
src/storage/azure_blob.rs (2 hunks)
src/storage/localfs.rs (1 hunks)
src/storage/object_storage.rs (1 hunks)
src/storage/s3.rs (1 hunks)
src/storage/store_metadata.rs (2 hunks)
src/users/filters.rs (1 hunks)
src/utils/arrow/flight.rs (1 hunks)
src/utils/human_size.rs (1 hunks)
src/utils/json/mod.rs (1 hunks)
src/utils/mod.rs (2 hunks)

✅ Files skipped from review due to trivial changes (24)

src/option.rs
src/connectors/common/mod.rs
src/handlers/http/llm.rs
src/rbac/utils.rs
src/handlers/http/users/filters.rs
src/parseable/mod.rs
src/handlers/http/modal/ingest_server.rs
src/storage/localfs.rs
src/storage/s3.rs
src/cli.rs
src/catalog/mod.rs
src/handlers/airplane.rs
src/handlers/http/cluster/mod.rs
src/storage/store_metadata.rs
src/connectors/kafka/config.rs
src/utils/arrow/flight.rs
src/handlers/http/oidc.rs
src/otel/metrics.rs
src/storage/azure_blob.rs
src/handlers/http/query.rs
src/alerts/mod.rs
src/storage/object_storage.rs
src/alerts/alerts_utils.rs
src/hottier.rs

🧰 Additional context used

🧠 Learnings (6)

📓 Common learnings

Learnt from: nikhilsinhaparseable
PR: parseablehq/parseable#1329
File: src/handlers/http/kinesis.rs:89-98
Timestamp: 2025-05-28T04:45:40.879Z
Learning: In Kinesis log processing, when handling deeply nested JSON that exceeds flattening limits, the original JSON structure must be preserved as-is. If the JSON is an array at the root level, attempting to add metadata fields like `requestId` and `timestamp` would fail because arrays don't support direct key-value insertion. The flattening process handles arrays by converting elements to individual objects, but when flattening is skipped, the original array structure must remain intact to prevent data loss.

Learnt from: de-sh
PR: parseablehq/parseable#1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern `!PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?)` ensures proper error handling in both modes.

Learnt from: nikhilsinhaparseable
PR: parseablehq/parseable#1271
File: src/prism/home/mod.rs:207-224
Timestamp: 2025-03-26T06:13:48.898Z
Learning: In the Parseable codebase, if a stream is found, the stream_jsons array will always have at least one element. Additionally, for any valid stream_json object, the log_source array will always have at least one element. This is a design invariant that makes additional null checks unnecessary.

Learnt from: nikhilsinhaparseable
PR: parseablehq/parseable#1346
File: src/handlers/http/health_check.rs:81-90
Timestamp: 2025-06-16T02:04:58.990Z
Learning: In the shutdown function in src/handlers/http/health_check.rs, the design approach is to log errors from sync operations rather than propagate them. This is intentional because the shutdown function is called on SIGTERM/SIGINT signals, and the goal is to perform best-effort cleanup (syncing pending files to object storage) while allowing the shutdown to proceed regardless of sync failures. Logging provides debugging information without blocking the shutdown process.

src/handlers/http/users/dashboards.rs (4)

Learnt from: nikhilsinhaparseable
PR: parseablehq/parseable#1305
File: src/handlers/http/users/dashboards.rs:0-0
Timestamp: 2025-05-01T10:27:56.858Z
Learning: The `add_tile()` function in `src/handlers/http/users/dashboards.rs` should use `get_dashboard_by_user(dashboard_id, &user_id)` instead of `get_dashboard(dashboard_id)` to ensure proper authorization checks when modifying a dashboard.

Learnt from: nikhilsinhaparseable
PR: parseablehq/parseable#1305
File: src/users/dashboards.rs:154-165
Timestamp: 2025-05-01T12:22:42.363Z
Learning: Title validation for dashboards is performed in the `create_dashboard` HTTP handler function rather than in the `DASHBOARDS.create` method, avoiding redundant validation.

Learnt from: nikhilsinhaparseable
PR: parseablehq/parseable#1305
File: src/handlers/http/users/dashboards.rs:63-71
Timestamp: 2025-05-01T10:22:45.695Z
Learning: Dashboard design in Parseable follows a sharing model where any user can read (list/get) any dashboard, enabling dashboard sharing functionality, but only the dashboard author can modify (update/delete) their own dashboards.

Learnt from: nikhilsinhaparseable
PR: parseablehq/parseable#1348
File: src/prism/home/mod.rs:366-368
Timestamp: 2025-06-15T18:18:14.590Z
Learning: In the Parseable dashboard system, dashboard_id is guaranteed to always be present (never None) because IDs are generated at dashboard creation time, and there are no pre-v1 dashboards that could lack IDs.

src/handlers/http/mod.rs (1)

Learnt from: de-sh
PR: parseablehq/parseable#1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern `!PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?)` ensures proper error handling in both modes.

src/utils/human_size.rs (1)

Learnt from: vkhinvasara
PR: parseablehq/parseable#1352
File: src/handlers/http/resource_check.rs:70-74
Timestamp: 2025-06-18T08:43:30.112Z
Learning: The sysinfo crate (version 0.30+) returns memory values from `used_memory()` and `total_memory()` in bytes, not kilobytes. Converting bytes to GiB requires dividing by 1024³ (1,073,741,824). In older versions (pre-0.30), the methods returned kilobytes.

src/utils/mod.rs (1)

Learnt from: nikhilsinhaparseable
PR: parseablehq/parseable#1340
File: src/query/mod.rs:64-66
Timestamp: 2025-06-18T06:39:04.775Z
Learning: In src/query/mod.rs, QUERY_SESSION_STATE and QUERY_SESSION serve different architectural purposes: QUERY_SESSION_STATE is used for stats calculation and allows dynamic registration of individual parquet files from the staging path (files created every minute), while QUERY_SESSION is used for object store queries with the global schema provider. Session contexts with schema providers don't support registering individual tables/parquets, so both session objects are necessary for their respective use cases.

src/handlers/http/logstream.rs (6)

Learnt from: nikhilsinhaparseable
PR: parseablehq/parseable#1329
File: src/handlers/http/kinesis.rs:89-98
Timestamp: 2025-05-28T04:45:40.879Z
Learning: In Kinesis log processing, when handling deeply nested JSON that exceeds flattening limits, the original JSON structure must be preserved as-is. If the JSON is an array at the root level, attempting to add metadata fields like `requestId` and `timestamp` would fail because arrays don't support direct key-value insertion. The flattening process handles arrays by converting elements to individual objects, but when flattening is skipped, the original array structure must remain intact to prevent data loss.

Learnt from: nikhilsinhaparseable
PR: parseablehq/parseable#1271
File: src/prism/home/mod.rs:207-224
Timestamp: 2025-03-26T06:13:48.898Z
Learning: In the Parseable codebase, if a stream is found, the stream_jsons array will always have at least one element. Additionally, for any valid stream_json object, the log_source array will always have at least one element. This is a design invariant that makes additional null checks unnecessary.

Learnt from: de-sh
PR: parseablehq/parseable#1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern `!PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?)` ensures proper error handling in both modes.

Learnt from: nikhilsinhaparseable
PR: parseablehq/parseable#1346
File: src/handlers/http/health_check.rs:81-90
Timestamp: 2025-06-16T02:04:58.990Z
Learning: In the shutdown function in src/handlers/http/health_check.rs, the design approach is to log errors from sync operations rather than propagate them. This is intentional because the shutdown function is called on SIGTERM/SIGINT signals, and the goal is to perform best-effort cleanup (syncing pending files to object storage) while allowing the shutdown to proceed regardless of sync failures. Logging provides debugging information without blocking the shutdown process.

Learnt from: nikhilsinhaparseable
PR: parseablehq/parseable#1288
File: src/handlers/http/modal/mod.rs:279-301
Timestamp: 2025-04-07T13:23:10.092Z
Learning: For critical operations like writing metadata to disk in NodeMetadata::put_on_disk(), it's preferred to let exceptions propagate (using expect/unwrap) rather than trying to recover with fallback mechanisms, as the failure indicates a fundamental system issue that needs immediate attention.

Learnt from: de-sh
PR: parseablehq/parseable#0
File: :0-0
Timestamp: 2025-03-20T15:50:45.435Z
Learning: Pay close attention to code comments for typos and semantic clarity during reviews for the Parseable project.

🧬 Code Graph Analysis (2)

src/handlers/http/users/dashboards.rs (1)

src/storage/object_storage.rs (1)

dashboard_path (1312-1314)

src/handlers/http/cluster/utils.rs (1)

src/cli.rs (1)

format (616-617)

⏰ Context from checks skipped due to timeout of 90000ms (10)

GitHub Check: Build Default x86_64-pc-windows-msvc
GitHub Check: Build Default x86_64-unknown-linux-gnu
GitHub Check: Build Default aarch64-apple-darwin
GitHub Check: Build Default x86_64-apple-darwin
GitHub Check: Build Default aarch64-unknown-linux-gnu
GitHub Check: Build Kafka aarch64-apple-darwin
GitHub Check: Build Kafka x86_64-unknown-linux-gnu
GitHub Check: Quest Smoke and Load Tests for Standalone deployments
GitHub Check: Quest Smoke and Load Tests for Distributed deployments
GitHub Check: coverage

🔇 Additional comments (15)

src/utils/json/mod.rs (1)

114-114: LGTM - String formatting modernization.

Good update to use Rust's inline variable interpolation syntax, improving code readability while maintaining identical functionality.

src/handlers/http/mod.rs (1)

92-92: LGTM - Path formatting improvement.

The inline variable interpolation makes the path construction more readable while maintaining the same functionality.

src/handlers/http/cluster/utils.rs (1)

206-206: LGTM - URL formatting modernization.

The inline variable interpolation improves readability while preserving the URL construction logic.

src/utils/human_size.rs (1)

42-42: LGTM - Byte formatting improvement.

Good modernization of the string formatting syntax while maintaining identical output.

src/utils/mod.rs (2)

56-56: LGTM - Time formatting improvement.

The inline variable interpolation makes it clearer which variables are being formatted into the time string.

85-85: LGTM - Error message formatting improvement.

The inline variable interpolation improves readability of the error message construction.

src/handlers/http/logstream.rs (7)

33-35: LGTM!

The new imports for JSON flattening utilities are properly added to support the enhanced detect_schema function.

108-127: Approve the nesting check and generic flattening logic.

The implementation correctly:

Checks nesting levels before attempting flattening

Uses proper error handling instead of unwrap()

Provides descriptive error messages for debugging

The error handling aligns with the retrieved learning about preserving JSON structure when flattening exceeds limits.

128-133: LGTM!

The detailed flattening step is properly implemented with appropriate error handling and follows the established pattern for error conversion.

134-138: LGTM!

The array conversion logic correctly handles both array and object cases that can result from the flattening process.

139-147: LGTM!

The schema inference properly handles errors with descriptive messages. The previous unwrap() issue has been correctly addressed with proper error handling.

148-150: LGTM!

The data type override logic correctly processes each flattened JSON object to refine the inferred schema.

152-161: LGTM!

The error handling for deeply nested JSON provides clear feedback to users and uses the appropriate HTTP status code for client errors.

src/users/filters.rs (1)

137-137: LGTM!

The string formatting has been modernized to use Rust's inline variable interpolation syntax, improving readability.

src/handlers/http/users/dashboards.rs (1)

86-86: LGTM!

The string formatting has been consistently modernized across all dashboard operations (post, update, delete) to use Rust's inline variable interpolation syntax.

Also applies to: 123-123, 148-148

✨ Finishing Touches

📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

src/handlers/http/logstream.rs (1)

137-143: Consider providing more detailed error message.

The error message for heavily nested JSON could be more helpful by indicating the actual nesting level or the configured limit.

-        Err(StreamError::Custom {
-            msg: "heavily nested, cannot flatten this JSON".to_string(),
-            status: StatusCode::BAD_REQUEST,
-        })
+        Err(StreamError::Custom {
+            msg: format!("JSON is too deeply nested (exceeds level {}), cannot flatten", 
+                        PARSEABLE.options.event_flatten_level),
+            status: StatusCode::BAD_REQUEST,
+        })

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 04078cf and a843a5f.

📒 Files selected for processing (1)

src/handlers/http/logstream.rs (2 hunks)

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: nikhilsinhaparseable
PR: parseablehq/parseable#1329
File: src/handlers/http/kinesis.rs:89-98
Timestamp: 2025-05-28T04:45:40.879Z
Learning: In Kinesis log processing, when handling deeply nested JSON that exceeds flattening limits, the original JSON structure must be preserved as-is. If the JSON is an array at the root level, attempting to add metadata fields like `requestId` and `timestamp` would fail because arrays don't support direct key-value insertion. The flattening process handles arrays by converting elements to individual objects, but when flattening is skipped, the original array structure must remain intact to prevent data loss.

Learnt from: de-sh
PR: parseablehq/parseable#1185
File: src/handlers/http/logstream.rs:255-261
Timestamp: 2025-02-14T09:49:25.818Z
Learning: In Parseable's logstream handlers, stream existence checks must be performed for both query and standalone modes. The pattern `!PARSEABLE.streams.contains(&stream_name) && (PARSEABLE.options.mode != Mode::Query || !PARSEABLE.create_stream_and_schema_from_storage(&stream_name).await?)` ensures proper error handling in both modes.

Learnt from: nikhilsinhaparseable
PR: parseablehq/parseable#1271
File: src/prism/home/mod.rs:207-224
Timestamp: 2025-03-26T06:13:48.898Z
Learning: In the Parseable codebase, if a stream is found, the stream_jsons array will always have at least one element. Additionally, for any valid stream_json object, the log_source array will always have at least one element. This is a design invariant that makes additional null checks unnecessary.

Learnt from: nikhilsinhaparseable
PR: parseablehq/parseable#1346
File: src/handlers/http/health_check.rs:81-90
Timestamp: 2025-06-16T02:04:58.990Z
Learning: In the shutdown function in src/handlers/http/health_check.rs, the design approach is to log errors from sync operations rather than propagate them. This is intentional because the shutdown function is called on SIGTERM/SIGINT signals, and the goal is to perform best-effort cleanup (syncing pending files to object storage) while allowing the shutdown to proceed regardless of sync failures. Logging provides debugging information without blocking the shutdown process.

src/handlers/http/logstream.rs (1)

Learnt from: nikhilsinhaparseable
PR: parseablehq/parseable#1329
File: src/handlers/http/kinesis.rs:89-98
Timestamp: 2025-05-28T04:45:40.879Z
Learning: In Kinesis log processing, when handling deeply nested JSON that exceeds flattening limits, the original JSON structure must be preserved as-is. If the JSON is an array at the root level, attempting to add metadata fields like `requestId` and `timestamp` would fail because arrays don't support direct key-value insertion. The flattening process handles arrays by converting elements to individual objects, but when flattening is skipped, the original array structure must remain intact to prevent data loss.

🧬 Code Graph Analysis (1)

src/handlers/http/logstream.rs (2)

src/utils/json/flatten.rs (4)

flatten (58-93)

convert_to_array (351-357)

generic_flattening (269-328)

has_more_than_max_allowed_levels (335-348)

src/event/format/mod.rs (2)

new (126-131)

override_data_type (324-368)

⏰ Context from checks skipped due to timeout of 90000ms (10)

GitHub Check: Build Default x86_64-pc-windows-msvc
GitHub Check: Build Default x86_64-apple-darwin
GitHub Check: Build Default x86_64-unknown-linux-gnu
GitHub Check: Build Kafka x86_64-unknown-linux-gnu
GitHub Check: Build Default aarch64-apple-darwin
GitHub Check: Build Default aarch64-unknown-linux-gnu
GitHub Check: Build Kafka aarch64-apple-darwin
GitHub Check: coverage
GitHub Check: Quest Smoke and Load Tests for Distributed deployments
GitHub Check: Quest Smoke and Load Tests for Standalone deployments

🔇 Additional comments (5)

src/handlers/http/logstream.rs (5)

33-35: LGTM: Import additions are appropriate.

The imported flatten utilities are necessary for the new flattening functionality in the detect_schema function.

120-125: Approve the detailed flattening logic.

The detailed flattening step with separator "_" and proper error handling is correctly implemented. The error message could be more descriptive, but the logic is sound.

126-130: Good pattern matching for array conversion.

The logic correctly handles both Array and Object cases from the flattened JSON, and the unreachable case is appropriate since the flatten function would have failed earlier for invalid types.

109-109: No changes required for nesting level check

Starting at current_level = 1 correctly treats the top-level JSON as level 1, aligns with the default event_flatten_level (10), and matches its use elsewhere in the codebase.

107-144: Two-stage flattening is intentional and follows established pattern

I’ve confirmed that this same sequence—calling generic_flattening to expand arrays followed by flatten::flatten to flatten nested objects—is used in src/utils/json/mod.rs. The two functions are complementary (one handles array branching, the other handles object nesting) and do not conflict. No changes needed.

src/handlers/http/logstream.rs

fix: flatten before detect

a843a5f

coderabbitai bot requested changes Jun 28, 2025

View reviewed changes

src/handlers/http/logstream.rs Outdated Show resolved Hide resolved

src/handlers/http/logstream.rs Outdated Show resolved Hide resolved

refactor

228428e

coderabbitai bot previously approved these changes Jun 28, 2025

View reviewed changes

clippy fix

95f4782

nikhilsinhaparseable dismissed coderabbitai[bot]’s stale review via 95f4782 June 28, 2025 13:12

coderabbitai bot approved these changes Jun 28, 2025

View reviewed changes

nitisht merged commit 32d105e into parseablehq:main Jun 28, 2025
13 of 14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

fix: flatten before detect #1361

fix: flatten before detect #1361

Uh oh!

nikhilsinhaparseable commented Jun 28, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jun 28, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fix: flatten before detect #1361

fix: flatten before detect #1361

Uh oh!

Conversation

nikhilsinhaparseable commented Jun 28, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jun 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Suggested labels

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nikhilsinhaparseable commented Jun 28, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jun 28, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)