-
Notifications
You must be signed in to change notification settings - Fork 134
feat: add schema indexes #1138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: add schema indexes #1138
Conversation
WalkthroughSeveral schema files were updated to add or modify database indexes. These changes include adding single-field indexes to properties such as Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant MongoDB
Note over Client,MongoDB: Example: Query with indexed fields
Client->>MongoDB: Find documents by indexed field (e.g., title, name, text)
MongoDB-->>Client: Return results using index for faster lookup
sequenceDiagram
participant MongoDB
Note over MongoDB: On insert or update
MongoDB->>MongoDB: Enforce unique compound index (e.g., value+entity)
alt Duplicate found
MongoDB-->>Client: Error (duplicate key)
else No duplicate
MongoDB-->>Client: Insert/Update successful
end
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🔭 Outside diff range comments (1)
api/src/nlp/schemas/nlp-value.schema.ts (1)
35-36
: Conflicting uniqueness constraints
value
is still markedunique: true
, yet you just added a compound unique index(value, entity)
.
The single-field unique index will forbid the samevalue
across different entities, negating the purpose of the new compound index and potentially blocking existing data.Diff to remove the obsolete unique flag:
- @Prop({ type: String, required: true, unique: true }) + @Prop({ type: String, required: true })
🧹 Nitpick comments (1)
api/src/nlp/schemas/nlp-sample.schema.ts (1)
29-30
: Large text field index – monitor index sizeIndexing
text
strings can explode index size quickly. If you need equality look-ups it’s fine; if you mostly do$text
search consider a text index instead. At minimum add a.index({ text: 1 }, { collation, partialFilterExpression… })
directly on the schema to control options.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
api/src/attachment/schemas/attachment.schema.ts
(1 hunks)api/src/chat/schemas/subscriber.schema.ts
(4 hunks)api/src/cms/schemas/content.schema.ts
(1 hunks)api/src/nlp/schemas/nlp-sample-entity.schema.ts
(2 hunks)api/src/nlp/schemas/nlp-sample.schema.ts
(2 hunks)api/src/nlp/schemas/nlp-value.schema.ts
(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (4)
- GitHub Check: API-Tests
- GitHub Check: Frontend Tests
- GitHub Check: Analyze (actions)
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (8)
api/src/attachment/schemas/attachment.schema.ts (1)
38-42
: Index looks fine – consider build-time impactAdding a single-field index on
name
is straightforward. Just make sure the collection size isn’t huge when this rolls out;@Prop({ …, index: true })
defaults to a foreground build duringsyncIndexes()
. For large datasets you may prefer a background build or manual creation in the deployment script.api/src/cms/schemas/content.schema.ts (1)
37-38
: Minor: keep text and single-field index coherent
ContentSchema
already defines a text index ontitle
(lines 94-106). The extra single-field B-tree index ontitle
may be redundant; MongoDB can’t use the text index to satisfy ordinary equality/ordering queries, but the reverse is also true. If the workload is search-only you can drop the B-tree index to save space; otherwise keep both intentionally.api/src/nlp/schemas/nlp-value.schema.ts (1)
122-125
: 👍 Compound unique index adds proper guardrailsThe
(value, entity)
unique index is the right level of granularity for NLP values. Once the single-field index is removed (see above) this will enforce the intended constraint.api/src/nlp/schemas/nlp-sample-entity.schema.ts (1)
110-113
: Unique triple-key index – verify null safetyThe
(sample, entity, value)
unique index is correct. Ensure no historical rows have any of these fields null/undefined; otherwise the index build will fail. If that’s a risk, addpartialFilterExpression
to exclude docs with nulls.api/src/chat/schemas/subscriber.schema.ts (4)
28-34
: Consider collation or text index for name look-upsPlain B-Tree indexes on
first_name
speed up equality matches but not case-insensitive or partial-text searches—both are very common on names.
If you routinely search with$regex
, provide autocomplete, or need locale-aware case-folding, specify:-@Prop({ type: String, required: true, index: true }) +@Prop({ + type: String, + required: true, + index: { collation: { locale: 'en', strength: 2 } }, // or use `text: true` +})Otherwise you’ll still fall back to collection scans.
[ suggest_optional_refactor ]
36-41
: Same remark applies tolast_name
Align the indexing strategy of
last_name
with whatever decision you take forfirst_name
to avoid mixed behaviour.
[ duplicate_comment ]
69-73
:foreign_id
should be declaredunique
foreign_id
is typically an external system identifier and should be unique to prevent data duplication.
Lacking a uniqueness constraint the application relies on code to guarantee integrity, which is fragile.-@Prop({ type: String, index: true }) +@Prop({ type: String, index: { unique: true } })[ raise_critical_issue ]
160-164
: Redundant single-field + compound indexes = write overheadYou now have:
• single–field indexes onfirst_name
andlast_name
• a compound{ first_name: 1, last_name: 1 }
indexMongoDB can use the compound index to satisfy single-field predicates that are a prefix of the compound key, so the individual ones are redundant and double write-amplification (~ 2× index maintenance on every insert/update).
Two options:
@@ -schema: SchemaFactory.createForClass(SubscriberStub).index({ - first_name: 1, - last_name: 1, -}), +schema: SchemaFactory.createForClass(SubscriberStub) + // Drop single-field indexes above, keep only the compound one + .index({ first_name: 1, last_name: 1 }),or delete
index: true
on the two props and keep the compound index only.
[ suggest_essential_refactor ][ request_verification ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Motivation
The following PR introduces new indexes to the database schemas to optimize query performance and improve data retrieval efficiency. The added indexes target frequently queried fields identified through analysis of query patterns and slow queries, resulting in faster lookup times and reduced database load.
It is worth noting that no data migrations are necessary since indexes can be created on existing collections without modifying the data when applying indexes on an existing dump.
Fixes # (1099)[https://github.com//issues/1099]
Type of change:
Please delete options that are not relevant.
Checklist:
Summary by CodeRabbit
New Features
Chores