-
Notifications
You must be signed in to change notification settings - Fork 159
feat: pgvector - make error messages more informative #1684
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Ok. I see the issue but I would not overcrowd the error message if possible. With Haystack import os
from haystack_integrations.document_stores.pgvector import PgvectorDocumentStore
from haystack import Document
os.environ["PG_CONN_STR"] = "postgresql://postgres:postgres@localhost:5432/postgres"
document_store = PgvectorDocumentStore(
embedding_dimension=5,
vector_function="cosine_similarity",
recreate_table=True,
search_strategy="hnsw",
)
document_store.write_documents([
Document(content="This is first", embedding=[0.1]*2),
Document(content="This is second", embedding=[0.3]*2)
])
print(document_store.count_documents()) Error:
In this case, the cause of the error is easily understandable, by inspecting the stacktrace. In Hayhooks from haystack import Pipeline
from hayhooks import BasePipelineWrapper
import os
from haystack_integrations.document_stores.pgvector import PgvectorDocumentStore
from haystack import Document
from haystack.components.writers import DocumentWriter
os.environ["PG_CONN_STR"] = "postgresql://postgres:postgres@localhost:5432/postgres"
class PipelineWrapper(BasePipelineWrapper):
def setup(self) -> None:
document_store = PgvectorDocumentStore(
embedding_dimension=5,
vector_function="cosine_similarity",
recreate_table=True,
search_strategy="hnsw",
)
pipe = Pipeline()
document_writer = DocumentWriter(document_store)
pipe.add_component("document_writer", document_writer)
self.pipeline = pipe
def run_api(self, text: str) -> str:
document = Document(content=text, embedding=[0.1]*2)
result = self.pipeline.run({"document_writer": {"documents": [document]}})
return result Error
@mpangrazzi WDYT? Would it be possible/appropriate to show the entire stacktrace in Hayhooks? |
@anakin87 @superkelvint yes it's definitely possible to show full stacktraces on Hayhooks! You simply need to set |
Thanks for the feedback! I understand the concern about overcrowding error messages. That said, I'd like to emphasize how including the SQL exception (even in a truncated form) could actually improve the developer experience. In environments like Hayhooks, where the full stack trace is hidden by default unless
Even outside of Hayhooks, just using plain Haystack, the raised exception adds an extra layer that can get in the way. It hides the real error behind a generic message, which makes debugging slower and less streamlined. Would you be open to conditionally including the SQL error? For example: raise DocumentStoreError(f"{error_msg}: {type(e).__name__}: {e}") from e or even raise DocumentStoreError(f"{error_msg}: {str(e)[:200]}") from e This would keep the message compact while still providing valuable context that can immediately point to the issue. It helps streamline the debugging process by providing developers with actionable information right at the point of failure. Let me know if you'd prefer this be gated behind a flag or if there's another compromise you'd suggest—happy to adapt! |
Hey @superkelvint, I understand your point. Could you please sign the CLA? |
Thanks @anakin87! I have just signed the CLA. |
the improvement is available in the new package release: https://pypi.org/project/pgvector-haystack/3.3.0/ |
The pgvector document store does not currently report SQL exceptions, which makes debugging difficult.
Using hayhooks, for example, with this patch, the error goes from this:
to this:
Proposed Changes:
Include the SQL exception in the reporting message.
Checklist
fix:
,feat:
,build:
,chore:
,ci:
,docs:
,style:
,refactor:
,perf:
,test:
.