-
Notifications
You must be signed in to change notification settings - Fork 71
Add support for Arrow string view type #1252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hey @ivant 👋 Thanks for your contribution.
We're always curious how people are using our SDKs, if there are areas to improve, or dedicate more focus to. Is this aimed at SDK built plugins that target Arrow-based backends, or is this for a separate use case? |
Hi @wbrowne! Happy to give more context! The problem I am currently trying to solve is the lack of a well supported Arrow Flight plugin that can handle connections to datasources that use up-to-date versions of Arrow Flight libraries (see, for example, grafana/grafana#99936). Some recent changes to Arrow Flight made the old plugin all but incompatible (particularly due to introduction of One example where this can come up is using something like Roapi as a datasource. Old versions of Roapi were working OK (not great, but workable) with the old plugin, but an attempt to upgrade Roapi past the point where something like There are other Arrow data types that are currently missing:
Obviously, not all of these need to be implemented at once, and it might be OK to not support some of the types that do not make sense to represent in the Grafana UI (leaving it up to the user queries to perform the conversions to something that is representable in the UI). From my point of view, I'd probably need to tackle the |
@ivant Thanks so much for the extra context - that's really helpful. Was the (now deprecated) Influx FlightSQL datasource plugin able to solve your use case apart from the recent Arrow updates? Is your goal now to build a new Arrow Flight plugin in that case? Apologies as I lack a lot of context in the area of Influx specifically, but considering the note on the repo, do you think the Influx plugin be the most appropriate place to have Arrow API support? |
It was able to solve my (company's) use case, but just barely, at a moment in time. Our needs grow and there are a bunch of things that we would really like to have, both in terms of bugfixes and functionality.
That is my current endeavor and I already have a prototype of a plugin that already works with the new Arrow datasources, so we can finally start testing out the new functionality of these datasources that was blocked because of version support. I am starting to test this new plugin with existing queries and dashboards to see what is the gap until we can fully migrate, and it looks surprisingly small.
InfluxDB is a specific product and at this point I have a very limited amount of understanding of what it is in terms of what they plan to support, though it is very likely that they will stick with Arrow for a while (due to their developer being deeply involved in Arrow creation). The code under https://github.com/grafana/grafana/tree/main/pkg/tsdb/influxdb/fsql seems to be implementing Arrow/FlightSQL datasource as part of 3 different types of InfluxDB backends they support. From my very brief look at this code:
My opinion is that Arrow/FlightSQL datasource support should not be bundled into InfluxDB specific plugin, because it serves to establish compatibility with InfluxDB specifically, not a more general variety of data sources that use Arrow/FlightSQL. The unasked question here is probably whether my intention is to open source the plugin. The answer to that is "probably yes", though I might need to figure out whether I'd need to extract my company-specific features into a separate plugin to avoid adding "bloat" for external users who do not care about these features, and what is the best way to do that. |
Thanks again @ivant for the extra context. I'm just working to get the right folk's eyes on this and will revert ASAP! |
Awesome 👍 Feel free to consult the Plugins policy in that case. We would be happy to take a look if/when you decide the plugin belongs part of the Grafana Catalog 😄 From here I think this is good to pass onto the codeowners @grafana/grafana-datasources-core-services. Thanks again for the information and your patience, @ivant! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ivant thanks for the contribution!
PR looks fine, going to merge it.
What this PR does / why we need it:
This PR adds support for Arrow's
StringView
data type, which is needed for correct handling of string data received from datasources using the newer Arrow libraries and which often send string-typed fields asStringView
.Which issue(s) this PR fixes: N/A
Special notes for your reviewer:
This change is needed for implementing a data source plugin for modern arrow-based data sources if one tries to rely on Grafana Plugin SDK's own Arrow-to-Frame translation. Without this the implementors of data sources would need to reimplement the Arrow-to-Frame translation on their own in its entirety.
There might be other new Arrow data types that are not being handled in the Grafana Plugin SDK. This PR is intentionally narrowly focused on
StringView
(since this is one of the most common types), but once this is merged it should be easy to add more types and tests for them.