We’re delighted to announce the release of Kuzu 0.9.0, whose most notable feature is a new vector extension that allows you to perform similarity search over vector data fully within Kuzu.
Other features include:
- Arbitrary SQL scans from Postgres databases
- WASM with bundled extensions
- Async Python API and Sync Node.js API
- Unity Catalog integration
- MCP server implementation
- G.V() integration
Besides new features, we've continuously improved the performance of our aggregation along with the creation of fts indexes a lot!
Please check our release post for more details.
What's Changed
- Add tests with different node group sizes + fix bugs by @royi-luo in #4928
- Temporarily disable daily build until 0.8.2 release by @mewim in #4950
- Fix list-predicate functions by @acquamarin in #4947
- Implement hint for unnested subquery by @andyfengHKU in #4955
- Refactor scalar_func_exec_t to take in separate selection vectors by @royi-luo in #4948
- Revert "Temporarily disable daily build until 0.8.2 release" by @mewim in #4956
- Refactor semi mask by @andyfengHKU in #4940
- Skip inserting null keys into distinct hash tables by @benjaminwinger in #4949
- Fix compile warnings in function executors by @royi-luo in #4960
- Optimize embedding caching in vector index construction by @ray6080 in #4920
- Improve semi mask planning by @andyfengHKU in #4957
- Optimize InMemHNSWGraph::getNeighbors by @ray6080 in #4951
- Fix export database with official extension. by @acquamarin in #4961
- Fix label predicate in recursive pattern by @andyfengHKU in #4966
- Support EXISTS subquery in recursive pattern predicate by @andyfengHKU in #4969
- Fix json output format in shell by @acquamarin in #4968
- gds: init scc computation using kosaraju's algorithm by @sdht0 in #4893
- Fix extend cardinality by @royi-luo in #4843
- Parallel Distinct SimpleAggregate by @benjaminwinger in #4934
- Remove old recursive extend by @andyfengHKU in #4976
- Reuse scan state for multiple table scans by @ray6080 in #4975
- Support customized extension repo by @acquamarin in #4973
- Support lambdas on lists with size > DEFAULT_VECTOR_CAPACITY by @royi-luo in #4979
- Refactor parsed expr visitor by @andyfengHKU in #4977
- Refactor DDL operator by @andyfengHKU in #4984
- Fix Python build issue caused by shell_printer by @mewim in #4985
- Support ignore_errors option for copy from subquery by @royi-luo in #4988
- Update Kùzu to Kuzu for consistency with SEO discoverability by @prrao87 in #4965
- Only merge distinct aggregate hash tables into the global queues when full by @benjaminwinger in #4972
- Fix hash 256 function by @acquamarin in #4989
- Statistics update optimization by @benjaminwinger in #4980
- Fix deserialization of empty node groups by @royi-luo in #4987
- Update CI workflow to use Debian 12 for code coverage job by @mewim in #4996
- Refactor table scan state interfaces by @ray6080 in #4981
- Fix json output mode with meta-data by @acquamarin in #4993
- Fix json casting issue by @acquamarin in #4992
- Selection vector slicing with lightweight SelectionView by @benjaminwinger in #4998
- Support gds optioanl args by @acquamarin in #4999
- Add projected graph node filter by @andyfengHKU in #4990
- Split recursive join and gds at logical operator level by @andyfengHKU in #5003
- Enable dynamic dispatch for simsimd by @royi-luo in #5000
- Add is ready only field to function by @andyfengHKU in #5002
- Add squared distance function for arrays by @royi-luo in #5008
- Refactor gds frontier by @andyfengHKU in #5006
- refactor: Kùzu -> Kuzu by @sdht0 in #5005
- Tie rust QueryResult lifetime to that of the Database by @benjaminwinger in #5009
- Implement SQL_QUERY function by @acquamarin in #5010
- Fix nested decimal type casting by @acquamarin in #5018
- Split recursive extend and gds at binding level by @andyfengHKU in #5020
- Added rustdoc example for Connection::execute by @benjaminwinger in #5022
- Try to fix CI workflow for forked repo by @mewim in #5027
- Implement copy from table function by @acquamarin in #5023
- Split rec join and gds at physical level by @andyfengHKU in #5021
- Refactor gds output writer by @andyfengHKU in #5028
- gds: init parallel scc by @sdht0 in #5011
- Hide columns in table func bind data by @andyfengHKU in #5029
- Disable compression on floating point values in array/list by @ray6080 in #5035
- Separate SemiMask interface and implementation by @ray6080 in #5036
- Implement parameter casting for table functions by @acquamarin in #5034
- Fix incorrect src dst for undirected path by @andyfengHKU in #5041
- Revert "Try to fix CI workflow for forked repo" by @mewim in #5043
- Expose semi mask sub-plan in the logical plan tree by @ray6080 in #5037
- Swap tableName and indexName in hnsw functions by @ray6080 in #5038
- Allow loading from multiple files by @ray6080 in #5045
- Remove cardinality from tableFuncBindData by @acquamarin in #5039
- Refactor table function by @acquamarin in #5046
- Support handling null/deleted nodes during vector index creation by @royi-luo in #5014
- Improve SelectionVector::fromValueVectors by @ray6080 in #5052
- Make QueryResult::toString const by @benjaminwinger in #5013
- Make GDS table function by @andyfengHKU in #5048
- Change hnsw input parameter types by @acquamarin in #5042
- Use copyNullMask instead of looping during copies of nulls by @benjaminwinger in #5015
- Implement multi-labeled wcc by @andyfengHKU in #5057
- Implement synchronous APIs for Node.js bindings by @mewim in #5058
- Multi label page rank by @andyfengHKU in #5060
- Use correct offset to access vector index embeddings during creation by @royi-luo in #5063
- Add AsyncConnection for asynchronous query execution on Python API by @mewim in #5061
- Clear table function signatures by @andyfengHKU in #5059
- Filtered HNSW search by @ray6080 in #5019
- Unify vector index, gds & table function planning by @andyfengHKU in #5067
- Implement
internal_id
function to createinternal_id
literal by @acquamarin in #5071 - Throw exception when extension rewrite functions called in a multi-statement query by @acquamarin in #5072
- Add job for testing simsimd dynamic dispatch to nightly build-and-deploy workflow by @royi-luo in #5007
- Vector extension by @ray6080 in #5047
- Add option blind/directed upper sel threshold; rename distFunc to metric by @ray6080 in #5069
- Move OutputNodeMask output GDSComputeState by @andyfengHKU in #5077
- Optimize regex match execution by @acquamarin in #5079
- Add projected graph with table droping tests by @andyfengHKU in #5073
- Rename hnsw functions by @ray6080 in #5078
- Support yield for QUERY_VECTOR_INDEX by @ray6080 in #5084
- Update nightly simd dispatch test by @royi-luo in #5083
- Add support for transforming arrays and objects parameters for WASM by @mewim in #5088
- fixed cli highlighting utf8 issue by @MSebanc in #5087
- Fix missing chrono include that prevents windows builds by @Robert-W-Ward in #5089
- Add windows 2025 to multiplatform test by @mewim in #5090
- Zero new memory after resizing ColumnChunkData by @benjaminwinger in #5082
- Fix setDstNode in shrink by @ray6080 in #5092
- Distinct Aggregate vectorization by @benjaminwinger in #5031
- Make Python API queries interruptable by @royi-luo in #5094
- Fix incorrectly resized chunkStates in local scan state by @ray6080 in #5095
- Shell table autocomplete catalog version by @MSebanc in #5091
- Make
EXTENSION
keyword optional inLOAD EXTENSION
statement by @ray6080 in #5098 - Avoid project constant expression in WITH by @andyfengHKU in #5099
- Add function alias for compatibility by @andyfengHKU in #5103
- Fix contention in re2 expression by @acquamarin in #5104
- Fix unused member compile warnings by @royi-luo in #5110
- Implement extension static linking by @acquamarin in #5093
- Fix list lambda collector by @acquamarin in #5112
- Add function alias by @acquamarin in #5115
- Fix hang in JSON parsing when error is hit with ignore errors false by @royi-luo in #5108
- Fix bug in reloadDB test command by @andyfengHKU in #5113
- Fix function names in show_function() by @acquamarin in #5116
- Fix c api parameter binding by @andyfengHKU in #5117
- Fix rel group replay by @andyfengHKU in #5114
- Allow WAL replaying on some corrupted WAL files (e.g. after DB was killed) by @royi-luo in #5111
- Disable parallel copy of vectors in vector extension tests by @ray6080 in #5096
- Track hash index memory usage by @benjaminwinger in #5074
- Fix size of chunks in RelBatchInsert by @benjaminwinger in #5119
- Fix projection push down for sql_query function by @acquamarin in #5122
- Rename the default return variable
_distance
todistance
in QueryVectorIndex by @ray6080 in #5109 - Fix concurrency issue in building hnsw by @ray6080 in #5121
- Make yield as non-reserved keyword by @acquamarin in #5125
- Validate projected graph argument type by @andyfengHKU in #5124
- Fix dropping table with fts and vector indexes by @ray6080 in #5131
- Fix resetting index entry's aux info when it was loaded already by @ray6080 in #5129
- Add value checks on mu, ml and k by @ray6080 in #5130
- Fix array size by @acquamarin in #5135
- Implement show_projected_graph function by @acquamarin in #5134
- Remove duplicated YIELD definition by @mewim in #5143
- Migrate code coverage job to self-hosted runner by @mewim in #5144
- Error when default expression type doesn't match by @acquamarin in #5137
- Add mnist dataset by @ray6080 in #5148
- Allow same index name on different tables by @ray6080 in #5146
- Add static link extensions for WASM and Android NDK builds by @mewim in #5147
- Add demo example in docs to codebase by @ray6080 in #5149
- Rename return variable of QueryVectorIndex from nn to node by @ray6080 in #5145
- Rename show_projected_graph to show_projected_graphs by @ray6080 in #5140
- Improve
regexp_replace
performance by @acquamarin in #5151 - Skip WASM UDTInvalidCast test case by @mewim in #5153
- Add click benchmarks by @benjaminwinger in #5154
- Fix copy dataframes to multi-nodes pair rel table bug by @acquamarin in #5165
- Fix empty list check by @andyfengHKU in #5166
- Fix rust extensions in testing and document how other binaries can use them by @benjaminwinger in #5138
- Change max line length to 65536 for shell and fix Android static link by @mewim in #5168
- Bump version to 0.9.0 by @mewim in #5174
New Contributors
- @Robert-W-Ward made their first contribution in #5089
Full Changelog: v0.8.1...v0.9.0