Releases: kuzudb/kuzu
v0.9.0
We’re delighted to announce the release of Kuzu 0.9.0, whose most notable feature is a new vector extension that allows you to perform similarity search over vector data fully within Kuzu.
Other features include:
- Arbitrary SQL scans from Postgres databases
- WASM with bundled extensions
- Async Python API and Sync Node.js API
- Unity Catalog integration
- MCP server implementation
- G.V() integration
Besides new features, we've continuously improved the performance of our aggregation along with the creation of fts indexes a lot!
Please check our release post for more details.
What's Changed
- Add tests with different node group sizes + fix bugs by @royi-luo in #4928
- Temporarily disable daily build until 0.8.2 release by @mewim in #4950
- Fix list-predicate functions by @acquamarin in #4947
- Implement hint for unnested subquery by @andyfengHKU in #4955
- Refactor scalar_func_exec_t to take in separate selection vectors by @royi-luo in #4948
- Revert "Temporarily disable daily build until 0.8.2 release" by @mewim in #4956
- Refactor semi mask by @andyfengHKU in #4940
- Skip inserting null keys into distinct hash tables by @benjaminwinger in #4949
- Fix compile warnings in function executors by @royi-luo in #4960
- Optimize embedding caching in vector index construction by @ray6080 in #4920
- Improve semi mask planning by @andyfengHKU in #4957
- Optimize InMemHNSWGraph::getNeighbors by @ray6080 in #4951
- Fix export database with official extension. by @acquamarin in #4961
- Fix label predicate in recursive pattern by @andyfengHKU in #4966
- Support EXISTS subquery in recursive pattern predicate by @andyfengHKU in #4969
- Fix json output format in shell by @acquamarin in #4968
- gds: init scc computation using kosaraju's algorithm by @sdht0 in #4893
- Fix extend cardinality by @royi-luo in #4843
- Parallel Distinct SimpleAggregate by @benjaminwinger in #4934
- Remove old recursive extend by @andyfengHKU in #4976
- Reuse scan state for multiple table scans by @ray6080 in #4975
- Support customized extension repo by @acquamarin in #4973
- Support lambdas on lists with size > DEFAULT_VECTOR_CAPACITY by @royi-luo in #4979
- Refactor parsed expr visitor by @andyfengHKU in #4977
- Refactor DDL operator by @andyfengHKU in #4984
- Fix Python build issue caused by shell_printer by @mewim in #4985
- Support ignore_errors option for copy from subquery by @royi-luo in #4988
- Update Kùzu to Kuzu for consistency with SEO discoverability by @prrao87 in #4965
- Only merge distinct aggregate hash tables into the global queues when full by @benjaminwinger in #4972
- Fix hash 256 function by @acquamarin in #4989
- Statistics update optimization by @benjaminwinger in #4980
- Fix deserialization of empty node groups by @royi-luo in #4987
- Update CI workflow to use Debian 12 for code coverage job by @mewim in #4996
- Refactor table scan state interfaces by @ray6080 in #4981
- Fix json output mode with meta-data by @acquamarin in #4993
- Fix json casting issue by @acquamarin in #4992
- Selection vector slicing with lightweight SelectionView by @benjaminwinger in #4998
- Support gds optioanl args by @acquamarin in #4999
- Add projected graph node filter by @andyfengHKU in #4990
- Split recursive join and gds at logical operator level by @andyfengHKU in #5003
- Enable dynamic dispatch for simsimd by @royi-luo in #5000
- Add is ready only field to function by @andyfengHKU in #5002
- Add squared distance function for arrays by @royi-luo in #5008
- Refactor gds frontier by @andyfengHKU in #5006
- refactor: Kùzu -> Kuzu by @sdht0 in #5005
- Tie rust QueryResult lifetime to that of the Database by @benjaminwinger in #5009
- Implement SQL_QUERY function by @acquamarin in #5010
- Fix nested decimal type casting by @acquamarin in #5018
- Split recursive extend and gds at binding level by @andyfengHKU in #5020
- Added rustdoc example for Connection::execute by @benjaminwinger in #5022
- Try to fix CI workflow for forked repo by @mewim in #5027
- Implement copy from table function by @acquamarin in #5023
- Split rec join and gds at physical level by @andyfengHKU in #5021
- Refactor gds output writer by @andyfengHKU in #5028
- gds: init parallel scc by @sdht0 in #5011
- Hide columns in table func bind data by @andyfengHKU in #5029
- Disable compression on floating point values in array/list by @ray6080 in #5035
- Separate SemiMask interface and implementation by @ray6080 in #5036
- Implement parameter casting for table functions by @acquamarin in #5034
- Fix incorrect src dst for undirected path by @andyfengHKU in #5041
- Revert "Try to fix CI workflow for forked repo" by @mewim in #5043
- Expose semi mask sub-plan in the logical plan tree by @ray6080 in #5037
- Swap tableName and indexName in hnsw functions by @ray6080 in #5038
- Allow loading from multiple files by @ray6080 in #5045
- Remove cardinality from tableFuncBindData by @acquamarin in #5039
- Refactor table function by @acquamarin in #5046
- Support handling null/deleted nodes during vector index creation by @royi-luo in #5014
- Improve SelectionVector::fromValueVectors by @ray6080 in #5052
- Make QueryResult::toString const by @benjaminwinger in #5013
- Make GDS table function by @andyfengHKU in #5048
- Change hnsw input parameter types by @acquamarin in #5042
- Use copyNullMask instead of looping during copies of nulls by @benjaminwinger in #5015
- Implement multi-labeled wcc by @andyfengHKU in #5057
- Implement synchronous APIs for Node.js bindings by @mewim in #5058
- Multi label page rank by @andyfengHKU in #5060
- Use correct offset to access vector index embeddings during creation by @royi-luo in #5063
- Add AsyncConnection for asynchronous query execution on Python API by @mewim in #5061
- Clear table function signatures by @andyfengHKU in #5059
- Filtered HNSW search by @ray6080 in #5019
- Unify vector index, gds & table function planning by @andyfengHKU in #5067
- Implement
internal_id
function to createinternal_id
literal by @acquamarin in #5071 - Throw exception when extension rewrite functions called in a multi-statement query by @acquamarin in #5072
- Add job for testing simsimd dynamic dispatch to nightly build-and-deploy workflow by @royi-luo in #5007
- Vector extension by @ray6080 in #5047
- Add option blind/directed upper sel threshold; rename distFunc to metric by @ray6080 in #5069
- Move OutputNodeMask output GDSComputeState by @andyfengHKU in #5077
- Optimize regex match execution by @acquamarin in #5079
- Add projected graph with table droping tests by @andyfengHKU in #5073
- Rename hnsw functions by @ray6080 in #5078
- Support yield for QUERY_VECTOR_INDEX by @ray6080 in https://github.com/ku...
v0.8.2
v0.8.2 is a minor release to fix the distinct hash aggregate with NULL bug
We're just a couple months into 2025, and we are happy to announce a new minor release: v0.8.2. This release is feature-packed, warranting its own blog post. One of the highlights is the introduction of the unity_catalog
extension, which allows you to scan/copy from Delta Lake tables managed by Unity Catalog.
We've also improved our existing extensions. For those of you on Google Cloud, we have some exciting news! We now support scanning/copying from/writing to files hosted on Google Cloud Storage(GCS) filesystem. This update leverages our existing httpfs
extension. Another useful new feature is that our CLI now explicitly excludes confidential information such as S3 access keys
from being stored in the command history file. This helps prevent accidental leakage of sensitive data into your command line history and ensures your credentials remain secure.
Our full-text search extension now supports customizing the stopwords table used in full-text search, which can be helpful in your custom domains where specific words not in the default list need to be excluded from the index.
From a performance perspective, we’ve significantly improved our execution of distinct aggregation queries via a new parallel distinct hash aggregation mechanism.
Please check our release post for more details. Hope you enjoy this release!
Full Changelog: v0.8.1...v0.8.2
v0.8.1
We're just a couple months into 2025, and we are happy to announce a new minor release: v0.8.1. This release is feature-packed, warranting its own blog post. One of the highlights is the introduction of the unity_catalog
extension, which allows you to scan/copy from Delta Lake tables managed by Unity Catalog.
We've also improved our existing extensions. For those of you on Google Cloud, we have some exciting news! We now support scanning/copying from/writing to files hosted on Google Cloud Storage(GCS) filesystem. This update leverages our existing httpfs
extension. Another useful new feature is that our CLI now explicitly excludes confidential information such as S3 access keys
from being stored in the command history file. This helps prevent accidental leakage of sensitive data into your command line history and ensures your credentials remain secure.
Our full-text search extension now supports customizing the stopwords table used in full-text search, which can be helpful in your custom domains where specific words not in the default list need to be excluded from the index.
From a performance perspective, we’ve significantly improved our execution of distinct aggregation queries via a new parallel distinct hash aggregation mechanism.
What's Changed
- Include delta/iceberg loader in extension build cmake by @acquamarin in #4855
- Fix rust build threads in CI by @benjaminwinger in #4853
- Improve label pruning by @andyfengHKU in #4842
- Remove SimpleTableFunction and make TableFunction non-extendable by @ray6080 in #4840
- adjusted cli autocomplete ordering by @MSebanc in #4845
- Table function logical plan by @ray6080 in #4848
- Forward declare ExecutionContext in PhysicalOperator by @ray6080 in #4859
- Add more test cases on vector size by @ray6080 in #4240
- Update README.md by @semihsalihoglu-uw in #4866
- Support customization of stopwords in full text search by @acquamarin in #4864
- Table function physical plan by @ray6080 in #4862
- build: separate out test build and run by @sdht0 in #4844
- Speed up recompilation when changing compile-time config by @royi-luo in #4850
- Optimize recompile times by @royi-luo in #4863
- Add Catalog version by @ray6080 in #4869
- Rename clone to copy by @andyfengHKU in #4871
- Allow COPY FROM in manual transactions by @ray6080 in #4872
- Refactor gds framework by @andyfengHKU in #4860
- Rework rewriteFunc of table functions by @ray6080 in #4873
- Migrate API docs to standalone repo by @mewim in #4875
- Fix extension-only build by @royi-luo in #4876
- Fix lambda function list-size limitation by @acquamarin in #4879
- Fix non exist pk error message by @acquamarin in #4883
- Fix nested agg binding exception by @andyfengHKU in #4885
- Remove count from evaluator local state by @andyfengHKU in #4878
- Refactor path backtrack by @andyfengHKU in #4886
- Make weight shortest path cost as double by @andyfengHKU in #4887
- Fixes the
alias
issue ofstruct_pack
function by @acquamarin in #4894 - Remove recursive extend binding by @andyfengHKU in #4896
- Parallel distinct hash aggregate by @benjaminwinger in #4881
- Fix export/import database by @acquamarin in #4900
- Fix
skipWhiteSpace()
function by @acquamarin in #4907 - Clean update info during node group checkpoint by @ray6080 in #4895
- Try to mitigate data race in the HashAggregate by @benjaminwinger in #4906
- Add empty columns to chunked node group if needed during COPY by @royi-luo in #4882
- Implement unity catalog extension by @acquamarin in #4890
- Disable test NodeUpdateTest.UpdateSameRowRedundtanly for in mem mode tests by @royi-luo in #4911
- Fix projection profiler by @andyfengHKU in #4908
- Separate hash aggregate finalization into its own operator by @benjaminwinger in #4913
- Break when error in the middle of multi statements by @ray6080 in #4914
- Allow create/drop hnsw index to run in manual transactions by @ray6080 in #4877
- Fix timing in the final QueryResult for create fts/hnsw index by @ray6080 in #4915
- Wsp track path by @andyfengHKU in #4898
- Fixes export database with stopwords by @acquamarin in #4917
- Split shortest path implementation to different files by @andyfengHKU in #4919
- Implement confidential statement by @acquamarin in #4910
- Update Kuzu logo image by @mewim in #4923
- Add predicate information per table to graph entry by @andyfengHKU in #4924
- Enable spilling during finalization of CreateHNSWIndex by @ray6080 in #4921
- Add GCS support by @royi-luo in #4892
Full Changelog: v0.8.0...v0.8.1
v0.8.0
We're kicking off the year 2025 with the exciting release of Kùzu 0.8.0, which brings two new features:
- Kùzu-WASM for in-browser graph analytics. You can now run your graph database while keeping all data and compute within your browser session!
fts
extension for full-text search. You can now run keyword-based search queries using BM25 in Kùzu.
In addition to these new features, we’ve streamlined the developer workflow during relationship table creation by unifying CREATE REL TABLE GROUP
into a single, flexible CREATE REL TABLE
syntax.
Finally, we’ve significantly improved our execution of aggregation queries via a new parallel hash aggregation mechanism.
Please check our release post for more details. Hope you enjoy this release!
What's Changed
- Add check of test file name on
-
, fix doc_example.test of JSON extension by @SterlingT3485 in #4537 - Add GDS support for vertex property scanning by @benjaminwinger in #4453
- Implement list_has_all by @acquamarin in #4546
- Added Unicode \u and \U parsing to the cli by @MSebanc in #4492
- Migrate macOS CI to use both x86-64 and ARM64 runner by @mewim in #4542
- Avoid importing polars in arrow scan by @acquamarin in #4551
- GDS interface cleanup by @andyfengHKU in #4524
- Use a flag to determine if the SelectionVector uses INCREMENTAL_SELECTED_POS by @benjaminwinger in #4552
- Fix VersionInfo SelectionVector creation by @benjaminwinger in #4556
- Remove some redundant compiler flags on Windows rust builds by @benjaminwinger in #4555
- Using shared_mutex instead of mutex in CatalogSet by @ted-wq-x in #4533
- Implement full text search by @acquamarin in #4416
- Rel scan selection optimizations by @benjaminwinger in #4558
- Fix wrong sniff type when sample_size = 1 by @SterlingT3485 in #4565
- Alter table with index by @acquamarin in #4563
- Fix the type cast in nested struct by @SterlingT3485 in #4560
- Wasm buffer manager support by @benjaminwinger in #4523
- Implement optional args in full text search by @acquamarin in #4569
- Fix config parameters binding for C API by @mewim in #4574
- Deprecated Ubuntu 23 runners from multi-platform test by @mewim in #4576
- Add more explicit comparisons when checking test result output by @benjaminwinger in #4280
- Minor path writer refactor by @andyfengHKU in #4580
- Fix rollback during Node Table COPY by @royi-luo in #4467
- Rework build pipeline by @mewim in #4581
- Gds node predicate push down by @andyfengHKU in #4461
- Migrate macOS build docs to internal repo by @mewim in #4585
- Handle failed queries in test runner if expected output is tuples by @royi-luo in #4584
- Add missing include; fix clangd by @mewim in #4586
- Fix flat select bug by @andyfengHKU in #4590
- Fixed cli handling of escape sequences after autocompletion by @MSebanc in #4589
- Revert "Fixed cli handling of escape sequences after autocompletion" by @andyfengHKU in #4594
- Increase buffer pool size for LargeListTest by @royi-luo in #4596
- Delta extension by @acquamarin in #4587
- Add test case name check & Share build between test and extension-test by @SterlingT3485 in #4572
- Add a homeDir field in LocalFileSystem for removefile to check by @SterlingT3485 in #4543
- Avoid double updating linesPerBlock in SharedFileErrorHandler if an exception is thrown by @royi-luo in #4601
- Implement drop/add property with if exists by @acquamarin in #4598
- Refactors the compilation of extensions by @acquamarin in #4602
- Remove unnecessary logic in ProcessorTask::finalizeIfNecessary by @royi-luo in #4573
- Cardinality estimation on top of HLL by @ray6080 in #4433
- Add recursive join benchmark by @andyfengHKU in #4603
- Parallel init dense gds array by @andyfengHKU in #4588
- Add inturrupt to path writer by @andyfengHKU in #4609
- Add sparse frontier implementation by @andyfengHKU in #4557
- Implement
FORMAT
option inLOAD FROM
clause. by @acquamarin in #4613 - Add e-notation double by @andyfengHKU in #4616
- Implement conjunctive full text search by @acquamarin in #4605
- Remove unnecessary lock in isVisibleNoLock check by @ray6080 in #4623
- Fix cost model for Extend by @ray6080 in #4429
- Add call function for bm info by @ray6080 in #4622
- Add exception when trying to load from a directory (or a file name with extension) by @SterlingT3485 in #4614
- Add setting
enable_plan_optimizer
by @royi-luo in #4619 - Implementing Parallel WCC (#4604) by @andyfengHKU in #4621
- Rename call function to simple table function by @andyfengHKU in #4626
- Refactor table bind func input by @andyfengHKU in #4627
- Add optimizer pass to repopulate cardinalities + combine cardinalities in Logical Plan/Operator by @royi-luo in #4606
- handle newline in front of profile/explain by @acquamarin in #4618
- Allow prepared statemenet parameter in CALL function by @acquamarin in #4628
- Add graph projection by @andyfengHKU in #4630
- refactor table bind func by @andyfengHKU in #4635
- Fix incorrect set of sequence val after exporting database by @ray6080 in #4636
- Add Ice Berg Extension by @SterlingT3485 in #4600
- Remove unnecessary join in fts by @andyfengHKU in #4637
- Support attach relational database with schema by @acquamarin in #4639
- Fix list-contains binding by @acquamarin in #4644
- Add iceberg metadata alter name test by @SterlingT3485 in #4645
- Trim Unnecessary Quote for CLI JSON output by @SterlingT3485 in #4643
- Implement
get_keys
function in fts by @acquamarin in #4647 - Remove offset from table func by @andyfengHKU in #4648
- Make iceberg test output to be stable by @SterlingT3485 in #4654
- Add include for
std::optional
in extension by @acquamarin in #4653 - Remove unnecessary sink in copy from subquery by @andyfengHKU in #4650
- Update PRODUCTION_RELEASES for extension by @mewim in #4659
- Add null checks to zone map by @royi-luo in #4642
- remove graph entry from fts input by @andyfengHKU in #4660
- Rollback hash index checkpoint by @royi-luo in #4559
- Fixed cli handling of escape sequences after autocompletion by @andyfengHKU in #4595
- Added type names to cli auto complete by @MSebanc in #4591
- Improve subquery planning by @andyfengHKU in #4651
- refactor table function constructor by @andyfengHKU in #4664
- Refactor scalar function constructor by @andyfengHKU in #4666
- Hash join flatten fix by @andyfengHKU in #4668
- Remove encoded join and enumeration flags from end-to-end testing framework by @ray6080 in #4669
- Skip query result from rewritten queries by @ray6080 in #4633
- Add more complete order by key types check by @ray6080 in #4671
- Refactor table func interface with const params by @ray6080 in #4670
- Add more clang tidy checks by @ray6080 in #4672
- Support IGNORE_ERRORS when scanning from pyarrow/pandas by @royi-luo in #4646
- Rename reader conf...
v0.7.1
We are excited to announce the release of two new extensions: Delta Lake and Iceberg. The Delta extension allows seamless scanning and copying from Delta Lake tables, while the Iceberg extension provides the same functionality for Apache Iceberg tables.
In addition to these new extensions, this release introduces several bug fixes and new features, including:
- The ability to attach to a specific schema in a relational database.
- Support for
ADD
/DROP PROPERTY IF [NOT] EXISTS
commands. - A new
list_has_all
function for enhanced list operations. - Experimental support for Android armv8a platform.
Hope you enjoy the new release!
What's Changed
- Trim Unnecessary Quote for CLI JSON output #4643
- Fix list-contains binding #4644
- Support attach relational database with schema #4639
- Add Ice Berg Extension #4600
- Fix incorrect set of sequence val after exporting database #4636
- Add e-notation double #4616
- Implement FORMAT option in LOAD FROM clause. #4613
- Add inturrupt to path writer #4609
- Implement drop/add property with if exists #4598
- Delta extension #4587
- Fix flat select bug #4590
- Gds node predicate push down #4461
- Fix rollback during Node Table COPY #4467
- Fix the type cast in nested struct #4560
- Rel scan selection optimizations #4558
- Using shared_mutex instead of mutex in CatalogSet #4533
- Fix VersionInfo SelectionVector creation #4556
- Avoid importing polars in arrow scan #4551
- Added Unicode \u and \U parsing to the cli #4492
- Implement list_has_all #4546
Full Changelog: v0.7.0...0.7.1
v0.7.0
Key highlights of this release
There have been some key performance improvements in this release:
- New and much faster recursive path finding algorithms that implement relationship patterns
with the Kleene star [*
]. - Data spilling to disk during copy which enables copying very large graphs on machines with
limited RAM. - Zone maps which enable much faster scans of node/rel properties when there is
a filter on numeric properties.
From the usability perspective, we have the following improvements:
- CSV auto detection to automatically detect several CSV configurations during data ingest.
- Improved UX during CSV import that can report to users about skipping erroneous CSV lines.
- New JSON data type that you can use to store JSON blobs as node/relationship properties.
- New official Golang API so that you can build applications on top of Kùzu using Go!
What's Changed
- Make C API header available in the kuzu target's include directories by @benjaminwinger in #4034
- Add option
force_checkpoint_on_close
by @ray6080 in #4032 - Fix delete on empty database by @andyfengHKU in #4038
- Fix list_contains implicit casting by @andyfengHKU in #4044
- Force cast when scanning by @mxwli in #4041
- In memory mode by @ray6080 in #4012
- Implement attach remote duckdb database by @acquamarin in #4040
- Only commit memory on windows when initially claiming the frame by @benjaminwinger in #4047
- Fix list contains casting on empty list by @andyfengHKU in #4049
- Fix consecutive merge by @andyfengHKU in #4046
- Fix semi mask on scan node table by @ray6080 in #4050
- Add mscv in mem CI workflow; remove build steps in CI workflow by @ray6080 in #4051
- Fix resize and append of chunkedNodeGroup by @ray6080 in #4054
- Refactor list auxiliary buffer by @andyfengHKU in #4052
- Rework checkpoint memory usage estimation by @ray6080 in #4055
- Optimize catalog access by @andyfengHKU in #4058
- Fix updates to same row leading to incorrect write-write conflict errors by @ray6080 in #4063
- Added new output modes for shell by @MSebanc in #4053
- Vacuum dropped columns during checkpoint by @ray6080 in #4074
- Refactor generation of grammar files. by @mxwli in #4073
- Auto checkpoint when closing db by @ray6080 in #4075
- Csv sniffing by @mxwli in #3932
- Fix platform issues with md5sum utility by @mxwli in #4077
- Make property case insensitive by @andyfengHKU in #4039
- Add More Json tests by @mxwli in #4061
- Fixed cli modes to escape characters by @MSebanc in #4085
- Added some missing algorithm includes by @benjaminwinger in #4086
- Implemented cli start up commands ability by @MSebanc in #4078
- Fix bug in CorrelatedSubqueryUnnestSolver by @andyfengHKU in #4088
- Fix a bug when intersecting an empty build side table by @andyfengHKU in #4089
- Enabled progress bar by default for shell by @MSebanc in #4065
- Add default empty str for database path for Python and Node.js API by @mewim in #4090
- Rework iterateCatalogEntries and catalog entry oid by @ray6080 in #4057
- Add in_memory constructor for rust database and API docs for in-memory mode by @benjaminwinger in #4094
- Update api doc for java, nodejs, python and add pytest for in-mem mode by @ray6080 in #4095
- Fix detection of when rust builds don't need to re-build the bundled C++ library by @benjaminwinger in #4093
- Update number of threads in 32-bit build by @mewim in #4096
- bump version to 0.6.0 by @ray6080 in #4076
- Fix constant lambda expression evaluation by @andyfengHKU in #4098
- Refactor function&function expression by @andyfengHKU in #4102
- Csv reader progress fix by @MSebanc in #4099
- Fix user defined types in nested type by @acquamarin in #4081
- change occurrences of std::regex uses to RE2 library calls by @mxwli in #4100
- Csv reading improvement by @mxwli in #4079
- Fix failed tests when compression is disabled by @ray6080 in #4104
- Automatically upload binary dataset to s3 by @mewim in #4114
- Make Connection::query in rust call the C++ query function instead of using prepare+execute by @benjaminwinger in #4117
- Disabled progress bar by default due to performance issues by @MSebanc in #4115
- Add Pyarrow Skip and Limit by @mxwli in #4112
- Add projection push down to csv and parquet by @andyfengHKU in #4113
- Schedule lsqb to run every day by @ray6080 in #4122
- Unify getColumnID of node and rel TableCatalogEntry by @ray6080 in #4123
- Keys function by @acquamarin in #4125
- Fix serialize tinysnb dataset on windows by @acquamarin in #4124
- Fix JSON null handling by @mxwli in #4118
- Implement projection push down in scan relation tables. by @acquamarin in #4126
- Add option to ignore errors in CSV parsing + improve error messages by @royi-luo in #4067
- Random Split for Multi Copy Testing by @yiyun-sj in #3989
- Extension installation rework by @acquamarin in #4066
- Enable scheduled run for interactive v1 by @ray6080 in #4133
- Implement Predicate functions by @acquamarin in #4109
- Add filter push down to relational table by @andyfengHKU in #4120
- Added cli support for single line comments and added multiline mode by @MSebanc in #4134
- add projection to pyarrow, numpy, and json by @mxwli in #4119
- Fix progress bar threading issue by @royi-luo in #4140
- Schedule auto run of SNB BI by @ray6080 in #4139
- ALP implementation by @royi-luo in #3994
- Added shell development guide by @MSebanc in #4144
- Schedule auto run of fintech bench by @ray6080 in #4143
- Fix export-import rel group by @andyfengHKU in #4147
- Fix undirected edge projection by @andyfengHKU in #4151
- Merge BMFileHandle and FileHandle by @ray6080 in #4045
- Fix casting assertion errors in
getNodeTableEntries
andgetRelTableEntries
by @ray6080 in #4154 - Replace Linux open file flag with kuzu open file flag by @acquamarin in #2931
- Fix import legacy exported database by @acquamarin in #4157
- Fix user defined type casting by @acquamarin in #4156
- Http file cache improve by @acquamarin in #4159
- Ignore node batch insert failures by @royi-luo in #4158
- Deprecate ARM64 macOS CI job by @mewim in #4164
- Make page size be flexible configurable by @ray6080 in #4162
- Fix Windows extension test PostgreSQL host by @mewim in #4169
- Fix Windows build tool path by @mewim in #4170
- Fix indentation in build script by @mewim in #4171
- Fix multiplatform build and test workflow by @mewim in #4174
- Add compile flag to reduce size of static library on windows by @benjaminwinger in #3644
- Fix attach kuzu in in-mem mode by @acquamarin in #4177
- Generate LDBC-01 from CI by @mewim in #4180
- Minor fix for multiplatform build and test by @mewim in #4179
- Fix dataset paths for generated datasets on S3 by @mewim in #4182
- Restructure extension build pipeline (test) by @mewim in...
v0.6.1
v0.6.1 is a minor release with the following bug fixes:
- Fix attaching PostgreSQL database due to extension version mismatch
- Fix constant lambda expression evaluation (#4098)
- Csv reader progress fix (#4099)
- Fix failed tests when compression is disabled (#4104)
- Make Connection::query in rust call the C++ query function instead of using prepare+execute (#4117)
- Disabled progress bar by default due to performance issues (#4115)
- Fix JSON null handling (#4118)
- Fix undirected edge projection (#4151)
- Fix import legacy exported database (#4157)
- Fix attach kuzu in in-mem mode (#4177)
- Fix race condition causing an infinite loop in the eviction queue (#4187)
- Fix double-initialization of the NullChunkData buffer (#4186)
- Fix buffer manager failure false positive (#4221)
- Fix windows open file flag (#4238)
- Fix undefined behaviour in the Buffer Manager after failure (#4246)
- Fix subquery planning (#4255)
- Fix nested aggregate (#4259)
- fix rel checkpoint due to incrrect set null and misaligned gaps due to empty src node (#4274)
- Fix memory leak in JSON parsing (#4302)
- Fix OPTIONAL MATCH null value handling for NetworkX conversion (#4282)
- Fix Multiple COPY FROM parquet leads to data corruption (#4368)
Full Changelog: v0.6.0...v0.6.1
v0.6.0
This release comes with several bug fixes, CLI updates and a much awaited feature: in-memory mode for Kùzu to quickly create temporary databases in memory.
Please check our release post for more details!
What's Changed
- Make C API header available in the kuzu target's include directories by @benjaminwinger in #4034
- Add option
force_checkpoint_on_close
by @ray6080 in #4032 - Fix delete on empty database by @andyfengHKU in #4038
- Fix list_contains implicit casting by @andyfengHKU in #4044
- Force cast when scanning by @mxwli in #4041
- In memory mode by @ray6080 in #4012
- Implement attach remote duckdb database by @acquamarin in #4040
- Only commit memory on windows when initially claiming the frame by @benjaminwinger in #4047
- Fix list contains casting on empty list by @andyfengHKU in #4049
- Fix consecutive merge by @andyfengHKU in #4046
- Fix semi mask on scan node table by @ray6080 in #4050
- Add mscv in mem CI workflow; remove build steps in CI workflow by @ray6080 in #4051
- Fix resize and append of chunkedNodeGroup by @ray6080 in #4054
- Refactor list auxiliary buffer by @andyfengHKU in #4052
- Rework checkpoint memory usage estimation by @ray6080 in #4055
- Optimize catalog access by @andyfengHKU in #4058
- Fix updates to same row leading to incorrect write-write conflict errors by @ray6080 in #4063
- Added new output modes for shell by @MSebanc in #4053
- Vacuum dropped columns during checkpoint by @ray6080 in #4074
- Refactor generation of grammar files. by @mxwli in #4073
- Auto checkpoint when closing db by @ray6080 in #4075
- fix decimal comparison by @mxwli in #4087
- Fixed cli modes to escape characters by @MSebanc in #4085
- Added some missing algorithm includes by @benjaminwinger in #4086
- Fix bug in CorrelatedSubqueryUnnestSolver by @andyfengHKU in #4088
- Fix a bug when intersecting an empty build side table by @andyfengHKU #4089
- Add default empty str for database path for Python and Node.js API by @mewim in #4090
- Add in_memory constructor for rust database and API docs for in-memory mode by @benjaminwinger in #4094
- Update api doc for java, nodejs, python and add pytest for in-mem mode by @ray6080 in #4095
Full Changelog: v0.5.0...v0.6.0
v0.5.0
Version 0.5.0 introduces several major changes:
Performance improvements
- MVCC-based transaction manager.
- Remote file system cache in httpfs extension.
New features
- Attach remote Kùzu databases.
- Python UDFs.
- List lambda functions.
- Scan and copy from DataFrames.
- New DDL statements: create table if not exists; drop table if exists.
- Progress bar in CLI and Explorer.
- Join order hints. Specify join order in Cypher.
New extensions and API improvements
- SQLite scanner.
- Support copying from and to JSON files.
- Decimal data type.
- Numerous improvements on C API.
Please see our release post for more details!
What's Changed
- Allow fuzzy matching on test result by @yiyun-sj in #3432
- Support asserting RETURN result column names in testing framework by @yiyun-sj in #3417
- Remove unique_ptr of value in literal expression by @andyfengHKU in #3440
- Replace const pointer with const reference in type functions by @manh9203 in #3430
- Infer test group name directly from test file path by @yiyun-sj in #3418
- Use a lockfree data structure to store page states by @benjaminwinger in #3425
- Move length function as a rewrite function by @andyfengHKU in #3442
- Remove shared_ptr of value in parameter expression by @andyfengHKU in #3443
- Optimize InMemoryHashIndex lookups by @benjaminwinger in #3378
- Add Python UDF for Primitive Types by @mxwli in #3390
- Upgrade runner to Ubuntu 24.04 by @mewim in #3445
- Support multiple query statements in e2e test framework by @yiyun-sj in #3437
- Issue 2385 by @andyfengHKU in #3444
- Rework the public interface of SelectionVector by @ray6080 in #3447
- Fix python empty dict parameter bug by @andyfengHKU in #3452
- Update project version to 0.4.1 by @mewim in #3455
- Allow numeric value comparison with precision by @yiyun-sj in #3453
- Fix CSV file answers tuple count bug by @yiyun-sj in #3454
- Add mvcc support for catalog by @ray6080 in #3301
- Fix calculation of hash slots on 32bit env by @acquamarin in #3460
- Implement Polars Scanning by @mxwli in #3451
- Pass transaction pointer to function by @hououou in #3239
- Support initialize test case on existing binary db directory by @yiyun-sj in #3428
- Reclaim empty overflow slots in memory hash index by @benjaminwinger in #3438
- Fix pandas UTF-8 scan by @acquamarin in #3468
- Remove logger from Database by @ray6080 in #3270
- Add Nested Types List and Map for Python UDF support by @mxwli in #3450
- Fix #2888: error when commit/rollback on invalid transaction; fix error msg on nested transaction by @ray6080 in #3469
- Fix some minor hash index issues by @benjaminwinger in #3471
- Backtraces by @benjaminwinger in #3456
- Fixed issue with commit error not showing in shell by @MSebanc in #3472
- Add generic utility functions by @manh9203 in #3282
- Attach remote kuzu database by @acquamarin in #3467
- Refactor ftable schema by @andyfengHKU in #3479
- Merge the HashIndex bulkstorage with the local storage for inserts by @benjaminwinger in #3482
- Fix string hash and add more hash tests by @benjaminwinger in #3473
- Migrate benchmark to new server by @mewim in #3487
- Added missing algorithm includes needed for gcc 14 by @benjaminwinger in #3490
- Port changes from 0.4.2 to master by @mewim in #3493
- Enable -Werror for GCC Build & Test Job by @mxwli in #3494
- Implement attach options by @acquamarin in #3485
- Added current_timestamp and current_date functions by @MSebanc in #3497
- Fix serial csv reader by @acquamarin in #3505
- Make serializer tool able to be run standalone by @benjaminwinger in #3501
- Fix issue-3488 by @andyfengHKU in #3506
- Always build rust integration with release runtime library on Windows by @zaddach in #3226
- Support CREATE SEQUENCE functionality by @yiyun-sj in #3474
- Read primary key for delete by @andyfengHKU in #3512
- Disk array builder cleanup by @benjaminwinger in #3498
- Add issue and PR templates by @prrao87 in #3515
- Remote file system cache by @acquamarin in #3516
- Add storage version info to the single file header by @benjaminwinger in #3519
- Propagate chunk state to commit out of place in column by @ray6080 in #3522
- Implement file cache for s3 filesystem by @acquamarin in #3526
- Graph function framework by @andyfengHKU in #3486
- Support contains function by @acquamarin in #3531
- rename InQueryCall to TableFunctionCall by @andyfengHKU in #3533
- File cache optimization by @acquamarin in #3530
- Add issue template for performance optimization category by @ray6080 in #3535
- Preliminary Decimal Datatype by @mxwli in #3521
- Remove property stats by @ray6080 in #3534
- Rework scan node by @ray6080 in #3524
- Automatically merge pull request upon extension build by @mewim in #3539
- Implement string_split and split_part functions by @acquamarin in #3537
- Scan primary key column before updating by @andyfengHKU in #3542
- Report LSQB results to benchmark server by @mewim in #3545
- C Api Enhancements by @MSebanc in #3457
- Add default value to CREATE by @yiyun-sj in #3523
- Fix shell printing issue by @MSebanc in #3547
- Python UDF and C++ UDF improvements by @mxwli in #3483
- Fix create rel table group parser exception by @acquamarin in #3549
- Fix disabled test for 3524 by @andyfengHKU in #3546
- Refactor FinBench CI pipeline and report results to server by @mewim in #3551
- Refactor InteractiveV1 CI pipeline and report results to server by @mewim in #3552
- Fix join order for 3524 by @andyfengHKU in #3553
- Fix issue 3166 by @andyfengHKU in #3404
- Fix filesearchpath in localFileSystem glob by @acquamarin in #3550
- ColumnChunk statistics for zone mapping by @benjaminwinger in #2611
- Refactor BI pipeline and add report to server by @mewim in #3555
- Turn on primary key scan by @andyfengHKU in #3556
- Support populating DEFAULT values in COPY FROM statements by @yiyun-sj in #3554
- Apply zone map to scan by @andyfengHKU in #3561
- fix sequence batch insert test by @yiyun-sj in #3563
- Auto report internal benchmark results to PR runs by @mewim in #3568
- Fix return type by @sapalli2989 in #3567
- Separate larger benchmark machines for LDBC benchmarks by @mewim in #3571
- Fix rel multiplicity parsing by @acquamarin in #3574
- Reworked progress bar to keep display handling separate by @MSebanc in #3566
- Disk array packed headers by @benjaminwinger in #3557
- Fix stats updates by @benjaminwinger in #3582
- Fix issue-3570 by @andyfengHKU in #3584
- Refactor hash function execution framework by @acquamarin in #3583
- Apply zone map to rel scan by @andyfengHKU in #3573
- Track variable sized memory manager al...