|
| 1 | ++++ |
| 2 | +title = "OrientDB work in progress update 2025 Q1" |
| 3 | +description = "OrientDB work in progess update 2025 Q1" |
| 4 | +insert_anchor_links = "none" |
| 5 | +date="2025-04-22" |
| 6 | +[extra] |
| 7 | +menu = false |
| 8 | ++++ |
| 9 | +# |
| 10 | +Since the last update have passed a few months so here is an update on what happened on the develop branch in the meanwhile. |
| 11 | + |
| 12 | +As mentioned in the original work post, most of the effort for the 4.0 is removed long deprecated things, |
| 13 | +and in this few months have been removed two main legacy APIs. If you have been using OrientDB for a while you may be a bit affectionate to them as they were the main APIs before the 3.x, |
| 14 | +but now `ODadabaseDocumentTx` and `OServerAdmin` are not part of the codebase anymore, together with them have been removed some related helpers, and the underlying implementations that they required, |
| 15 | +this was not a big amount of code, but these APIs where used in a lot of tests, so it has been long effort to convert all legacy tests to new APIs |
| 16 | + |
| 17 | +Some other refactor work that was not mentioned in early posts was also done, mostly around the client implementation, |
| 18 | +traditionally OrientDB used as interface for network layer the storage interface, the time revealed that was not a good design choice, making way more complex handle evolution of remote network and disk persistence, |
| 19 | +resulting in actual storage implementation details leak around the codebase, without respecting the interface, now the network implementation do not implement the storage interface anymore, |
| 20 | +making the client implementation actually look like a client implementation without low level storage concepts in it, and the storage interface is now again the separation layer between raw data persistence and database logic, |
| 21 | +more work still need to be done to split clearly the storage implementation and the higher level implementations and make the persistence engine pluggable again like in OrientDB 1.x, this will probably be done during the development of 4.x. |
| 22 | + |
| 23 | +The last and the biggest set of work was done around one really important issue, that was the memory usage of the query engine, |
| 24 | +in 3.x you could drain all the memory of the server with a properly crafted not too complex query, this was due to the algorithm that was used to find the indexes to use in the query, |
| 25 | +this algorithm has been redesigned from scratch resulting in a way lower memory usage for complex query with multiple conditions, with this redesign though some of the optimization |
| 26 | +that existed in the previous implementation that improved the performance for simple queries are not there anymore and need to be re-done, an example of this is avoiding to check the condition already handled by the index lookup, |
| 27 | +these optimizations are not critical for a 4.0.0 release, but will be good if they will be included before the final release, also with the new implementation additional optimization are now possible. |
| 28 | +So the current implementation has slightly different performance characteristics compared to 3.x, some simpler query may be a bit slower and some more complex query will be way faster, later |
| 29 | +on in the implementation will be done some profiling and optimization to make sure to have only performance improvements and not regressions. |
| 30 | + |
| 31 | + |
| 32 | +Going back to the original list of things to do here is the update status: |
| 33 | + |
| 34 | + |
| 35 | +- [x] Remove legacy query engine APIs |
| 36 | +- [ ] Remove legacy document store APIs |
| 37 | +- [x] Remove legacy database APIs |
| 38 | +- [x] Fix major issues in query engine and query performance improvement |
| 39 | +- [ ] Transactional DDLs |
| 40 | +- [ ] In memory database backup & restore |
| 41 | +- [ ] Support for distributed "in memory" databases |
| 42 | +- [ ] Split distributed transaction log from WAL |
| 43 | +- [x] Remove of TP2(aka blueprints) APIs from code |
| 44 | +- [x] Include of TP3(aka gremlin) APIs in main repository |
| 45 | +- [ ] Rewrite distributed node discovery |
| 46 | +- [ ] Introduce proper consensus based algorithm for distributed topology management |
| 47 | +- [ ] New minimum java versions as 17, with migrations with the next major |
| 48 | +- [ ] Lucene integration improvements |
| 49 | +- [ ] Review of data types trying to remove redundant one potentially adding new types |
| 50 | +- [ ] Add & Complete server commands |
| 51 | +- [ ] Review the console to integrate server commands |
| 52 | +- [ ] Introduce a third party http server |
| 53 | +- [ ] Review and improve non java clients |
| 54 | +- [ ] Review and upgrades in studio dependencies |
| 55 | +- [ ] Review metrics, maybe integration of OpenTelemetry |
| 56 | + |
| 57 | + |
| 58 | +Again this list do not need to be all complete to release a 4.0.0, but few more critical points need to be done before releasing the 4.0.0 |
0 commit comments