You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently using a query looking similar to this one:
MATCH (b:A)-[]->(:Attribute)-[]->(:C)<-[* SHORTEST 1..6]-(:C{name:'Kùzu'})
RETURNcount(b);
Above query returns ~ 1k rows immediately.
In order to make statement a bit more concise, I changed it to:
MATCH (b:A)-[* 2..2]->(:C)<-[* SHORTEST 1..6]-(:C{name:'Kùzu'})
RETURNcount(b);
This query works hard - I skipped at "Current Pipeline Progress: 10%" after a minute or so.
Conceptually type A or C - there are more, but let's limit to these - is always followed by Attribute, which then is followed by A or C again, and cycle may repeat multiple times. There is a single relation type for (A|C) --> (Attribute) and another for (Attribute) --> (A|C). Attribute is a reified node, hence this pairwise hopping sequence.
In summary, given same output rows and otherwise equal query structure, there is a big performance difference between recursive and non-recursive searching. Here is the row count from splitting query parts:
Is this expected behavior?
For now I marked issue as performance optimization, not sure about a bug. Hopefully you got a hint from description, otherwise let me try to re-create a graph with dummy data.
The text was updated successfully, but these errors were encountered:
(Original example was a more complex query pattern, with 33 output rows as well; let's keep thing simple.)
Comment:
Performance decrease of factor ~ 2300 with recursive variant
This example might show, that query processor can make better use of join optimization. Given (:B)<-(:C) has just 33 rows and immediately returns, it would be more performant to perform recursive query resolution with left side (:A)-[* 2]->(:B) for these joined 33 nodes only instead of full graph.
Description
I am currently using a query looking similar to this one:
Above query returns ~ 1k rows immediately.
In order to make statement a bit more concise, I changed it to:
This query works hard - I skipped at "Current Pipeline Progress: 10%" after a minute or so.
Conceptually type
A
orC
- there are more, but let's limit to these - is always followed byAttribute
, which then is followed byA
orC
again, and cycle may repeat multiple times. There is a single relation type for(A|C) --> (Attribute)
and another for(Attribute) --> (A|C)
.Attribute
is a reified node, hence this pairwise hopping sequence.In summary, given same output rows and otherwise equal query structure, there is a big performance difference between recursive and non-recursive searching. Here is the row count from splitting query parts:
Is this expected behavior?
For now I marked issue as performance optimization, not sure about a bug. Hopefully you got a hint from description, otherwise let me try to re-create a graph with dummy data.
The text was updated successfully, but these errors were encountered: