-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve tutorial documentation #353
Comments
I think that we can use an artificial dataset that represents a community of twitter users to demonstrate these benefits. Dataset: Twitter communityNodes:User
Posts
Relations:Follows (FROM User TO User) Queries
Other suggested queries:
I also attached the csvs to be used in the tutorial, cc @prrao87 please take a look 👍 |
Here is a rough draft of what RUST's tutorial queries will look like, I think we should:
Any other suggestions can also be helpful! use kuzu::{Connection, Database, Error, SystemConfig};
fn main() -> Result<(), Error> {
// Create an empty on-disk database and connect to it
let db = Database::new("./demo_db", SystemConfig::default())?;
let conn = Connection::new(&db)?;
// Create the tables
conn.query("CREATE NODE TABLE User(userId INT64 PRIMARY KEY, username STRING, account_creation_date DATE)")?;
conn.query("CREATE NODE TABLE User_Post(postId INT64 PRIMARY KEY, post_date DATE, like_count INT64, retweet_count INT64)")?;
conn.query("CREATE REL TABLE FOLLOWS(FROM User TO User)")?;
conn.query("CREATE REL TABLE POSTS(FROM User TO User_Post)")?;
conn.query("CREATE REL TABLE LIKES(FROM User TO User_Post)")?;
conn.query("COPY User FROM './data/tutorial_user.csv'")?;
conn.query("COPY User_Post FROM './data/tutorial_user_post.csv'")?;
conn.query("COPY FOLLOWS FROM './data/TUTORIAL_FOLLOWS.csv'")?;
conn.query("COPY POSTS FROM './data/TUTORIAL_POSTS.csv'")?;
conn.query("COPY LIKES FROM './data/TUTORIAL_LIKES.csv'")?;
// Querying a two-hop statement, giving user recommended follows:
// First, we want to query for users that we follow follows. We should start off with a query which looks like this:
conn.query("""
MATCH (u1:User)-[f1:FOLLOWS]->(u2:User)-[f2:FOLLOWS]->(u3:User)
RETURN u3
""")?;
// Adding onto the query, we want to specify the u1 to be the user we wish to recommend to. We use a WHERE Clause to do so:
conn.query("""
MATCH (u1:User)-[f1:FOLLOWS]->(u2:User)-[f2:FOLLOWS]->(u3:User)
WHERE u1.username = 'epicking81'
RETURN u3
""")?;
// This is still not entirely correct, since u3 can return users which u1 already follow. As a last step, we need to expand the WHERE Clause:
conn.query("""
MATCH (u1:User)-[f1:FOLLOWS]->(u2:User)-[f2:FOLLOWS]->(u3:User)
WHERE u1.username = 'epicking81'
AND NOT (u1)-[:FOLLOWS]->(u3)
RETURN u3
""")?;
// Querying for stats by aggregation, giving the number of people a user follows:
// Similar to above, we wish to first specify the relationship. In this case, we want to know how many people a specific user follows:
conn.query("""
MATCH (u1:User)-[f:FOLLOWS]->(u2:User)
WHERE u1.username = 'epicking81'
RETURN u2
""")?;
// The previous query will return the list of users our user follows. We can alter the query to use aggregation to return the count instead:
conn.query("""
MATCH (u1:User)-[f:FOLLOWS]->(u2:User)
WHERE u1.username = 'epicking81'
RETURN count(u2)
""")?;
// This is extremely useful in multiple scenarios! Here are some more examples:
// 1. Querying for average like count of a user:
conn.query("""
MATCH (u1:User)-[p:POSTS]->(p2:User_Post)
WHERE u1.username = 'epicking81'
RETURN avg(p2.like_count)
""")?;
// 2. Querying for max like count of a user:
conn.query("""
MATCH (u1:User)-[p:POSTS]->(p2:User_Post)
WHERE u1.username = 'epicking81'
RETURN max(p2.like_count)
""")?;
// Querying for shortest path
// We can use recursive matching to find paths between nodes, an example of this showing the shortest length between two users:
conn.query("""
MATCH (u1:user)-[f:FOLLOWS* SHORTEST 1..4]->(u2:User)
WHERE u1.username = 'silentguy245' AND u2.username = 'epicwolf202'
RETURN length(f) AS length;
""")?;
// Recommendation page for user:
conn.query("""
MATCH (u1:user)-[f:FOLLOWS]->(u2:User)-[]->(p:User_Post)
WHERE p.post_date > "2022-01-01" AND u1.username = 'fastgirl798'
RETURN p.*
ORDER BY p.like_count DESC LIMIT 10;
""")?;
} |
This is a good starting point! Some thoughts:
Along those lines. I'm not fully sure I follow the recommendation logic, but maybe flesh out those queries more. Also, we need to think about how the output results are formatted and displayed so that we can explain them. Maybe having the |
Related, maybe we can show how to use the output in further processing, such as using them in other queries or exporting them in other formats? Also, we can show how to perform parameterized queries. |
OMG, yes, @sdht0 thanks for that callout - we totally should show parameterized queries ("prepared statements"). Please find a way to work that in @WWW0030 . Place a new markdown section under the "Tutorials" section in the docs. When you make the PR, make it to the dev branch so that I can work on the organization of the page better after the 0.8.0 release. |
Users have been asking for more tutorials and examples in other languages than Python.
I propose that we update the Tutorials section in our docs to demonstrate the versatility of Kùzu to be used in various client languages. We need to showcase the same workflow, on the same dataset, highlighting that Kùzu caters to users coming from almost any language.
Subtask 1
First, we need to create an artificial dataset that clearly demonstrates the benefits of using a graph to answer the following kinds of queries.
CROUP BY
clause, so we need to show how you can aggregate on a particular property while grouping on another)SHORTEST
keyword in CypherSubtask 2
Write tutorials in each client language that showcases the end-to-end workflows in each client language that we officially support. We would read in data from CSV/Parquet files and create individual sub-issues linked to this issue that various team members can take on.
cc @aracardan @WWW0030
The text was updated successfully, but these errors were encountered: