You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**gitbase**, is a SQL database interface to Git repositories.
4
4
@@ -8,196 +8,27 @@ about the [Universal AST](https://doc.bblf.sh/) of the code itself. gitbase is b
8
8
gitbase implements the *MySQL* wire protocol, it can be accessed using any MySQL
9
9
client or library from any language.
10
10
11
+
[src-d/go-mysql-server](https://github.com/src-d/go-mysql-server) is the SQL engine implementation used by `gitbase`.
12
+
11
13
## Status
12
14
13
15
The project is currently in **alpha** stage, meaning it's still lacking performance in a number of cases but we are working hard on getting a performant system able to process thousands of repositories in a single node. Stay tuned!
14
16
15
17
## Examples
16
18
17
-
To see the SQL subset currently supported take a look at [this list](https://github.com/src-d/go-mysql-server/blob/0084abf48137b4d72c6f948abfde91a00f3f77f0/SUPPORTED.md) from [src-d/go-mysql-server](https://github.com/src-d/go-mysql-server).
18
-
19
-
[src-d/go-mysql-server](https://github.com/src-d/go-mysql-server) is the project where the SQL engine used by ***gitbase*** is implemented.
20
-
21
-
#### Get all the HEAD references from all the repositories
22
-
23
-
```sql
24
-
SELECT*FROM refs WHERE ref_name ='HEAD'
25
-
```
26
-
27
-
#### Commits that appears in more than one reference
28
-
29
-
```sql
30
-
SELECT*FROM (
31
-
SELECTCOUNT(c.commit_hash) AS num, c.commit_hash
32
-
FROM ref_commits r
33
-
INNER JOIN commits c
34
-
ONr.commit_hash=c.commit_hash
35
-
GROUP BYc.commit_hash
36
-
) t WHERE num >1
37
-
```
38
-
39
-
#### Get the number of blobs per HEAD commit
40
-
41
-
```sql
42
-
SELECTCOUNT(c.commit_hash), c.commit_hash
43
-
FROM ref_commits as r
44
-
INNER JOIN commits c
45
-
ONr.ref_name='HEAD'ANDr.commit_hash=c.commit_hash
46
-
INNER JOIN commit_blobs cb
47
-
ONcb.commit_hash=c.commit_hash
48
-
GROUP BYc.commit_hash
49
-
```
50
-
51
-
#### Get commits per commiter, per month in 2015
52
-
53
-
```sql
54
-
SELECTCOUNT(*) as num_commits, month, repo_id, committer_email
55
-
FROM (
56
-
SELECT
57
-
MONTH(committer_when) as month,
58
-
r.repository_idas repo_id,
59
-
committer_email
60
-
FROM ref_commits r
61
-
INNER JOIN commits c
62
-
ON YEAR(c.committer_when) =2015ANDr.commit_hash=c.commit_hash
63
-
WHEREr.ref_name='HEAD'
64
-
) as t
65
-
GROUP BY committer_email, month, repo_id
66
-
```
67
-
68
-
## Installation
69
-
70
-
### Prerequisites
19
+
You can see some [query examples](/docs/using-gitbase/examples.md) in [gitbase documentation](/docs).
71
20
72
-
**gitbase** has two optional dependencies that should be running on your system if you're planning on using certain functionality.
21
+
## Motivation and scope
73
22
74
-
-[bblfsh](https://github.com/bblfsh/bblfshd) >= 2.5.0 (only if you're planning to use the `UAST` functionality provided in gitbase).
75
-
-[pilosa](https://github.com/pilosa/pilosa) 0.9.0 (only if you're planning on using indexes).
76
-
77
-
### Installing from binaries
78
-
79
-
Check the [Release](https://github.com/src-d/gitbase/releases) page to download the gitbase binary.
80
-
81
-
### Installing from source
82
-
83
-
Because gitbase uses [bblfsh's client-go](https://github.com/bblfsh/client-go), which uses cgo, you need to install some dependencies by hand instead of just using `go get`.
84
-
85
-
_Note_: we use `go get -d` so the code is not compiled yet, as it would
86
-
fail before `make dependencies` is executed successfully.
87
-
88
-
```
89
-
go get -d github.com/src-d/gitbase
90
-
cd $GOPATH/src/github.com/src-d/gitbase
91
-
make dependencies
92
-
```
93
-
94
-
## Usage
95
-
96
-
### Local
97
-
```bash
98
-
Usage:
99
-
gitbase [OPTIONS] <server | version>
100
-
101
-
Help Options:
102
-
-h, --help Show this help message
103
-
104
-
Available commands:
105
-
server Start SQL server.
106
-
version Show the version information.
107
-
```
108
-
109
-
You can start a server providing a path which contains multiple git repositories `/path/to/repositories` with this command:
110
-
111
-
```
112
-
$ gitbase server -v -g /path/to/repositories -u gitbase
113
-
```
114
-
115
-
### Docker
116
-
117
-
You can use the official image from [docker hub](https://hub.docker.com/r/srcd/gitbase/tags/) to quickly run gitbase:
118
-
```
119
-
docker run --rm --name gitbase -p 3306:3306 -v /my/git/repos:/opt/repos srcd/gitbase:latest
120
-
```
121
-
122
-
If you want to speedup gitbase using indexes, you must run a pilosa container:
123
-
```
124
-
docker run -it --rm --name pilosa -p 10101:10101 pilosa/pilosa:v0.9.0
125
-
```
126
-
127
-
Then link the gitbase container to the pilosa one:
|`GITBASE_SKIP_GIT_ERRORS`| do not stop queries on git errors, default disabled |
163
-
|`GITBASE_INDEX_DIR`| directory where indexes will be persisted |
23
+
gitbase was born to ease the analysis of git repositories and its source code.
164
24
165
-
## Tables
25
+
Also, making it MySQL compatible, we provide the maximum compatibility between languages and existing tools.
166
26
167
-
You can execute the `SHOW TABLES` statement to get a list of the available tables.
168
-
To get all the columns and types of a specific table, you can write `DESCRIBE TABLE [tablename]`.
27
+
As a single binary allows use it as a standalone service. The service is able to process local repositories or integrate with existing tools and frameworks (e.g. spark) to make source code analysis on the large scale.
|is_remote(reference_name)bool| check if the given reference name is from a remote one |
193
-
|is_tag(reference_name)bool| check if the given reference name is a tag |
194
-
|language(path, [blob])text| gets the language of a file given its path and the optional content of the file |
195
-
|uast(blob, [lang, [xpath]])json_blob| returns an array of UAST nodes as blobs |
196
-
|uast_xpath(json_blob, xpath)| performs an XPath query over the given UAST nodes |
197
-
198
-
## Unstable features
199
-
200
-
-**Table squashing:** there is an optimization that collects inner joins between tables with a set of supported conditions and converts them into a single node that retrieves the data in chained steps (getting first the commits and then the blobs of every commit instead of joining all commits and all blobs, for example). It can be enabled with the environment variable `GITBASE_UNSTABLE_SQUASH_ENABLE`.
31
+
From here, you can directly go to [getting started](/docs/using-gitbase/getting-started.md).
If you need support, want to contribute or just want to say hi, join us at the [source{d} community Slack](https://join.slack.com/t/sourced-community/shared_invite/enQtMjc4Njk5MzEyNzM2LTFjNzY4NjEwZGEwMzRiNTM4MzRlMzQ4MmIzZjkwZmZlM2NjODUxZmJjNDI1OTcxNDAyMmZlNmFjODZlNTg0YWM). We hang out in the #general channel.
|`GITBASE_BLOBS_MAX_SIZE`| maximum blob size to return in MiB, default 5 MiB |
10
+
|`GITBASE_BLOBS_ALLOW_BINARY`| enable retrieval of binary blobs, default `false`|
11
+
|`GITBASE_UNSTABLE_SQUASH_ENABLE`| enable join squash rule to improve query performance **experimental**. This optimization collects inner joins between tables with a set of supported conditions and converts them into a single node that retrieves the data in chained steps (getting first the commits and then the blobs of every commit instead of joining all commits and all blobs, for example).|
12
+
|`GITBASE_SKIP_GIT_ERRORS`| do not stop queries on git errors, default disabled |
13
+
14
+
## Command line arguments
15
+
16
+
```bash
17
+
Please specify one command of: server or version
18
+
Usage:
19
+
gitbase [OPTIONS] <server | version>
20
+
21
+
Help Options:
22
+
-h, --help Show this help message
23
+
24
+
Available commands:
25
+
server Starts a gitbase server instance
26
+
version Show the version information
27
+
```
28
+
29
+
`server` command contains the following options:
30
+
31
+
```bash
32
+
Usage:
33
+
gitbase [OPTIONS] server [server-OPTIONS]
34
+
35
+
Starts a gitbase server instance
36
+
37
+
The squashing tables and pushing down join conditions is still a
38
+
work in progress and unstable, disable by default. It can be enabled
39
+
using a not empty value at GITBASE_UNSTABLE_SQUASH_ENABLE env variable.
40
+
41
+
By default when gitbase encounters an error in a repository it
42
+
stops the query. With GITBASE_SKIP_GIT_ERRORS variable it won't
43
+
complain and just skip those rows or repositories.
44
+
45
+
Help Options:
46
+
-h, --help Show this help message
47
+
48
+
[server command options]
49
+
-v Activates the verbose mode
50
+
-g, --git= Path where the git repositories are located, multiple directories can be defined. Accepts globs.
51
+
--siva= Path where the siva repositories are located, multiple directories can be defined. Accepts globs.
52
+
-h, --host= Host where the server is going to listen (default: localhost)
53
+
-p, --port= Port where the server is going to listen (default: 3306)
54
+
-u, --user= User name used for connection (default: root)
55
+
-P, --password= Password used for connection
56
+
--pilosa= URL to your pilosa server (default: http://localhost:10101)
57
+
-i, --index= Directory where the gitbase indexes information will be persisted. (default: /var/lib/gitbase/index)
0 commit comments