Skip to content

Commit 982fd4d

Browse files
authored
Sync readme with other tableschema plugins (#10)
* Synced readme with other plugins * Disable Python3.4 on CI because of Travis bug
1 parent 38cf7ba commit 982fd4d

File tree

5 files changed

+118
-71
lines changed

5 files changed

+118
-71
lines changed

.travis.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ language:
1919
python:
2020
- 2.7
2121
- 3.3
22-
- 3.4
22+
# - 3.4
2323
- 3.5
2424
- 3.6
2525

CONTRIBUTING.md

-43
This file was deleted.

README.md

+116-26
Original file line numberDiff line numberDiff line change
@@ -2,25 +2,90 @@
22

33
[![Travis](https://img.shields.io/travis/frictionlessdata/tableschema-elasticsearch-py/master.svg)](https://travis-ci.org/frictionlessdata/tableschema-elasticsearch-py)
44
[![Coveralls](http://img.shields.io/coveralls/frictionlessdata/tableschema-elasticsearch-py/master.svg)](https://coveralls.io/r/frictionlessdata/tableschema-elasticsearch-py?branch=master)
5-
[![PyPi](https://img.shields.io/pypi/v/tableschema-elasticsearch-py.svg)](https://pypi.python.org/pypi/tableschema-elasticsearch-py)
6-
[![SemVer](https://img.shields.io/badge/versions-SemVer-brightgreen.svg)](http://semver.org/)
5+
[![PyPi](https://img.shields.io/pypi/v/tableschema-elasticsearch.svg)](https://pypi.python.org/pypi/tableschema-elasticsearch)
76
[![Gitter](https://img.shields.io/gitter/room/frictionlessdata/chat.svg)](https://gitter.im/frictionlessdata/chat)
87

9-
Generate and load ElasticSearch indexes based on JSON Table Schema descriptors.
8+
Generate and load ElasticSearch indexes based on [Table Schema](http://specs.frictionlessdata.io/table-schema/) descriptors.
9+
10+
## Features
11+
12+
- implements `tableschema.Storage` interface
1013

1114
## Getting Started
1215

1316
### Installation
1417

18+
The package use semantic versioning. It means that major versions could include breaking changes. It's highly recommended to specify `package` version range in your `setup/requirements` file e.g. `package>=1.0,<2.0`.
19+
1520
```bash
1621
pip install tableschema-elasticsearch
1722
```
1823

24+
### Examples
25+
26+
Code examples in this readme requires Python 3.3+ interpreter. You could see even more example in [examples](https://github.com/frictionlessdata/tableschema-spss-py/tree/master/examples) directory.
27+
28+
```python
29+
import elasticsearch
30+
import jsontableschema_es
31+
32+
INDEX_NAME = 'testing_index'
33+
34+
# Connect to Elasticsearch instance running on localhost
35+
es=elasticsearch.Elasticsearch()
36+
storage=jsontableschema_es.Storage(es)
37+
38+
# List all indexes
39+
print(list(storage.buckets))
40+
41+
# Create a new index
42+
storage.create('test', [
43+
('numbers',
44+
{
45+
'fields': [
46+
{
47+
'name': 'num',
48+
'type': 'number'
49+
}
50+
]
51+
})
52+
])
53+
54+
# Write data to index
55+
l=list(storage.write(INDEX_NAME, 'numbers', ({'num':i} for i in range(1000)), ['num']))
56+
print(len(l))
57+
print(l[:10], '...')
58+
59+
l=list(storage.write(INDEX_NAME, 'numbers', ({'num':i} for i in range(500,1500)), ['num']))
60+
print(len(l))
61+
print(l[:10], '...')
62+
63+
# Read all data from index
64+
storage=jsontableschema_es.Storage(es)
65+
print(list(storage.buckets))
66+
l=list(storage.read(INDEX_NAME))
67+
print(len(l))
68+
print(l[:10])
69+
70+
```
71+
72+
## Documentation
73+
74+
The whole public API of this package is described here and follows semantic versioning rules. Everyting outside of this readme are private API and could be changed without any notification on any new version.
75+
1976
### Storage
2077

21-
Package implements [Tabular Storage](https://github.com/frictionlessdata/jsontableschema-py#storage) interface.
78+
Package implements [Tabular Storage](https://github.com/frictionlessdata/tableschema-py#storage) interface (see full documentation on the link):
79+
80+
![Storage](https://i.imgur.com/RQgrxqp.png)
2281

23-
`elasticsearch` is used as the db wrapper. We can get storage this way:
82+
This driver provides an additional API:
83+
84+
#### `Storage(es=None)`
85+
86+
- `es (object)` - `elasticsearch.Elastisearc` instance. If not provided new one will be created.
87+
88+
In this driver `elasticsearch` is used as the db wrapper. We can get storage this way:
2489

2590
```python
2691
from elasticsearch import Elasticsearch
@@ -34,25 +99,24 @@ Then we could interact with storage ('buckets' are ElasticSearch indexes in this
3499

35100
```python
36101
storage.buckets # iterator over bucket names
37-
storage.create('bucket', [(doc_type, descriptor)],
102+
storage.create('bucket', [(doc_type, descriptor)],
38103
reindex=False,
39104
always_recreate=False,
40105
mapping_generator_cls=None)
41106
# reindex will copy existing documents from an existing index with the same name (not implemented yet)
42107
# always_recreate will always recreate an index, even if it already exists. default is to update mappings only.
43-
# mapping_generator_cls allows customization of the generated mapping
108+
# mapping_generator_cls allows customization of the generated mapping
44109
storage.delete('bucket')
45110
storage.describe('bucket') # return descriptor, not implemented yet
46111
storage.iter('bucket', doc_type=optional) # yield rows
47112
storage.read('bucket', doc_type=optional) # return rows
48113
storage.write('bucket', doc_type, rows, primary_key,
49114
as_generator=False)
50-
# primary_key is a list of field names which will be used to generate document ids
115+
# primary_key is a list of field names which will be used to generate document ids
51116
```
52117

53118
When creating indexes, we always create an index with a semi-random name and a matching alias that points to it. This allows us to decide whether to re-index documents whenever we're re-creating an index, or to discard the existing records.
54119

55-
56120
### Mappings
57121

58122
When creating indexes, the tableschema types are converted to ES types and a mapping is generated for the index.
@@ -66,16 +130,16 @@ Example:
66130
{
67131
"fields": [
68132
{
69-
"name": "my-number",
133+
"name": "my-number",
70134
"type": "number"
71135
},
72136
{
73-
"name": "my-array-of-dates",
137+
"name": "my-array-of-dates",
74138
"type": "array",
75139
"es:itemType": "date"
76140
},
77141
{
78-
"name": "my-person-object",
142+
"name": "my-person-object",
79143
"type": "object",
80144
"es:schema": {
81145
"fields": [
@@ -87,7 +151,7 @@ Example:
87151
}
88152
},
89153
{
90-
"name": "my-library",
154+
"name": "my-library",
91155
"type": "array",
92156
"es:itemType": "object",
93157
"es:schema": {
@@ -99,36 +163,62 @@ Example:
99163
}
100164
},
101165
{
102-
"name": "my-user-provded-object",
166+
"name": "my-user-provded-object",
103167
"type": "object",
104168
"es:enabled": false
105-
}
169+
}
106170
]
107171
}
108172
```
109173

110174
#### Custom mappings
175+
111176
By providing a custom mapping generator class (via `mapping_generator_cls`), inheriting from the MappingGenerator class you should be able
112177

178+
## Contributing
113179

114-
### Drivers
180+
The project follows the [Open Knowledge International coding standards](https://github.com/okfn/coding-standards).
115181

116-
`elasticsearch-py` is used to access the ElasticSearch interface - [docs](https://elasticsearch-py.readthedocs.io/en/master/).
182+
Recommended way to get started is to create and activate a project virtual environment.
183+
To install package and development dependencies into active environment:
117184

118-
## API Reference
185+
```
186+
$ make install
187+
```
119188

120-
### Snapshot
189+
To run tests with linting and coverage:
121190

122-
https://github.com/frictionlessdata/tableschema-elasticsearch-py#snapshot
191+
```bash
192+
$ make test
193+
```
123194

124-
### Detailed
195+
For linting `pylama` configured in `pylama.ini` is used. On this stage it's already
196+
installed into your environment and could be used separately with more fine-grained control
197+
as described in documentation - https://pylama.readthedocs.io/en/latest/.
125198

126-
- [Changelog](https://github.com/frictionlessdata/tableschema-elasticsearch-py/commits/master)
199+
For example to sort results by error type:
127200

128-
## Contributing
201+
```bash
202+
$ pylama --sort <path>
203+
```
204+
205+
For testing `tox` configured in `tox.ini` is used.
206+
It's already installed into your environment and could be used separately with more fine-grained control as described in documentation - https://testrun.org/tox/latest/.
207+
208+
For example to check subset of tests against Python 2 environment with increased verbosity.
209+
All positional arguments and options after `--` will be passed to `py.test`:
210+
211+
```bash
212+
tox -e py27 -- -v tests/<path>
213+
```
214+
215+
Under the hood `tox` uses `pytest` configured in `pytest.ini`, `coverage`
216+
and `mock` packages. This packages are available only in tox envionments.
217+
218+
## Changelog
129219

130-
Please read the contribution guideline:
220+
Here described only breaking and the most important changes. The full changelog and documentation for all released versions could be found in nicely formatted [commit history](https://github.com/frictionlessdata/tableschema-elasticsearch-py/commits/master).
131221

132-
[How to Contribute](CONTRIBUTING.md)
222+
### v0.x
133223

134-
Thanks!
224+
Initial driver implementation.
File renamed without changes.

pylama.ini

+1-1
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ linters = pyflakes,mccabe,pep8
33
ignore = E731
44

55
[pylama:pep8]
6-
max_line_length = 90
6+
max_line_length = 100
77

88
[pylama:mccabe]
99
complexity = 24

0 commit comments

Comments
 (0)