Skip to content

Commit 3a91b5c

Browse files
committed
chore: update README
1 parent 07a57d0 commit 3a91b5c

File tree

1 file changed

+76
-70
lines changed

1 file changed

+76
-70
lines changed

README.md

Lines changed: 76 additions & 70 deletions
Original file line numberDiff line numberDiff line change
@@ -5,76 +5,37 @@
55

66
Exabyte Property Extractor, Sourcer, Serializer (ExPreSS) is a Python package to extract material- and simulation-related properties and serialize them according to the Exabyte Data Convention (EDC) outlined in [Exabyte Source of Schemas and Examples (ESSE)](https://github.com/Exabyte-io/exabyte-esse).
77

8-
## Functionality
8+
## 1. Overview
99

10-
As below:
10+
The following Functionality is supported:
1111

1212
- Extract structural information, material properties and from simulation data
13-
- Serialize extracted information according to ESSE/EDC
13+
- Serialize extracted information according to [ESSE](#links) data standard
1414
- Support for multiple simulation engines, including:
1515
- [VASP](#links)
1616
- [Quantum ESPRESSO](#links)
17+
- [JARVIS](#links)
1718
- others, to be added
1819

1920
The package is written in a modular way easy to extend for additional applications and properties of interest. Contributions can be in the form of additional [functionality](#todo-list) and [bug/issue reports](https://help.github.com/articles/creating-an-issue/).
2021

21-
## Architecture
22-
23-
The following diagram presents the package architecture. The package provides an [interface](express/__init__.py) to extract properties in EDC format. Inside the interface `Property` classes are initialized with a `Parser` (Vasp, Espresso, or Structure) depending on the given parameters through the parser factory. Each `Property` class implements required calls to `Parser` functions listed in these [Mixins Classes](express/parsers/mixins) to extract raw data either from the textual files, XML files or input files in string format and implements a serializer to form the final property according to the EDC format.
24-
25-
![ExPreSS](https://user-images.githubusercontent.com/10528238/53124591-9958e700-3510-11e9-9222-3aedacfd7943.png)
26-
27-
### Parsers
28-
29-
As explained above, ExPreSS parsers are responsible for extracting raw data from different sources such as data on the disk and provide the raw data to properties classes. In order to make sure all parsers implement the same interfaces and abstract properties classes from the parsers implementations, a set a [Mixin Classes](express/parsers/mixins) are provided which should be mixed with the parsers. The parsers must implement Mixins' abstract methods at the time of inheritance.
30-
31-
### Properties
32-
33-
ExPreSS properties classes are responsible to form the properties based on the raw data provided by the parsers and serialize the property according to EDC. A list of supported properties are available in [here](express/settings.py).
34-
35-
### Extractors
36-
37-
Extractors are classes that are composed with the parsers to extract raw data from the corresponding sources such as text or XML.
38-
39-
## Installation
22+
## 2. Installation
4023

4124
ExPreSS can be installed as a Python package either via PyPi or the repository as below.
4225

43-
#### PyPi
26+
### 2.1. From PyPi
4427

4528
```bash
46-
pip install express
29+
pip install express-py
4730
```
4831

49-
#### Repository
32+
### 2.2. From GitHub repository
5033

51-
0. Install [git-lfs](https://help.github.com/articles/installing-git-large-file-storage/) in order to pull the files stored on Git LFS.
34+
See "Development" section below.
5235

53-
1. Clone repository:
54-
55-
```bash
56-
git clone [email protected]:Exabyte-io/exabyte-express.git
57-
```
36+
## 3. Usage
5837

59-
2. Install [virtualenv](https://virtualenv.pypa.io/en/stable/) using [pip](https://pip.pypa.io/en/stable/) if not already present:
60-
61-
```bash
62-
pip install virtualenv
63-
```
64-
65-
3. Create virtual environment and install required packages:
66-
67-
```bash
68-
cd exabyte-express
69-
virtualenv venv
70-
source venv/bin/activate
71-
export GIT_LFS_SKIP_SMUDGE=1
72-
pip install -e PATH_TO_EXPRESS_REPOSITORY
73-
```
74-
75-
## Usage
76-
77-
### Extract Total Energy
38+
### 3.1. Extract Total Energy
7839

7940
The following example demonstrates how to initialize an ExPreSS class instance to extract and serialize total energy produced in a Quantum ESPRESSO calculation. The full path to the calculation directory (`work_dir`) and the file containing standard output (`stdout_file`) are required to be passed as arguments to the underlying Espresso parser.
8041

@@ -89,12 +50,13 @@ kwargs = {
8950

9051
}
9152

92-
express_ = ExPrESS("espresso", **kwargs)
93-
print json.dumps(express_.property("total_energy"), indent=4)
53+
handler = ExPrESS("espresso", **kwargs)
54+
data = handler.property("total_energy", **kwargs)
55+
print(json.dumps(data, indent=4))
9456

9557
```
9658

97-
### Extract Relaxed Structure
59+
### 3.2. Extract Relaxed Structure
9860

9961
In this example the final structure of a VASP calculation is extracted and is serialized to a material. The final structure is extracted from the `CONTCAR` file located in the calculation directory (`work_dir`). `is_final_structure=True` argument should be passed to the [Material Property](express/properties/material.py) class to let it know to extract final structure.
10062

@@ -109,12 +71,13 @@ kwargs = {
10971

11072
}
11173

112-
express_ = ExPrESS("vasp", **kwargs)
113-
print json.dumps(express_.property("material", is_final_structure=True), indent=4)
74+
handler = ExPrESS("vasp", **kwargs)
75+
data = handler.property("material", is_final_structure=True, **kwargs)
76+
print(json.dumps(data, indent=4))
11477

11578
```
11679

117-
### Extract Structure from Input
80+
### 3.3. Extract Structure from input file
11881

11982
One can use [StructureParser](express/parsers/structure.py) to extract materials from POSCAR or PW input files. Please note that `StructureParser` class only works with strings and not files and therefore the input files should be read first and then passed to the parser.
12083

@@ -131,8 +94,9 @@ kwargs = {
13194
"structure_format": "poscar"
13295
}
13396

134-
express_ = ExPrESS("structure", **kwargs)
135-
print json.dumps(express_.property("material"), indent=4)
97+
handler = ExPrESS("structure", **kwargs)
98+
data = handler.property("material", **kwargs)
99+
print(json.dumps(data, indent=4))
136100

137101
with open("./tests/fixtures/espresso/test-001/pw-scf.in") as f:
138102
pwscf_input = f.read()
@@ -142,39 +106,80 @@ kwargs = {
142106
"structure_format": "espresso-in"
143107
}
144108

145-
express_ = ExPrESS("structure", **kwargs)
146-
print json.dumps(express_.property("material"), indent=4)
147-
109+
handler = ExPrESS("structure", **kwargs)
110+
data = handler.property("material", **kwargs)
111+
print(json.dumps(data, indent=4))
148112
```
149113

150-
## Tests
114+
## 4. Development
115+
116+
### 4.1. Install From GitHub
117+
118+
1. Install [git-lfs](https://help.github.com/articles/installing-git-large-file-storage/) in order to pull the files stored on Git LFS.
119+
2. Clone repository:
120+
```bash
121+
git clone [email protected]:Exabyte-io/express.git
122+
```
123+
3. Install [virtualenv](https://virtualenv.pypa.io/en/stable/) using [pip](https://pip.pypa.io/en/stable/) if not already present:
124+
```bash
125+
pip install virtualenv
126+
```
127+
4. Create virtual environment and install required packages:
128+
```bash
129+
cd express
130+
virtualenv venv
131+
source venv/bin/activate
132+
export GIT_LFS_SKIP_SMUDGE=1
133+
pip install -e PATH_TO_EXPRESS_REPOSITORY
134+
```
135+
136+
### 4.2. Tests
151137

152138
There are two types of tests in ExPreSS: unit and integration, implemented in [Python Unit Testing Framework](https://docs.python.org/2/library/unittest.html).
153139

154-
### Unit Tests
140+
#### 4.2.1. Unit Tests
155141

156142
Unit tests are used to assert properties are serialized according to EDC. Properties classes are initialized with mocked parser data and then are serialized to assert functionality.
157143

158-
### Integration Tests
144+
#### 4.2.2. Integration Tests
159145

160146
Parsers functionality is tested through integration tests. The parsers are initialized with the configuration specified in the [Tests Manifest](./tests/manifest.yaml) and then the functionality is asserted.
161147

162-
### Run Tests
148+
#### 4.2.3. Running Tests
163149

164150
> Note that the CI tests are run using a github action in `.github`, and not using the script below, so there could be discrepancies.
165151

166-
Run the following commands to run the tests.
152+
Run the following commands to run the tests ("unit" tests only in this case).
167153

168154
```bash
169-
sh run-tests.sh -t=unit
170-
sh run-tests.sh -t=integration
155+
python -m unittest discover --verbose --catch --start-directory tests/unit
171156
```
172157

173-
## Contribution
158+
## 5. Architecture
159+
160+
The following diagram presents the package architecture. The package provides an [interface](express/__init__.py) to extract properties in EDC format. Inside the interface `Property` classes are initialized with a `Parser` (Vasp, Espresso, or Structure) depending on the given parameters through the parser factory. Each `Property` class implements required calls to `Parser` functions listed in these [Mixins Classes](express/parsers/mixins) to extract raw data either from the textual files, XML files or input files in string format and implements a serializer to form the final property according to the EDC format.
161+
162+
![ExPreSS](https://user-images.githubusercontent.com/10528238/53124591-9958e700-3510-11e9-9222-3aedacfd7943.png)
163+
164+
### 5.1. Parsers
165+
166+
As explained above, ExPreSS parsers are responsible for extracting raw data from different sources such as data on the disk and provide the raw data to properties classes. In order to make sure all parsers implement the same interfaces and abstract properties classes from the parsers implementations, a set a [Mixin Classes](express/parsers/mixins) are provided which should be mixed with the parsers. The parsers must implement Mixins' abstract methods at the time of inheritance.
167+
168+
### 5.2. Properties
169+
170+
ExPreSS properties classes are responsible to form the properties based on the raw data provided by the parsers and serialize the property according to EDC. A list of supported properties are available in [here](express/settings.py).
171+
172+
### 5.3. Extractors
173+
174+
Extractors are classes that are composed with the parsers to extract raw data from the corresponding sources such as text or XML.
175+
176+
177+
## 6. Contribution
174178
175179
This repository is an [open-source](LICENSE.md) work-in-progress and we welcome contributions. We suggest forking this repository and introducing the adjustments there. The changes in the fork can further be considered for merging into this repository as explained in [GitHub Standard Fork and Pull Request Workflow](https://gist.github.com/Chaser324/ce0505fbed06b947d962).
176180
177-
## TODO list
181+
182+
## 7. TODO list
178183
179184
Desirable features for implementation:
180185
@@ -187,3 +192,4 @@ Desirable features for implementation:
187192
1. [Excellent Source of Schemas and Examples (ESSE), Github Repository](https://github.com/exabyte-io/esse)
188193
1. [Vienna Ab-initio Simulation Package (VASP), official website](https://cms.mpi.univie.ac.at/vasp/)
189194
1. [Quantum ESPRESSO, Official Website](https://www.quantum-espresso.org/)
195+
1. [JARVIS NIST](https://pages.nist.gov/jarvis/)

0 commit comments

Comments
 (0)