Skip to content

Commit a7d2e28

Browse files
authored
[Doc] Update dataset_prepare & inference (#2798)
1 parent 871e7ac commit a7d2e28

File tree

4 files changed

+907
-64
lines changed

4 files changed

+907
-64
lines changed

docs/en/user_guides/2_dataset_prepare.md

Lines changed: 34 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
## Prepare datasets
1+
# Tutorial 2: Prepare datasets
22

33
It is recommended to symlink the dataset root to `$MMSEGMENTATION/data`.
44
If your folder structure is different, you may need to change the corresponding paths in config files.
@@ -179,20 +179,19 @@ mmsegmentation
179179
| │   │   │ └── polygons
180180
```
181181

182-
### Cityscapes
182+
## Cityscapes
183183

184184
The data could be found [here](https://www.cityscapes-dataset.com/downloads/) after registration.
185185

186186
By convention, `**labelTrainIds.png` are used for cityscapes training.
187-
We provided a [scripts](https://github.com/open-mmlab/mmsegmentation/blob/1.x/tools/dataset_converters/cityscapes.py) based on [cityscapesscripts](https://github.com/mcordts/cityscapesScripts)
188-
to generate `**labelTrainIds.png`.
187+
We provided a [script](https://github.com/open-mmlab/mmsegmentation/blob/1.x/tools/dataset_converters/cityscapes.py) based on [cityscapesscripts](https://github.com/mcordts/cityscapesScripts)to generate `**labelTrainIds.png`.
189188

190189
```shell
191190
# --nproc means 8 process for conversion, which could be omitted as well.
192191
python tools/dataset_converters/cityscapes.py data/cityscapes --nproc 8
193192
```
194193

195-
### Pascal VOC
194+
## Pascal VOC
196195

197196
Pascal VOC 2012 could be downloaded from [here](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar).
198197
Beside, most recent works on Pascal VOC dataset usually exploit extra augmentation data, which could be found [here](http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/semantic_contours/benchmark.tgz).
@@ -204,14 +203,14 @@ If you would like to use augmented VOC dataset, please run following command to
204203
python tools/dataset_converters/voc_aug.py data/VOCdevkit data/VOCdevkit/VOCaug --nproc 8
205204
```
206205

207-
Please refer to [concat dataset](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/advanced_guides/datasets.md) for details about how to concatenate them and train them together.
206+
Please refer to [concat dataset](../advanced_guides/add_datasets.md#concatenate-dataset) and [voc_aug config example](../../../configs/_base_/datasets/pascal_voc12_aug.py) for details about how to concatenate them and train them together.
208207

209-
### ADE20K
208+
## ADE20K
210209

211210
The training and validation set of ADE20K could be download from this [link](http://data.csail.mit.edu/places/ADEchallenge/ADEChallengeData2016.zip).
212211
We may also download test set from [here](http://data.csail.mit.edu/places/ADEchallenge/release_test.zip).
213212

214-
### Pascal Context
213+
## Pascal Context
215214

216215
The training and validation set of Pascal Context could be download from [here](http://host.robots.ox.ac.uk/pascal/VOC/voc2010/VOCtrainval_03-May-2010.tar). You may also download test set from [here](http://host.robots.ox.ac.uk:8080/eval/downloads/VOC2010test.tar) after registration.
217216

@@ -223,7 +222,7 @@ If you would like to use Pascal Context dataset, please install [Detail](https:/
223222
python tools/dataset_converters/pascal_context.py data/VOCdevkit data/VOCdevkit/VOC2010/trainval_merged.json
224223
```
225224

226-
### COCO Stuff 10k
225+
## COCO Stuff 10k
227226

228227
The data could be downloaded [here](http://calvin.inf.ed.ac.uk/wp-content/uploads/data/cocostuffdataset/cocostuff-10k-v1.1.zip) by wget.
229228

@@ -243,7 +242,7 @@ python tools/dataset_converters/coco_stuff10k.py /path/to/coco_stuff10k --nproc
243242

244243
By convention, mask labels in `/path/to/coco_stuff164k/annotations/*2014/*_labelTrainIds.png` are used for COCO Stuff 10k training and testing.
245244

246-
### COCO Stuff 164k
245+
## COCO Stuff 164k
247246

248247
For COCO Stuff 164k dataset, please run the following commands to download and convert the augmented dataset.
249248

@@ -267,7 +266,7 @@ By convention, mask labels in `/path/to/coco_stuff164k/annotations/*2017/*_label
267266

268267
The details of this dataset could be found at [here](https://github.com/nightrome/cocostuff#downloads).
269268

270-
### CHASE DB1
269+
## CHASE DB1
271270

272271
The training and validation set of CHASE DB1 could be download from [here](https://staffnet.kingston.ac.uk/~ku15565/CHASE_DB1/assets/CHASEDB1.zip).
273272

@@ -279,7 +278,7 @@ python tools/dataset_converters/chase_db1.py /path/to/CHASEDB1.zip
279278

280279
The script will make directory structure automatically.
281280

282-
### DRIVE
281+
## DRIVE
283282

284283
The training and validation set of DRIVE could be download from [here](https://drive.grand-challenge.org/). Before that, you should register an account. Currently '1st_manual' is not provided officially.
285284

@@ -291,7 +290,7 @@ python tools/dataset_converters/drive.py /path/to/training.zip /path/to/test.zip
291290

292291
The script will make directory structure automatically.
293292

294-
### HRF
293+
## HRF
295294

296295
First, download [healthy.zip](https://www5.cs.fau.de/fileadmin/research/datasets/fundus-images/healthy.zip), [glaucoma.zip](https://www5.cs.fau.de/fileadmin/research/datasets/fundus-images/glaucoma.zip), [diabetic_retinopathy.zip](https://www5.cs.fau.de/fileadmin/research/datasets/fundus-images/diabetic_retinopathy.zip), [healthy_manualsegm.zip](https://www5.cs.fau.de/fileadmin/research/datasets/fundus-images/healthy_manualsegm.zip), [glaucoma_manualsegm.zip](https://www5.cs.fau.de/fileadmin/research/datasets/fundus-images/glaucoma_manualsegm.zip) and [diabetic_retinopathy_manualsegm.zip](https://www5.cs.fau.de/fileadmin/research/datasets/fundus-images/diabetic_retinopathy_manualsegm.zip).
297296

@@ -303,7 +302,7 @@ python tools/dataset_converters/hrf.py /path/to/healthy.zip /path/to/healthy_man
303302

304303
The script will make directory structure automatically.
305304

306-
### STARE
305+
## STARE
307306

308307
First, download [stare-images.tar](http://cecas.clemson.edu/~ahoover/stare/probing/stare-images.tar), [labels-ah.tar](http://cecas.clemson.edu/~ahoover/stare/probing/labels-ah.tar) and [labels-vk.tar](http://cecas.clemson.edu/~ahoover/stare/probing/labels-vk.tar).
309308

@@ -315,15 +314,15 @@ python tools/dataset_converters/stare.py /path/to/stare-images.tar /path/to/labe
315314

316315
The script will make directory structure automatically.
317316

318-
### Dark Zurich
317+
## Dark Zurich
319318

320319
Since we only support test models on this dataset, you may only download [the validation set](https://data.vision.ee.ethz.ch/csakarid/shared/GCMA_UIoU/Dark_Zurich_val_anon.zip).
321320

322-
### Nighttime Driving
321+
## Nighttime Driving
323322

324323
Since we only support test models on this dataset, you may only download [the test set](http://data.vision.ee.ethz.ch/daid/NighttimeDriving/NighttimeDrivingTest.zip).
325324

326-
### LoveDA
325+
## LoveDA
327326

328327
The data could be downloaded from Google Drive [here](https://drive.google.com/drive/folders/1ibYV0qwn4yuuh068Rnc-w4tPi0U0c-ti?usp=sharing).
329328

@@ -338,55 +337,53 @@ wget https://zenodo.org/record/5706578/files/Val.zip
338337
wget https://zenodo.org/record/5706578/files/Test.zip
339338
```
340339

341-
For LoveDA dataset, please run the following command to download and re-organize the dataset.
340+
For LoveDA dataset, please run the following command to re-organize the dataset.
342341

343342
```shell
344343
python tools/dataset_converters/loveda.py /path/to/loveDA
345344
```
346345

347-
Using trained model to predict test set of LoveDA and submit it to server can be found [here](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/docs/en/user_guides/3_inference.md).
346+
Using trained model to predict test set of LoveDA and submit it to server can be found [here](https://codalab.lisn.upsaclay.fr/competitions/421).
348347

349348
More details about LoveDA can be found [here](https://github.com/Junjue-Wang/LoveDA).
350349

351-
### ISPRS Potsdam
350+
## ISPRS Potsdam
352351

353-
The [Potsdam](https://www2.isprs.org/commissions/comm2/wg4/benchmark/2d-sem-label-potsdam/)
354-
dataset is for urban semantic segmentation used in the 2D Semantic Labeling Contest - Potsdam.
352+
The [Potsdam](https://www2.isprs.org/commissions/comm2/wg4/benchmark/2d-sem-label-potsdam/) dataset is for urban semantic segmentation used in the 2D Semantic Labeling Contest - Potsdam.
355353

356354
The dataset can be requested at the challenge [homepage](https://www2.isprs.org/commissions/comm2/wg4/benchmark/data-request-form/).
357355
The '2_Ortho_RGB.zip' and '5_Labels_all_noBoundary.zip' are required.
358356

359-
For Potsdam dataset, please run the following command to download and re-organize the dataset.
357+
For Potsdam dataset, please run the following command to re-organize the dataset.
360358

361359
```shell
362360
python tools/dataset_converters/potsdam.py /path/to/potsdam
363361
```
364362

365363
In our default setting, it will generate 3456 images for training and 2016 images for validation.
366364

367-
### ISPRS Vaihingen
365+
## ISPRS Vaihingen
368366

369-
The [Vaihingen](https://www2.isprs.org/commissions/comm2/wg4/benchmark/2d-sem-label-vaihingen/)
370-
dataset is for urban semantic segmentation used in the 2D Semantic Labeling Contest - Vaihingen.
367+
The [Vaihingen](https://www2.isprs.org/commissions/comm2/wg4/benchmark/2d-sem-label-vaihingen/) dataset is for urban semantic segmentation used in the 2D Semantic Labeling Contest - Vaihingen.
371368

372369
The dataset can be requested at the challenge [homepage](https://www2.isprs.org/commissions/comm2/wg4/benchmark/data-request-form/).
373370
The 'ISPRS_semantic_labeling_Vaihingen.zip' and 'ISPRS_semantic_labeling_Vaihingen_ground_truth_eroded_COMPLETE.zip' are required.
374371

375-
For Vaihingen dataset, please run the following command to download and re-organize the dataset.
372+
For Vaihingen dataset, please run the following command to re-organize the dataset.
376373

377374
```shell
378375
python tools/dataset_converters/vaihingen.py /path/to/vaihingen
379376
```
380377

381-
In our default setting (`clip_size` =512, `stride_size`=256), it will generate 344 images for training and 398 images for validation.
378+
In our default setting (`clip_size`=512, `stride_size`=256), it will generate 344 images for training and 398 images for validation.
382379

383-
### iSAID
380+
## iSAID
384381

385382
The data images could be download from [DOTA-v1.0](https://captain-whu.github.io/DOTA/dataset.html) (train/val/test)
386383

387384
The data annotations could be download from [iSAID](https://captain-whu.github.io/iSAID/dataset.html) (train/val)
388385

389-
The dataset is a Large-scale Dataset for Instance Segmentation (also have segmantic segmentation) in Aerial Images.
386+
The dataset is a Large-scale Dataset for Instance Segmentation (also have semantic segmentation) in Aerial Images.
390387

391388
You may need to follow the following structure for dataset preparation after downloading iSAID dataset.
392389

@@ -415,7 +412,7 @@ You may need to follow the following structure for dataset preparation after dow
415412
python tools/dataset_converters/isaid.py /path/to/iSAID
416413
```
417414

418-
In our default setting (`patch_width`=896, `patch_height`=896, `overlap_area`=384), it will generate 33978 images for training and 11644 images for validation.
415+
In our default setting (`patch_width`=896, `patch_height`=896, `overlap_area`=384), it will generate 33978 images for training and 11644 images for validation.
419416

420417
## LIP(Look Into Person) dataset
421418

@@ -435,7 +432,7 @@ mv val_segmentations ../
435432
cd ..
436433
```
437434

438-
The contents of LIP datasets include:
435+
The contents of LIP datasets include:
439436

440437
```none
441438
├── data
@@ -456,10 +453,9 @@ The contents of LIP datasets include:
456453

457454
## Synapse dataset
458455

459-
This dataset could be download from [this page](https://www.synapse.org/#!Synapse:syn3193805/wiki/)
456+
This dataset could be download from [this page](https://www.synapse.org/#!Synapse:syn3193805/wiki/).
460457

461-
To follow the data preparation setting of [TransUNet](https://arxiv.org/abs/2102.04306), which splits original training set (30 scans)
462-
into new training (18 scans) and validation set (12 scans). Please run the following command to prepare the dataset.
458+
To follow the data preparation setting of [TransUNet](https://arxiv.org/abs/2102.04306), which splits original training set (30 scans) into new training (18 scans) and validation set (12 scans). Please run the following command to prepare the dataset.
463459

464460
```shell
465461
unzip RawData.zip
@@ -532,10 +528,9 @@ Then, use this command to convert synapse dataset.
532528
python tools/dataset_converters/synapse.py --dataset-path /path/to/synapse
533529
```
534530

535-
Noted that MMSegmentation default evaluation metric (such as mean dice value) is calculated on 2D slice image,
536-
which is not comparable to results of 3D scan in some paper such as [TransUNet](https://arxiv.org/abs/2102.04306).
531+
Noted that MMSegmentation default evaluation metric (such as mean dice value) is calculated on 2D slice image, which is not comparable to results of 3D scan in some paper such as [TransUNet](https://arxiv.org/abs/2102.04306).
537532

538-
### REFUGE
533+
## REFUGE
539534

540535
Register in [REFUGE Challenge](https://refuge.grand-challenge.org) and download [REFUGE dataset](https://refuge.grand-challenge.org/REFUGE2Download).
541536

@@ -624,4 +619,4 @@ It includes 400 images for training, 400 images for validation and 400 images fo
624619
```
625620

626621
- You could set Datasets version with `MapillaryDataset_v1` and `MapillaryDataset_v2` in your configs.
627-
View the Mapillary Vistas Datasets config file here [V1.2](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/configs/_base_/datasets/mapillary_v1.py) and [V2.0](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/configs/_base_/datasets/mapillary_v2.py)
622+
View the Mapillary Vistas Datasets config file here [V1.2](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/configs/_base_/datasets/mapillary_v1.py) and [V2.0](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/configs/_base_/datasets/mapillary_v2.py)

docs/en/user_guides/3_inference.md

Lines changed: 13 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ MMSegmentation provides several interfaces for users to easily use pre-trained m
1919

2020
## Inferencer
2121

22-
We provides the most **convenient** way to use the model in MMSegmentation `MMSegInferencer`. You can get segmentation mask for an image with only 3 lines of code.
22+
We provide the most **convenient** way to use the model in MMSegmentation `MMSegInferencer`. You can get segmentation mask for an image with only 3 lines of code.
2323

2424
### Basic Usage
2525

@@ -36,15 +36,15 @@ The following example shows how to use `MMSegInferencer` to perform inference on
3636
The visualization result should look like:
3737

3838
<div align="center">
39-
https://user-images.githubusercontent.com/76149310/221507927-ae01e3a7-016f-4425-b966-7b19cbbe494e.png
39+
<img src='https://user-images.githubusercontent.com/76149310/221507927-ae01e3a7-016f-4425-b966-7b19cbbe494e.png' />
4040
</div>
4141

4242
Moreover, you can use `MMSegInferencer` to process a list of images:
4343

4444
```
4545
# Input a list of images
4646
>>> images = [image1, image2, ...] # image1 can be a file path or a np.ndarray
47-
>>> inferencer(images, show=True, wait_time=0.5) # wait_time is delay time, and 0 means forever.
47+
>>> inferencer(images, show=True, wait_time=0.5) # wait_time is delay time, and 0 means forever
4848
4949
# Or input image directory
5050
>>> images = $IMAGESDIR
@@ -56,13 +56,12 @@ Moreover, you can use `MMSegInferencer` to process a list of images:
5656
>>> inferencer(images, out_dir='outputs', img_out_dir='vis', pred_out_dir='pred')
5757
```
5858

59-
There is a optional parameter of inferencer, `return_datasamples`, whose default value is False, and
60-
return value of inferencer is a `dict` type by default, including 2 keys 'visualization' and 'predictions'.
59+
There is a optional parameter of inferencer, `return_datasamples`, whose default value is False, and return value of inferencer is a `dict` type by default, including 2 keys 'visualization' and 'predictions'.
6160
If `return_datasamples=True` inferencer will return [`SegDataSample`](../advanced_guides/structures.md), or list of it.
6261

6362
```
6463
result = inferencer('demo/demo.png')
65-
# result is a `dict` including 2 keys 'visualization' and 'predictions'.
64+
# result is a `dict` including 2 keys 'visualization' and 'predictions'
6665
# 'visualization' includes color segmentation map
6766
print(result['visualization'].shape)
6867
# (512, 683, 3)
@@ -92,18 +91,12 @@ print(type(results[0]))
9291
### Initialization
9392

9493
`MMSegInferencer` must be initialized from a `model`, which can be a model name or a `Config` even a path of config file.
95-
The model names can be found in models' metafile, like one model name of maskformer is `maskformer_r50-d32_8xb2-160k_ade20k-512x512`, and if input model name and the weights of the model will be download automatically. Below are other input parameters:
96-
97-
- weights (str, optional) - Path to the checkpoint. If it is not specified and model is a model name of metafile, the weights will be loaded
98-
from metafile. Defaults to None.
99-
- classes (list, optional) - Input classes for result rendering, as the prediction of segmentation
100-
model is a segment map with label indices, `classes` is a list which includes
101-
items responding to the label indices. If classes is not defined, visualizer will take `cityscapes` classes by default. Defaults to None.
102-
- palette (list, optional) - Input palette for result rendering, which is a list of color palette
103-
responding to the classes. If palette is not defined, visualizer will take `cityscapes` palette by default. Defaults to None.
104-
- dataset_name (str, optional)[Dataset name or alias](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/mmseg/utils/class_names.py#L302-L317)
105-
visulizer will use the meta information of the dataset i.e. classes and palette,
106-
but the `classes` and `palette` have higher priority. Defaults to None.
94+
The model names can be found in models' metafile (configs/xxx/metafile.yaml), like one model name of maskformer is `maskformer_r50-d32_8xb2-160k_ade20k-512x512`, and if input model name and the weights of the model will be download automatically. Below are other input parameters:
95+
96+
- weights (str, optional) - Path to the checkpoint. If it is not specified and model is a model name of metafile, the weights will be loaded from metafile. Defaults to None.
97+
- classes (list, optional) - Input classes for result rendering, as the prediction of segmentation model is a segment map with label indices, `classes` is a list which includes items responding to the label indices. If classes is not defined, visualizer will take `cityscapes` classes by default. Defaults to None.
98+
- palette (list, optional) - Input palette for result rendering, which is a list of colors responding to the classes. If the palette is not defined, the visualizer will take the palette of `cityscapes` by default. Defaults to None.
99+
- dataset_name (str, optional) - [Dataset name or alias](https://github.com/open-mmlab/mmsegmentation/blob/dev-1.x/mmseg/utils/class_names.py#L302-L317), visualizer will use the meta information of the dataset i.e. classes and palette, but the `classes` and `palette` have higher priority. Defaults to None.
107100
- device (str, optional) - Device to run inference. If None, the available device will be automatically used. Defaults to None.
108101
- scope (str, optional) - The scope of the model. Defaults to 'mmseg'.
109102

@@ -113,8 +106,7 @@ The model names can be found in models' metafile, like one model name of maskfor
113106

114107
- show (bool) - Whether to display the image in a popup window. Defaults to False.
115108
- wait_time (float) - The interval of show (s). Defaults to 0.
116-
- img_out_dir (str) - Subdirectory of `out_dir`, used to save rendering color segmentation mask, so `out_dir` must be defined
117-
if you would like to save predicted mask. Defaults to 'vis'.
109+
- img_out_dir (str) - Subdirectory of `out_dir`, used to save rendering color segmentation mask, so `out_dir` must be defined if you would like to save predicted mask. Defaults to 'vis'.
118110
- opacity (int, float) - The transparency of segmentation mask. Defaults to 0.8.
119111

120112
The examples of these parameters is in [Basic Usage](#basic-usage)
@@ -245,7 +237,7 @@ vis_image = show_result_pyplot(model, img_path, result)
245237
# save the visualization result, the output image would be found at the path `work_dirs/result.png`
246238
vis_iamge = show_result_pyplot(model, img_path, result, out_file='work_dirs/result.png')
247239

248-
# Modify the time of displaying images, note that 0 is the special value that means "forever".
240+
# Modify the time of displaying images, note that 0 is the special value that means "forever"
249241
vis_image = show_result_pyplot(model, img_path, result, wait_time=5)
250242
```
251243

0 commit comments

Comments
 (0)