Skip to content

Commit cf1735c

Browse files
author
zhangming8
committed
change ema; chaneg train pipline
1 parent c465c14 commit cf1735c

File tree

9 files changed

+168
-104
lines changed

9 files changed

+168
-104
lines changed

README.md

Lines changed: 35 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,12 @@
11
## A pytorch easy re-implement of "YOLOX: Exceeding YOLO Series in 2021"
22

33
## 1. Notes
4+
45
This is a pytorch easy re-implement of "YOLOX: Exceeding YOLO Series in 2021" [https://arxiv.org/abs/2107.08430]
56
The repo is still under development
67

78
## 2. Environment
9+
810
pytorch>=1.7.0, python>=3.6, Ubuntu/Windows, see more in 'requirements.txt'
911

1012
cd /path/to/your/work
@@ -16,7 +18,9 @@
1618

1719
#### Model Zoo
1820

19-
All weights can be downloaded from [GoogleDrive](https://drive.google.com/drive/folders/1qEMLzikH5JwRNRoHpeCa6BJBeSQ6xXCH?usp=sharing) or [BaiduDrive](https://pan.baidu.com/s/1UsbdnyVwRJhr9Vy1tmJLeQ) (code:bc72)
21+
All weights can be downloaded
22+
from [GoogleDrive](https://drive.google.com/drive/folders/1qEMLzikH5JwRNRoHpeCa6BJBeSQ6xXCH?usp=sharing)
23+
or [BaiduDrive](https://pan.baidu.com/s/1UsbdnyVwRJhr9Vy1tmJLeQ) (code:bc72)
2024

2125
|Model |test size |mAP<sup>val<br>0.5:0.95 |mAP<sup>test<br>0.5:0.95 | Params<br>(M) |
2226
| ------ |:---: |:---: | :---: |:---: |
@@ -28,9 +32,11 @@ All weights can be downloaded from [GoogleDrive](https://drive.google.com/drive/
2832
|yolox-x |640 |50.5 |51.1 |99.1 |
2933
|yolox-x |800 |51.2 |51.9 |99.1 |
3034

31-
mAP was reevaluated on COCO val2017 and test2017, and some results are slightly better than the official implement [YOLOX](https://github.com/Megvii-BaseDetection/YOLOX). You can reproduce them by scripts in 'evaluate.sh'
35+
mAP was reevaluated on COCO val2017 and test2017, and some results are slightly better than the official
36+
implement [YOLOX](https://github.com/Megvii-BaseDetection/YOLOX). You can reproduce them by scripts in 'evaluate.sh'
3237

3338
#### Dataset
39+
3440
download COCO:
3541
http://images.cocodataset.org/zips/train2017.zip
3642
http://images.cocodataset.org/zips/val2017.zip
@@ -45,34 +51,37 @@ mAP was reevaluated on COCO val2017 and test2017, and some results are slightly
4551
change opt.dataset_path = "/path/to/dataset" in 'config.py'
4652

4753
#### Train
54+
4855
See more example in 'train.sh'
49-
a. Train from scratch:(backbone="CSPDarknet-s" means using yolox-s, and you can change it to any other backbone, eg: CSPDarknet-nano, tiny, s, m, l, x)
50-
python train.py gpus='0' backbone="CSPDarknet-s" num_epochs=300 exp_id="coco_CSPDarknet-s_640x640" use_amp=True val_intervals=2 data_num_workers=6 metric="ap" batch_size=48
56+
a. Train from scratch:(backbone="CSPDarknet-s" means using yolox-s, and you can change it, eg: CSPDarknet-nano, tiny, s, m, l, x)
57+
python train.py gpus='0' backbone="CSPDarknet-s" num_epochs=300 exp_id="coco_CSPDarknet-s_640x640" use_amp=True val_intervals=2 data_num_workers=6 batch_size=48
5158

5259
b. Finetune, download pre-trained weight on COCO and finetune on customer dataset:
53-
python train.py gpus='0' backbone="CSPDarknet-s" num_epochs=300 exp_id="coco_CSPDarknet-s_640x640" use_amp=True val_intervals=2 data_num_workers=6 metric="ap" batch_size=48 load_model="../weights/yolox-s.pth" resume=False
60+
python train.py gpus='0' backbone="CSPDarknet-s" num_epochs=300 exp_id="coco_CSPDarknet-s_640x640" use_amp=True val_intervals=2 data_num_workers=6 batch_size=48 load_model="../weights/yolox-s.pth"
5461

5562
c. Resume, you can use 'resume=True' when your training is accidentally stopped:
56-
python train.py gpus='0' backbone="CSPDarknet-s" num_epochs=300 exp_id="coco_CSPDarknet-s_640x640" use_amp=True val_intervals=2 data_num_workers=6 metric="ap" batch_size=48 load_model="exp/coco_CSPDarknet-s_640x640/model_last.pth" resume=True
57-
58-
d. Some tips:
59-
Ⅰ You can also change params in 'train.sh'(these params will replace opt.xxx in config.py) and use 'nohup sh train.sh &' to train
60-
Ⅱ If you want to close mulit-size training, change opt.random_size = None in 'config.py' or set random_size=None in 'train.sh'
61-
Ⅲ Mulit-gpu train: change opt.gpus = "3,5,6,7"
62-
Ⅳ Visualized log by tensorboard: tensorboard --logdir exp/your_exp_id/logs_2021-08-xx-xx-xx and visit http://localhost:6006
63+
python train.py gpus='0' backbone="CSPDarknet-s" num_epochs=300 exp_id="coco_CSPDarknet-s_640x640" use_amp=True val_intervals=2 data_num_workers=6 batch_size=48 load_model="exp/coco_CSPDarknet-s_640x640/model_last.pth" resume=True
64+
65+
#### Some Tips:
66+
67+
a. You can also change params in 'train.sh'(these params will replace opt.xxx in config.py) and use 'nohup sh train.sh &' to train
68+
b. Multi-gpu train: set opt.gpus = "3,5,6,7" in 'config.py' or set gpus="3,5,6,7" in 'train.sh'
69+
c. If you want to close multi-size training, change opt.random_size = None in 'config.py' or set random_size=None in 'train.sh'
70+
d. random_size = (14, 26) means: Randomly select an integer from interval (14,26) and multiply by 32 as the input size
71+
e. Visualized log by tensorboard:
72+
tensorboard --logdir exp/your_exp_id/logs_2021-08-xx-xx-xx and visit http://localhost:6006
6373
Your can also use the following shell scripts:
64-
grep 'train epoch' exp/your_exp_id/logs_2021-08-xx-xx-xx/log.txt
65-
grep 'val epoch' exp/your_exp_id/logs_2021-08-xx-xx-xx/log.txt
66-
grep 'AP' exp/your_exp_id/logs_2021-08-xx-xx-xx/log.txt |grep 0.95
67-
74+
(1) grep 'train epoch' exp/your_exp_id/logs_2021-08-xx-xx-xx/log.txt
75+
(2) grep 'val epoch' exp/your_exp_id/logs_2021-08-xx-xx-xx/log.txt
76+
6877
#### Evaluate
6978

70-
The weights will be saved in './exp/your_exp_id/model_xx.pth'
79+
Module weights will be saved in './exp/your_exp_id/model_xx.pth'
7180
change 'load_model'='weight/path/to/evaluate.pth' and backbone='backbone-type' in 'evaluate.sh'
7281
sh evaluate.sh
73-
82+
7483
#### Predict/Inference/Demo
75-
84+
7685
a. Predict images, change img_dir and load_model
7786
python predict.py gpus='0' backbone="CSPDarknet-s" vis_thresh=0.3 load_model="exp/coco_CSPDarknet-s_640x640/model_best.pth" img_dir='/path/to/dataset/images/val2017'
7887

@@ -82,7 +91,7 @@ mAP was reevaluated on COCO val2017 and test2017, and some results are slightly
8291
You can also change params in 'predict.sh', and use 'sh predict.sh'
8392

8493
#### Train Customer Dataset(VOC format)
85-
94+
8695
1. put your annotations(.xml) and images(.jpg) into:
8796
/path/to/voc_data/images/train2017/*.jpg # train images
8897
/path/to/voc_data/images/train2017/*.xml # train xml annotations
@@ -106,21 +115,27 @@ mAP was reevaluated on COCO val2017 and test2017, and some results are slightly
106115
## 4. Multi/One-class Multi-object Tracking(MOT)
107116

108117
#### one-class/single-class MOT Dataset
118+
109119
DOING
110120

111121
#### Multi-class MOT Dataset
122+
112123
DOING
113124

114125
#### Train
126+
115127
DOING
116128

117129
#### Evaluate
130+
118131
DOING
119132

120133
#### Predict/Inference/Demo
134+
121135
DOING
122136

123137
## 5. Acknowledgement
138+
124139
https://github.com/Megvii-BaseDetection/YOLOX
125140
https://github.com/PaddlePaddle/PaddleDetection
126141
https://github.com/open-mmlab/mmdetection

config.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,6 @@ def update_nano_tiny(cfg, inp_params):
6464
opt.basic_lr_per_img = 0.01 / 64.0
6565
opt.scheduler = "yoloxwarmcos"
6666
opt.no_aug_epochs = 15 # close mixup and mosaic augments in the last 15 epochs
67-
opt.accumulate = 1 # real batch size = accumulate * batch_size
6867
opt.min_lr_ratio = 0.05
6968
opt.weight_decay = 5e-4
7069
opt.warmup_epochs = 5
@@ -87,13 +86,12 @@ def update_nano_tiny(cfg, inp_params):
8786
opt.ema = True # False, Exponential Moving Average
8887
opt.grad_clip = dict(max_norm=35, norm_type=2) # None, clip gradient makes training more stable
8988
opt.print_iter = 1 # print loss every 1 iteration
90-
opt.metric = "loss" # 'Ap' 'loss', used to save 'model_best.pth'
91-
opt.val_intervals = 1 # evaluate(when metric='Ap') and save best ckpt every 1 epoch
89+
opt.val_intervals = 2 # evaluate val dataset and save best ckpt every 2 epoch
9290
opt.save_epoch = 1 # save check point every 1 epoch
9391
opt.resume = False # resume from 'model_last.pth' when set True
94-
opt.use_amp = False # True
92+
opt.use_amp = False # True, Automatic mixed precision
9593
opt.cuda_benchmark = True
96-
opt.nms_thresh = 0.65
94+
opt.nms_thresh = 0.65 # nms IOU threshold in post process
9795
opt.occupy_mem = False # pre-allocate gpu memory for training to avoid memory Fragmentation.
9896

9997
opt.rgb_means = [0.485, 0.456, 0.406]
@@ -113,7 +111,6 @@ def update_nano_tiny(cfg, inp_params):
113111
opt.label_name = new_label
114112
opt.num_classes = len(opt.label_name)
115113
opt.gpus_str = opt.gpus
116-
opt.metric = opt.metric.lower()
117114
opt.gpus = [int(i) for i in opt.gpus.split(',')]
118115
opt.gpus = [i for i in range(len(opt.gpus))] if opt.gpus[0] >= 0 else [-1]
119116
if opt.master_batch_size == -1:
@@ -131,6 +128,9 @@ def update_nano_tiny(cfg, inp_params):
131128
opt.load_model = os.path.join(opt.save_dir, 'model_last.pth')
132129
if opt.random_size is not None and (opt.random_size[1] - opt.random_size[0] > 1):
133130
opt.cuda_benchmark = False
131+
# TODO, will stuck after evaluating when multi-size training
132+
opt.val_intervals = 10000
133+
print("[Warning] disable evaluate when multi-size training")
134134
if opt.reid_dim > 0:
135135
assert opt.tracking_id_nums is not None
136136
if opt.random_size is None:

data/coco_dataset.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ def get_dataloader(opt, no_aug=False):
5656
augment=False))
5757
val_sampler = torch.utils.data.SequentialSampler(val_dataset)
5858
val_kwargs = {"num_workers": opt.data_num_workers, "pin_memory": True, "sampler": val_sampler,
59-
"batch_size": opt.batch_size, "drop_last": True}
59+
"batch_size": opt.batch_size, "drop_last": False}
6060
val_loader = torch.utils.data.DataLoader(val_dataset, **val_kwargs)
6161

6262
return train_loader, val_loader

data/datasets/coco.py

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,11 @@
22
# -*- coding:utf-8 -*-
33
# Copyright (c) Megvii, Inc. and its affiliates.
44

5+
import io
56
import os
67
import cv2
78
import json
9+
import contextlib
810
import numpy as np
911
from pycocotools.coco import COCO
1012
from pycocotools.cocoeval import COCOeval
@@ -81,9 +83,14 @@ def run_coco_eval(self, results, save_dir):
8183
coco_eval = COCOeval(self.coco, coco_det, "bbox")
8284
coco_eval.evaluate()
8385
coco_eval.accumulate()
84-
coco_eval.summarize()
85-
ap, ap_0_5 = coco_eval.stats[0], coco_eval.stats[1]
86-
return ap, ap_0_5
86+
87+
redirect_string = io.StringIO()
88+
with contextlib.redirect_stdout(redirect_string):
89+
coco_eval.summarize()
90+
str_result = redirect_string.getvalue()
91+
ap, ap_0_5, ap_7_5, ap_small, ap_medium, ap_large = coco_eval.stats[:6]
92+
print(str_result)
93+
return ap, ap_0_5, ap_7_5, ap_small, ap_medium, ap_large, str_result
8794

8895
def _load_coco_annotations(self):
8996
return [self.load_anno_from_ids(_ids) for _ids in self.ids]

models/yolox.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ def forward(self, inputs, targets=None, show_time=False):
6464
body_feats = self.backbone(inputs)
6565
neck_feats = self.neck(body_feats)
6666
yolo_outputs = self.head(neck_feats)
67-
# print('yolo_outputs:', [[i.shape, i.dtype] for i in yolo_outputs]) # float16 when use_amp=True
67+
# print('yolo_outputs:', [[i.shape, i.dtype, i.device] for i in yolo_outputs]) # float16 when use_amp=True
6868

6969
if show_time:
7070
s2 = sync_time(inputs)
@@ -73,7 +73,7 @@ def forward(self, inputs, targets=None, show_time=False):
7373
if targets is not None:
7474
loss = self.loss(yolo_outputs, targets)
7575
# for k, v in loss.items():
76-
# print(k, v, v.dtype) # always float32
76+
# print(k, v, v.dtype, v.device) # always float32
7777

7878
if targets is not None:
7979
return yolo_outputs, loss

0 commit comments

Comments
 (0)