Official implementation of the following work:
P. Zhang, Z. Mai, Q.-H. Nguyen & W.-L. Chao. Revisiting semi-supervised learning in the era of foundation models. arXiv.
This repository contains the official implementation of V-PET (VFM-PEFT Enesmble Training). The code is modified from USB. Original copyright notice:
Copyright (c) 2021 Othneil Drew
We recommend using Conda to create a python=3.9
environment.
conda create -n vpet python=3.9
conda activate vpet
Then install the required packages:
pip install -r requirements.txt
Please download our models and data. Then, unzip the files and place them in data
and pretrain_weight
directories with the following structure:
├── data
│ └── vtab
│ ├── clevr_count
│ │ ├── images
│ │ │ ├── 000
│ │ │ ├── 001
│ │ │ └── ...
│ │ ├── labeded_idx
│ │ ├── test.list
│ │ ├── train.list
│ │ ├── trainval.list
│ │ └── val.list
│ ├── diabetic_retinopathy
│ │ └── ...
│ ├── dtd
│ │ └── ...
│ ├── kitti
│ │ └── ...
│ ├── resisc45
│ │ └── ...
│ └── sun397
│ └── ...
├── pretrain_weight
│ ├── vit_base_patch14_reg4_dinov2_lvd142m.bin
│ └── vit_base_patch16_clip_224_openai.bin
└──[other files]
Due to V-PET is an ensemble method which requires hyperparameter tuning before training, our workflow is implemented in a three-step process:
- Train: Train the model on labeled data.
- Tune: Based on the trained model, tune the hyperparameters on the validation set.
- V-PET: Run V-PET on the pseudo-labels generated by the tuned model.
All the training commands are in scripts/
folder. To run all the scripts, we can simply run run_train.sh
in the root directory. Or we can run our intended commands independently. Note that to run V-PET without other SSL baselines, we only need to run the commands in scripts/clip/run_supervised.sh
and scripts/dinov2/run_supervised.sh
to train the labeled-only models.
After training the models. We can generate hyperparameter tuning metrics by running the following command:
# This command will generate the list of models that we need to generate hyperparameter tuning informatics. The list will be saved in `eval_list.pkl`.
python eval_gen_list.py
# This command will read `eval_list.pkl` file and generate the metrics for each model in the list. Results will be saved in each log folder.
python eval.py
Then, we can collect the metrics with SQLite by running the following command:
python tune_collect.py
After generating the hyperparameter tuning metrics, we can run the following command to generate config files for V-PET:
python gen_config_pet.py
Then we can run V-PET with the generated config files, for example:
python train.py --c "config/lora/pet-ensemble/dtd/3-shot/clip/config.yaml"
Similarly, we can collect the results using SQLite:
python tune_collect.py
To read the tuned results from SQLite, we can run the following command:
python tune_print.py
@misc{zhang2025revisitingsemisupervisedlearningera,
title={Revisiting semi-supervised learning in the era of foundation models},
author={Ping Zhang and Zheda Mai and Quang-Huy Nguyen and Wei-Lun Chao},
year={2025},
eprint={2503.09707},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2503.09707},
}