Skip to content

Commit 4f12d74

Browse files
authored
Merge pull request #9 from Jian-Zhang-3DV/main
Add Description of Eval Process and Test Data Download Method
2 parents 7568b83 + 5c779fc commit 4f12d74

File tree

9 files changed

+597
-31
lines changed

9 files changed

+597
-31
lines changed

.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
checkpoints/*
1010

1111
# Datasets
12-
data/*
12+
/data
1313

1414
# Python bytecode
1515
*.pyc
@@ -24,3 +24,4 @@ build/
2424

2525
# Outputs
2626
outputs/
27+

.vscode/launch.json

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,6 +74,23 @@
7474
"--output_dir",
7575
"checkpoints/Debug"
7676
]
77+
},
78+
{
79+
"name": "test.py",
80+
"type": "debugpy",
81+
"request": "launch",
82+
"program": "${workspaceFolder}/test.py",
83+
"console": "integratedTerminal",
84+
"args": [
85+
"--pretrained",
86+
"checkpoints/pretrained_models/checkpoint-final.pth",
87+
"--test_dataset",
88+
"TestDataset(split='test', is_training=False, ROOT='data/scannet_test', resolution=256, seed=777)",
89+
"--test_criterion",
90+
"TestLoss()",
91+
"--test_results_dir",
92+
"outputs/test"
93+
]
7794
}
7895
]
7996
}

README.md

Lines changed: 25 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -16,17 +16,25 @@ LSM reconstructs explicit radiance fields from two unposed images in real-time,
1616
## Table of Contents
1717

1818
- [Table of Contents](#table-of-contents)
19+
- [Updates](#updates)
1920
- [Feature and RGB Rendering](#feature-and-rgb-rendering)
2021
- [Feature Visualization](#feature-visualization)
2122
- [RGB Color Rendering](#rgb-color-rendering)
2223
- [Get Started](#get-started)
2324
- [Installation](#installation)
2425
- [Data Preparation](#data-preparation)
26+
- [Training](#training)
2527
- [Inference](#inference)
26-
- [Updates](#updates)
2728
- [Acknowledgement](#acknowledgement)
2829
- [Citation](#citation)
2930

31+
## Updates
32+
33+
**[2025-04-12]** Added test dataset download instructions and testing process description. See [data_process/data.md](data_process/data.md) for details.
34+
35+
**[2025-03-09]** Added ScanNet++ data preprocessing pipeline. For detailed instructions, please refer to [data_process/data.md](data_process/data.md).
36+
37+
**[2025-03-06]** Added ScanNet data preprocessing pipeline improvements. For detailed instructions, please refer to [data_process/data.md](data_process/data.md).
3038

3139
## Feature and RGB Rendering
3240

@@ -106,28 +114,25 @@ LSM reconstructs explicit radiance fields from two unposed images in real-time,
106114
```
107115

108116
### Data Preparation
109-
1. **For training**: The model can be trained on ScanNet and ScanNet++ datasets.
117+
1. **For training**: The model can be trained on ScanNet and ScanNet++ datasets.
110118
- Both datasets require signing agreements to access
111119
- Detailed data preparation instructions are available in [data_process/data.md](data_process/data.md)
112120

113-
Quick overview of data structure after processing:
114-
```bash
115-
# For ScanNet
116-
data/scannet_processed/
117-
└── {scene_id}/
118-
├── color/ # RGB images
119-
├── depth/ # Depth maps
120-
└── pose/ # Camera parameters
121-
122-
# For ScanNet++
123-
data/scannetpp_render/
124-
└── {scene_id}/
125-
└── dslr/
126-
├── camera/ # Camera parameters
127-
├── render_depth/ # Depth maps
128-
├── rgb_resized_undistorted/ # RGB images
129-
└── mask_resized_undistorted/ # Masks
130-
```
121+
2. **For testing**: Refer to [data_process/data.md](data_process/data.md) for details on the test dataset.
122+
123+
### Training
124+
After preparing the datasets, you can train the model using the following command:
125+
```bash
126+
bash scripts/train.sh
127+
```
128+
129+
The training results will be saved to `SAVE_DIR`. By default, it is set to `checkpoints/output`.
130+
131+
Optional parameters in `scripts/train.sh`:
132+
```bash
133+
# Directory to save training outputs
134+
--output_dir "checkpoints/output"
135+
```
131136

132137
### Inference
133138
1. Data preparation
@@ -164,12 +169,6 @@ LSM reconstructs explicit radiance fields from two unposed images in real-time,
164169
--resolution "256"
165170
```
166171

167-
## Updates
168-
169-
**[2024-03-09]** Added ScanNet++ data preprocessing pipeline. For detailed instructions, please refer to [data_process/data.md](data_process/data.md).
170-
171-
**[2024-03-06]** Added ScanNet data preprocessing pipeline improvements. For detailed instructions, please refer to [data_process/data.md](data_process/data.md).
172-
173172
## Acknowledgement
174173

175174
This work is built on many amazing research works and open-source projects, thanks a lot to all the authors for sharing!

data_process/data.md

Lines changed: 55 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,12 @@
1414
- [1. Download ScanNet++ Data](#1-download-scannet-data-1)
1515
- [2. Data Processing](#2-data-processing)
1616
- [3. Data Structure](#3-data-structure)
17+
- [Test Dataset](#test-dataset)
18+
- [Download Test Dataset](#download-test-dataset)
19+
- [Data Structure](#data-structure-1)
20+
- [Test Set Selection Criteria](#test-set-selection-criteria)
21+
- [Test Category Label Selection](#test-category-label-selection)
22+
- [Test Data Loading and Evaluation Workflow](#test-data-loading-and-evaluation-workflow)
1723

1824
## Overview
1925
This document provides instructions for preparing ScanNet and ScanNet++ datasets for training and evaluation.
@@ -222,4 +228,52 @@ Each directory contains:
222228
- `extrinsic`: 4x4 camera-to-world transformation matrix
223229
- `render_depth/`: Rendered depth maps stored as 16-bit PNG files (depth values * 1000)
224230
- `rgb_resized_undistorted/`: Undistorted and resized RGB images
225-
- `mask_resized_undistorted/`: Undistorted and resized binary mask images (255 for valid pixels, 0 for invalid)
231+
- `mask_resized_undistorted/`: Undistorted and resized binary mask images (255 for valid pixels, 0 for invalid)
232+
233+
## Test Dataset
234+
235+
### Download Test Dataset
236+
```bash
237+
# Download and extract test dataset
238+
wget https://huggingface.co/datasets/Journey9ni/LSM/resolve/main/scannet_test.tar
239+
tar -xf scannet_test.tar -C ./data/ # Extract to the data directory
240+
```
241+
242+
### Data Structure
243+
The test dataset is expected to have the following structure:
244+
```bash
245+
data/scannet_test/
246+
└── {scene_id}/
247+
├── depth/ # Depth maps
248+
├── images/ # RGB images
249+
├── labels/ # Semantic labels
250+
├── selected_seqs_test.json # Test sequence parameters
251+
└── selected_seqs_train.json # Train sequence parameters
252+
```
253+
254+
### Test Set Selection Criteria
255+
The test set was curated using the following process:
256+
1. **Initial Selection**: The last 50 scenes from the alphabetically sorted list of original ScanNet scans were initially selected.
257+
2. **Frame Sampling**: 30 frames were sampled at regular intervals from each selected scene.
258+
3. **Pose Validation**: Each frame's pose data was checked for NaN values (due to errors in the original ScanNet dataset). Scenes containing frames with invalid poses were excluded (7 scenes removed).
259+
4. **Compatibility Check**: Scenes that caused errors during testing with NeRF-DFF and Feature-3DGS were further filtered out.
260+
5. **Final Set**: This process resulted in a final test set of 40 scenes.
261+
262+
### Test Category Label Selection
263+
We use a predefined set of common indoor categories: ['wall', 'floor', 'ceiling', 'chair', 'table', 'sofa', 'bed', 'other'], instead of the 20 categories used by ScanNetV2.
264+
265+
### Test Data Loading and Evaluation Workflow
266+
267+
The testing process relies on the `TestDataset` class in `large_spatial_model/datasets/testdata.py`, initialized with `split='test'` and `is_training=False`.
268+
269+
1. **View Selection**: The dataset selects test views based on `llff_hold` and `test_ids` parameters for each scene. Typically, frames whose index modulo `llff_hold` falls within `test_ids` are chosen as the core test frames (`target_view`).
270+
2. **View Grouping**: For each selected `target_view`, the dataset groups it with its immediate predecessor (`source_view1`) and successor (`source_view2`), forming a tuple of view indices: `(source_view2, target_view, source_view1)`. The test set comprises a series of these `(Scene ID, View Indices Tuple)` pairs.
271+
3. **Data Loading**: When iterating through the dataset during testing:
272+
* The script loads RGB images (`.jpg`), depth maps (`.png`), semantic label maps (`.png`), and camera parameters (intrinsics, extrinsics from `.npz`) for each view index in the tuple.
273+
* Preprocessing steps include validity checks (e.g., for NaN in camera poses) and image cropping/resizing.
274+
* The `map_func` maps original ScanNet semantic labels to the simplified category set defined above.
275+
* This yields a dictionary for each view containing image, depth, pose, intrinsics, processed label map, etc.
276+
4. **Model Inference and Evaluation**:
277+
* The model takes the `source_view1` and `source_view2` data as input to infer the parameters (e.g., Gaussian parameters for 3D Gaussian Splatting).
278+
* Using these inferred parameters and the `target_view`'s camera pose/intrinsics, the model renders a semantic label map for the `target_view`.
279+
* This rendered semantic map is then compared against the ground truth semantic label map for the `target_view` from the original ScanNet dataset to evaluate the model's performance.

0 commit comments

Comments
 (0)