NVlabs
diff --git a/‎.gitignore
Lines changed: 2 additions & 1 deletion b/‎.gitignore
Lines changed: 2 additions & 1 deletion
diff --git a/‎.vscode/launch.json
Lines changed: 17 additions & 0 deletions b/‎.vscode/launch.json
Lines changed: 17 additions & 0 deletions
diff --git a/‎README.md
Lines changed: 25 additions & 26 deletions b/‎README.md
Lines changed: 25 additions & 26 deletions
diff --git a/‎data_process/data.md
Lines changed: 55 additions & 1 deletion b/‎data_process/data.md
Lines changed: 55 additions & 1 deletion
@@ -9,7 +9,7 @@
 checkpoints/*
 
 # Datasets
-data/*
+/data
 
 # Python bytecode
 *.pyc
@@ -24,3 +24,4 @@ build/
 
 # Outputs
 outputs/
+
@@ -74,6 +74,23 @@
                 "--output_dir",
                 "checkpoints/Debug"
             ]
+        },
+        {
+            "name": "test.py",
+            "type": "debugpy",
+            "request": "launch",
+            "program": "${workspaceFolder}/test.py",
+            "console": "integratedTerminal",
+            "args": [
+                "--pretrained",
+                "checkpoints/pretrained_models/checkpoint-final.pth",
+                "--test_dataset",
+                "TestDataset(split='test', is_training=False, ROOT='data/scannet_test', resolution=256, seed=777)",
+                "--test_criterion",
+                "TestLoss()",
+                "--test_results_dir",
+                "outputs/test"
+            ]
         }
     ]
 }
@@ -16,17 +16,25 @@ LSM reconstructs explicit radiance fields from two unposed images in real-time,
 ## Table of Contents
 
 - [Table of Contents](#table-of-contents)
+- [Updates](#updates)
 - [Feature and RGB Rendering](#feature-and-rgb-rendering)
   - [Feature Visualization](#feature-visualization)
   - [RGB Color Rendering](#rgb-color-rendering)
 - [Get Started](#get-started)
   - [Installation](#installation)
   - [Data Preparation](#data-preparation)
+  - [Training](#training)
   - [Inference](#inference)
-- [Updates](#updates)
 - [Acknowledgement](#acknowledgement)
 - [Citation](#citation)
 
+## Updates
+
+**[2025-04-12]** Added test dataset download instructions and testing process description. See [data_process/data.md](data_process/data.md) for details.
+
+**[2025-03-09]** Added ScanNet++ data preprocessing pipeline. For detailed instructions, please refer to [data_process/data.md](data_process/data.md).
+
+**[2025-03-06]** Added ScanNet data preprocessing pipeline improvements. For detailed instructions, please refer to [data_process/data.md](data_process/data.md).
 
 ## Feature and RGB Rendering
 
@@ -106,28 +114,25 @@ LSM reconstructs explicit radiance fields from two unposed images in real-time,
    ```
 
 ### Data Preparation
-1. **For training**: The model can be trained on ScanNet and ScanNet++ datasets. 
+1. **For training**: The model can be trained on ScanNet and ScanNet++ datasets.
    - Both datasets require signing agreements to access
    - Detailed data preparation instructions are available in [data_process/data.md](data_process/data.md)
 
-   Quick overview of data structure after processing:
-   ```bash
-   # For ScanNet
-   data/scannet_processed/
-   └── {scene_id}/
-       ├── color/      # RGB images
-       ├── depth/      # Depth maps
-       └── pose/       # Camera parameters
-
-   # For ScanNet++
-   data/scannetpp_render/
-   └── {scene_id}/
-       └── dslr/
-           ├── camera/                    # Camera parameters
-           ├── render_depth/              # Depth maps
-           ├── rgb_resized_undistorted/   # RGB images
-           └── mask_resized_undistorted/  # Masks
-   ```
+2. **For testing**: Refer to [data_process/data.md](data_process/data.md) for details on the test dataset.
+
+### Training
+After preparing the datasets, you can train the model using the following command:
+```bash
+bash scripts/train.sh
+```
+
+The training results will be saved to `SAVE_DIR`. By default, it is set to `checkpoints/output`.
+
+Optional parameters in `scripts/train.sh`:
+```bash
+# Directory to save training outputs
+--output_dir "checkpoints/output"
+```
 
 ### Inference
 1. Data preparation
@@ -164,12 +169,6 @@ LSM reconstructs explicit radiance fields from two unposed images in real-time,
    --resolution "256"
    ```
 
-## Updates
-
-**[2024-03-09]** Added ScanNet++ data preprocessing pipeline. For detailed instructions, please refer to [data_process/data.md](data_process/data.md).
-
-**[2024-03-06]** Added ScanNet data preprocessing pipeline improvements. For detailed instructions, please refer to [data_process/data.md](data_process/data.md).
-
 ## Acknowledgement
 
 This work is built on many amazing research works and open-source projects, thanks a lot to all the authors for sharing!
 
@@ -14,6 +14,12 @@
     - [1. Download ScanNet++ Data](#1-download-scannet-data-1)
     - [2. Data Processing](#2-data-processing)
     - [3. Data Structure](#3-data-structure)
+  - [Test Dataset](#test-dataset)
+    - [Download Test Dataset](#download-test-dataset)
+    - [Data Structure](#data-structure-1)
+    - [Test Set Selection Criteria](#test-set-selection-criteria)
+    - [Test Category Label Selection](#test-category-label-selection)
+    - [Test Data Loading and Evaluation Workflow](#test-data-loading-and-evaluation-workflow)
 
 ## Overview
 This document provides instructions for preparing ScanNet and ScanNet++ datasets for training and evaluation.
@@ -222,4 +228,52 @@ Each directory contains:
   - `extrinsic`: 4x4 camera-to-world transformation matrix
 - `render_depth/`: Rendered depth maps stored as 16-bit PNG files (depth values * 1000)
 - `rgb_resized_undistorted/`: Undistorted and resized RGB images
-- `mask_resized_undistorted/`: Undistorted and resized binary mask images (255 for valid pixels, 0 for invalid)
+- `mask_resized_undistorted/`: Undistorted and resized binary mask images (255 for valid pixels, 0 for invalid)
+
+## Test Dataset
+
+### Download Test Dataset
+```bash
+# Download and extract test dataset
+wget https://huggingface.co/datasets/Journey9ni/LSM/resolve/main/scannet_test.tar
+tar -xf scannet_test.tar -C ./data/ # Extract to the data directory
+```
+
+### Data Structure
+The test dataset is expected to have the following structure:
+```bash
+data/scannet_test/
+└── {scene_id}/
+    ├── depth/                     # Depth maps
+    ├── images/                    # RGB images
+    ├── labels/                    # Semantic labels
+    ├── selected_seqs_test.json    # Test sequence parameters
+    └── selected_seqs_train.json   # Train sequence parameters
+```
+
+### Test Set Selection Criteria
+The test set was curated using the following process:
+1. **Initial Selection**: The last 50 scenes from the alphabetically sorted list of original ScanNet scans were initially selected.
+2. **Frame Sampling**: 30 frames were sampled at regular intervals from each selected scene.
+3. **Pose Validation**: Each frame's pose data was checked for NaN values (due to errors in the original ScanNet dataset). Scenes containing frames with invalid poses were excluded (7 scenes removed).
+4. **Compatibility Check**: Scenes that caused errors during testing with NeRF-DFF and Feature-3DGS were further filtered out.
+5. **Final Set**: This process resulted in a final test set of 40 scenes.
+
+### Test Category Label Selection
+We use a predefined set of common indoor categories: ['wall', 'floor', 'ceiling', 'chair', 'table', 'sofa', 'bed', 'other'], instead of the 20 categories used by ScanNetV2.
+
+### Test Data Loading and Evaluation Workflow
+
+The testing process relies on the `TestDataset` class in `large_spatial_model/datasets/testdata.py`, initialized with `split='test'` and `is_training=False`.
+
+1.  **View Selection**: The dataset selects test views based on `llff_hold` and `test_ids` parameters for each scene. Typically, frames whose index modulo `llff_hold` falls within `test_ids` are chosen as the core test frames (`target_view`).
+2.  **View Grouping**: For each selected `target_view`, the dataset groups it with its immediate predecessor (`source_view1`) and successor (`source_view2`), forming a tuple of view indices: `(source_view2, target_view, source_view1)`. The test set comprises a series of these `(Scene ID, View Indices Tuple)` pairs.
+3.  **Data Loading**: When iterating through the dataset during testing:
+    *   The script loads RGB images (`.jpg`), depth maps (`.png`), semantic label maps (`.png`), and camera parameters (intrinsics, extrinsics from `.npz`) for each view index in the tuple.
+    *   Preprocessing steps include validity checks (e.g., for NaN in camera poses) and image cropping/resizing.
+    *   The `map_func` maps original ScanNet semantic labels to the simplified category set defined above.
+    *   This yields a dictionary for each view containing image, depth, pose, intrinsics, processed label map, etc.
+4.  **Model Inference and Evaluation**:
+    *   The model takes the `source_view1` and `source_view2` data as input to infer the parameters (e.g., Gaussian parameters for 3D Gaussian Splatting).
+    *   Using these inferred parameters and the `target_view`'s camera pose/intrinsics, the model renders a semantic label map for the `target_view`.
+    *   This rendered semantic map is then compared against the ground truth semantic label map for the `target_view` from the original ScanNet dataset to evaluate the model's performance.
Original file line number	Diff line number	Diff line change
`@@ -74,6 +74,23 @@`
`74`	`74`	`"--output_dir",`
`75`	`75`	`"checkpoints/Debug"`
`76`	`76`	`]`
	`77`	`+ },`
	`78`	`+ {`
	`79`	`+ "name": "test.py",`
	`80`	`+ "type": "debugpy",`
	`81`	`+ "request": "launch",`
	`82`	`+ "program": "${workspaceFolder}/test.py",`
	`83`	`+ "console": "integratedTerminal",`
	`84`	`+ "args": [`
	`85`	`+ "--pretrained",`
	`86`	`+ "checkpoints/pretrained_models/checkpoint-final.pth",`
	`87`	`+ "--test_dataset",`
	`88`	`+ "TestDataset(split='test', is_training=False, ROOT='data/scannet_test', resolution=256, seed=777)",`
	`89`	`+ "--test_criterion",`
	`90`	`+ "TestLoss()",`
	`91`	`+ "--test_results_dir",`
	`92`	`+ "outputs/test"`
	`93`	`+ ]`
`77`	`94`	`}`
`78`	`95`	`]`
`79`	`96`	`}`