viamrobotics · nathan-contino · May 27, 2025 · May 27, 2025 · May 27, 2025 · May 27, 2025
diff --git a/docs/data-ai/_index.md b/docs/data-ai/_index.md
@@ -25,32 +25,47 @@ You can also monitor your machines through teleop, power your application logic,
 
 <div class="hoveraction">
 
-{{< how-to-expand "Capture data" "3" "BEGINNER-FRIENDLY" "" "data-platform-capture" >}}
+{{< how-to-expand "Quickstart" "" "" "" "" >}}
 {{< cards >}}
+{{% card link="/data-ai/quickstart/set-up-viam/" noimage="true" %}}
+{{% card link="/data-ai/quickstart/capture" noimage="true" %}}
+{{% card link="/data-ai/quickstart/infer/" noimage="true" %}}
+{{< /cards >}}
+{{< /how-to-expand >}}
+
+{{< how-to-expand "Capture data" "" "" "" "data-platform-capture" >}}
+{{< cards >}}
+{{% card link="/data-ai/capture-data/" noimage="true" %}}
 {{% card link="/data-ai/capture-data/capture-sync/" noimage="true" %}}
-{{% card link="/data-ai/capture-data/filter-before-sync/" noimage="true" %}}
 {{% card link="/data-ai/capture-data/conditional-sync/" noimage="true" %}}
+{{% card link="/data-ai/capture-data/filter-before-sync/" noimage="true" %}}
 {{< /cards >}}
 {{< /how-to-expand >}}
 
-{{< how-to-expand "Work with data" "4" "BEGINNER-FRIENDLY" "" "data-platform-work" >}}
+{{< how-to-expand "Work with data" "" "" "" "data-platform-work" >}}
 {{< cards >}}
 {{% card link="/data-ai/data/query/" noimage="true" %}}
 {{% card link="/data-ai/data/visualize/" noimage="true" %}}
-{{% card link="/data-ai/data/advanced/alert-data/" noimage="true" %}}
+{{% card link="/data-ai/data/alert-data/" noimage="true" %}}
 {{% card link="/data-ai/data/export/" noimage="true" %}}
 {{< /cards >}}
 {{< /how-to-expand >}}
 
-{{< how-to-expand "Leverage AI" "8" "INTERMEDIATE" "" "data-platform-ai" >}}
+{{< how-to-expand "Run inference" "" "" "" "data-platform-ai" >}}
+{{< cards >}}
+{{% card link="/data-ai/inference/" noimage="true" %}}
+{{% card link="/data-ai/inference/alert/" noimage="true" %}}
+{{% card link="/data-ai/inference/act/" noimage="true" %}}
+{{< /cards >}}
+{{< /how-to-expand >}}
+
+{{< how-to-expand "Train an ML model" "" "" "" "data-platform-ai" >}}
 {{< cards >}}
 {{% card link="/data-ai/ai/create-dataset/" noimage="true" %}}
 {{% card link="/data-ai/ai/train-tflite/" noimage="true" %}}
 {{% card link="/data-ai/ai/train/" noimage="true" %}}
 {{% card link="/data-ai/ai/deploy/" noimage="true" %}}
-{{% card link="/data-ai/ai/run-inference/" noimage="true" %}}
-{{% card link="/data-ai/ai/alert/" noimage="true" %}}
-{{% card link="/data-ai/ai/act/" noimage="true" %}}
+{{% card link="/data-ai/ai/upload-external-data/" noimage="true" %}}
 {{< /cards >}}
 {{< /how-to-expand >}}
 

diff --git a/docs/data-ai/ai/_index.md b/docs/data-ai/ai/_index.md
@@ -1,6 +1,6 @@
 ---
-linkTitle: "Leverage AI"
-title: "Leverage AI"
+linkTitle: "Train an ML model"
+title: "Train an ML model"
 weight: 300
 layout: "empty"
 type: "docs"

diff --git a/docs/data-ai/ai/create-dataset.md b/docs/data-ai/ai/create-dataset.md
@@ -1,6 +1,6 @@
 ---
-linkTitle: "Create a dataset"
-title: "Create a dataset"
+linkTitle: "Create a training dataset"
+title: "Create a training dataset"
 weight: 10
 layout: "docs"
 type: "docs"

diff --git a/docs/data-ai/ai/deploy.md b/docs/data-ai/ai/deploy.md
@@ -83,7 +83,7 @@ The service works with models trained inside and outside the Viam app:
 On its own the ML model service only runs the model.
 After deploying your model, you need to configure an additional service to use the deployed model.
 For example, you can configure an [`mlmodel` vision service](/operate/reference/services/vision/) to visualize the inferences your model makes.
-Follow our docs to [run inference](/data-ai/ai/run-inference/) to add an `mlmodel` vision service and see inferences.
+Follow our docs to [run inference](/data-ai/inference/) to add an `mlmodel` vision service and see inferences.
 
 For other use cases, consider [creating custom functionality with a module](/operate/get-started/other-hardware/).
 

diff --git a/docs/data-ai/ai/train-tflite.md b/docs/data-ai/ai/train-tflite.md
@@ -154,7 +154,7 @@ To capture images of edge cases and re-train your model using those images, comp
 ## Next steps
 
 Now your machine can make inferences about its environment.
-The next step is to [deploy](/data-ai/ai/deploy/) the ML model and then [act](/data-ai/ai/act/) or [alert](/data-ai/ai/alert/) based on these inferences.
+The next step is to [deploy](/data-ai/ai/deploy/) the ML model and then [act](/data-ai/inference/act/) or [alert](/data-ai/inference/alert/) based on these inferences.
 
 See the following tutorials for examples of using machine learning models to make your machine do things based on its inferences about its environment:
 

diff --git a/docs/data-ai/ai/train.md b/docs/data-ai/ai/train.md
@@ -846,4 +846,4 @@ You can also view your training jobs' logs with the [`viam train logs`](/dev/too
 {{< /table >}}
 
 To use your new model with machines, you must [deploy it](/data-ai/ai/deploy/) with the appropriate ML model service.
-Then you can use another service, such as the vision service, to [run inference](/data-ai/ai/run-inference/).
+Then you can use another service, such as the vision service, to [run inference](/data-ai/inference/).
diff --git a/...ta-ai/ai/advanced/upload-external-data.md → docs/data-ai/ai/upload-external-data.md b/...ta-ai/ai/advanced/upload-external-data.md → docs/data-ai/ai/upload-external-data.md
@@ -11,9 +11,11 @@ aliases:
   - /data/upload/
   - /services/data/upload/
   - /how-tos/upload-data/
+  - /data-ai/ai/advanced/upload-external-data/
+  - /data-ai/ai/advanced/
 date: "2024-12-04"
 description: "Upload data to the Viam app from your local computer or mobile device using the data client API, Viam CLI, or Viam mobile app."
-prev: "/data-ai/ai/act/"
+prev: "/data-ai/inference/act/"
 ---
 
 When you configure the data management service, Viam automatically uploads data from the default directory `~/.viam/capture` and any directory you configured.

diff --git a/docs/data-ai/capture-data/_index.md b/docs/data-ai/capture-data/_index.md
@@ -2,10 +2,117 @@
 linkTitle: "Capture data"
 title: "Capture data"
 weight: 100
-layout: "empty"
+layout: "docs"
 type: "docs"
-empty_node: true
 open_on_desktop: true
 header_only: true
-noedit: true
+aliases:
+  - /data-ai/capture-data/advanced/how-sync-works/
+  - /data-ai/capture-data/advanced/
+  - /data-ai/capture-data/how-sync-works/
 ---
+
+`viam-server` and `viam-micro-server` handle data sync in distinct ways:
+
+{{< tabs >}}
+{{% tab name="viam-server" %}}
+
+The data is captured locally on the machine's storage and, by default, stored in the `~/.viam/capture` directory.
+For Linux root or sudo users, the `~/.viam/capture` directory resolves to `/root/.viam/capture`.
+
+{{% expand "Can't find the directory data is stored in? Click here." %}}
+
+The relative path for the data capture directory depends on where `viam-server` is run from, as well as the operating system of the machine.
+
+To find the `$HOME` value, check your machine's logs on startup which will log it in the environment variables:
+
+```sh
+2025-01-15T14:27:26.073Z    INFO    rdk    server/entrypoint.go:77    Starting viam-server with following environment variables    {"HOME":"/home/johnsmith"}
+```
+
+{{% /expand%}}
+
+If a machine restarts for any reason, data capture automatically resumes and any data already stored but not yet synced is synced.
+
+The service can capture data from multiple resources at the same or different frequencies.
+The service does not impose a lower or upper limit on the frequency of data collection.
+However, in practice, your hardware may impose limits on the frequency of data collection.
+Avoid configuring data capture to higher rates than your hardware can handle, as this could lead to performance degradation.
+
+Data capture is frequently used with cloud sync.
+You can start and stop capture and sync independently.
+You can also enable cloud sync without data capture and it will sync data in the capture directory, as well as the additional sync paths configured in the `viam-server` config.
+If you place data like images or files in the `~/.viam/capture` directory or another directory set up for sync with the data manager, for example with the `"additional_sync_paths"` config attribute, it will sync this data to the cloud.
+
+{{% /tab %}}
+{{% tab name="viam-micro-server" %}}
+
+The data is captured in the ESP32's flash memory until it is uploaded to the Viam Cloud.
+
+If the machine restarts before all data is synced, all unsynced data captured since the last sync point is lost.
+
+The service can capture data from multiple resources at the same or different frequencies.
+The service does not impose a lower or upper limit on the frequency of data collection.
+However, in practice, high frequency data collection (> 100Hz) requires special considerations on the ESP32.
+
+{{% /tab %}}
+{{< /tabs >}}
+
+## Security
+
+The data management service uses {{< glossary_tooltip term_id="grpc" text="gRPC" >}} calls to send and receive data, so your data is encrypted while in flight.
+When data is stored in the cloud, it is encrypted at rest by the cloud storage provider.
+
+## Data integrity
+
+Viam's data management service is designed to safeguard against data loss, data duplication and otherwise compromised data.
+
+If the internet becomes unavailable or the machine needs to restart during the sync process, the sync is interrupted.
+If the sync process is interrupted, the service will retry uploading the data at exponentially increasing intervals until the interval in between tries is at one hour, at which point the service retries the sync every hour.
+When the connection is restored and sync resumes, the service continues sync where it left off without duplicating data.
+If the interruption happens mid-file, sync resumes from the beginning of that file.
+
+To avoid syncing files that are still being written to, the data management service only syncs arbitrary files that haven't been modified in the previous 10 seconds.
+This default can be changed with the [`file_last_modified_millis` config attribute](/data-ai/capture-data/capture-sync/).
+
+## Automatic data deletion
+
+If cloud sync is enabled, the data management service deletes captured data from the disk once it has successfully synced to the cloud.
+
+{{< alert title="Warning" color="warning" >}}
+
+If your robot is offline and can't sync and your machine's disk fills up beyond a certain threshold, the data management service will delete captured data to free up additional space and maintain a working machine.
+
+{{< /alert >}}
+
+The data management service will also automatically delete local data in the event your machine's local storage fills up.
+Local data is automatically deleted when _all_ of the following conditions are met:
+
+- Data capture is enabled on the data management service
+- Local disk usage percentage is greater than or equal to 90%
+- The Viam capture directory is at least 50% of the current local disk usage
+
+If local disk usage is greater than or equal to 90%, but the Viam capture directory is not at least 50% of that usage, a warning log message will be emitted instead and no action will be taken.
+
+Automatic file deletion only applies to files in the specified Viam capture directory, which is set to `~/.viam/capture` by default.
+Data outside of this directory is not touched by automatic data deletion.
+
+If your machine captures a large amount of data, or frequently goes offline for long periods of time while capturing data, consider moving the Viam capture directory to a larger, dedicated storage device on your machine if available.
+You can change the capture directory using the `capture_dir` attribute.
+
+You can also control how local data is deleted if your machine's local storage becomes full, using the `delete_every_nth_when_disk_full` attribute.
+
+## Storage
+
+Data that is successfully synced to the cloud is automatically deleted from local storage.
+
+When a machine loses its internet connection, it cannot resume cloud sync until it can reach the Viam Cloud again.
+
+{{<imgproc src="/services/data/data_management.png" resize="x1100" declaredimensions=true alt="Data is captured on the machine, uploaded to the cloud, and then deleted off local storage." class="imgzoom" >}}
+
+To ensure that the machine can store all data captured while it has no connection, you need to provide enough local data storage.
+
+If your robot is offline and can't sync and your machine's disk fills up beyond a certain threshold, the data management service will delete captured data to free up additional space and maintain a working machine.
+
+Data capture supports capturing tabular data directly to MongoDB in addition to capturing to disk.
+For more information, see [Capture directly to MongoDB](/data-ai/reference/advanced-data-capture-sync/#capture-directly-to-your-own-mongodb-cluster).
diff --git a/docs/data-ai/capture-data/advanced/_index.md b/docs/data-ai/capture-data/advanced/_index.md
diff --git a/docs/data-ai/capture-data/advanced/how-sync-works.md b/docs/data-ai/capture-data/advanced/how-sync-works.md
diff --git a/docs/data-ai/capture-data/capture-other-sources.md b/docs/data-ai/capture-data/capture-other-sources.md