Skip to content

Create Quickstart section and reduce restructuring to the minimum needed to fit new content #4341

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open
31 changes: 23 additions & 8 deletions docs/data-ai/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,32 +25,47 @@ You can also monitor your machines through teleop, power your application logic,

<div class="hoveraction">

{{< how-to-expand "Capture data" "3" "BEGINNER-FRIENDLY" "" "data-platform-capture" >}}
{{< how-to-expand "Quickstart" "" "" "" "" >}}
{{< cards >}}
{{% card link="/data-ai/quickstart/set-up-viam/" noimage="true" %}}
{{% card link="/data-ai/quickstart/capture" noimage="true" %}}
{{% card link="/data-ai/quickstart/infer/" noimage="true" %}}
{{< /cards >}}
{{< /how-to-expand >}}

{{< how-to-expand "Capture data" "" "" "" "data-platform-capture" >}}
{{< cards >}}
{{% card link="/data-ai/capture-data/" noimage="true" %}}
{{% card link="/data-ai/capture-data/capture-sync/" noimage="true" %}}
{{% card link="/data-ai/capture-data/filter-before-sync/" noimage="true" %}}
{{% card link="/data-ai/capture-data/conditional-sync/" noimage="true" %}}
{{% card link="/data-ai/capture-data/filter-before-sync/" noimage="true" %}}
{{< /cards >}}
{{< /how-to-expand >}}

{{< how-to-expand "Work with data" "4" "BEGINNER-FRIENDLY" "" "data-platform-work" >}}
{{< how-to-expand "Work with data" "" "" "" "data-platform-work" >}}
{{< cards >}}
{{% card link="/data-ai/data/query/" noimage="true" %}}
{{% card link="/data-ai/data/visualize/" noimage="true" %}}
{{% card link="/data-ai/data/advanced/alert-data/" noimage="true" %}}
{{% card link="/data-ai/data/alert-data/" noimage="true" %}}
{{% card link="/data-ai/data/export/" noimage="true" %}}
{{< /cards >}}
{{< /how-to-expand >}}

{{< how-to-expand "Leverage AI" "8" "INTERMEDIATE" "" "data-platform-ai" >}}
{{< how-to-expand "Run inference" "" "" "" "data-platform-ai" >}}
{{< cards >}}
{{% card link="/data-ai/inference/" noimage="true" %}}
{{% card link="/data-ai/inference/alert/" noimage="true" %}}
{{% card link="/data-ai/inference/act/" noimage="true" %}}
{{< /cards >}}
{{< /how-to-expand >}}

{{< how-to-expand "Train an ML model" "" "" "" "data-platform-ai" >}}
{{< cards >}}
{{% card link="/data-ai/ai/create-dataset/" noimage="true" %}}
{{% card link="/data-ai/ai/train-tflite/" noimage="true" %}}
{{% card link="/data-ai/ai/train/" noimage="true" %}}
{{% card link="/data-ai/ai/deploy/" noimage="true" %}}
{{% card link="/data-ai/ai/run-inference/" noimage="true" %}}
{{% card link="/data-ai/ai/alert/" noimage="true" %}}
{{% card link="/data-ai/ai/act/" noimage="true" %}}
{{% card link="/data-ai/ai/upload-external-data/" noimage="true" %}}
{{< /cards >}}
{{< /how-to-expand >}}

Expand Down
4 changes: 2 additions & 2 deletions docs/data-ai/ai/_index.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
linkTitle: "Leverage AI"
title: "Leverage AI"
linkTitle: "Train an ML model"
title: "Train an ML model"
weight: 300
layout: "empty"
type: "docs"
Expand Down
4 changes: 2 additions & 2 deletions docs/data-ai/ai/create-dataset.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
linkTitle: "Create a dataset"
title: "Create a dataset"
linkTitle: "Create a training dataset"
title: "Create a training dataset"
weight: 10
layout: "docs"
type: "docs"
Expand Down
2 changes: 1 addition & 1 deletion docs/data-ai/ai/deploy.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ The service works with models trained inside and outside the Viam app:
On its own the ML model service only runs the model.
After deploying your model, you need to configure an additional service to use the deployed model.
For example, you can configure an [`mlmodel` vision service](/operate/reference/services/vision/) to visualize the inferences your model makes.
Follow our docs to [run inference](/data-ai/ai/run-inference/) to add an `mlmodel` vision service and see inferences.
Follow our docs to [run inference](/data-ai/inference/) to add an `mlmodel` vision service and see inferences.

For other use cases, consider [creating custom functionality with a module](/operate/get-started/other-hardware/).

Expand Down
2 changes: 1 addition & 1 deletion docs/data-ai/ai/train-tflite.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,7 @@ To capture images of edge cases and re-train your model using those images, comp
## Next steps

Now your machine can make inferences about its environment.
The next step is to [deploy](/data-ai/ai/deploy/) the ML model and then [act](/data-ai/ai/act/) or [alert](/data-ai/ai/alert/) based on these inferences.
The next step is to [deploy](/data-ai/ai/deploy/) the ML model and then [act](/data-ai/inference/act/) or [alert](/data-ai/inference/alert/) based on these inferences.

See the following tutorials for examples of using machine learning models to make your machine do things based on its inferences about its environment:

Expand Down
2 changes: 1 addition & 1 deletion docs/data-ai/ai/train.md
Original file line number Diff line number Diff line change
Expand Up @@ -846,4 +846,4 @@ You can also view your training jobs' logs with the [`viam train logs`](/dev/too
{{< /table >}}

To use your new model with machines, you must [deploy it](/data-ai/ai/deploy/) with the appropriate ML model service.
Then you can use another service, such as the vision service, to [run inference](/data-ai/ai/run-inference/).
Then you can use another service, such as the vision service, to [run inference](/data-ai/inference/).
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,11 @@ aliases:
- /data/upload/
- /services/data/upload/
- /how-tos/upload-data/
- /data-ai/ai/advanced/upload-external-data/
- /data-ai/ai/advanced/
date: "2024-12-04"
description: "Upload data to the Viam app from your local computer or mobile device using the data client API, Viam CLI, or Viam mobile app."
prev: "/data-ai/ai/act/"
prev: "/data-ai/inference/act/"
---

When you configure the data management service, Viam automatically uploads data from the default directory `~/.viam/capture` and any directory you configured.
Expand Down
113 changes: 110 additions & 3 deletions docs/data-ai/capture-data/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,117 @@
linkTitle: "Capture data"
title: "Capture data"
weight: 100
layout: "empty"
layout: "docs"
type: "docs"
empty_node: true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we do actually want to keep these top level pages empty. There's been a lot of confusion on whether people get confused with some of these having content and others not having content. I think in this case (1) we have a child page with almost the same name so the content should be merged onto that page or this page's content needs to go into reference. (2) with the current design the top level pages really don't feel like they should have content themselves. No other top level page currently does. (3) this isn't the first thing you'd want to tell someone. This is the info for people who want to know more so it should come after they get the basics.

open_on_desktop: true
header_only: true
noedit: true
aliases:
- /data-ai/capture-data/advanced/how-sync-works/
- /data-ai/capture-data/advanced/
- /data-ai/capture-data/how-sync-works/
---

`viam-server` and `viam-micro-server` handle data sync in distinct ways:

{{< tabs >}}
{{% tab name="viam-server" %}}

The data is captured locally on the machine's storage and, by default, stored in the `~/.viam/capture` directory.
For Linux root or sudo users, the `~/.viam/capture` directory resolves to `/root/.viam/capture`.

{{% expand "Can't find the directory data is stored in? Click here." %}}

The relative path for the data capture directory depends on where `viam-server` is run from, as well as the operating system of the machine.

To find the `$HOME` value, check your machine's logs on startup which will log it in the environment variables:

```sh
2025-01-15T14:27:26.073Z INFO rdk server/entrypoint.go:77 Starting viam-server with following environment variables {"HOME":"/home/johnsmith"}
```

{{% /expand%}}

If a machine restarts for any reason, data capture automatically resumes and any data already stored but not yet synced is synced.

The service can capture data from multiple resources at the same or different frequencies.
The service does not impose a lower or upper limit on the frequency of data collection.
However, in practice, your hardware may impose limits on the frequency of data collection.
Avoid configuring data capture to higher rates than your hardware can handle, as this could lead to performance degradation.

Data capture is frequently used with cloud sync.
You can start and stop capture and sync independently.
You can also enable cloud sync without data capture and it will sync data in the capture directory, as well as the additional sync paths configured in the `viam-server` config.
If you place data like images or files in the `~/.viam/capture` directory or another directory set up for sync with the data manager, for example with the `"additional_sync_paths"` config attribute, it will sync this data to the cloud.

{{% /tab %}}
{{% tab name="viam-micro-server" %}}

The data is captured in the ESP32's flash memory until it is uploaded to the Viam Cloud.

If the machine restarts before all data is synced, all unsynced data captured since the last sync point is lost.

The service can capture data from multiple resources at the same or different frequencies.
The service does not impose a lower or upper limit on the frequency of data collection.
However, in practice, high frequency data collection (> 100Hz) requires special considerations on the ESP32.

{{% /tab %}}
{{< /tabs >}}

## Security

The data management service uses {{< glossary_tooltip term_id="grpc" text="gRPC" >}} calls to send and receive data, so your data is encrypted while in flight.
When data is stored in the cloud, it is encrypted at rest by the cloud storage provider.

## Data integrity

Viam's data management service is designed to safeguard against data loss, data duplication and otherwise compromised data.

If the internet becomes unavailable or the machine needs to restart during the sync process, the sync is interrupted.
If the sync process is interrupted, the service will retry uploading the data at exponentially increasing intervals until the interval in between tries is at one hour, at which point the service retries the sync every hour.
When the connection is restored and sync resumes, the service continues sync where it left off without duplicating data.
If the interruption happens mid-file, sync resumes from the beginning of that file.

To avoid syncing files that are still being written to, the data management service only syncs arbitrary files that haven't been modified in the previous 10 seconds.
This default can be changed with the [`file_last_modified_millis` config attribute](/data-ai/capture-data/capture-sync/).

## Automatic data deletion

If cloud sync is enabled, the data management service deletes captured data from the disk once it has successfully synced to the cloud.

{{< alert title="Warning" color="warning" >}}

If your robot is offline and can't sync and your machine's disk fills up beyond a certain threshold, the data management service will delete captured data to free up additional space and maintain a working machine.

{{< /alert >}}

The data management service will also automatically delete local data in the event your machine's local storage fills up.
Local data is automatically deleted when _all_ of the following conditions are met:

- Data capture is enabled on the data management service
- Local disk usage percentage is greater than or equal to 90%
- The Viam capture directory is at least 50% of the current local disk usage

If local disk usage is greater than or equal to 90%, but the Viam capture directory is not at least 50% of that usage, a warning log message will be emitted instead and no action will be taken.

Automatic file deletion only applies to files in the specified Viam capture directory, which is set to `~/.viam/capture` by default.
Data outside of this directory is not touched by automatic data deletion.

If your machine captures a large amount of data, or frequently goes offline for long periods of time while capturing data, consider moving the Viam capture directory to a larger, dedicated storage device on your machine if available.
You can change the capture directory using the `capture_dir` attribute.

You can also control how local data is deleted if your machine's local storage becomes full, using the `delete_every_nth_when_disk_full` attribute.

## Storage

Data that is successfully synced to the cloud is automatically deleted from local storage.

When a machine loses its internet connection, it cannot resume cloud sync until it can reach the Viam Cloud again.

{{<imgproc src="/services/data/data_management.png" resize="x1100" declaredimensions=true alt="Data is captured on the machine, uploaded to the cloud, and then deleted off local storage." class="imgzoom" >}}

To ensure that the machine can store all data captured while it has no connection, you need to provide enough local data storage.

If your robot is offline and can't sync and your machine's disk fills up beyond a certain threshold, the data management service will delete captured data to free up additional space and maintain a working machine.

Data capture supports capturing tabular data directly to MongoDB in addition to capturing to disk.
For more information, see [Capture directly to MongoDB](/data-ai/reference/advanced-data-capture-sync/#capture-directly-to-your-own-mongodb-cluster).
8 changes: 0 additions & 8 deletions docs/data-ai/capture-data/advanced/_index.md

This file was deleted.

118 changes: 0 additions & 118 deletions docs/data-ai/capture-data/advanced/how-sync-works.md

This file was deleted.

12 changes: 0 additions & 12 deletions docs/data-ai/capture-data/capture-other-sources.md

This file was deleted.

Loading
Loading