Skip to content

Improve Mlflow Tasks and Add Dockerfile for Local Development #52

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Apr 24, 2025

Conversation

diamonwiggins
Copy link
Member

  • Update tasks to use the status property to prevent task re-running
  • Update tasks to use the silent property to prevent printing task commands to stdout
  • Update task names for better grouping
    • Moved away from verb:noun as the verbs weren't really consistent across different tasks
  • Add Dockerfile that can be built and used for local development workflows to help consumers more quickly onboard without having to manage dependencies

@diamonwiggins diamonwiggins requested a review from a team April 23, 2025 23:03
fi

# Start container with host networking for kubectl port-forward compatibility
CONTAINER_ID=$(docker run --name mlflow-dev-container --network host -d \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to --rm to allow cleaning after self?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the command be abstracted, while docker is common so is podman it would be nice to allow the container system to be either one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to --rm to allow cleaning after self?

I ended up getting rid of --rm because i preferred to keep the container around and just shell back into it.

Copy link
Member Author

@diamonwiggins diamonwiggins Apr 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the command be abstracted, while docker is common so is podman it would be nice to allow the container system to be either one.

Looks like there is pretty good compatibility in podman with the docker CLI already. Would have to spend some additional time with podman myself to see if there's parity with what I have today or if I needed to actually abstract both docker and podman commands in this CLI .

https://podman-desktop.io/docs/migrating-from-docker/managing-docker-compatibility#:~:text=Use%20all%20Docker%20tools%2C%20including,engine%20to%20start%20a%20container.

Copy link
Member

@banjoh banjoh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a future PR, I think we should move the containerisation tasks to a shared file and allow passing variables if need be. This will allow DRYing changes such as #51

# Custom prompt
export PS1="\[\033[01;32m\]mlflow-dev\[\033[00m\]:\[\033[01;34m\]\w\[\033[00m\]$ "

# Useful aliases
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 we should have these aliases in all our <tools> shell shells such as EC

@diamonwiggins diamonwiggins merged commit 987e228 into main Apr 24, 2025
7 checks passed
@diamonwiggins diamonwiggins deleted the diamonwiggins/sc-122577/address-mlflow-feedback branch April 24, 2025 13:52
Copy link
Member

@chris-sanders chris-sanders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments and questions


## Prerequisites

- Docker
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's decouple from requiring Docker itself and make it any compatible container orchestrator. Podman is quickly becoming the standard in linux distributions with Docker often being shipped as an external repo.


### Common Development Tasks

Once you're inside the development container (or on your local machine with all prerequisites installed), you can run these common tasks:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to avoid forcing people to operate in the development container, can we just execute the steps in the container instead, or is the shell a convenience thing?

The reason I say this is people are going to be editing and developing files and they will want their normal tools to do that. The container should allow them to run helm commands and go through the workflow but I want to avoid people 'staying' in the container and having to setup their development environment of preference.

```bash
# Installs with Replicated SDK disabled
task install:helm:local
task helm:install-local
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does 'local' mean here? I'm guessing you mean the local charts as in what's in this directory? I could see confusion with CMX not being local but as long as we're consistent and explan what we mean by local maybe it's fine.


# Optionally specify a custom values file
MLFLOW_VALUES=./my-values.yaml task install:helm:local
MLFLOW_VALUES=./my-values.yaml task helm:install-local
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this setup so that you can provide different values to different charts? As a generic workflow I'd like to see inputs and configs like this not include the application name. To make this a process people can use in their own applications we want to minimize the number of variables where the variable name is specific to an application. Ideally the workflow has all generic variable names and works for any application suite, and the contents of the variables is the only custom parts.

@@ -105,57 +176,56 @@ This workflow tests the full Replicated release and distribution process:
export REPLICATED_CHANNEL=channel_name

# Login to the registry
task login:registry
task registry:login
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this specifically for the SDK?
I'm not a fan of the workflow needing a registry login, although I could see if you have private charts in a private registry already you could need it. Maybe this needs to explicitly be 'replicated:registry:login' or something? Given you have to log-out as part of the current workflow (before packaging) do we even want the user to have a login option or will we just do that as part of the testing process?

Comment on lines +37 to +39
# Install Kind
&& curl -Lo /usr/local/bin/kind https://kind.sigs.k8s.io/dl/v0.20.0/kind-linux-amd64 \
&& chmod +x /usr/local/bin/kind \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we providing Kind rather than using CMX?

&& rm replicated_*_linux_amd64.tar.gz

# Install Python dependencies directly
RUN pip3 install --upgrade pip wheel setuptools \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

uv pip install is the new standard I'd use uv for everything these days with python.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants