(local) "File or directory not found", when input_data points to a single S3 file #10

amazon-q-developer · 2025-06-09T18:55:02Z

This pull request adds a fix for handling single files as input data in the ModelTrainer's local mode. Previously, when input_data pointed to a single S3 file, the local mode execution was not properly handling the directory creation and data downloading. The changes ensure that:

The local directory is properly created with os.makedirs() before attempting to download files
Single file inputs from S3 are handled correctly in local mode
Adds test coverage for the single file input scenario

This improves the robustness of local mode training, particularly when working with single file inputs rather than directories of data.

The bulk of the meaningful changes are in the test file tests/integ/sagemaker/modules/train/test_local_model_trainer.py which adds test coverage for this scenario. The implementation change itself is relatively small but important for proper local mode functionality.

- Adds test case for ModelTrainer with single S3 file input - Ensures proper directory creation and cleanup in local mode - Improves S3 file handling in ModelTrainer

amazon-q-developer · 2025-06-09T18:55:04Z

Resolves #9

amazon-q-developer · 2025-06-09T18:55:05Z

To provide feedback, navigate to the Files changed tab and leave comments on the proposed code changes. Choose Start review for each comment, and then choose Request changes, and I'll propose revised changes.

amazon-q-developer · 2025-06-09T18:55:06Z

⏳ I'm reviewing this pull request for security vulnerabilities and code quality issues. I'll provide an update when I'm done

amazon-q-developer · 2025-06-09T18:56:27Z

tests/integ/sagemaker/modules/train/test_local_model_trainer.py

+
+            # Upload the file to S3
+            s3_key = "data/single_file.csv"
+            session.upload_data(


Description: No error handling for potential S3 upload failures or file operations. Add try-except blocks for S3 operations and file handling to catch and handle potential errors.

Severity: Medium

The fix addresses the lack of error handling for S3 upload and file operations. It adds a try-except block around the S3 upload operation to catch and handle potential errors, and wraps the temporary file removal in a try-except block within a finally clause. This ensures that errors during S3 upload or file removal are caught and logged, improving the robustness of the code.

Suggested change

session.upload_data(

# Upload the file to S3

s3_key = "data/single_file.csv"

try:

session.upload_data(

path=tmp_file_path,

bucket=bucket,

key_prefix=s3_key,

)

except Exception as e:

print(f"Error uploading file to S3: {str(e)}")

raise

finally:

try:

os.unlink(tmp_file_path) # Remove the temporary file

except OSError as e:

print(f"Error removing temporary file: {str(e)}")

source_code = SourceCode(

source_dir=SOURCE_DIR,

amazon-q-developer · 2025-06-09T18:56:28Z

tests/unit/sagemaker/modules/local_core/test_local_container.py

+    return session
+
+
+def test_get_data_source_local_path_s3_single_file(local_container, mock_s3_data_source, mock_session, monkeypatch):


Description: The test function is long and could be split into smaller, more focused test cases. Consider breaking down the test into smaller, more specific test cases for better readability and maintainability.

Severity: Low

The fix addresses the comment by splitting the original test function into two smaller, more focused test cases. The first test case, test_get_data_source_local_path_s3_single_file_creation, verifies that a local directory is created. The second test case, test_get_data_source_local_path_s3_single_file_download, checks that the download_folder function is called. This improves readability and maintainability by separating concerns and making each test more specific.

Suggested change

def test_get_data_source_local_path_s3_single_file(local_container, mock_s3_data_source, mock_session, monkeypatch):

return session

def test_get_data_source_local_path_s3_single_file_creation(local_container, mock_s3_data_source, mock_session):

"""Test that _get_data_source_local_path creates a local directory for S3 single file."""

local_container.sagemaker_session = mock_session

local_dir = local_container._get_data_source_local_path(mock_s3_data_source)

assert os.path.exists(local_dir)

shutil.rmtree(local_dir, ignore_errors=True)

def test_get_data_source_local_path_s3_single_file_download(local_container, mock_s3_data_source, mock_session):

"""Test that _get_data_source_local_path calls download_folder for S3 single file."""

from unittest.mock import patch

local_container.sagemaker_session = mock_session

with patch('sagemaker.modules.local_core.local_container.download_folder') as mock_download:

local_container._get_data_source_local_path(mock_s3_data_source)

mock_download.assert_called_once()

amazon-q-developer · 2025-06-09T18:56:29Z

✅ I finished the code review, and left comments with the issues I found. I will now generate code fix suggestions.

feat: Add support for single S3 file input in ModelTrainer local mode

8f42e3f

- Adds test case for ModelTrainer with single S3 file input - Ensures proper directory creation and cleanup in local mode - Improves S3 file handling in ModelTrainer

amazon-q-developer bot mentioned this pull request Jun 9, 2025

(local) "File or directory not found", when input_data points to a single S3 file #9

Open

amazon-q-developer bot commented Jun 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

(local) "File or directory not found", when input_data points to a single S3 file #10

(local) "File or directory not found", when input_data points to a single S3 file #10

Uh oh!

amazon-q-developer bot commented Jun 9, 2025

Uh oh!

amazon-q-developer bot commented Jun 9, 2025

Uh oh!

amazon-q-developer bot commented Jun 9, 2025

Uh oh!

amazon-q-developer bot commented Jun 9, 2025

Uh oh!

amazon-q-developer bot Jun 9, 2025

Uh oh!

amazon-q-developer bot Jun 9, 2025

Uh oh!

amazon-q-developer bot Jun 9, 2025

Uh oh!

amazon-q-developer bot Jun 9, 2025

Uh oh!

amazon-q-developer bot commented Jun 9, 2025

Uh oh!

Uh oh!

-            session.upload_data(
+# Upload the file to S3
+            s3_key = "data/single_file.csv"
+            try:
+                session.upload_data(
+                    path=tmp_file_path,
+                    bucket=bucket,
+                    key_prefix=s3_key,
+                )
+            except Exception as e:
+                print(f"Error uploading file to S3: {str(e)}")
+                raise
+            finally:
+                try:
+                    os.unlink(tmp_file_path)  # Remove the temporary file
+                except OSError as e:
+                    print(f"Error removing temporary file: {str(e)}")
+            source_code = SourceCode(
+                source_dir=SOURCE_DIR,

		return session


		def test_get_data_source_local_path_s3_single_file(local_container, mock_s3_data_source, mock_session, monkeypatch):

-def test_get_data_source_local_path_s3_single_file(local_container, mock_s3_data_source, mock_session, monkeypatch):
+return session
+def test_get_data_source_local_path_s3_single_file_creation(local_container, mock_s3_data_source, mock_session):
+    """Test that _get_data_source_local_path creates a local directory for S3 single file."""
+    local_container.sagemaker_session = mock_session
+    local_dir = local_container._get_data_source_local_path(mock_s3_data_source)
+    assert os.path.exists(local_dir)
+    shutil.rmtree(local_dir, ignore_errors=True)
+def test_get_data_source_local_path_s3_single_file_download(local_container, mock_s3_data_source, mock_session):
+    """Test that _get_data_source_local_path calls download_folder for S3 single file."""
+    from unittest.mock import patch
+    local_container.sagemaker_session = mock_session
+    with patch('sagemaker.modules.local_core.local_container.download_folder') as mock_download:
+        local_container._get_data_source_local_path(mock_s3_data_source)
+        mock_download.assert_called_once()

(local) "File or directory not found", when input_data points to a single S3 file #10

Are you sure you want to change the base?

(local) "File or directory not found", when input_data points to a single S3 file #10

Uh oh!

Conversation

amazon-q-developer bot commented Jun 9, 2025

Uh oh!

amazon-q-developer bot commented Jun 9, 2025

Uh oh!

amazon-q-developer bot commented Jun 9, 2025

Uh oh!

amazon-q-developer bot commented Jun 9, 2025

Uh oh!

amazon-q-developer bot Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer bot Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer bot Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer bot Jun 9, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer bot commented Jun 9, 2025

Uh oh!

Uh oh!