Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

torchtune training fails to validate dataset #1849

Open
2 tasks
booxter opened this issue Mar 31, 2025 · 1 comment
Open
2 tasks

torchtune training fails to validate dataset #1849

booxter opened this issue Mar 31, 2025 · 1 comment
Labels
bug Something isn't working

Comments

@booxter
Copy link
Contributor

booxter commented Mar 31, 2025

System Info

.

Information

  • The official example scripts
  • My own modified scripts

🐛 Describe the bug

When I try to use torchtune for post-training, it no longer works and fails with:

AttributeError: 'DatasetWithACL' object has no attribute 'dataset_schema'

This happens during dataset validation.

Error executing endpoint route=\'/v1/post-training/supervised-fine-tune\' method=\'post\'\nTraceback (most recent call last):\n  File "/home/ec2-user/src/llama-stack/schedule/llama_stack/distribution/server/server.py", line 201, in endpoint\n    return await maybe_await(value)\n           ^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/home/ec2-user/src/llama-stack/schedule/llama_stack/distribution/server/server.py", line 161, in maybe_await\n    return await value\n           ^^^^^^^^^^^\n  File "/home/ec2-user/src/llama-stack/schedule/llama_stack/providers/inline/post_training/torchtune/post_training.py", line 89, in supervised_fine_tune\n    await recipe.setup()\n  File "/home/ec2-user/src/llama-stack/schedule/llama_stack/providers/inline/post_training/torchtune/recipes/lora_finetuning_single_device.py", line 198, in setup\n    self._training_sampler, self._training_dataloader = await self._setup_data(\n                                                        ^^^^^^^^^^^^^^^^^^^^^^^\n  File "/home/ec2-user/src/llama-stack/schedule/llama_stack/providers/inline/post_training/torchtune/recipes/lora_finetuning_single_device.py", line 342, in _setup_data\n    await validate_input_dataset_schema(\n  File "/home/ec2-user/src/llama-stack/schedule/llama_stack/providers/inline/post_training/common/validator.py", line 50, in validate_input_dataset_schema\n    if not dataset_def.dataset_schema or len(dataset_def.dataset_schema) == 0:\n           ^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File "/home/ec2-user/src/llama-stack/schedule/venv/lib64/python3.11/site-packages/pydantic/main.py", line 984, in __getattr__\n    raise AttributeError(f\'{type(self).__name__!r} object has no attribute {item!r}\')\nAttributeError: \'DatasetWithACL\' object has no attribute \'dataset_schema\'' severity=<LogSeverity.ERROR: 'error'>

This happens since dataset API was changed and the dataset_schema field removed: #1573

Error logs

.

Expected behavior

.

@booxter booxter added the bug Something isn't working label Mar 31, 2025
@booxter
Copy link
Contributor Author

booxter commented Mar 31, 2025

@yanxi0830 FYI post-training is broken after datasets API changed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant