Model parallelism of pretrained models. #4452
Unanswered
ryuryu18yaki
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, I try to do fine-tuning of whisper-large-model with AWS sagemaker model parallelism. But I cannot understand how to code a training script. Could you teach me about 2 topics under here.
1, How to make pretrained model with wrapped "smp.DistributedModel".
I want to load model from OpenAI's GitHub like this.
import whisper
model = whisper.load_model("large")
However, if I need to do another library, I am willing to do.
2, Model parallelism have three methods (sharded data parallel, pipeline parallel and tensor parallel).
Which methods should I use?
Now, I think to use sharded data parallel.
My English is poor, but I hope you can answer me.
Beta Was this translation helpful? Give feedback.
All reactions