We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
p145
在训练方式上,指令微调与预训练较为相似,很多设置包括数据组织形式都 可以预训练阶段所采用的技术(参考第 4 章和第 6 章)。本节主要介绍指令微调所 特有的一些训练策略。
p146
指令微调中的优化器设置(AdamW 或 Adafactor)、稳定训练技巧(权重衰减 和梯度裁剪)和训练技术(3D 并行、ZeRO 和混合精度训练)都与预训练保持阶 段一致,可以完全沿用。
The text was updated successfully, but these errors were encountered:
No branches or pull requests
p145
p146
The text was updated successfully, but these errors were encountered: