Skip to content

Commit 85c945c

Browse files
authored
Add tutorial use mirror (#1412)
1 parent b914827 commit 85c945c

File tree

4 files changed

+269
-33
lines changed

4 files changed

+269
-33
lines changed

docs/en/tutorials/use_mirror.md

+86
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# Use mirror to download models and datasets
2+
3+
While the official Hugging Face repository offers numerous high-quality models and datasets, they may not be always accessible due to network issues. To make the access easier, MindNLP enables you to download models and datasets from a variety of huggingface mirrors or other model repositories.
4+
5+
Here we show you how to set your desired mirror.
6+
7+
You can either set the Hugging Face mirror through the environment variable, or more locally, specify the mirror in the `from_pretrained` method when downloading models.
8+
9+
## Set Hugging Face mirror through the environment variable
10+
11+
The Huggingface mirror used in MindNLP is controlled throught the `HF_ENDPOINT` environment variable.
12+
13+
You can either set this variable in the terminal before excuting your python script:
14+
```bash
15+
export HF_ENDPOINT="https://hf-mirror.com"
16+
```
17+
or set it within the python script using the `os` package:
18+
19+
20+
```python
21+
import os
22+
os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'
23+
```
24+
25+
If the `HF_ENDPOINT` variable is not set explicitly by the user, MindNLP will use 'https://hf-mirror.com' by default. You can change this to the official Huggingface repository, 'https://huggingface.co'.
26+
27+
**Important:**
28+
29+
The URL should not include the last '/'. Setting the varialble to 'https://hf-mirror.com' will work, while setting it to 'https://hf-mirror.com/' will result in an error.
30+
31+
**Important:**
32+
33+
As the `HF_ENDPOINT` variable is read during the initial import of MindNLP, it is important to set the `HF_ENDPOINT` before importing MindNLP. If you are in a Jupyter Notebook, and MindNLP package is already imported, you may need to restart the notebook for the change to take effect.
34+
35+
Now you can download the model you want, for example:
36+
37+
38+
```python
39+
from mindnlp.transformers import AutoModelForSequenceClassification
40+
model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased')
41+
```
42+
43+
## Specify Hugging Face mirror in the `from_pretrained` method
44+
45+
Instead of setting the Hugging Face mirror globally through the environment variable, you can also specify the mirror for a single download operation in the `from_pretrained` method.
46+
47+
For example:
48+
49+
50+
```python
51+
from mindnlp.transformers import AutoModelForSequenceClassification
52+
model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased', mirror='modelscope', revision='master')
53+
```
54+
55+
MindNLP accepts the following options for the `mirror` argument:
56+
57+
* 'huggingface'
58+
59+
Download from the Hugging Face mirror specified through the `HF_ENDPOINT` environment variable. By default, it points to [HF-Mirror](https://hf-mirror.com).
60+
61+
* 'modelscope'
62+
63+
Download from [ModelScope](https://www.modelscope.cn).
64+
65+
* 'wisemodel'
66+
67+
Download from [始智AI](https://www.wisemodel.cn).
68+
69+
* 'gitee'
70+
71+
Dowload from the [Gitee AI Hugging Face repository](https://ai.gitee.com/huggingface).
72+
73+
* 'aifast'
74+
75+
Download from [AI快站](https://aifasthub.com).
76+
77+
Note that not all models can be found from a single mirror, you may need to check whether the model you want to download is actually provided by the mirror you choose.
78+
79+
In addition to specifying the mirror, you also need to specify the `revision` argument. The `revision` argument can either be 'master' or 'main' depending on the mirror you choose. By default, `revision='main'`.
80+
81+
* If the `mirror` is 'huggingface', 'wisemodel' or 'gitee', set `revision='main'`.
82+
83+
* If the `mirror` is 'modelscope', set `revision='master'`.
84+
85+
* If the `mirror` is 'aifast', `revision` does not need to be specified.
86+

mkdocs.yml

+1
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,7 @@ nav:
1010
- Data Preprocess: tutorials/data_preprocess.md
1111
- Use Trainer: tutorials/use_trainer.md
1212
- PEFT: tutorials/peft.md
13+
- Use Mirror: tutorials/use_mirror.md
1314
- Supported Models: supported_models.md
1415
- How-To Contribute: contribute.md
1516
- API Reference:

tutorials/4.pretrained_model.ipynb

-33
This file was deleted.

tutorials/4.use_mirror.ipynb

+182
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,182 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"# Use mirror to download models and datasets\n",
8+
"\n",
9+
"While the official Hugging Face repository offers numerous high-quality models and datasets, they may not be always accessible due to network issues. To make the access easier, MindNLP enables you to download models and datasets from a variety of huggingface mirrors or other model repositories.\n",
10+
"\n",
11+
"Here we show you how to set your desired mirror.\n",
12+
"\n",
13+
"You can either set the Hugging Face mirror through the environment variable, or more locally, specify the mirror in the `from_pretrained` method when downloading models."
14+
]
15+
},
16+
{
17+
"cell_type": "markdown",
18+
"metadata": {},
19+
"source": [
20+
"## Set Hugging Face mirror through the environment variable\n",
21+
"\n",
22+
"The Huggingface mirror used in MindNLP is controlled throught the `HF_ENDPOINT` environment variable.\n",
23+
"\n",
24+
"You can either set this variable in the terminal before excuting your python script:\n",
25+
"```bash\n",
26+
"export HF_ENDPOINT=\"https://hf-mirror.com\"\n",
27+
"```\n",
28+
"or set it within the python script using the `os` package:"
29+
]
30+
},
31+
{
32+
"cell_type": "code",
33+
"execution_count": 1,
34+
"metadata": {},
35+
"outputs": [],
36+
"source": [
37+
"import os\n",
38+
"os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'"
39+
]
40+
},
41+
{
42+
"cell_type": "markdown",
43+
"metadata": {},
44+
"source": [
45+
"If the `HF_ENDPOINT` variable is not set explicitly by the user, MindNLP will use 'https://hf-mirror.com' by default. You can change this to the official Huggingface repository, 'https://huggingface.co'.\n",
46+
"\n",
47+
"**Important:**\n",
48+
"\n",
49+
"The URL should not include the last '/'. Setting the varialble to 'https://hf-mirror.com' will work, while setting it to 'https://hf-mirror.com/' will result in an error.\n",
50+
"\n",
51+
"**Important:**\n",
52+
"\n",
53+
"As the `HF_ENDPOINT` variable is read during the initial import of MindNLP, it is important to set the `HF_ENDPOINT` before importing MindNLP. If you are in a Jupyter Notebook, and MindNLP package is already imported, you may need to restart the notebook for the change to take effect."
54+
]
55+
},
56+
{
57+
"cell_type": "markdown",
58+
"metadata": {},
59+
"source": [
60+
"Now you can download the model you want, for example:"
61+
]
62+
},
63+
{
64+
"cell_type": "code",
65+
"execution_count": 2,
66+
"metadata": {},
67+
"outputs": [
68+
{
69+
"name": "stderr",
70+
"output_type": "stream",
71+
"text": [
72+
"[WARNING] ME(54773:130029102232640,MainProcess):2024-07-17-21:23:42.507.077 [mindspore/run_check/_check_version.py:102] MindSpore version 2.2.14 and cuda version 11.4.148 does not match, CUDA version [['10.1', '11.1', '11.6']] are supported by MindSpore officially. Please refer to the installation guide for version matching information: https://www.mindspore.cn/install.\n",
73+
"/home/hubo/Software/miniconda3/envs/mindspore/lib/python3.9/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
74+
" from .autonotebook import tqdm as notebook_tqdm\n",
75+
"Building prefix dict from the default dictionary ...\n",
76+
"Dumping model to file cache /tmp/jieba.cache\n",
77+
"Loading model cost 0.762 seconds.\n",
78+
"Prefix dict has been built successfully.\n",
79+
"The following parameters in checkpoint files are not loaded:\n",
80+
"['cls.predictions.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight']\n",
81+
"The following parameters in models are missing parameter:\n",
82+
"['classifier.weight', 'classifier.bias']\n"
83+
]
84+
}
85+
],
86+
"source": [
87+
"from mindnlp.transformers import AutoModelForSequenceClassification\n",
88+
"model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased')"
89+
]
90+
},
91+
{
92+
"cell_type": "markdown",
93+
"metadata": {},
94+
"source": [
95+
"## Specify Hugging Face mirror in the `from_pretrained` method\n",
96+
"\n",
97+
"Instead of setting the Hugging Face mirror globally through the environment variable, you can also specify the mirror for a single download operation in the `from_pretrained` method.\n",
98+
"\n",
99+
"For example:"
100+
]
101+
},
102+
{
103+
"cell_type": "code",
104+
"execution_count": 4,
105+
"metadata": {},
106+
"outputs": [
107+
{
108+
"name": "stderr",
109+
"output_type": "stream",
110+
"text": [
111+
"The following parameters in checkpoint files are not loaded:\n",
112+
"['cls.predictions.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.LayerNorm.weight']\n",
113+
"The following parameters in models are missing parameter:\n",
114+
"['classifier.weight', 'classifier.bias']\n"
115+
]
116+
}
117+
],
118+
"source": [
119+
"from mindnlp.transformers import AutoModelForSequenceClassification\n",
120+
"model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased', mirror='modelscope', revision='master')"
121+
]
122+
},
123+
{
124+
"cell_type": "markdown",
125+
"metadata": {},
126+
"source": [
127+
"MindNLP accepts the following options for the `mirror` argument:\n",
128+
"\n",
129+
"* 'huggingface'\n",
130+
"\n",
131+
" Download from the Hugging Face mirror specified through the `HF_ENDPOINT` environment variable. By default, it points to [HF-Mirror](https://hf-mirror.com).\n",
132+
"\n",
133+
"* 'modelscope'\n",
134+
"\n",
135+
" Download from [ModelScope](https://www.modelscope.cn).\n",
136+
"\n",
137+
"* 'wisemodel'\n",
138+
"\n",
139+
" Download from [始智AI](https://www.wisemodel.cn).\n",
140+
"\n",
141+
"* 'gitee'\n",
142+
"\n",
143+
" Dowload from the [Gitee AI Hugging Face repository](https://ai.gitee.com/huggingface).\n",
144+
"\n",
145+
"* 'aifast'\n",
146+
"\n",
147+
" Download from [AI快站](https://aifasthub.com).\n",
148+
"\n",
149+
"Note that not all models can be found from a single mirror, you may need to check whether the model you want to download is actually provided by the mirror you choose.\n",
150+
"\n",
151+
"In addition to specifying the mirror, you also need to specify the `revision` argument. The `revision` argument can either be 'master' or 'main' depending on the mirror you choose. By default, `revision='main'`.\n",
152+
"\n",
153+
"* If the `mirror` is 'huggingface', 'wisemodel' or 'gitee', set `revision='main'`.\n",
154+
"\n",
155+
"* If the `mirror` is 'modelscope', set `revision='master'`.\n",
156+
"\n",
157+
"* If the `mirror` is 'aifast', `revision` does not need to be specified.\n"
158+
]
159+
}
160+
],
161+
"metadata": {
162+
"kernelspec": {
163+
"display_name": "mindspore",
164+
"language": "python",
165+
"name": "python3"
166+
},
167+
"language_info": {
168+
"codemirror_mode": {
169+
"name": "ipython",
170+
"version": 3
171+
},
172+
"file_extension": ".py",
173+
"mimetype": "text/x-python",
174+
"name": "python",
175+
"nbconvert_exporter": "python",
176+
"pygments_lexer": "ipython3",
177+
"version": "3.9.18"
178+
}
179+
},
180+
"nbformat": 4,
181+
"nbformat_minor": 2
182+
}

0 commit comments

Comments
 (0)