You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary:
Pull Request resolved: #2889
Enable Random weights for unit test. When testing for DMP interface for Dynamic Sharding, I'm noticing discrepancies in predictions. Still debugging this case, but will be enabling random weights by default for the initial dynamic sharding interface and keep the debug values as an optional flag.
Main changes:
1. Added comment to `copy_state_dict` in `test_sharding` to make it clear it is the global state_dict being copied to the local
2. Removing redundant `copy_state_dict` use in dynamic sharding unit test set up, since already using `load_state_dict`
3. Added `use_debug_state_dict` flag defaulted to `False` - if turned on this will force the test models to have dummy int values in embeddings weights.
4. With `use_debug_state_dict` turned off, the weights will be randomly generated up-on initialization of the EBCs.
1. Note: `torch.manual_seed(0)` - is needed to force the EBCs to be initialized with the same float values across ranks in the distributed env.
2. Alternate approach could be to initialize the global EBCs outside of distributed test process, but since this is just unit test, I can keep as is.
Reviewed By: TroyGarden
Differential Revision: D73077322
fbshipit-source-id: 093f23c10b73a90b61429c4109e484627270bd46
0 commit comments