[MoE] Add MoE calibration options #1593

kylesayrs · 2025-06-25T15:19:57Z

Purpose

Add more options for configuring how an MoE model is calibrated via prepare_for_calibration

Changes

Add moe calibration arguments to prepare_for_calibration

:param moe_calibrate_all_experts: send all tokens to all experts for calibration
:param moe_calibrate_gated_acts: use moe gating mechanism when computing
    expert input and output activations. If this is True, the model computes
    activations similar to those found during inference. If this is False, the
    model computes activations similar to those found during training.

Signed-off-by: Kyle Sayers <[email protected]>

gemini-code-assist

Summary of Changes

Hello @kylesayrs, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the calibration capabilities for Mixture-of-Experts (MoE) models by introducing fine-grained control over how experts are handled during the calibration process. It also refines the observer configuration mechanism for quantization and adds detailed documentation for observers, improving the overall flexibility and clarity of the quantization pipeline.

Highlights

MoE Calibration Options: Introduced new parameters (moe_calibrate_all_experts, moe_calibrate_gated_acts) to the prepare_for_calibration function, providing fine-grained control over how Mixture-of-Experts (MoE) models are calibrated.
DeepseekV3 MoE Patching: Updated the DeepseekV3 MoE patching logic to dynamically configure expert activation collection during calibration based on the new CalibrationConfig settings.
Observer Configuration Enhancement: Enhanced the observer initialization process to allow direct passing of configuration parameters (e.g., maxshrink, patience, averaging_constant) to observers, improving their configurability.
New Documentation: Added comprehensive documentation for the Observer classes, detailing their purpose, types, quantization strategies, and configuration parameters.
Dependency Update: Updated the compressed-tensors dependency to a newer version.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

… into kylesayrs/moe_calibration_config

gemini-code-assist

Code Review

The pull request introduces new options for calibrating MoE models, specifically adding moe_calibrate_all_experts and moe_calibrate_gated_acts to the prepare_for_calibration function. The changes also include updates to the setup.py file and modifications to observer initialization and parameter handling. The feedback focuses on improving code clarity through additional comments and error message enhancements.

gemini-code-assist · 2025-06-25T15:22:18Z

src/llmcompressor/modeling/deepseek_v3.py

+                raise NotImplementedError(
+                    "Using all experts for activations without "
+                    "calibrating all experts is not supported"
+                )


This NotImplementedError could be more informative by including the specific configurations that are not supported. This will help users quickly understand the limitations and adjust their settings accordingly.

raise NotImplementedError( "Using all experts for activations without calibrating all experts is not supported. Please set moe_calibrate_gated_acts=True or moe_calibrate_all_experts=True." )

…onfig

Signed-off-by: Kyle Sayers <[email protected]>

dsikka · 2025-06-25T16:28:07Z

Putting this as draft as I make some changes

kylesayrs and others added 19 commits June 19, 2025 10:55

deepseekv3

b30eade

Signed-off-by: Kyle Sayers <[email protected]>

remove dreg

a957f2f

Signed-off-by: Kyle Sayers <[email protected]>

reformat example

2fd2a25

Signed-off-by: Kyle Sayers <[email protected]>

wip: clean up moe examples

b8b217c

Signed-off-by: Kyle Sayers <[email protected]>

remove deepseek2.5 for now

43bc91d

Signed-off-by: Kyle Sayers <[email protected]>

update readme

7d8ed36

Signed-off-by: Kyle Sayers <[email protected]>

rename files, update examples tests

e9e30c3

Signed-off-by: Kyle Sayers <[email protected]>

revert examples changes

2db2789

Signed-off-by: Kyle Sayers <[email protected]>

remove extra examples

0dc2381

Signed-off-by: Kyle Sayers <[email protected]>

Merge remote-tracking branch 'origin' into kylesayrs/deepseek-v3

941deac

skip generation

ad506fa

Signed-off-by: Kyle Sayers <[email protected]>

update readme, swap to r1, add docstrings

2b84051

Signed-off-by: Kyle Sayers <[email protected]>

remove qconfig, fix typo

d8e8213

Signed-off-by: Kyle Sayers <[email protected]>

remove dfs, replace with replace_module

6a8ed57

Signed-off-by: Kyle Sayers <[email protected]>

Merge remote-tracking branch 'origin' into kylesayrs/deepseek-v3

0e154cf

Merge branch 'main' into kylesayrs/deepseek-v3

f7b4c1b

implement and use CalibrationConfig

803842b

Signed-off-by: Kyle Sayers <[email protected]>

add docstring

4a5d1dd

Signed-off-by: Kyle Sayers <[email protected]>

update docstring

c2780cb

Signed-off-by: Kyle Sayers <[email protected]>

gemini-code-assist bot reviewed Jun 25, 2025

View reviewed changes

Merge branch 'kylesayrs/deepseek-v3', remote-tracking branch 'origin'…

cee1de7

… into kylesayrs/moe_calibration_config

gemini-code-assist bot reviewed Jun 25, 2025

View reviewed changes

kylesayrs added 3 commits June 25, 2025 11:22

Merge remote-tracking branch 'origin' into kylesayrs/deepseek-v3

919285a

Merge branch 'kylesayrs/deepseek-v3' into kylesayrs/moe_calibration_c…

3364830

…onfig

reduce diff

306eaa0

Signed-off-by: Kyle Sayers <[email protected]>

kylesayrs mentioned this pull request Jun 25, 2025

[MoE] DeepSeek-V3/R1 #1535

Merged

dsikka marked this pull request as draft June 25, 2025 16:28

Base automatically changed from kylesayrs/deepseek-v3 to main June 25, 2025 16:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[MoE] Add MoE calibration options #1593

[MoE] Add MoE calibration options #1593

kylesayrs commented Jun 25, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jun 25, 2025

Uh oh!

dsikka commented Jun 25, 2025 •

edited

Loading

Uh oh!

Uh oh!

[MoE] Add MoE calibration options #1593

Are you sure you want to change the base?

[MoE] Add MoE calibration options #1593

Conversation

kylesayrs commented Jun 25, 2025

Purpose

Changes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jun 25, 2025

Choose a reason for hiding this comment

Uh oh!

dsikka commented Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

dsikka commented Jun 25, 2025 •

edited

Loading