Skip to content

Qonnx binary quant #1292

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

jurevreca12
Copy link

Description

📝 Added a transformation for the QONNX operator BipolarQuant to hls4ml operator BinaryQuantizer.

  • Added a ONNX BipolarQuant Operator.
  • Added three transformations (optimizations) that transform the qonnx operator to an hls4ml operator.

Type of change

  • Bug fix (non-breaking change that fixes an issue)
  • Documentation update
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • A new research paper code implementation
  • Other (Specify)

Tests

📝 Added a basic test that uses a binarized model which utilizes BipolarQuant in two different contexts: On a constant weight parameter, and on input activations.

  • To reproduce run pytest test/pytest/test_qonnx.py:test_bnn
  • I added the test model file directly to git. Which you may not approve. But its a very small artificial model. I generate these types of models in the chisel4ml testing infrastructure. These are just single layer models that have randomly generated weights. I didn't want to add brevitas to the dev requirements, so i just added a single such model.

Test Configuration:

Checklist

  • I have read the guidelines for contributing.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have made corresponding changes to the documentation.
  • My changes generate no new warnings.
  • I have installed and run pre-commit on the files I edited or added.
  • I have added tests that prove my fix is effective or that my feature works.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if this is a good place for an onnx model. I wonder if some of the models from the model zoo will do. If not, we should put this in https://github.com/fastmachinelearning/example-models, where we keep other inputs. (Note that that project is instantiated as a submodule.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, this is the qonnx model zoo that I am referring to: https://github.com/fastmachinelearning/qonnx_model_zoo. It does have some binary models, though I am not sure if they are suitable for the test.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it should not be in this repository. However, It is kind of an artificial model, so I am not sure if it should be in the model zoo. I prefer using these kind of artificial models, because they can be generated quickly, and they are very simple. That way if something goes wrong its a very small model, and its easier to find the issue.
Would you be open to adding Brevitas to the testing dependencies? That way we could auto-generate various quantized models as in chisel4ml.
Another option would be to use one of the models from the zoo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example-models repository is a good place for not real models that are used for testing.


class BipolarQuantToActivation(OptimizerPass):
"""
This is for the case when scale is a (positive) power of 2 and zeropt is 0. It is a a 1:1 transformation of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really work if the scale is a power of 2? It's fine if it doesn't but if it doesn't we should change the matching criteria.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure. I'll add a test case to check.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not work for scale != 1. Since BinaryQuantizer does not have a scaling factor. One option would be to lower BipolarQuant with scaling factor != 1 to a activation node followed by a mul node. However, I am not sure if there is much sense in this? Since these mul nodes are, I presume, not implementable? On the other hand, perhaps they could be absorbed into other nodes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The multiplier nodes are the ApplyAlpha nodes I mention in another comment. The ApplyAlpha is just a shift and scale layer, and you can use it as just a scale. (The name comes from it's original application, and it is not very good. We have talked about changing the name.) Fundamentally it's implmented the same way as a batchnorm (since we don't actually update the scaling in a batchnorm).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some tests with non-unit po2 scale factors, and it appears that the transformations work as is. I tried setting the scaling factors bellow 1 and above 1. And both cases worked. Perhaps some down stream transformation is handling this case?

is_match = (
isinstance(node, BipolarQuant)
and len(node.inputs) == 1
and isinstance(node.get_input_node(node.inputs[0]), Constant)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, does this really work if scale != 1? If it doesn't, the matching criteria should change.

if node.get_input_node(node.inputs[1]):
scale_node = node.get_input_node(node.inputs[1])
if isinstance(scale_node, Constant):
node.set_attr('scale', scale_node.get_attr('value'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't seem to handle to case when scale != 1. Ideally we should be able to extract ApplyAlpha scales in such a case that we propagate up and down. I think basic support can be fairly straightforwadly added, in the style of the Quant support. (If we don't support scale != 1, we should catch those cases and exit gracefully, with an error message.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the the BinaryQuantizer code and it does not define a scaling factor. Meaning that this can only work for scale factors 1. Further more this whole optimizer pass becomes irrelevant. So I will delete it.
What does ApplyAlpha do? I am not familiar with this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So a quant layer with a scale and/or zero offset really means scale/shift, then quantize, then unscale/unshift. The ApplyAlpha are scale and shift layers in the hls4ml IR. When a quant node is applied to a weight, the initial scaling/shifting can actually be done to the weights (assuming they are constant and not update able). Otherwise, the hope is that the scaling and unscaling can be moved around the graph to where the implementation is easiest. There are optimizers that already exist for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants