MLIR算子量化Quantization (2)

日期：2022-11-06 栏目：程序人生浏览：次

$$ \begin{align*} re&al_value_{Single} \
&= roundToNearestFloat((affine_value_{uint8 , or , uint16} - zero_point_{uint8 , or , uint16})_{sint32})_{Single} * scale_{Single} \end{align*} $$

在上面的例子中，假设减法的结果，32位有符号整数格式，并且$$roundToNearestFloat$$返回Single精度。

仿射到不动点

当仿射标度和不动点标度相同时，从仿射值中减去零点得到等价的不固定值。

$$ scaled_value = affine_value_{non\mbox{-}negative} - zero_point_{non\mbox{-}negative} $$

Fixed point to affine

当仿射尺度和不动点尺度相同时，将零点加到不动点的值上，得到等价的仿射值。

$$ affine_value_{non\mbox{-}negative} = scaled_value + zero_point_{non\mbox{-}negative} $$

Usage within MLIR

MLIR中正在开发的量化系统有几个内容：

Quantization dialect containing:

A family of which represent the mapping between expressed values (typically of a floating point computer type) and storage values (typically of an integral computer type).

for converting between types based on a QuantizedType and its expressed and storage sub-types.

for assigning instrumentation points within the computation where runtime statistics may help guide the quantization process.

Integration with simulated quantization at training time

The TFLite op-set natively supports uniform-quantized variants.

Passes and tools exist to convert directly from the TensorFlow dialect to the TFLite quantized operation set.

并不是所有的量子化应用都会用到所有这些设置。TensorFlow到TensorFlow Lite的转换，使用QuantizedTypes，但有自己的类型转换算子和支持数学的表达式。

Quantization Dialect

Quantized type

TODO: Flesh this section out.

QuantizedType base class

UniformQuantizedType

Quantized type conversion operations

qcast : Convert from an expressed type to QuantizedType

dcast : Convert from a QuantizedType to its expressed type

scast : Convert between a QuantizedType and its storage type

Instrumentation and constraint operations

const_fake_quant : Emulates the logic of the historic TensorFlow fake_quant_with_min_max_args operation.

stats_ref : Declares that statistics should be gathered at this point with a unique key and made available to future passes of the solver.

stats : Declares inline statistics (per layer and per axis) for the point in the computation. stats_ref ops are generally converted to statistical operations once trial runs have been performed.

coupled_ref : Declares points in the computation to be coupled from a type inference perspective based on a unique key.

Integration with simulated quantization at training time

训练时与模拟量化的集成

TensorFlow历来使用tf.quantization.fake_quant_*模拟训练时，量化效果的算子族。

正如最初实现的那样，TensorFlow Lite是推理时此类操作的主要对象。当启用量化推断时，如果每个合格的张量都经过一个适当的伪量化节点（张量可以应用伪量化的规则，多少有些牵扯），那么TensorFlow Lite将使用伪量化操作的属性，判断如何从量化算子转换为使用kernel子集。

在基于MLIR的量化中，伪量化算子将它们转换成一个序列来处理的，该序列是*qcast*（quantize），然后是*dcast*（dequantize），具有适当的*UniformQuantizedType*作为qcast算子的对象。

后续的编译器传递保留量化，以某种方式模拟的知识，同时允许编译器灵活地移动类型转换，简化了计算，并将其转换为基于积分算子的形式。

允许部分量化的计算，其中不能简化为积分运算的部分，仍然以浮点形式执行，并在边界处进行适当的转换。

TFLite native quantization

TODO: Flesh this out

General algorithm

Take input min/max information and set the ArrayInfo (which really is InputOrOutputArrayInfo.

In LegalizeTF, convert ArrayInfo min/max to tf.Quantize and tf.Dequantize nodes. (or tf.FakeQuant) Convert all constant FakeQuants to (tf.FQ -> tfl.Q -> tfl.DQ).

Hardcode logic/propagation needs to happen here.

Run TF constant folding.

In PrepareTFL, convert all tf.FQ to (tfl.Q -> tfl.DQ).

Run quantization pass that take (tfl.DQ (for both input and weights) -> op -> tfl.Q) and replaces with (op). Also replace (constant_float -> tfl.Q) with (constant_quant).

转载注明出处：https://www.heiqu.com/zgfxdf.html

MLIR算子量化Quantization (2)

相关推荐