Step 3. Main Transformations¶
Main transformations are the majority of low precision transformations. Transformations operate with dequantization operations. Main transformations include:
Let’s explore some main transformations on the example model. Original model:
Result model after main transformations:
Changes in the example model after main transformation:
All
FakeQuantizeoperations (fakeQuantize1,fakeQuantize2andfakeQuantize3) were decomposed:original
FakeQuantizeoperations were replaced with new operations with other output intervals and output port precision,dequantization operations.
Dequantization operations were moved via precision preserved (
concat1andconcat2) and quantized (convolution2) operations.
Note
The left branch (branch #1) does not require per-tensor quantization. As a result, the fakeQuantize1 output interval is [0, 255]. But quantized convolution2 requires per-tensor quantization on the right branch (branch #2). Then all connected FakeQuantize interval operations (fakeQuantize1 and fakeQuantize2) are aligned to have per-tensor quantization after the concatenation (concat2) operation.