ncnn why quantization tool does not decrease model size by 75%?

1bqhqjot 于 2022-12-31 发布在其他

关注(0)|答案(7)|浏览(161)

First, thank you so much for awesome project and releasing to the public

I am trying model quantization following your quantization guidelines
https://github.com/Tencent/ncnn/tree/master/tools/quantize

but after finishing all steps without any single lines of error, model size doesn't decrease at all. Checking ncnn::Mat::elemsize, model file changed from float64 to int8 which means quantization works to some degree.

What's the problem with it? Am I did something wrong?

ncnn

来源：https://github.com/Tencent/ncnn/issues/1214

7条答案

按热度按时间

xfb7svmp1#

我也碰到了同样的问题。

example.param 和 `example.bin 是得到的初始ncnn模型。

./ncnnoptimize example.param example.bin example-nobn-fp32.param example-nobn-fp32.bin 85535
./ncnn2table --param example-nobn-fp32.param --bin example-nobn-fp32.bin --images images/ --output example-nobn.table --mean 104,117,123 --norm 0.017,0.017,0.017 --size 224,224 --thread 2
./ncnn2int8 example-nobn-fp32.param example-nobn-fp32.bin example-int8.param mobilenet-int8.bin

ncnnoptimize 优化之后，模型大小将为 883k，然后ncnn2int8定点化之后，模型大小又变成了1.8M

赞(0）回复(0）举报 2022-12-31

nlejzf6q2#

https://github.com/Tencent/ncnn/wiki/quantized-int8-inference
根据这里的描述，可能现在这种方法采用的是 “the runtime way, no model binary reduction”，所以模型大小并没有下降

赞(0）回复(0）举报 2022-12-31

l7wslrjt3#

我也碰到了同样的问题。

example.param 和 `example.bin 是得到的初始ncnn模型。

./ncnnoptimize example.param example.bin example-nobn-fp32.param example-nobn-fp32.bin 85535
./ncnn2table --param example-nobn-fp32.param --bin example-nobn-fp32.bin --images images/ --output example-nobn.table --mean 104,117,123 --norm 0.017,0.017,0.017 --size 224,224 --thread 2
./ncnn2int8 example-nobn-fp32.param example-nobn-fp32.bin example-int8.param mobilenet-int8.bin

ncnnoptimize 优化之后，模型大小将为 883k，然后ncnn2int8定点化之后，模型大小又变成了1.8M

并没有看到你在 ./ncnn2int8 example-nobn-fp32.param example-nobn-fp32.bin example-int8.param mobilenet-int8.bin 这里面指定 table 文件，所以应该没有存成 8bit 吧。过程中有看到 “quantize_convolution layer_name ” 类似的打印么。
另外，发现你输出为 mobilenet-int8.bin，在截图里面没有看到。

赞(0）回复(0）举报 2022-12-31

vx6bjr1n4#

@lexuszhi1990
这位同学的解释是正确的@MambaWong

赞(0）回复(0）举报 2022-12-31

qfe3c7zg5#

@MambaWong
could you please check my case?
I used torchvision model and convert it to .bin, .param file using onnx
( https://github.com/Tencent/ncnn/wiki/use-ncnn-with-pytorch-or-onnx )

./ncnnoptimize squeezenet.param squeezenet.bin squeezenet-nobn.param squeezenet-nobn.bin 0
./ncnn2table --param squeezenet-nobn.param --bin squeezenet-nobn.bin --images ../../samples/ --output squeezenet.table --mean 123,117,104 --norm 0.017,0.017,0.017 --size 224,224 --thread 2
./ncnn2int8 squeezenet-nobn.param squeezenet-nobn.bin squeezenet-quant.param squeezenet-quant.bin squeezenet.table

赞(0）回复(0）举报 2022-12-31

mlnl4t2r6#

hello, have you solved your problems, I also meet this problem. @bocharm

赞(0）回复(0）举报 2022-12-31

sycxhyv77#

请更新ncnn量化和转换工具
https://github.com/Tencent/ncnn/wiki/quantized-int8-inference

赞(0）回复(0）举报 2022-12-31