我对LLM服务和量化非常陌生。任何线索都将非常感谢。我正在尝试使用Autoawq来验证我的模型。我已经安装了以下软件包：

Package            Version
------------------ ------------
absl-py            2.0.0
accelerate         0.24.1
aiohttp            3.9.0
aiosignal          1.3.1
annotated-types    0.6.0
anyio              3.7.1
async-timeout      4.0.3
attributedict      0.3.0
attrs              23.1.0
autoawq            0.1.7
blessings          1.7
cachetools         5.3.2
certifi            2022.12.7
chardet            5.2.0
charset-normalizer 2.1.1
click              8.1.7
codecov            2.1.13
colorama           0.4.6
coloredlogs        15.0.1
colour-runner      0.1.1
coverage           7.3.2
DataProperty       1.0.1
datasets           2.15.0
deepdiff           6.7.1
dill               0.3.7
distlib            0.3.7
distro             1.8.0
exceptiongroup     1.1.3
filelock           3.9.0
frozenlist         1.4.0
fsspec             2023.4.0
h11                0.14.0
httpcore           1.0.2
httpx              0.25.1
huggingface-hub    0.19.4
humanfriendly      10.0
idna               3.4
inspecta           0.1.3
Jinja2             3.1.2
joblib             1.3.2
jsonlines          4.0.0
lm-eval            0.3.0
MarkupSafe         2.1.3
mbstrdecoder       1.1.3
mpmath             1.3.0
multidict          6.0.4
multiprocess       0.70.15
networkx           3.0
nltk               3.8.1
numexpr            2.8.6
numpy              1.24.1
openai             1.3.3
ordered-set        4.1.0
packaging          23.2
pandas             2.0.3
pathvalidate       3.2.0
Pillow             9.3.0
pip                19.3.1
platformdirs       4.0.0
pluggy             1.3.0
portalocker        2.8.2
protobuf           4.25.1
psutil             5.9.6
pyarrow            14.0.1
pyarrow-hotfix     0.5
pybind11           2.11.1
pycountry          22.3.5
pydantic           2.5.1
pydantic-core      2.14.3
pygments           2.17.1
pyproject-api      1.6.1
pytablewriter      1.2.0
python-dateutil    2.8.2
pytz               2023.3.post1
PyYAML             6.0.1
regex              2023.10.3
requests           2.28.1
rootpath           0.1.1
rouge-score        0.1.2
sacrebleu          1.5.0
safetensors        0.4.0
scikit-learn       1.3.2
scipy              1.10.1
sentencepiece      0.1.99
setuptools         41.6.0
six                1.16.0
sniffio            1.3.0
sqlitedict         2.1.0
sympy              1.12
tabledata          1.3.3
tabulate           0.9.0
tcolorpy           0.1.4
termcolor          2.3.0
texttable          1.7.0
threadpoolctl      3.2.0
tokenizers         0.15.0
toml               0.10.2
tomli              2.0.1
torch              2.1.1+cu118
torchaudio         2.1.1+cu118
torchvision        0.16.1+cu118
tox                4.11.3
tqdm               4.66.1
tqdm-multiprocess  0.0.11
transformers       4.35.2
triton             2.1.0
typepy             1.3.2
typing-extensions  4.4.0
tzdata             2023.3
urllib3            1.26.13
virtualenv         20.24.6
xxhash             3.4.1
yarl               1.9.2
zstandard          0.22.0

字符串
我尝试运行https://github.com/casper-hansen/AutoAWQ的示例代码：

from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer

model_path = 'lmsys/vicuna-7b-v1.5'
quant_path = 'vicuna-7b-v1.5-awq'
quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" }

# Load model
model = AutoAWQForCausalLM.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)

# Quantize
model.quantize(tokenizer, quant_config=quant_config)

# Save quantized model
model.save_quantized(quant_path)
tokenizer.save_pretrained(quant_path)

型
但我得到以下错误：

/usr/test3/lib64/python3.8/site-packages/huggingface_hub/utils/_runtime.py:184: UserWarning: Pydantic is installed but cannot be imported. Please check your installation. `huggingface_hub` will default to not using Pydantic. Error message: '{e}'
  warnings.warn(
Traceback (most recent call last):
  File "quant.py", line 1, in <module>
    from awq import AutoAWQForCausalLM
  File "/usr/test3/lib64/python3.8/site-packages/awq/__init__.py", line 2, in <module>
    from awq.models.auto import AutoAWQForCausalLM
  File "/usr/test3/lib64/python3.8/site-packages/awq/models/__init__.py", line 1, in <module>
    from .mpt import MptAWQForCausalLM
  File "/usr/test3/lib64/python3.8/site-packages/awq/models/mpt.py", line 1, in <module>
    from .base import BaseAWQForCausalLM
  File "/usr/test3/lib64/python3.8/site-packages/awq/models/base.py", line 12, in <module>
    from awq.quantize.quantizer import AwqQuantizer
  File "/usr/test3/lib64/python3.8/site-packages/awq/quantize/quantizer.py", line 11, in <module>
    from awq.modules.linear import WQLinear_GEMM, WQLinear_GEMV
  File "/usr/test3/lib64/python3.8/site-packages/awq/modules/linear.py", line 4, in <module>
    import awq_inference_engine  # with CUDA kernels
ImportError: libcudart.so.12: cannot open shared object file: No such file or directory

型
这是我的nvidia配置：

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 520.61.05    Driver Version: 520.61.05    CUDA Version: 11.8     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA A10          Off  | 00000000:17:00.0 Off |                    0 |
|  0%   40C    P0    59W / 150W |  18106MiB / 23028MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA A10          Off  | 00000000:31:00.0 Off |                    0 |
|  0%   28C    P8    21W / 150W |      2MiB / 23028MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  NVIDIA A10          Off  | 00000000:B1:00.0 Off |                    0 |
|  0%   26C    P8    20W / 150W |      2MiB / 23028MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   3  NVIDIA A10          Off  | 00000000:CA:00.0 Off |                    0 |
|  0%   26C    P8    20W / 150W |      2MiB / 23028MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

型
下面是nvcc --version输出：

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2021 NVIDIA Corporation
Built on Sun_Feb_14_21:12:58_PST_2021
Cuda compilation tools, release 11.2, V11.2.152
Build cuda_11.2.r11.2/compiler.29618528_0

型

1条答案

按热度按时间

busg9geu1#

我最近在Runpod中使用AWQ。面临同样的问题。所以默认使用nvidia-smi看到cuda版本是12.3。通过使用这些命令安装库解决了这个问题。
！pip -q install --upgrade fschat accelerate autoawq vllm
！pip install torch==2.1.0+cu121 torchvision==0.16.0+cu121 torchaudio==2.1.0 torchtext==0.16.0+cpu torchdata==0.7.0 --index-url https://download.pytorch.org/whl/cu121
链接到解决方案https://github.com/vllm-project/vllm/issues/1718

赞(0）回复(0）举报 6个月前

pytorch ImportError：libcudart.so.12：无法打开共享对象文件：没有这样的文件或目录

1条答案

相关问题

热门标签

最新问答