Bug 7065 - pytorch unusable
Summary: pytorch unusable
Status: RESOLVED FIXED
Alias: None
Product: Anolis OS 23
Classification: Anolis OS
Component: BaseOS Packages (show other bugs) BaseOS Packages
Version: 23.0
Hardware: All Linux
: P3-Medium S3-normal
Target Milestone: ---
Assignee: xuchunmei
QA Contact: bolong_tbl
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-10-27 16:55 UTC by zhongling
Modified: 2023-11-15 17:13 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description zhongling 2023-10-27 16:55:07 UTC
Description of problem:

Pytorch unusable on aarch64 machines.

Version-Release number of selected component (if applicable):

pytorch-2.0.1-3.an23.aarch64

How reproducible:

```
yum install -y pytorch
python3 -c 'import torch'
```

Actual results:

```
Traceback (most recent call last):
  File "/usr/lib64/python3.10/site-packages/torch/__init__.py", line 168, in _load_global_deps
    ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
  File "/usr/lib64/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcurand.so.10: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib64/python3.10/site-packages/torch/__init__.py", line 228, in <module>
    _load_global_deps()
  File "/usr/lib64/python3.10/site-packages/torch/__init__.py", line 189, in _load_global_deps
    _preload_cuda_deps(lib_folder, lib_name)
  File "/usr/lib64/python3.10/site-packages/torch/__init__.py", line 154, in _preload_cuda_deps
    raise ValueError(f"{lib_name} not found in the system path {sys.path}")
ValueError: libcublas.so.*[0-9] not found in the system path ['', '/usr/lib64/python310.zip', '/usr/lib64/python3.10', '/usr/lib64/python3.10/lib-dynload', '/usr/lib64/python3.10/site-packages', '/usr/lib/python3.10/site-packages']
```

Expected results:


Additional info:

cuda's library path set to x86_64
```
cat /etc/ld.so.conf.d/000_cuda.conf
/usr/local/cuda/targets/x86_64-linux/lib
```
Comment 1 xuchunmei alibaba_cloud_group 2023-11-15 17:13:42 UTC
aarch64下的:
/etc/ld.so.conf.d/988_cuda-12.conf
/etc/ld.so.conf.d/000_cuda.conf
两个配置文件中的目录不对导致的问题。
已在cuda-12.1.1-5.an23版本修复