後來想一想, 前篇留著當筆記, 對 pytorch 2.9.1 再開一篇文寫.
以下筆記我自己實驗出可用的版本, ubuntu 24.04.3, ROCm 6.4.4, pytorch 2.9.1, 目標 gfx803, gfx900. 以下都可以在虛擬機裡編譯, 不需要真的有 AMD 的 GPU. 之所以用 ROCm 6.4.4 不用 7.1.1 | 7.2.0 是因為這是最後一版支援 gfx803 (RX460/470/480/560/570/580) 的 ROCm, 也是最後一版支援 WSL 的 ROCm, 所以這篇就以 ROCm 6.4.4 (6.x 最後一版), pytorch 2.9.1 為主. 還是跟前篇一樣, 要編哪一版 pytorch 就要注意它是對到哪一版 ROCm.
ps. ROCm 7.1.1 可以支援在 ryzen 3000 開始的 APU (代號 gfx902), 但沒實驗出讓它跑 code 的辦法.
# 環境變數裡加東西, 下次登入時就生效. ps. 加在 /etc/environment.d/ 裡無效, 理由不明.
# Add GFX803 related variables
echo "ROC_ENABLE_PRE_VEGA=1" | sudo tee -a /etc/environment
echo "HSA_OVERRIDE_GFX_VERSION=8.0.3" | sudo tee -a /etc/environment安裝 miniconda 3. conda init 是在 $LOGUSER 下, 特別記清處 rocm-build 這個字, 後面會很常用上
Miniconda 3
# Download and Install Miniconda 3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -P /tmp/
sudo bash /tmp/Miniconda3-latest-Linux-x86_64.sh -b -p /opt/conda
sudo sed -i 's|^PATH="|PATH="/opt/conda/bin:|' /etc/environment
# Logout to apply environment
exit
# Logged in
conda init
source ~/.bashrc
conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/main
conda tos accept --override-channels --channel https://repo.anaconda.com/pkgs/r
conda create -n rocm-build -y python=3.12
conda activate rocm-buildInstall Prerequisites
sudo apt install -y build-essential ccache git libjpeg-dev \
libjpeg-turbo8-dev libpng-dev libmsgpack-dev libssl-dev \
python3-virtualenv libboost-dev libboost1.83-dev libmsgpack-cxx-dev \
ninja-build
ROCm 6.4.4
wget https://repo.radeon.com/rocm/rocm.gpg.key -O - | gpg --dearmor \
| sudo tee /etc/apt/keyrings/rocm.gpg > /dev/null
https://repo.radeon.com/rocm/apt/6.4.4 noble main" | sudo tee \
--append /etc/apt/sources.list.d/rocm.list
600' | sudo tee /etc/apt/preferences.d/rocm-pin-600
sudo apt install rocm rocm-developer-tools rocm-ml-sdk \
rocm-ml-libraries rocm-hip-sdk rocm-hip-libraries
exit
此時如果要修改 rocm 安裝版本, 就要在 /etc/apt/sources.list.d/rocm.list 中修改版本數字跟 ubuntu 發行主版號:
#sudo nano /etc/apt/sources.list.d/rocm.list
ps. 到寫這篇前還沒看到支援 ubuntu 26.04
rocBLAS
conda activate rocm-build
sudo mkdir /opt/rocBLAS
rocm-6.4.4 /opt/rocBLAS
pip install "cmake<4.0" joblib pyyaml virtualenv \
typing-extensions
time ./install.sh -a "gfx803;gfx900;gfx90a;gfx942;gfx1100" \
-b rocm-6.4.4
sudo rsync -vrh /opt/rocBLAS/build/release/rocblas-install\
/lib/rocblas/library/ /opt/rocm/lib/rocblas/library/
rocSOLVER
ps. rocSOLVER 在 ROCm 7.0 開始才能指定 Tensile 版本, 如果是 7.0 之後的版本就建議跟 rocBLAS 一樣 install.sh 要多下一個 -b 指定 Tensile 版號.
(update: ROCm 7.0以上特有) rocSOLVER的編法很奇怪, 最好特別連進去 ssh console 裡再操作 install.sh, 否則不會動作.
# Download rocSOLVER
sudo mkdir /opt/rocSOLVER
sudo chown $LOGNAME: /opt/rocSOLVER
git clone --recursive https://github.com/ROCm/rocSOLVER.git -b \rocm-6.4.4 /opt/rocSOLVER cd /opt/rocSOLVER # Build rocBLAS time ./install.sh -a "gfx803;gfx900;gfx90a;gfx942;gfx1100" # Copy compiled library SRC=$(sudo find \ /opt/rocSOLVER/build/release/rocsolver-install/lib/ \ -type f -name "librocsolver.so.*") TGT=$(sudo find /opt/rocm/ -type f -name "librocsolver.so*") LIST=$(find /opt/rocm/ -type l -name "librocsolver.so*") sudo cp -f ${SRC} ${TGT} for d in ${LIST}; do sudo ln -sf ${TGT} ${d}; done
# Reboot 如果你是在目標機器上編的話才要 reboot.
PyTorch 2.9.1
# Re-enable conda environment
conda activate rocm-build
# Download PyTorch
sudo mkdir /opt/pytorch
sudo chown $LOGNAME: /opt/pytorch
git clone --recursive https://github.com/pytorch/pytorch.git -b v2.9.1 /opt/pytorch
cd /opt/pytorch
# Install required packages
pip install mkl-static mkl-include -r requirements.txt
# Build PyTorch
export PYTORCH_ROCM_ARCH="gfx803;gfx900"
export PYTORCH_BUILD_VERSION=2.9.1 PYTORCH_BUILD_NUMBER=1
python tools/amd_build/build_amd.py
time python setup.py bdist_wheel
# Install PyTorch
pip install /opt/pytorch/dist/torch-2.9.1-cp312-cp312-linux_x86_64.whlps1.
torchVision 要編譯
0.24.1
版,
跟
pytorch
也是有對應版號的
https://github.com/pytorch/vision
ps2. torchAudio 對應版號是 2.9.1 版
之後就比照 https://github.com/NULL0xFF/rocm-gfx803?tab=readme-ov-file 這邊筆記操作.
後續有實驗成功的版本我會再貼上來.
update: 操蛋的, 花了我三個禮拜試一堆組合後終於可以動了....
MNIST PyTorch example
Clone the PyTorch examples repository.
git clone https://github.com/pytorch/examples.gitGo to the MNIST example folder.
cd examples/mnist
以上引用自 https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/3rd-party/pytorch-install.html
update2: rocm 7.0 開始擋掉對 gfx803 的支援.






