I want to install GPU on LX2160A to train AI data,but I don’t know how to install a NVIDIA GPU on LX2160A.
Pls someone could tell me how to operate with it?
I want to install GPU on LX2160A to train AI data,but I don’t know how to install a NVIDIA GPU on LX2160A.
Pls someone could tell me how to operate with it?
Currently we do not support Nvidia cards on the LX2160a, at least with Nvidia’s proprietary binary driver. I have made some slight success with the new open kernel driver and Nvidia’s latest binary driver release, but it is still very unstable and requires a Turing generation card or newer. We are hoping that soon Nvidia will be able to provide better support for their hardware on Aarch64 systems.
OK,thx!
What about AMD GPU card to install on LX2160A ? Could some operation advice give to me?
Honestly I haven’t done much regarding training on the AMD GPUs. In general you are going to need a Navi based card or newer, so you can use ROCm on the LX2160a. The amdgpu kernel module currently does not support Aarch64 for Navi and newer cars, as they require some floating point instructions in the kernel and the code needs to be re-organized so these functions are compiled with different flags and can run safely in the kernel without corrupting the userspace stack.
Most likely I will publish my patches for the nvidia binary driver first just because I already have a card that is working. The amdgpu kernel work is on my radar but GPU availability and pricing hasn’t made it a priority over the past year or so.
Have you already purchased a card, or are you looking to use hardware you already have?
Yes, I have some Nvidia GPU,such as GTX3080 Ti 、P400、A100. But no AMD GPU.
Most likely I will publish my patches for the nvidia binary driver first just because I already have a card that is working.
When will you push your patch?
Not really sure. Most likely I will get back around to testing nvidia’s drivers more this weekend.
Nvidia released a driver update. I will test it and then push a howto guide which is “use at your own risk”
Hi Jon, any update on this?
I will try a RTX A2000 on a HoneyComb LX2 running Ubuntu Server 22.04.3.
Thanks.
Hi, I’ve recently replaced my main PC GPU and now have a spare NVidia GTX 1660Ti which I just installed into my Honeycomb LX2K.
Here are the steps I used on Ubuntu 22.04 LTS:
# Install server driver since this is a headless workstation,
# if using as workstation, install nvidia-driver-535-open (haven't tested this yet)
sudo apt install nvidia-driver-535-server-open
# Add configs
cat <<'EOF' | sudo tee /etc/modprobe.d/nvreg_fix.conf
options nvidia NVreg_OpenRmEnableUnsupportedGpus=1
EOF
cat <<'EOF' | sudo tee /etc/modprobe.d/nvidia.conf
options nvidia-drm modeset=1
EOF
# Blacklist noveau driver
cat <<'EOF' | sudo tee /etc/modprobe.d/blacklist-nouveau.conf
blacklist nouveau
options nouveau modeset=0
EOF
# Update initramfs
sudo update-initramfs -u
Reboot so new modules are correctly loaded
After reboot, run nvidia-smi
to confirm your system detected the GPU.
for pkg in docker.io docker-doc docker-compose docker-compose-v2 podman-docker containerd runc; do sudo apt-get remove $pkg; done
# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
# Add the repository to Apt sources:
echo "deb [arch="$(dpkg --print-architecture)" signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
"$(. /etc/os-release && echo "$VERSION_CODENAME")" stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \\n && curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \\n sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \\n sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
# Running a MNIST trainning in Docker using GPU
docker run --runtime=nvidia --gpus all -it --rm -v $(pwd):/work -w /work --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 nvcr.io/nvidia/pytorch:23.10-py3
wget https://github.com/pytorch/examples/raw/main/mnist/main.py -o mnist_pytorch.py
time python3 mnist_pytorch.py --epochs=1
Apologies if there is a newer thread on this topic, I didn’t come across it. I’m wondering if there has been any progress with having NVIDIA GPUs play nice with this board. For my own usecase I’m not concerned with pixels, I’d just like to be able to dev and debug CUDA/C++ code on device..
If you are using the EFI firmware then there should be no issues using an Nvidia card with the newer open source kernel module and driver. This should also work with u-boot and device-tree but I have not specifically tested this use case.
Hi Jon, Thanks for the confirmation. I finally got around to transferring my RTX3060ti over to the Honeycomb. After multiple attempts I keep coming back to the same place which is essentially:
> nvidia-smi
No devices were found
Previously, I had the recommended AMD Navi based GPU so I have sudo apt remove --purge "*amdgpu*"
the AMD drivers and removed the AMD GPU related arguments from the kernel args. I’ve done similar with "*nvidia*"
to ensure that I haven’t got conflicting configurations hanging around, and I am very confident that I do not.
Currently I have installed 570 using the offline installer NVIDIA-Linux-aarch64-570.153.02.run
and choosing the GPL/MIT option for the modules option. The installer completes and cheerfully reports success.
# NV Detect indicates that this is the correct driver version, as does manually selecting the driver from nv.com
> /usr/bin/nvidia-detector
nvidia-driver-570
> nvidia-smi --version
NVIDIA-SMI version : 570.153.02
NVML version : 570.153
DRIVER version : 570.153.02
CUDA Version : 12.8
> lspci
0002:01:00.0 Non-Volatile memory controller: Phison Electronics Corporation E18 PCIe4 NVMe Controller (rev 01)
0004:01:00.0 VGA compatible controller: NVIDIA Corporation Device 2803 (rev a1)
0004:01:00.1 Audio device: NVIDIA Corporation Device 22bd (rev a1)
> lsmod | grep -i nvidia
nvidia_uvm 1658880 0
nvidia_drm 122880 0
nvidia_modeset 1847296 1 nvidia_drm
nvidia 11816960 2 nvidia_uvm,nvidia_modeset
drm_kms_helper 344064 1 nvidia_drm
drm 659456 4 drm_kms_helper,nvidia,nvidia_drm
> sudo dmesg | grep -i nvidia
[ 755.047626] nvidia: loading out-of-tree module taints kernel.
[ 755.095012] nvidia-nvlink: Nvlink Core is being initialized, major device number 505
[ 755.098892] nvidia 0004:01:00.0: Adding to iommu group 9
[ 755.100471] nvidia 0004:01:00.0: enabling device (0000 -> 0003)
[ 755.100498] nvidia 0004:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
[ 755.142981] NVRM: loading NVIDIA UNIX Open Kernel Module for aarch64 570.153.02 Release Build (dvs-builder@U22-I3-AF02-06-3) Tue May 13 16:16:49 UTC 2025
[ 755.235154] input: HDA NVidia HDMI/DP,pcm=3 as /devices/pci0004:01/0004:01:00.1/sound/card0/input1
[ 755.236439] input: HDA NVidia HDMI/DP,pcm=7 as /devices/pci0004:01/0004:01:00.1/sound/card0/input2
[ 755.237139] input: HDA NVidia HDMI/DP,pcm=8 as /devices/pci0004:01/0004:01:00.1/sound/card0/input3
[ 755.237581] input: HDA NVidia HDMI/DP,pcm=9 as /devices/pci0004:01/0004:01:00.1/sound/card0/input4
[ 755.237919] input: HDA NVidia HDMI/DP,pcm=10 as /devices/pci0004:01/0004:01:00.1/sound/card0/input5
[ 755.238154] input: HDA NVidia HDMI/DP,pcm=11 as /devices/pci0004:01/0004:01:00.1/sound/card0/input6
[ 755.238502] input: HDA NVidia HDMI/DP,pcm=12 as /devices/pci0004:01/0004:01:00.1/sound/card0/input7
[ 755.262075] nvidia-modeset: Loading NVIDIA UNIX Open Kernel Mode Setting Driver for aarch64 570.153.02 Release Build (dvs-builder@U22-I3-AF02-06-3) Tue May 13 16:07:25 UTC 2025
[ 755.279644] [drm] [nvidia-drm] [GPU ID 0x00040100] Loading driver
[ 755.279653] [drm] Initialized nvidia-drm 0.0.0 20160202 for 0004:01:00.0 on minor 0
[ 763.547183] audit: type=1400 audit(1749046674.715:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=1951 comm="apparmor_parser"
[ 763.547196] audit: type=1400 audit(1749046674.715:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=1951 comm="apparmor_parser"
[ 1124.959851] The NVIDIA GPU driver for AArch64 has not been qualified on this platform
So, I’m getting a bit stuck. I’ve followed most of the tips I was able to mine from other forums, in particular the NVIDIA forums and, frankly, I think my config is pretty good and consistent … apart from the bit where it doesn’t work.
Are there any special kernel parameters I need, or need to get rid of for instance? I have arm-smmu.disable_bypass=0 iommu.passthrough=1
but IIRC these are needed for something networking related.
And, yes, nouveau is blacklisted. I also added the nvreg_fix.conf
file as suggested by carlosedp, above.
Looking forward to any Honeycomb specific tips, or any other tips and pointer for that matter!
Have you added the module parameter for your nvidia module that allows it to initialize on unsupported platforms? If you run modinfo nvidia
you should see it there.