[PyCUDA] import pycuda.driver fails

Discussion:

Harshit Suri

2018-07-16 17:27:00 UTC

Hello Everyone,

I had a working installation of pycuda. However, after running updates on
my Ubuntu machine;
import pycuda.driver as cuda fails.
( I had also updated my anaconda install and updated all packages that
anaconda found that required updates )

When I try running " import pycuda.driver " through jupyter notebook
I get the following error message. Followed by a kernel crash
--------------------------------------------------------------------------
[I 10:02:37.765 NotebookApp] Adapting to protocol v5.1 for kernel
5fee93d7-62cf-468a-bcb0-b48e6f1d0987
terminate called after throwing an instance of 'std::runtime_error'
what(): numpy failed to initialize
[I 10:02:43.117 NotebookApp] KernelRestarter: restarting kernel (1/5), keep
random ports
kernel 5fee93d7-62cf-468a-bcb0-b48e6f1d0987 restarted
--------------------------------------------------------------------------
It seems numpy fails to initialize.
However if I try to import numpy independently, it passes successfully.

My guess is that when I ran the updates it updated something that caused
this. The installation for pycuda was done by "pip install pycuda"

I even tried uninstalling via pip uninstall pycuda followed by pip install
pycuda. But I still get the same error. I though this might force pycuda to
rebuild with the newer versions of python/numpy.

Current installation versions:
python is 3.6
numpy is 1.13.3
pycuda is 2017.1.1
gcc 4.8.4
ubuntu 14.04.4

I have tried searching through the archives for a solution to this. So I
hope someone will be able to help.

Thanks
-Suri

Andreas Kloeckner

2018-07-16 17:59:02 UTC

Permalink

Harshit,

Post by Harshit Suri
I had a working installation of pycuda. However, after running updates on
my Ubuntu machine;
import pycuda.driver as cuda fails.
( I had also updated my anaconda install and updated all packages that
anaconda found that required updates )
When I try running " import pycuda.driver " through jupyter notebook
I get the following error message. Followed by a kernel crash
--------------------------------------------------------------------------
[I 10:02:37.765 NotebookApp] Adapting to protocol v5.1 for kernel
5fee93d7-62cf-468a-bcb0-b48e6f1d0987
terminate called after throwing an instance of 'std::runtime_error'
what(): numpy failed to initialize
[I 10:02:43.117 NotebookApp] KernelRestarter: restarting kernel (1/5), keep
random ports
kernel 5fee93d7-62cf-468a-bcb0-b48e6f1d0987 restarted
--------------------------------------------------------------------------
It seems numpy fails to initialize.
However if I try to import numpy independently, it passes successfully.
My guess is that when I ran the updates it updated something that caused
this. The installation for pycuda was done by "pip install pycuda"

My guess would be Nvidia kernel/driver mismatch. Check the end of dmesg,
there might be a message there.

Andreas

Harshit Suri

2018-07-16 18:50:37 UTC

Permalink

Thank you for the reply Andreas,

The NVIDIA driver was definitely updated. ( it was on the list of items in
the Ubuntu "software updater" prior to running the updates ).
I have included the lines of dmesg corresponding to NVIDIA below, after
trying to rerun the import pycuda.driver in jupyter notebook.

-I am able to run my regular CUDA-C/C++ code binaries successfully after
the update: Based on this am I correct in assuming that this is not
necessarily a Nvidia kernel/driver mismatch issue?

-Is there a way I can start from scratch and force pycuda to rebuild the
installation?

-Should I try to go to an older NVIDIA driver binary?

Thanks

output of dmesg
======================================================================================
[ 2.523799] nvidia: loading out-of-tree module taints kernel.
[ 2.523807] nvidia: module license 'NVIDIA' taints kernel.
[ 2.523808] Disabling lock debugging due to kernel taint
[ 2.534215] nvidia: module verification failed: signature and/or
required key missing - tainting kernel
[ 2.541075] systemd-udevd[682]: failed to execute '/bin/systemctl'
'/bin/systemctl start --no-block nvidia-persistenced.service': No such file
or directory
[ 2.541164] nvidia-nvlink: Nvlink Core is being initialized, major
device number 242
[ 2.541358] nvidia 0000:05:00.0: enabling device (0100 -> 0103)
[ 2.541415] vgaarb: device changed decodes:
PCI:0000:05:00.0,olddecodes=io+mem,decodes=none:owns=none
[ 2.541514] vgaarb: device changed decodes:
PCI:0000:04:00.0,olddecodes=io+mem,decodes=none:owns=io+mem
[ 2.541584] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 384.130 Wed
Mar 21 03:37:26 PDT 2018 (using threaded interrupts)
[ 2.575938] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver
for UNIX platforms 384.130 Wed Mar 21 02:59:49 PDT 2018
[ 2.576962] [drm] [nvidia-drm] [GPU ID 0x00000500] Loading driver
[ 2.577030] [drm] [nvidia-drm] [GPU ID 0x00000400] Loading driver
[ 2.590360] clocksource: Switched to clocksource tsc
[ 2.606251] init: failsafe main process (693) killed by TERM signal
[ 2.616069] random: dbus-daemon: uninitialized urandom read (12 bytes
read, 124 bits of entropy available)
[ 2.618449] random: dbus-daemon: uninitialized urandom read (12 bytes
read, 124 bits of entropy available)
[ 2.635743] nvidia-uvm: Loaded the UVM driver in 8 mode, major device
number 241
[ 2.674524] random: nonblocking pool is initialized
[ 2.678916] systemd-udevd[776]: failed to execute '/bin/systemctl'
'/bin/systemctl start --no-block nvidia-persistenced.service': No such file
or directory

[ 7.507516] nvidia-modeset: Allocated GPU:1
(GPU-debe2b79-7d7c-b48e-e93e-92f5881136ab) @ PCI:0000:05:00.0
[ 12.010315] tty_warn_deprecated_flags: 'ModemManager' is using
deprecated serial flags (with no effect): 00008000
[ 32.726283] audit_printk_skb: 51 callbacks suppressed
[ 32.726285] audit: type=1400 audit(1531765181.557:28): apparmor="STATUS"
operation="profile_replace" profile="unconfined"
name="/usr/lib/cups/backend/cups-pdf" pid=2416 comm="apparmor_parser"
[ 32.726290] audit: type=1400 audit(1531765181.557:29): apparmor="STATUS"
operation="profile_replace" profile="unconfined" name="/usr/sbin/cupsd"
pid=2416 comm="apparmor_parser"
======================================================================================

Post by Andreas Kloeckner
Harshit,

--------------------------------------------------------------------------

Post by Harshit Suri
[I 10:02:37.765 NotebookApp] Adapting to protocol v5.1 for kernel
5fee93d7-62cf-468a-bcb0-b48e6f1d0987
terminate called after throwing an instance of 'std::runtime_error'
what(): numpy failed to initialize
[I 10:02:43.117 NotebookApp] KernelRestarter: restarting kernel (1/5),

keep

Post by Harshit Suri
random ports
kernel 5fee93d7-62cf-468a-bcb0-b48e6f1d0987 restarted

--------------------------------------------------------------------------

Post by Harshit Suri
It seems numpy fails to initialize.
However if I try to import numpy independently, it passes successfully.
My guess is that when I ran the updates it updated something that caused
this. The installation for pycuda was done by "pip install pycuda"

My guess would be Nvidia kernel/driver mismatch. Check the end of dmesg,
there might be a message there.
Andreas

Harshit Suri

2018-07-16 21:02:18 UTC

Permalink

I was able to resolve this. I am writing the steps in case it is helpful to
someone in the future...

Basically I uninstalled pycuda
$ pip uninstall pycuda

I downloaded the sources from https://pypi.org/project/pycuda/#files
(pycuda .tar file)

Then I ran steps 1 and 3 from
https://wiki.tiker.net/PyCuda/Installation/Linux (skipped step 2, which
installs numpy, because it was already installed)

I guess pycuda needed to be rebuilt from sources after all the updates to
my Ubuntu machine were done.
Thanks for your time Andreas.

Post by Harshit Suri
Thank you for the reply Andreas,
The NVIDIA driver was definitely updated. ( it was on the list of items
in the Ubuntu "software updater" prior to running the updates ).
I have included the lines of dmesg corresponding to NVIDIA below, after
trying to rerun the import pycuda.driver in jupyter notebook.
-I am able to run my regular CUDA-C/C++ code binaries successfully after
the update: Based on this am I correct in assuming that this is not
necessarily a Nvidia kernel/driver mismatch issue?
-Is there a way I can start from scratch and force pycuda to rebuild the
installation?
-Should I try to go to an older NVIDIA driver binary?
Thanks
output of dmesg
======================================================================================
[ 2.523799] nvidia: loading out-of-tree module taints kernel.
[ 2.523807] nvidia: module license 'NVIDIA' taints kernel.
[ 2.523808] Disabling lock debugging due to kernel taint
[ 2.534215] nvidia: module verification failed: signature and/or
required key missing - tainting kernel
[ 2.541075] systemd-udevd[682]: failed to execute '/bin/systemctl'
'/bin/systemctl start --no-block nvidia-persistenced.service': No such file
or directory
[ 2.541164] nvidia-nvlink: Nvlink Core is being initialized, major
device number 242
[ 2.541358] nvidia 0000:05:00.0: enabling device (0100 -> 0103)
PCI:0000:05:00.0,olddecodes=io+mem,decodes=none:owns=none
PCI:0000:04:00.0,olddecodes=io+mem,decodes=none:owns=io+mem
[ 2.541584] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 384.130
Wed Mar 21 03:37:26 PDT 2018 (using threaded interrupts)
[ 2.575938] nvidia-modeset: Loading NVIDIA Kernel Mode Setting Driver
for UNIX platforms 384.130 Wed Mar 21 02:59:49 PDT 2018
[ 2.576962] [drm] [nvidia-drm] [GPU ID 0x00000500] Loading driver
[ 2.577030] [drm] [nvidia-drm] [GPU ID 0x00000400] Loading driver
[ 2.590360] clocksource: Switched to clocksource tsc
[ 2.606251] init: failsafe main process (693) killed by TERM signal
[ 2.616069] random: dbus-daemon: uninitialized urandom read (12 bytes
read, 124 bits of entropy available)
[ 2.618449] random: dbus-daemon: uninitialized urandom read (12 bytes
read, 124 bits of entropy available)
[ 2.635743] nvidia-uvm: Loaded the UVM driver in 8 mode, major device
number 241
[ 2.674524] random: nonblocking pool is initialized
[ 2.678916] systemd-udevd[776]: failed to execute '/bin/systemctl'
'/bin/systemctl start --no-block nvidia-persistenced.service': No such file
or directory
[ 7.507516] nvidia-modeset: Allocated GPU:1
[ 12.010315] tty_warn_deprecated_flags: 'ModemManager' is using
deprecated serial flags (with no effect): 00008000
[ 32.726283] audit_printk_skb: 51 callbacks suppressed
apparmor="STATUS" operation="profile_replace" profile="unconfined"
name="/usr/lib/cups/backend/cups-pdf" pid=2416 comm="apparmor_parser"
apparmor="STATUS" operation="profile_replace" profile="unconfined"
name="/usr/sbin/cupsd" pid=2416 comm="apparmor_parser"
======================================================================================
On Mon, Jul 16, 2018 at 10:59 AM Andreas Kloeckner <

Post by Andreas Kloeckner
Harshit,

Post by Harshit Suri
I had a working installation of pycuda. However, after running updates

Post by Harshit Suri
my Ubuntu machine;
import pycuda.driver as cuda fails.
( I had also updated my anaconda install and updated all packages that
anaconda found that required updates )
When I try running " import pycuda.driver " through jupyter notebook
I get the following error message. Followed by a kernel crash

--------------------------------------------------------------------------

keep

Post by Harshit Suri
random ports
kernel 5fee93d7-62cf-468a-bcb0-b48e6f1d0987 restarted

--------------------------------------------------------------------------

My guess would be Nvidia kernel/driver mismatch. Check the end of dmesg,
there might be a message there.
Andreas