[Mageia-dev] NVIDIA CUDA 5 has landed

Fri Dec 14 23:50:10 CET 2012

On 12/14/2012 10:08 PM, Dimitri wrote:
> Hi,
>
> Finally, NVIDIA CUDA Toolkit 5.0.35 has landed in Cauldron. I apologize
> for the delay, and ask everyone interested to test the packages
> (especially on x86_64).
>

Many thanks !!!

> * GCC 4.7 is not supported. However, you can fool nvcc and compile a
> program with -D__GNUC_MINOR__=6;

I used to disable the check in CUDA itself, it is a simple patch, goes
attached. So you dont have to tweak every compile line in yor project.
BTW, CUDA works fine for me with gcc 4.7. The only problem I know is
that cuda-gdb wont work with 4.7 because different formats in binaries
(COFF/ELF related ???).

>
> The work on the package is still not finished. The package contains an
> init script (nvidia) that creates device nodes and loads kernel module.
> This "service" is intended to be started on GUI-less compute nodes. I
> want to ask for assistance in migrating this script to systemd unit. As
> far as I know, with latest kernels and NVIDIA drivers the device nodes
> always get created automatically (never noticed them not being
> created). So probably the only thing left is to load kernel module.
> Seems like putting it into modules-load.d/* is a bad idea, because if
> the module is absent, systemd-modules-load.service will fail. Should we
> simply make a unit with ExecStart=/usr/sbin/modprobe nvidia-current instead?

I also used the script to workaround a problem with running CUDA on
a system without X. If there is no X running on the nVidia card, CUDA
programs take a looooot to launch. So the solution is to launch
something like nvidia-smi at boot and let it running.

Now you seem to be looking into packaging, I will tell about a problem
I have. It can be at the same time very specific, but also very common.
I have a box with Intel graphics, they work fine for 2D and desktop,
and an nvidia card (a cheap but powerfull GT640), to crunch numbers.
This can be a very common setup where someone has a box with
intel graphics and an accelerator Tesla card. They want nVidia CUDA
but not nVidia GL.

So, I want the GL from Mesa for intel graphics, but libcuda from
nVidia libraries. Problem: if /usr/lib64/nvidia-current is in the
path (via alteratives), libGL also picks the nvidia one, not Mesa's.
And that breaks desktop. I have to rm the nvidia GL libs.
Also, cuda libraries are independent of GL ones, and should not go
into nvidia-current, they could go just fine in /usr/lib[64].
They should not even suggests x11-driver-nvidia, cause for example
a server with a Tesla card needs the kernel module but not the x11
driver.
Then there is the problem of some utilities you need on a headless
CUDA server that are now packaged into the x11-driver (nvidia-smi).
In short, I would like a CUDA environment without the CUDA
x11 driver and GL libraries. It can work fine.
And as a side effect, now that drivers for 8xxx and 9xxx cards
are different from the next generations, and both support CUDA,
there should not be any cuda part in the driver package...

So I would propose these, blame me if this is too complicated. All this
can be done and checked in separate steps or package releases:

- move libs in nvidia-current-cuda-opencl from /usr/lib[64]/nvidia-current
   to /usr/lib[64]. None links against libGL, and work on an
   environment with Mesa's libGL, and with alternatives for gl_conf
   set to standard. They do not comflict with anybodyelse nor
   any other package provides an alternative. And they can even
   be used without an nVidia card with pthreads emulation.
   (perhaps in the future if AMD or Intel give its own libOpenCL,but then
   probaly OpenCL libraries would need to be split apart).
- move libcuda.so symlink (and all from nvidia-current-devel related
   to cuda libraries) to nvidia-cuda-toolkit-devel, also in /usr/lib).
   So you don't need GL devel package and bunch of requires to build
   CUDA apps...
- move nvidia-smi/nvidia-xconfig to its own package (nvidia-cli-utils?),
   they do not require GL not even X.
- perhaps rename nvidia-cuda-tollkit-***** to nvidia-cuda-*****  ;)
   (I really think the -toolkit part is redundant...)

So the target will be to be able to have a CUDA compute environment
without the X or GL part of nvidia.

Sorry for the long and boring mail, but I had this in mind since looong
ago.

TIA

-- 
J.A. Magallon <jamagallon()ono!com>        \               Winter is coming...
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cuda-gcc-47.diff
Type: text/x-patch
Size: 613 bytes
Desc: not available
URL: </pipermail/mageia-dev/attachments/20121214/9be4482d/attachment.bin>