steelbrain
Ah this is quite interesting! I had a usecase where I needed a GPU-over-IP but only for transcoding videos. I had a not-so-powerful AMD GPU in my homelab server that somehow kept crashing the kernel any time I tried to encode videos with it and also an NVIDIA RTX 3080 in a gaming machine.

So I wrote https://github.com/steelbrain/ffmpeg-over-ip and had the server running in the windows machine and the client in the media server (could be plex, emby, jellyfin etc) and it worked flawlessly.

radarsat1
I'm confused, if this operates at the CPU/GPU boundary doesn't it create a massive I/O bottleneck for any dataset that doesn't fit into VRAM? I'm probably misunderstanding how it works but if it intercepts GPU i/o then it must stream your entire dataset on every epoch to a remote machine, which sounds wasteful, probably I'm not getting this right.
dishsoap
For anyone curious about how this actually works, it looks like a library is injected into your process to hook these functions [1] in order to forward them to the service.

[1] https://pastebin.com/raw/kCYmXr5A

Cieric
This is interesting, but I'm more interested in self-hosting. I already have a lot of GPUs (some running some not.) Does this have a self-hosting option so I can use the GPUs I already have?
doctorpangloss
I don't get it. Why would I start an instance in ECS, to use your GPUs in ECS, when I could start an instance for the GPUs I want in ECS? Separately, why would I want half of Nitro, instead of real Nitro?
cpeterson42
Given the interest here we decided to open up T4 instances for free. Would love for y'all to try it and let us know your thoughts!
tptacek
This is neat. Were you able to get MIG or vGPUs working with it?
mmsc
What's it like to actually use this for any meaningful throughput? Can this be used for hash cracking? Every time I think about virtual GPUs over a network, I think about botnets. Specifically from https://www.hpcwire.com/2012/12/06/gpu_monster_shreds_passwo... "Gosney first had to convince Mosix co-creator Professor Amnon Barak that he was not going to “turn the world into a giant botnet.”"
somat
What makes me sad is that the original sgi engineers who developed glx were very careful to use x11 mechanisms for the gpu transport, so it was fairly trivial to send the gl stream over the network to render on your graphics card. "run on the supercomputer down the hall, render on your workstation". More recent driver development has not shown such care and this is usually no longer possible.

I am not sure how useful it was in reality(usually if you had a nice graphics card you also had a nice cpu) but I had fun playing around with it. There was something fascinating about getting accelerated graphics on a program running in the machine room. I was able to get glquake running like this once.

userbinator
It's impressive that this is even possible, but I wonder what happens if the network connection goes down or is anything but 100% stable? In my experience drivers react badly to even a local GPU that isn't behaving.
orsorna
So what exactly is the pricing model? Do I need a quote? Because otherwise I don't see how to determine it without creating an account which is needlessly gatekeeping.
delijati
Even a directly attached eGPU via thunderbold 4 was after some time too slow for machine learning aka training. As i work now fully remote i just have a beefy midi tower. Some context about eGPU [1].

But hey i'm happy to be proofed wrong ;)

[1] https://news.ycombinator.com/item?id=38890182#38905888

ellis0n
In 2008, I had a powerful server with XEON CPU, but the motherboard had no slots for a graphics card. I also had a computer with a powerful graphics card but a weak Core 2 Duo. I had the idea of passing the graphics card over the network using Linux drivers. This concept has now been realized in this project. Good job!
kawsper
Cool idea, nice product page!

Does anyone know if this is possible with USB?

I have a Davinci Resolve license USB-dongle I'd like to not plugging into my laptop.

the_reader
Would be possible to mix it with Blender?
teaearlgraycold
This could be perfect for us. We need very limited bandwidth but have high compute needs.
xyst
Exciting. But would definitely like to see a self hosted option.
talldayo
> Access serverless GPUs through a simple CLI to run your existing code on the cloud while being billed precisely for usage

Hmm... well I just watched you run nvidia-smi in a Mac terminal, which is a platform it's explicitly not supported on. My instant assumption is that your tool copies my code into a private server instance and communicates back and forth to run the commands.

Does this platform expose eGPU capabilities if my host machine supports it? Can I run raster workloads or network it with my own CUDA hardware? The actual way your tool and service connects isn't very clear to me and I assume other developers will be confused too.

rubatuga
What ML packages do you support? In the comments below it says you do not support Vulkan or OpenGL. Does this support AMD GPUs as well?
winecamera
I saw that in the tnr CLI, there are hints of an option to self-host a GPU. Is this going to be a released feature?
tamimio
I’m more interested in using tools like hashcat, any benchmark on these? As the docs link returns error.
m3kw9
So won’t that make the network the prohibitive bottle neck? Your memory bandwidth is 1gbps max
billconan
is this a remote nvapi?

this is awesome. can it do 3d rendering (vulkan/opengl)

throwaway888abc
Does it work for gaming on windows ? or even linux ?
test20240809
pocl (Portable Computing Language) [1] provides a remote backend [2] that allows for serialization and forwarding of OpenCL commands over a network.

Another solution is qCUDA [3] which is more specialized towards CUDA.

In addition to these solutions, various virtualization solutions today provide some sort of serialization mechanism for GPU commands, so they can be transferred to another host (or process). [4]

One example is the QEMU-based Android Emulator. It is using special translator libraries and a "QEMU Pipe" to efficiently communicate GPU commands from the virtualized Android OS to the host OS [5].

The new Cuttlefish Android emulator [6] uses Gallium3D for transport and the virglrenderer library [7].

I'd expect that the current virtio-gpu implementation in QEMU [8] might make this job even easier, because it includes the Android's gfxstream [9] (formerly called "Vulkan Cereal") that should already support communication over network sockets out of the box.

[1] https://github.com/pocl/pocl

[2] https://portablecl.org/docs/html/remote.html

[3] https://github.com/coldfunction/qCUDA

[4] https://www.linaro.org/blog/a-closer-look-at-virtio-and-gpu-...

[5] https://android.googlesource.com/platform/external/qemu/+/em...

[6] https://source.android.com/docs/devices/cuttlefish/gpu

[7] https://cs.android.com/android/platform/superproject/main/+/...

[8] https://www.qemu.org/docs/master/system/devices/virtio-gpu.h...

[9] https://android.googlesource.com/platform/hardware/google/gf...

Zambyte
Reminds me of Plan9 :)
bkitano19
this is nuts
cpeterson42
We created a discord for the latest updates, bug reports, feature suggestions, and memes. We will try to respond to any issues and suggestions as quickly as we can! Feel free to join here: https://discord.gg/nwuETS9jJK