-
Notifications
You must be signed in to change notification settings - Fork 271
Update Docker images to latest Ubuntu version #610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
What's the rationale here ? Upgrading deps is indeed nice, but Cuda 12.8 is rather new (Jan 2025) so it would make TEI fail to run on any older deployments/noeds. Unless it unlocks things, I don't think we should upgrade at the moment. Ubuntu 24 should be ok. |
I was hoping to get us upgraded to the latest CUDA 12.x versions since within minor releases, CUDA is mostly backwards compatible. Let me know if I misunderstood something here @Narsil |
I think doesn't hold for nvidia container: NVIDIA/nvidia-container-toolkit#940 It's been a while I haven't personally see this arise since we're trying to keep up a lot with newer versions of everything, but the cuda version of the node has caused issues in the past in clusters I manage. Is there any particular reason wanting to upgrade ? (The stance here is that if it's not broken, no need to fix it, and we can take advantage of a later minor upgrade do to such potentially breaking version upgrades) |
@Narsil Thanks for pointing out that issue. Forward compatibility is not something that I considered.
I think the rationale is to just ensure that we don't fall too behind on dependency upgrades. CUDA 12.2 was released in June 2023, and the driver version shipped for 12.2 is not really compatible with some of the newer GPUs coming out (see the AWS EC2 instance/nvidia-driver compatibility matrix). I am fine with reverting the PR to just the Ubuntu update and we can maybe update the CUDA version in a later major TEI release (1.8.0 or 2.0?), but the current change is still technically only a minor version update of the CUDA drivers themselves. So I'm a little ambivalent/curious on how this would fit into a TEI release lifecycle? |
1.8 is fine for those kind of upgrades. If you just update ubuntu I will definitely merge as-is, otherwise we can leave as-is and I'll merge when 1.8 hits (there are no plans just yet, usually it happens when there's something significant happening, not necessarily a breaking change). Again, I think it's welcome in general to update regularly, but having been bitten in the past, and seeing no obvious reason right now, I tend to delay those including them by default. Thanks a lot for the PR regardless. |
@Narsil Thanks for the update! I'll revert the CUDA changes then, so it can make the 1.7.1 release |
Oops. Looks like I was too late! Either way I can keep track of this and raise another PR when the time is right to update to the latest CUDA version. Thanks for the feedback and the discussion! |
For such changes regarding CUDA, it's probably better to measure any actual gain from bumping min version. Likewise for the concern about lacking support, by having an actual case of breaking compatibility. If someone does have one, I'd appreciate that but my understanding with cudarc usage is the following:
That said there is no actual build of those common CUDA libs regardless of choice. I had assumed then that the only actual concern for compatibility was if you needed an API call that wouldn't build due to not being available for a lower version of CUDA, which is more obvious need for a version bump... or as this project already does with it's Docker image builds, As such bumping the version of CUDA for the build be that in Have I understood that all correctly? There probably isn't much benefit to In both cases |
@Narsil is this good to merge? |
Sorry for the late reply but this was a really eye-opening comment for me @polarathene. So in this case, updating the CUDA version doesn't really provide any benefits because the underlying |
I am still trying to grok it myself, but I do know the version of The build is also tied to the major version of CUDA, although from what I've read it's only been major version compatible since CUDA 12? Once there is CUDA 13, it might differ again 😅 Mainly though, virtual or real architecture with compute capability is targeted for build, where virtual can provide forward compatible PTX to compile CUDA kernels at runtime via JIT, and you can use that to set the baseline/minimum compute capability required to be compatible. Too low and you may miss some performance benefits on newer hardware though (more than one PTX can be bundled too). This separate cuda crate is something I wasn't familiar with in my previous comment, but I am aware of TEI building it's own kernels for the subset of targets, so there is some relevance there. |
UPDATE: PR: #635 FWIW, I recently came across this advice:
Yet this project uses text-embeddings-inference/Dockerfile-cuda Lines 52 to 54 in 53eae1b
The command should apparently keep the major by including nvprune --generate-code code=sm_70 --generate-code code=sm_${CUDA_COMPUTE_CAP} /usr/local/cuda/lib64/libcublas_static.a -o /usr/local/cuda/lib64/libcublas_static.a Not sure how compatible that is, as I haven't tested it (the project README specifically states that CC 7.5 is the minimum (without FA feature), so that may fail but this min CC version requirement isn't clarified). I don't know if that was discussed in the past, or if there's been any reports about that target failing from that image build, but I thought I'd bring it to your attention.
As the README doesn't clarify why CC 7.5 is the minimum, perhaps the above is fine? It's only pruning a static lib, not building for a lower CC version. |
What does this PR do?
The docker base images haven't been updated in a while so I was wondering if we could port them over to the more newer base images and Ubuntu LTS version. Let me know if there are any concerns!
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
cc @Narsil @alvarobartt