Skip to content

Update the dependency version to fix the CI #437

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 36 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
5c6151c
fix: use malloc_trim to cleanup pages (#307)
OlivierDehaene Jun 27, 2024
7c9b7cb
feat(candle): add FlashMistral (#308)
OlivierDehaene Jun 27, 2024
35aefeb
feat(candle): add flash gte (#310)
OlivierDehaene Jun 28, 2024
0361479
feat: add default prompts (#312)
OlivierDehaene Jun 28, 2024
970922e
feat: Add optional CORS allow any option value in http server cli (#260)
kir-gadjello Jun 28, 2024
3c2f9ba
docs: Update `HUGGING_FACE_HUB_TOKEN` to `HF_API_TOKEN` in README (#…
kevinhu Jun 28, 2024
6c6cd93
v1.3.0 (#313)
OlivierDehaene Jun 28, 2024
4b2ab61
feat(candle): support Qwen2 on Cuda (#316)
OlivierDehaene Jul 2, 2024
a0549e6
v1.4.0
OlivierDehaene Jul 2, 2024
acbbb92
tokenizer max limit on input size (#324)
ErikKaum Jul 3, 2024
1e076c7
docs: air-gapped deployments (#326)
OlivierDehaene Jul 4, 2024
052037c
docs: remove revisions
OlivierDehaene Jul 4, 2024
7b9245d
feat(onnx): add onnx runtime for better CPU perf (#328)
OlivierDehaene Jul 5, 2024
e496fe7
feat: add `/similarity` route (#331)
OlivierDehaene Jul 8, 2024
4cc38bd
fix(ort): fix mean pooling (#332)
OlivierDehaene Jul 8, 2024
dcbea38
chore(candle): update flash attn (#335)
OlivierDehaene Jul 9, 2024
661a77f
v1.5.0 (#336)
OlivierDehaene Jul 10, 2024
ce1edf4
fix: Download `model.onnx_data` (#343)
kozistr Jul 15, 2024
af89c27
docs: Rename 'Sentence Transformers' to 'sentence-transformers' in do…
Wauplin Jul 15, 2024
a68893d
Ci new cluster (#345)
XciD Jul 18, 2024
a1f493f
fix(ci): fix incorrect Dockerfile
XciD Jul 18, 2024
630ddc7
chore(ci): improve ci (#351)
XciD Jul 22, 2024
b563f80
fix: use a more generic name (#352)
XciD Jul 22, 2024
9ec8f34
chore(ci): update build.yaml for buildkitd-config
XciD Jul 29, 2024
29f728e
fix: add serde default for truncation direction (#399)
drbh Sep 5, 2024
dbeb3ab
fix: metrics unbounded memory (#409)
OlivierDehaene Sep 17, 2024
ebe078a
fix: allow health check w/o auth (#360)
kozistr Sep 17, 2024
df03195
Update `ort` crate version to `2.0.0-rc.4` to support onnx IR version…
kozistr Sep 17, 2024
fef77b0
adds curl to fix healthcheck (#376)
WissamAntoun Sep 17, 2024
416efe1
fix: use num_cpus::get to check as get_physical does not check cgroup…
OlivierDehaene Sep 18, 2024
205f96c
fix: use status code 400 when batch is empty (#413)
OlivierDehaene Oct 17, 2024
750898d
fix: add cls pooling as default for BERT variants (#426)
OlivierDehaene Oct 17, 2024
cb1e594
feat: auto limit string if truncate is set (#428)
OlivierDehaene Oct 17, 2024
bbde203
v1.5.1
OlivierDehaene Nov 5, 2024
76b29f1
hotfix lockfile
OlivierDehaene Nov 5, 2024
015029b
update dependency
kozistr Nov 17, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
164 changes: 164 additions & 0 deletions .github/workflows/build.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
name: Build and push docker image to registry

on:
workflow_dispatch:
push:
branches:
- 'main'
tags:
- 'v*'
pull_request:
paths:
- ".github/workflows/build.yaml"
- ".github/workflows/matrix.json"
- "integration-tests/**"
- "backends/**"
- "core/**"
- "router/**"
- "Cargo.lock"
- "rust-toolchain.toml"
- "Dockerfile"
branches:
- 'main'

jobs:
matrix:
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
steps:
- name: Checkout repository
uses: actions/checkout@v3

- id: set-matrix
run: |
branchName=$(echo '${{ github.ref }}' | sed 's,refs/heads/,,g')
matrix=$(jq --arg branchName "$branchName" 'map(. | select((.runOn==$branchName) or (.runOn=="always")) )' .github/workflows/matrix.json)
echo "{\"include\":$(echo $matrix)}"
echo ::set-output name=matrix::{\"include\":$(echo $matrix)}\"

build-and-push-image:
needs: matrix
strategy:
matrix: ${{fromJson(needs.matrix.outputs.matrix)}}
concurrency:
group: ${{ github.workflow }}-${{ github.job }}-${{matrix.name}}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
runs-on:
group: aws-highmemory-32-plus-priv
permissions:
contents: write
packages: write
# This is used to complete the identity challenge
# with sigstore/fulcio when running outside of PRs.
id-token: write
security-events: write
steps:
- name: Checkout repository
uses: actions/checkout@v4

- name: Initialize Docker Buildx
uses: docker/setup-buildx-action@v3
with:
install: true
buildkitd-config: /tmp/buildkitd.toml

- name: Configure sccache
uses: actions/github-script@v6
with:
script: |
core.exportVariable('ACTIONS_CACHE_URL', process.env.ACTIONS_CACHE_URL || '');
core.exportVariable('ACTIONS_RUNTIME_TOKEN', process.env.ACTIONS_RUNTIME_TOKEN || '');

- name: Inject slug/short variables
uses: rlespinasse/github-slug-action@v4

- name: Login to internal Container Registry
if: github.event_name != 'pull_request'
uses: docker/login-action@v3
with:
username: ${{ secrets.REGISTRY_USERNAME }}
password: ${{ secrets.REGISTRY_PASSWORD }}
registry: registry.internal.huggingface.tech

- name: Login to GitHub Container Registry
if: github.event_name != 'pull_request'
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}

- name: Extract metadata (tags, labels) for Docker
id: meta
uses: docker/metadata-action@v5
with:
images: |
registry.internal.huggingface.tech/api-inference/text-embeddings-inference
ghcr.io/huggingface/text-embeddings-inference
flavor: |
latest=false
tags: |
type=semver,pattern=${{ matrix.imageNamePrefix }}{{version}}
type=semver,pattern=${{ matrix.imageNamePrefix }}{{major}}.{{minor}}
type=raw,value=${{ matrix.imageNamePrefix }}latest
type=raw,value=${{ matrix.imageNamePrefix }}sha-${{ env.GITHUB_SHA_SHORT }}

- name: Build and push Docker image
id: build-and-push
uses: docker/build-push-action@v6
with:
context: .
file: ${{ matrix.dockerfile }}
push: ${{ github.event_name != 'pull_request' }}
platforms: 'linux/amd64'
build-args: |
SCCACHE_GHA_ENABLED=${{ matrix.sccache }}
ACTIONS_CACHE_URL=${{ env.ACTIONS_CACHE_URL }}
ACTIONS_RUNTIME_TOKEN=${{ env.ACTIONS_RUNTIME_TOKEN }}
CUDA_COMPUTE_CAP=${{ matrix.cudaComputeCap }}
GIT_SHA=${{ env.GITHUB_SHA }}
DOCKER_LABEL=sha-${{ env.GITHUB_SHA_SHORT }}
${{matrix.extraBuildArgs}}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=s3,region=us-east-1,bucket=ci-docker-buildx-cache,name=text-embeddings-inference-cache-${{matrix.name}},access_key_id=${{ secrets.S3_CI_DOCKER_BUILDX_CACHE_ACCESS_KEY_ID }},secret_access_key=${{ secrets.S3_CI_DOCKER_BUILDX_CACHE_SECRET_ACCESS_KEY }},mode=max
cache-to: type=s3,region=us-east-1,bucket=ci-docker-buildx-cache,name=text-embeddings-inference-cache-${{matrix.name}},access_key_id=${{ secrets.S3_CI_DOCKER_BUILDX_CACHE_ACCESS_KEY_ID }},secret_access_key=${{ secrets.S3_CI_DOCKER_BUILDX_CACHE_SECRET_ACCESS_KEY }},mode=max

- name: Extract metadata (tags, labels) for Docker
id: meta-grpc
if: ${{ matrix.grpc }}
uses: docker/metadata-action@v5
with:
images: |
registry.internal.huggingface.tech/api-inference/text-embeddings-inference
ghcr.io/huggingface/text-embeddings-inference
flavor: |
latest=false
tags: |
type=semver,pattern=${{ matrix.imageNamePrefix }}{{version}}-grpc
type=semver,pattern=${{ matrix.imageNamePrefix }}{{major}}.{{minor}}-grpc
type=raw,value=${{ matrix.imageNamePrefix }}latest-grpc
type=raw,value=${{ matrix.imageNamePrefix }}sha-${{ env.GITHUB_SHA_SHORT }}-grpc

- name: Build and push Docker image
id: build-and-push-grpc
if: ${{ matrix.grpc }}
uses: docker/build-push-action@v6
with:
context: .
target: grpc
file: ${{ matrix.dockerfile }}
push: ${{ github.event_name != 'pull_request' }}
platforms: 'linux/amd64'
build-args: |
SCCACHE_GHA_ENABLED=${{ matrix.sccache }}
ACTIONS_CACHE_URL=${{ env.ACTIONS_CACHE_URL }}
ACTIONS_RUNTIME_TOKEN=${{ env.ACTIONS_RUNTIME_TOKEN }}
CUDA_COMPUTE_CAP=${{ matrix.cudaComputeCap }}
GIT_SHA=${{ env.GITHUB_SHA }}
DOCKER_LABEL=sha-${{ env.GITHUB_SHA_SHORT }}
${{matrix.extraBuildArgs}}
tags: ${{ steps.meta-grpc.outputs.tags }}
labels: ${{ steps.meta-grpc.outputs.labels }}
cache-from: type=s3,region=us-east-1,bucket=ci-docker-buildx-cache,name=text-embeddings-inference-cache-${{matrix.name}},access_key_id=${{ secrets.S3_CI_DOCKER_BUILDX_CACHE_ACCESS_KEY_ID }},secret_access_key=${{ secrets.S3_CI_DOCKER_BUILDX_CACHE_SECRET_ACCESS_KEY }},mode=max
136 changes: 0 additions & 136 deletions .github/workflows/build_75.yaml

This file was deleted.

Loading