Spaces:

encryptd
/

ocr_vlm_nuextract3

Sleeping

App Files Files Community

ocr_vlm_nuextract3

Commit History

Optimize speculative decoding performance by increasing max_num_batched_tokens to 4096

a0613f9

encryptd commited on 1 day ago

Fix entrypoint override in Dockerfile

8c76968

encryptd commited on 1 day ago

Fix HF space build error by using official vllm-openai:v0.21.0 base image

0a0b8ea

encryptd commited on 1 day ago

Remove torchvision pin to allow vllm 0.12.0 to resolve its own torchvision dependency

a35ce3f

encryptd commited on 1 day ago

Upgrade to vllm 0.12.0 and restore original performance arguments for Qwen 3.5 support

1cc1c9e

encryptd commited on 1 day ago

Pin transformers to 4.48.2 to satisfy vllm>=4.48.2 dependency and resolve TokenizersBackend AttributeError

ee926f5

encryptd commited on 1 day ago

Pin transformers to 4.48.0 to resolve TokenizersBackend AttributeError in vLLM 0.8.0

ab83d82

encryptd commited on 1 day ago

Remove limit-mm-per-prompt argument to bypass vLLM 0.8.0 multimodal registration check

c2fa736

encryptd commited on 1 day ago

Remove speculative-config argument for vLLM 0.8.0 CLI parser

20df6b8

encryptd commited on 1 day ago

Fix limit-mm-per-prompt syntax for vLLM 0.8.0 CLI parser

7f9201a

encryptd commited on 1 day ago

Update torch to 2.6.0+cu124 and torchvision to 0.21.0+cu124 to satisfy vllm 0.8.0 dependency requirements

860dc84

encryptd commited on 1 day ago

Downgrade torch to 2.5.1+cu124 and vllm to 0.8.0 to match available CUDA 12.4 pre-compiled wheels

bdb058c

encryptd commited on 1 day ago

Fix NVIDIA driver mismatch on HF Space by forcing +cu124 torch wheels

ba7160f

encryptd commited on 1 day ago

Fix: Upgrade to vLLM 0.22.0 and PyTorch 2.11.0 on CUDA 12.4 for native Qwen 3.5 support and host compatibility

359492c

encryptd commited on 1 day ago

Fix: Remove speculative-config argument for vLLM 0.7.2 CLI compliance

e776eb6

encryptd commited on 1 day ago

Fix: Update --limit-mm-per-prompt format to KEY=VALUE format image=99 for vLLM 0.7.2

a5a317d

encryptd commited on 1 day ago

Fix: Align vLLM arguments with model card recommendations (chat format, MTP speculative config, limit-mm-per-prompt)

244de46

encryptd commited on 1 day ago

Fix: Change wheel index to cu124 and pin torch to 2.5.1+cu124 for native CUDA 12.4 host driver compatibility

c6032e7

encryptd commited on 1 day ago

Fix: Remove manual transformers source install to resolve tokenizer AttributeError

4222b36

encryptd commited on 1 day ago

Fix: Remove limit-mm-per-prompt and mm-processor-kwargs for vLLM 0.7.2 compatibility

9700b91

encryptd commited on 1 day ago

Fix: Update --limit-mm-per-prompt format to KEY=VALUE for vLLM 0.7.2 compatibility

b7c3dd4

encryptd commited on 1 day ago

Fix: Pin torch to 2.5.1+cu121 and vllm to 0.7.2 to guarantee CUDA 12.4 host driver compatibility

d1fc8a9

encryptd commited on 1 day ago

Fix: Add LD_LIBRARY_PATH forward compatibility for older host GPU drivers

d16bf2a

encryptd commited on 1 day ago

Fix: Switch base to CUDA devel image to provide nvcc for flashinfer JIT, and add .gitignore

2ea1c55

encryptd commited on 1 day ago

Fix: Remove show_copy_button parameter from gr.Textbox for Gradio 6 compatibility

5aa37ab

encryptd commited on 2 days ago

Fix: Remove audioop-lts as Python 3.10 natively provides audioop

eeebc77

encryptd commited on 2 days ago

Migration: Convert Hugging Face Space to custom Docker Space using CUDA 12.4

f1ba762

encryptd commited on 2 days ago

Fix: Install transformers from source to support Qwen 3.5 architecture

7eb1ffe

encryptd commited on 2 days ago

Initial commit: NuExtract3 Gradio space setup powered by vLLM on A100 GPU

88dbb61

encryptd commited on 3 days ago

Commit History

Optimize speculative decoding performance by increasing max_num_batched_tokens to 4096 a0613f9

Fix entrypoint override in Dockerfile 8c76968

Fix HF space build error by using official vllm-openai:v0.21.0 base image 0a0b8ea

Remove torchvision pin to allow vllm 0.12.0 to resolve its own torchvision dependency a35ce3f

Upgrade to vllm 0.12.0 and restore original performance arguments for Qwen 3.5 support 1cc1c9e

Pin transformers to 4.48.2 to satisfy vllm>=4.48.2 dependency and resolve TokenizersBackend AttributeError ee926f5

Pin transformers to 4.48.0 to resolve TokenizersBackend AttributeError in vLLM 0.8.0 ab83d82

Remove limit-mm-per-prompt argument to bypass vLLM 0.8.0 multimodal registration check c2fa736

Remove speculative-config argument for vLLM 0.8.0 CLI parser 20df6b8

Fix limit-mm-per-prompt syntax for vLLM 0.8.0 CLI parser 7f9201a

Update torch to 2.6.0+cu124 and torchvision to 0.21.0+cu124 to satisfy vllm 0.8.0 dependency requirements 860dc84

Downgrade torch to 2.5.1+cu124 and vllm to 0.8.0 to match available CUDA 12.4 pre-compiled wheels bdb058c

Fix NVIDIA driver mismatch on HF Space by forcing +cu124 torch wheels ba7160f

Fix: Upgrade to vLLM 0.22.0 and PyTorch 2.11.0 on CUDA 12.4 for native Qwen 3.5 support and host compatibility 359492c

Fix: Remove speculative-config argument for vLLM 0.7.2 CLI compliance e776eb6

Fix: Update --limit-mm-per-prompt format to KEY=VALUE format image=99 for vLLM 0.7.2 a5a317d

Fix: Align vLLM arguments with model card recommendations (chat format, MTP speculative config, limit-mm-per-prompt) 244de46

Fix: Change wheel index to cu124 and pin torch to 2.5.1+cu124 for native CUDA 12.4 host driver compatibility c6032e7

Fix: Remove manual transformers source install to resolve tokenizer AttributeError 4222b36

Fix: Remove limit-mm-per-prompt and mm-processor-kwargs for vLLM 0.7.2 compatibility 9700b91

Fix: Update --limit-mm-per-prompt format to KEY=VALUE for vLLM 0.7.2 compatibility b7c3dd4

Fix: Pin torch to 2.5.1+cu121 and vllm to 0.7.2 to guarantee CUDA 12.4 host driver compatibility d1fc8a9

Fix: Add LD_LIBRARY_PATH forward compatibility for older host GPU drivers d16bf2a

Fix: Switch base to CUDA devel image to provide nvcc for flashinfer JIT, and add .gitignore 2ea1c55

Fix: Remove show_copy_button parameter from gr.Textbox for Gradio 6 compatibility 5aa37ab

Fix: Remove audioop-lts as Python 3.10 natively provides audioop eeebc77

Migration: Convert Hugging Face Space to custom Docker Space using CUDA 12.4 f1ba762

Fix: Install transformers from source to support Qwen 3.5 architecture 7eb1ffe

Initial commit: NuExtract3 Gradio space setup powered by vLLM on A100 GPU 88dbb61

Optimize speculative decoding performance by increasing max_num_batched_tokens to 4096

a0613f9

Fix entrypoint override in Dockerfile

8c76968

Fix HF space build error by using official vllm-openai:v0.21.0 base image

0a0b8ea

Remove torchvision pin to allow vllm 0.12.0 to resolve its own torchvision dependency

a35ce3f

Upgrade to vllm 0.12.0 and restore original performance arguments for Qwen 3.5 support

1cc1c9e

Pin transformers to 4.48.2 to satisfy vllm>=4.48.2 dependency and resolve TokenizersBackend AttributeError

ee926f5

Pin transformers to 4.48.0 to resolve TokenizersBackend AttributeError in vLLM 0.8.0

ab83d82

Remove limit-mm-per-prompt argument to bypass vLLM 0.8.0 multimodal registration check

c2fa736

Remove speculative-config argument for vLLM 0.8.0 CLI parser

20df6b8

Fix limit-mm-per-prompt syntax for vLLM 0.8.0 CLI parser

7f9201a

Update torch to 2.6.0+cu124 and torchvision to 0.21.0+cu124 to satisfy vllm 0.8.0 dependency requirements

860dc84

Downgrade torch to 2.5.1+cu124 and vllm to 0.8.0 to match available CUDA 12.4 pre-compiled wheels

bdb058c

Fix NVIDIA driver mismatch on HF Space by forcing +cu124 torch wheels

ba7160f

Fix: Upgrade to vLLM 0.22.0 and PyTorch 2.11.0 on CUDA 12.4 for native Qwen 3.5 support and host compatibility

359492c

Fix: Remove speculative-config argument for vLLM 0.7.2 CLI compliance

e776eb6

Fix: Update --limit-mm-per-prompt format to KEY=VALUE format image=99 for vLLM 0.7.2

a5a317d

Fix: Align vLLM arguments with model card recommendations (chat format, MTP speculative config, limit-mm-per-prompt)

244de46

Fix: Change wheel index to cu124 and pin torch to 2.5.1+cu124 for native CUDA 12.4 host driver compatibility

c6032e7

Fix: Remove manual transformers source install to resolve tokenizer AttributeError

4222b36

Fix: Remove limit-mm-per-prompt and mm-processor-kwargs for vLLM 0.7.2 compatibility

9700b91

Fix: Update --limit-mm-per-prompt format to KEY=VALUE for vLLM 0.7.2 compatibility

b7c3dd4

Fix: Pin torch to 2.5.1+cu121 and vllm to 0.7.2 to guarantee CUDA 12.4 host driver compatibility

d1fc8a9

Fix: Add LD_LIBRARY_PATH forward compatibility for older host GPU drivers

d16bf2a

Fix: Switch base to CUDA devel image to provide nvcc for flashinfer JIT, and add .gitignore

2ea1c55

Fix: Remove show_copy_button parameter from gr.Textbox for Gradio 6 compatibility

5aa37ab

Fix: Remove audioop-lts as Python 3.10 natively provides audioop

eeebc77

Migration: Convert Hugging Face Space to custom Docker Space using CUDA 12.4

f1ba762

Fix: Install transformers from source to support Qwen 3.5 architecture

7eb1ffe

Initial commit: NuExtract3 Gradio space setup powered by vLLM on A100 GPU

88dbb61