-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Pull requests: NVIDIA/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[https://nvbugs/6341070][fix] Pass kv_cache_free_gpu_memory_fraction=0.5 to TRTLLMWorker.init_with_new_llm…
#15495
opened Jun 19, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[https://nvbugs/6329165][fix] In TestNemotronNanoV3.test_accuracy, mocker.patch.dict GSM8K.EVALUATE_KWARGS…
#15494
opened Jun 19, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[https://nvbugs/6341072][fix] Change the model_name fixture to "llama-models-v2/TinyLlama-1.1B-Chat-v1.0"…
#15492
opened Jun 19, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[https://nvbugs/6337233][fix] In
_build_fake_self, bind PyExecutor._is_stats_dummy_request onto the fake…
#15491
opened Jun 19, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[https://nvbugs/6337238][fix] Test-only fix — use the existing get_hf_rope_theta helper, adapt the…
#15490
opened Jun 19, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[https://nvbugs/6337231][fix] Replace all 9
self._is_stats_dummy_request(req) calls with…
#15489
opened Jun 19, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[https://nvbugs/6335726][fix] In test_qwen_moe_routed_expert_multi_lora_varying_ranks, drop ranks…
#15488
opened Jun 19, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[https://nvbugs/6337226][fix] Switch
max(positive_hits) → min(positive_hits) in both the range and…
#15487
opened Jun 19, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[https://nvbugs/6317600][fix] Add an early return at the head of
_run_attention_warmup when…
#15486
opened Jun 19, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[None][feat] support per-layer mixed-precision MoE serving (GLM, Qwen3-MoE)
#15485
opened Jun 19, 2026 by
joshua-hill
Loading…
[https://nvbugs/6337224][fix] Update PERF_SANITY_DIR to include
aggregated/; in recipe_to_server_config…
#15484
opened Jun 19, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[https://nvbugs/6322076][fix] Added _init_nccl_x86_64_init_workaround() that sets NCCL_MNNVL_ENABLE=0 and…
#15483
opened Jun 18, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[https://nvbugs/6336801][fix] Add the two
skip_softmax_threshold_scale_factor_decode/prefill aliases to…
#15482
opened Jun 18, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[https://nvbugs/6239637][fix] Unwaive Qwen3.5 cases on A100 platform
#15481
opened Jun 18, 2026 by
nv-guomingz
Collaborator
Loading…
1 task done
[https://nvbugs/6322073][fix] Add
_needs_x86_nccl_pp_workaround() and init_nccl_pp_workaround() helpers…
#15480
opened Jun 18, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[https://nvbugs/6322045][fix] In triton_context, when max_q_len == max_kv_len (so cache_lens=0) and…
#15479
opened Jun 18, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[https://nvbugs/6317074][fix] One-line bump of free_gpu_memory_fraction from 0.6 to 0.8 in…
#15477
opened Jun 18, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[#14679][fix] Fix fused-QKV TP sharding for Phi-3/Phi-4
#15475
opened Jun 18, 2026 by
guan404ming
Contributor
Loading…
1 task done
[https://nvbugs/6276981][fix] Force the q-split + allgather code path whenever q_split_eligible=True (drop…
#15474
opened Jun 18, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[https://nvbugs/6315845][fix] Dual-location revert of #14851's protective-code removal: (1) pin…
#15472
opened Jun 18, 2026 by
chenfeiz0326
Collaborator
Loading…
2 tasks done
[https://nvbugs/6337235][test] Fix MX/GMS model loader fixtures
#15471
opened Jun 18, 2026 by
chienchunhung
Collaborator
•
Draft
[None][feat] Qwen-Image: load pre-quantized ModelOpt NVFP4/FP8 checkpoints
#15470
opened Jun 18, 2026 by
jingyu-ml
Loading…
[https://nvbugs/6327147][fix] Lower kv_cache_config.free_gpu_memory_fraction from 0.7 to 0.5 in the…
#15468
opened Jun 18, 2026 by
tensorrt-cicd
Collaborator
Loading…
2 tasks done
[#15463][fix] Sync llm_args.enable_chunked_prefill with chunked-prefill fallback gates
#15467
opened Jun 18, 2026 by
DhineshPonnarasan
Contributor
Loading…
8 tasks done
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.