Skip to content

vllm.model_executor.layers.fused_moe.experts

Modules:

Name Description
aiter_mxfp4_w4a8_moe
batched_deep_gemm_moe
cpu_moe

CPU FP8 W8A16 and MXFP4 W4A16 fused MoE experts.

cutlass_moe

CUTLASS based Fused MoE kernels.

deep_gemm_moe
fallback
flashinfer_cutedsl_batched_moe
flashinfer_cutedsl_moe
flashinfer_cutlass_moe
fused_batched_moe

Fused batched MoE kernel.

fused_humming_moe

Fused MoE utilities for Humming.

gpt_oss_triton_kernels_moe
lora_context
lora_experts_mixin
marlin_moe

Fused MoE utilities for GPTQ.

nvfp4_emulation_moe

NVFP4 quantization emulation for MoE.

ocp_mx_emulation_moe

OCP MX quantization emulation for MoE.

rocm_aiter_moe
triton_cutlass_moe
triton_deep_gemm_moe
triton_moe

Triton-based MoE expert implementations.

trtllm_bf16_moe
trtllm_fp8_moe
trtllm_mxfp4_moe
trtllm_nvfp4_moe