vllm.entrypoints.serve.disagg.protocol ¶
GenerateRequest ¶
Bases: BaseModel
Source code in vllm/entrypoints/serve/disagg/protocol.py
62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 | |
features class-attribute instance-attribute ¶
features: MultiModalFeatures | None = None
Multimodal hashes and placeholder positions (populated for MM inputs).
sampling_params instance-attribute ¶
sampling_params: SamplingParams
The sampling parameters for the model.
is_sampling_param_provided ¶
Whether the caller explicitly set sampling_params.<name>.
For requests parsed from a JSON body, this reflects the raw input dict. For requests constructed with a pre-built SamplingParams instance, all fields are considered provided so server-side defaults do not clobber values already resolved upstream.
Source code in vllm/entrypoints/serve/disagg/protocol.py
MultiModalFeatures ¶
Bases: BaseModel
Lightweight multimodal metadata produced by the render step.
Carries hashes (for cache lookup / identification) and placeholder positions so the downstream /generate service knows where in the token sequence each multimodal item lives.
Source code in vllm/entrypoints/serve/disagg/protocol.py
kwargs_data class-attribute instance-attribute ¶
Per-modality serialized tensor data.
Each value is a list parallel to mm_hashes[modality]. A str entry is a base64-encoded MultiModalKwargsItem; None means the item should be resolved from cache. The entire field is None for metadata-only (cache-hit) responses.
mm_hashes instance-attribute ¶
Per-modality item hashes, e.g. {"image": ["abc", "def"]}.
mm_placeholders instance-attribute ¶
mm_placeholders: dict[str, list[PlaceholderRangeInfo]]
Per-modality placeholder ranges in the token sequence.
PlaceholderRangeInfo ¶
Bases: BaseModel
Serializable placeholder location for a single multi-modal item.