[Bug]: Add speculative decoding metrics for HPU

### Your current environment


In the lines referenced below, is_cuda_alike() evaluates to False, causing the function to return None.
As a result, speculative decoding metrics are disabled.

https://github.com/HabanaAI/vllm-fork/blob/bef366097c64b486138031cbded37eed8e10d038/vllm/spec_decode/metrics.py#L99-L119


### 🐛 Describe the bug

export VLLM_CONTIGUOUS_PA=false
export VLLM_SKIP_WARMUP=true
export PT_HPU_LAZY_MODE=1
export PT_HPU_ENABLE_LAZY_COLLECTIVES=true


python -m vllm.entrypoints.openai.api_server \
	--host 0.0.0.0 --port 8000 \
	--model meta-llama/Llama-3.1-70B-Instruct    \
	--seed 42 -tp 4 \
	--speculative_config '{"model": "meta-llama/Llama-3.1-8B-Instruct", "num_speculative_tokens": 5, "target_parallel_config": 1}' \
	--gpu_memory_utilization 0.95 --max-model-len 16384

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

	def maybe_collect_rejsample_metrics(
	self, k: int) -> Optional[SpecDecodeWorkerMetrics]:
	# Skip for any platform that doesn't have device Event
	if current_platform.Event is None:
	return None

	if not current_platform.is_cuda_alike():
	return None

	# If a copy was initiated in the previous call, collect and return.
	if self._in_flight_copy is not None:
	ready_event = self._in_flight_copy
	self._in_flight_copy = None
	return self._collect_rejsample_metrics(k, ready_event)

	# Otherwise, check if we should start a new copy.
	if self._should_collect_rejsample_metrics(self._timer()):
	assert self._in_flight_copy is None
	self._in_flight_copy = self._copy_rejsample_metrics_async()

	return None

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: Add speculative decoding metrics for HPU #1921

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug]: Add speculative decoding metrics for HPU #1921

Description

Your current environment

🐛 Describe the bug

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions