-
Notifications
You must be signed in to change notification settings - Fork 344
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
While running pipeline:
You cannot select the number of dataset splits for a generative evaluation at the moment. Automatically inferring.
Splits: 0%| | 0/1 [00:00<?, ?it/s] Adding requests: 100%|██████████| 10/10 [00:00<00:00, 1842.11it/s]requests: 0%| | 0/10 [00:00<?, ?it/s]
Processed prompts: 10%|█ | 1/10 [00:09<01:28, 9.84s/it, est. speed input: 235.04 toks/s, output: 41.68 tProcessed prompts: 30%|███ | 3/10 [00:10<00:20, 2.87s/it, est. speed input: 506.65 toks/s, output: 122.60 Processed prompts: 50%|█████ | 5/10 [00:10<00:07, 1.42s/it, est. speed input: 916.65 toks/s, output: 214.78 Processed prompts: 60%|██████ | 6/10 [00:11<00:04, 1.22s/it, est. speed input: 1067.55 toks/s, output: 252.07Processed prompts: 70%|███████ | 7/10 [00:11<00:02, 1.02it/s, est. speed input: 1171.63 toks/s, output: 294.79Processed prompts: 80%|████████ | 8/10 [00:12<00:01, 1.13it/s, est. speed input: 1316.77 toks/s, output: 330.65Processed prompts: 100%|██████████| 10/10 [00:13<00:00, 1.25it/s, est. speed input: 1451.31 toks/s, output: 396.9ProcessedProcessed prompts: 100%|██████████| 10/10 [00:13<00:00, 1.39s/it, est. speed input: 1451.31 toks/s, output: 396.93 toks/s]
Splits: 100%|██████████| 1/1 [00:13<00:00, 13.97s/it]Splits: 100%|██████████| 1/1 [00:13<00:00, 13.97s/it]
Creating parquet from Arrow format: 0%| | 0/1 [00:00<?, ?ba/s]Creating parquet from Arrow format: 100%|██████████| 1/1 [00:00<00:00, 135.89ba/s]
Generating train split: 0 examples [00:00, ? examples/s]Generating train split: 10 examples [00:00, 1094.00 examples/s]
[rank0]:[W908 22:32:07.824350850 ProcessGroupNCCL.cpp:1479] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
Traceback (most recent call last):
File "/pkg/modal/_runtime/container_io_manager.py", line 778, in handle_input_exception
yield
File "/pkg/modal/_container_entrypoint.py", line 243, in run_input_sync
res = io_context.call_finalized_function()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/pkg/modal/_runtime/container_io_manager.py", line 197, in call_finalized_function
res = self.finalized_function.callable(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/main2.py", line 21, in main
evaluate()
File "/root/eval/main.py", line 50, in evaluate
pipeline.evaluate()
File "/usr/local/lib/python3.12/site-packages/lighteval/pipeline.py", line 317, in evaluate
self._compute_metrics(outputs)
File "/usr/local/lib/python3.12/site-packages/lighteval/pipeline.py", line 417, in _compute_metrics
outputs = apply_metric(
^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/lighteval/metrics/__init__.py", line 50, in apply_metric
metric.compute_sample(
File "/usr/local/lib/python3.12/site-packages/lighteval/metrics/utils/metric_utils.py", line 59, in compute_sample
return {self.metric_name: sample_level_fn(**kwargs)}
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/lighteval/metrics/metrics_sample.py", line 752, in compute
return self.summac.score_one(inp, prediction)["score"]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/lighteval/metrics/imports/summac.py", line 288, in score_one
image = self.imager.build_image(original, generated)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/lighteval/metrics/imports/summac.py", line 218, in build_image
batch_tokens = self.tokenizer.batch_encode_plus(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 3200, in batch_encode_plus
return self._batch_encode_plus(
^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: transformers.tokenization_utils_fast.PreTrainedTokenizerFast._batch_encode_plus() got multiple values for keyword argument 'truncation_strategy'
Stopping app - uncaught exception raised in remote container: TypeError("transformers.tokenization_utils_fast.PreTrainedTokenizerFast._batch_encode_plus() got multiple values for keyword argument 'truncation_strategy'").
╭─ Error ────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ transformers.tokenization_utils_fast.PreTrainedTokenizerFast._batch_encode_plus() got multiple values for │
│ keyword argument 'truncation_strategy' │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
task: Failed to run task "modal": exit status 1
To Reproduce
import sys
import os
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
from lighteval.logging.evaluation_tracker import EvaluationTracker
from lighteval.models.vllm.vllm_model import VLLMModelConfig
from lighteval.pipeline import ParallelismManager, Pipeline, PipelineParameters
from lighteval.utils.imports import is_accelerate_available
if is_accelerate_available():
from datetime import timedelta
from accelerate import Accelerator, InitProcessGroupKwargs
accelerator = Accelerator(kwargs_handlers=[InitProcessGroupKwargs(timeout=timedelta(seconds=3000))])
else:
accelerator = None
def evaluate():
evaluation_tracker = EvaluationTracker(
output_dir="./results",
save_details=True,
push_to_hub=True,
hub_results_org="yujonglee",
)
pipeline_params = PipelineParameters(
launcher_type=ParallelismManager.ACCELERATE,
custom_tasks_directory="tasks",
max_samples=10,
)
model_config = VLLMModelConfig(
model_name="Qwen/Qwen3-0.6B",
dtype="float16",
)
# https://huggingface.co/docs/lighteval/en/available-tasks
tasks = ["helm|legal_summarization:billsum|0|0"]
task = ",".join(tasks)
pipeline = Pipeline(
tasks=task,
pipeline_parameters=pipeline_params,
evaluation_tracker=evaluation_tracker,
model_config=model_config,
)
pipeline.evaluate()
pipeline.save_and_push_results()
pipeline.show_results()
if __name__ == "__main__":
evaluate()
Expected behavior
Run without error.
Version info
git+https://github.com/huggingface/lighteval.git@7ed2636#egg=lighteval[vllm]
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working