-
Notifications
You must be signed in to change notification settings - Fork 27
Read task from ModelInfo in tt-xla benchmarks #768
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Update llm_benchmark.py to read task from model_info.task - Update vision_benchmark.py to read task from model_info.task - Modify setup_model() to return full ModelInfo object instead of just name This standardizes task categorization using the forge Task enum instead of hardcoded strings, enabling better querying in Superset. Co-authored-by: Ognjen Djuricic <[email protected]>
|
@claude Can you review this pr? |
|
Claude finished @odjuricicTT's task in 1m 39s —— View job Code ReviewTodo List
Review SummaryI've reviewed the changes in this PR and they look good overall. The implementation correctly reads the task field from ✅ What's Working Well
🔍 Observations & Potential Issues1. Minor inconsistency in llm_benchmark.py (llm_benchmark.py:293-296)The if task != "text-generation":
raise ValueError(
f"Only 'text-generation' task is supported for llm benchmark. Got: {task}. "
"Please use -t text-generation"
)This validation happens before the task is actually read from
Suggestion: Consider whether this validation is still needed, or if it should validate against the forge Task enum values instead of hardcoded strings. 2. Type hint removed in vision_benchmark.py (vision_benchmark.py:35)The return type hint was changed from: def setup_model(...) -> tuple[torch.nn.Module, str]:to: def setup_model(...):This is correct since the function now returns def setup_model(...) -> tuple[torch.nn.Module, ModelInfo]:This would improve type safety and documentation. 3. Verify ModelInfo.task enum valuesThe PR assumes
📝 Recommendations
✨ Overall AssessmentThe changes are solid and achieve the stated goal of reading task from ModelInfo. The refactoring in
The code changes are correct and the implementation is clean. Nice work! 🎯 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors tt-xla benchmark scripts to read task information from the ModelInfo object instead of hardcoding task strings, aiming to standardize task categorization across benchmarks.
Key Changes
- Modified
setup_model()in vision_benchmark.py to return the full ModelInfo object instead of just the model name string - Updated both llm_benchmark.py and vision_benchmark.py to extract task values from
model_info.taskusing string conversion - Changed how model name is accessed from
model_info(now using.nameattribute)
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| benchmark/tt-xla/vision_benchmark.py | Modified setup_model() to return ModelInfo object; updated task and name extraction to use ModelInfo attributes |
| benchmark/tt-xla/llm_benchmark.py | Added model_info retrieval and updated task extraction to use model_info.task attribute |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if model_variant: | ||
| print(f"Loading model {model_loader.get_model_info(variant=model_variant).name}...") | ||
| model_info = model_loader.get_model_info(variant=model_variant) | ||
| print(f"Loading model {model_info.name}...") | ||
| model = model_loader.load_model() | ||
| model_info = model_loader.get_model_info(model_variant).name | ||
| else: | ||
| print(f"Loading model {model_loader.get_model_info().name}...") | ||
| model_info = model_loader.get_model_info() | ||
| print(f"Loading model {model_info.name}...") | ||
| model = model_loader.load_model() |
Copilot
AI
Dec 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is duplicated code between the if and else branches. The only difference is whether variant=model_variant is passed to get_model_info(). This duplication can be eliminated by conditionally building the arguments or restructuring the code to reduce redundancy and improve maintainability.
| full_model_name = f"{model_info}" | ||
| model_type = "Vision, Random Input Data" | ||
| full_model_name = model_info.name | ||
| model_type = str(model_info.task) |
Copilot
AI
Dec 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converting the task enum to string using str() may not produce the desired output format. Depending on how the Task enum is defined, str() might return something like "Task.TEXT_GENERATION" instead of just "text-generation". Consider using .value or .name attribute of the enum if available, or ensure the enum's __str__ method is properly defined to return the expected format. This affects consistency with how task values are stored in benchmark results.
| model_type = str(model_info.task) | |
| model_type = getattr(model_info.task, "value", str(model_info.task)) |
| model_type = "text-generation" | ||
| model_info = model_loader.get_model_info(variant=model_variant) | ||
| full_model_name = model_info.name | ||
| model_type = str(model_info.task) |
Copilot
AI
Dec 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converting the task enum to string using str() may not produce the desired output format. Depending on how the Task enum is defined, str() might return something like "Task.TEXT_GENERATION" instead of just "text-generation". Consider using .value or .name attribute of the enum if available, or ensure the enum's __str__ method is properly defined to return the expected format. This affects consistency with how task values are stored in benchmark results.
| model_type = str(model_info.task) | |
| model_type = getattr(model_info.task, "value", str(model_info.task)) |
|
|
||
|
|
||
| def setup_model(model_loader, model_variant=None, data_format="bfloat16") -> tuple[torch.nn.Module, str]: | ||
| def setup_model(model_loader, model_variant=None, data_format="bfloat16"): |
Copilot
AI
Dec 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function signature no longer includes a type hint for the return value. The docstring states the function returns "Tuple of (model, model_info)" but the signature should include the return type annotation for better type checking and code documentation. Consider adding -> tuple[torch.nn.Module, ModelInfo] or the appropriate type for model_info.
Summary
Implements issue #759 to read the task field from the model's ModelInfo object instead of hardcoding task strings in tt-xla benchmarks.
Changes
model_info.taskinstead of hardcoding "text-generation"model_info.taskinstead of hardcoding "Vision, Random Input Data"setup_model()to return full ModelInfo object instead of just nameBenefits
Testing
Performance benchmark workflow will be triggered to verify the task field is correctly populated in the benchmark result JSON.
Fixes #759
🤖 Generated with Claude Code