[MCP Integration] Phase 2: Data models and constants#2235
[MCP Integration] Phase 2: Data models and constants#2235aniruddh-alt merged 12 commits intoani/mcp-integration-01-scaffoldfrom
Conversation
|
Important Upgrade your plan to unlock code review, CI analysis, custom rules, and more. |
f38a2e4 to
8e7a6b1
Compare
70e68d6 to
b35dc2d
Compare
cbe787c to
4d3461e
Compare
b35dc2d to
79de029
Compare
Add Pydantic models (JobRecord, PreFlightSummary, CloudValidation, etc.) and constants (cloud provider configs, resource specs, validation rules) that form the shared foundation for all MCP services. Part of the MCP integration PR chain (Phase 2 of 10). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add missing "eval" to DATA_SPLITS constant - Convert mutable list constants to tuples - Import TypedDict from typing (Python 3.12+)
4d3461e to
89c8856
Compare
Pydantic requires typing_extensions.TypedDict (not typing.TypedDict) on Python < 3.12 when NotRequired fields are present. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
src/oumi/mcp/constants.py
Outdated
| "pretraining", | ||
| "synthesis", | ||
| "quantization", | ||
| "", |
There was a problem hiding this comment.
do we need this empty string here?
There was a problem hiding this comment.
We can remove them, they were meant to be a fallback but it is not needed.
src/oumi/mcp/constants.py
Outdated
|
|
||
| # The oumi release version that the bundled configs/ directory corresponds to. | ||
| # Update this constant whenever the bundled configs are refreshed from a tag. | ||
| BUNDLED_OUMI_VERSION = "0.7" |
There was a problem hiding this comment.
Can we take a dependency on where this is defined in the library or have an automatic sync check to run whenever one of these changes to ensure the other is also changed?
src/oumi/mcp/models.py
Outdated
| torch_dtype: PyTorch data type (e.g., bfloat16, float32). | ||
| """ | ||
|
|
||
| learning_rate: NotRequired[float] |
There was a problem hiding this comment.
Our standard is typically to use Optional, is there a difference with this type?
There was a problem hiding this comment.
NotRequired means that the field doesn't need to be present as opposed to Optional which needs the field to be present but it can be None. It adds convenience but can be updated to Optional to maintain uniformity.
src/oumi/mcp/models.py
Outdated
| e.g. {"torch": "2.3.0"}. | ||
| """ | ||
|
|
||
| accelerator_type: str |
Co-authored-by: Aniruddhan Ramesh <aniruddhanramesh@Aniruddhans-MacBook-Pro.local> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
| ), | ||
| timeout=30.0, | ||
| ) | ||
| except TimeoutError: |
There was a problem hiding this comment.
Bug: The code catches the built-in TimeoutError, but asyncio.wait_for raises asyncio.TimeoutError on Python < 3.11, which is a different, uncaught exception type.
Severity: HIGH
Suggested Fix
In src/oumi/mcp/job_service.py, change the exception handling from except TimeoutError: to except asyncio.TimeoutError: to correctly catch the timeout exception raised by asyncio.wait_for across all supported Python versions.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: src/oumi/mcp/job_service.py#L519
Potential issue: The `cancel_job_impl` function uses `try...except TimeoutError` to
handle timeouts from `asyncio.wait_for`. In Python versions before 3.11,
`asyncio.wait_for` raises `asyncio.TimeoutError`, which is a distinct exception and does
not inherit from the built-in `TimeoutError`. Consequently, if the 30-second timeout is
reached during a job cancellation on an environment running Python < 3.11, the exception
will not be caught. This will propagate as an unhandled exception, causing the async
handler to crash and potentially leading to a server failure. The codebase shows
awareness of this distinction elsewhere, using `asyncio.TimeoutError` correctly in other
modules.
Description
Part of the MCP Integration PR chain (Phase 2 of 10) - Stage: foundation
What changed: Added Pydantic data models (
JobRecord,PreFlightSummary,CloudValidation, etc.) and constants (cloud provider configs, resource specs, validation rules) used by all MCP services.Why: These shared types and constants form the foundation that all subsequent service modules depend on.
Related issues
Before submitting