[python] fix TypedDict rendering for template-instantiated models#11161
[python] fix TypedDict rendering for template-instantiated models#11161l0lawrence wants to merge 8 commits into
Conversation
commit: |
|
All changed packages have been documented.
Show changes
|
|
You can try these changes here
|
…models sharing a cross-language id Template-instantiated models such as ResourceUpdateModel<Foo, FooProperties> all carry the template's crossLanguageDefinitionId, so the TypedDict dedup that keyed on it wrongly collapsed distinct models (e.g. CacheUpdate and VolumeUpdate) into one and dropped the rest from types.py, producing dangling _types.X refs. Key the dpg-vs-copy pairing on the model name instead, and harden the preprocess typeddict lookup to also match on name. Co-authored-by: Copilot App <223556219+Copilot@users.noreply.github.com>
5436bbe to
316d3a2
Compare
|
I dont think this is a complete solution because the other types of models -> typeddict conversions could fall into the same trap |
| """Models that should be rendered as TypedDicts (excluding discriminated bases which become unions). | ||
|
|
||
| When both a dpg model and its typeddict copy exist (same crossLanguageDefinitionId), | ||
| When both a dpg model and its typeddict copy exist for the same model, |
There was a problem hiding this comment.
bc crossLangDefId can apply to multiple models if they have the same base type but diff names? I feel like thats a weird case but it does seem like something that is existing in typespec today
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Spread (json) bodies keep the template clid but may be renamed to <Method>Request, so a clid-only lookup could reference the wrong sibling model's TypedDict. Add _find_spread_original which only reuses an original when the choice is unambiguous (name match, or a single clid candidate) and otherwise falls back to the json model itself. Add tests covering non-spread dpg reuse, list-element dpg copies, and spread single/name-match/ambiguous-renamed scenarios. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Response-only models (e.g. GetResponse) already render as classes in models/ and are referenced via _models.*, so emitting a TypedDict copy for them in types.py was dead code never referenced via _types.*. typeddict_models now keeps a model only when it is a typeddict copy (base=typeddict, always an input body overload), is_typed_dict_only (covers full typeddict mode incl. responses), or used as input. Spread request bodies are kept via the base check since tcgc marks their usage Json|Spread without the Input flag. Add is_usage_input helper and tests covering output-only exclusion, input/output inclusion, typeddict-mode inclusion, and the spread-copy (no Input flag) case. Verified across all azure + unbranded generated packages: no dangling _types.* references. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
An output-only discriminated base (e.g. Dinosaur = Union[TRex]) was emitted as a union alias in types.py while its subtype TRex was correctly excluded, causing NameError: name 'TRex' is not defined at import time. discriminated_base_models now applies the same input-only predicate as typeddict_models via a shared _renders_as_input_typeddict helper. typeddict_models additionally force-includes any subtype referenced by a rendered (input) discriminated base, so an input base's union alias never references an undefined name even if the subtype's own usage flags would exclude it. Add unit tests: output-only base excluded, input base included with subtypes, and input base force-including an output-only subtype. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Compute the models rendered in types.py as the reachability closure of the request-body input surface over base classes, discriminated subtypes and property types. This keeps output-only models that are referenced by an input model (e.g. an ARM `SystemData` property on `Resource`), fixing a pyright `reportUndefinedVariable` error, while still dropping response-only dead code. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Fixes Python TypedDict rendering/selection in models-mode=dpg when template-instantiated models share a crossLanguageDefinitionId, preventing missing _types.X definitions and import-time crashes in generated SDKs.
Changes:
- Adjust TypedDict copy deduplication/matching to key on model
name(notcrossLanguageDefinitionId) for both preprocessing overload insertion andtypes.pyrendering. - Restrict
types.pyTypedDict/union-alias emission to the transitive closure of request-body “input surface” models, preventing dead/invalid output-only definitions. - Add targeted unit tests covering shared-CLID template instantiations, spread bodies, and reachability (incl. container element types).
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| packages/http-client-python/tests/unit/test_typeddict.py | Adds regression tests for shared-CLID models and input-surface reachability rules for types.py. |
| packages/http-client-python/tests/unit/test_typeddict_overloads.py | Adds tests ensuring overload preprocessing creates distinct TypedDict copies per model name (including list bodies and spread-body edge cases). |
| packages/http-client-python/generator/pygen/preprocess/init.py | Updates _find_existing_typeddict to also match by name and adds _find_spread_original to avoid ambiguous shared-CLID cross-references. |
| packages/http-client-python/generator/pygen/codegen/serializers/types_serializer.py | Switches TypedDict pairing to name-based dedup and introduces input-surface closure filtering for types.py emission. |
| packages/http-client-python/generator/pygen/codegen/models/model_type.py | Adds is_usage_input helper used by types_serializer filtering logic. |
| .chronus/changes/python-fix-typeddict-shared-clid-2026-6-2-0-0-0.md | Changelog entry documenting the fix and the new reachability-based emission behavior. |
Memoize _types_file_model_names via functools.cached_property so the model-graph reachability walk runs once per TypesSerializer instead of on every access from typeddict_models and discriminated_base_models. self._models is set at construction and never mutated, so per-instance caching is safe and the generated output is unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
fix #11149
Why
PR #10439 enabled TypedDict body-overload generation in the default
dpgmode. In real Azure mgmt libraries (e.g.azure-mgmt-netapp), this produced dangling_types.CacheUpdatereferences that were absent fromtypes.py, causing anAttributeErrorat import time.Root cause
Template-instantiated models all share the template's
crossLanguageDefinitionId. For example, everyAzure.ResourceManager.Foundations.ResourceUpdateModel<T, P>instantiation (CacheUpdate,VolumeUpdate,ActiveDirectoryConfigUpdate, ...) carriesclid = Azure.ResourceManager.Foundations.ResourceUpdateModel.The
typeddict_modelsdedup keyed the dpg-vs-copy pairing on thatclid, so it treated genuinely distinct models as duplicates and skipped all but the first one. In netapp,ActiveDirectoryConfigUpdatewas seen first, soCacheUpdatewas dropped fromtypes.pywhile still being referenced via_types.CacheUpdate.Fix
nameinstead ofcrossLanguageDefinitionId. The typeddict copy is a shallow copy of the source, so it shares the source's name; distinct template instantiations have distinct names, and model names are unique within atypes.pymodule._find_existing_typeddictin preprocess to also match onname(same root cause, tightly coupled).This keeps TypedDict generation on in
dpgmode (intentional per #10439) and just corrects the rendering so referenced TypedDicts actually land intypes.py.