Skip to content

Conversation

@martinwellman
Copy link
Collaborator

This PR adds support for mapping slots that have a range consisting of multiple enumerations.

Some important things to review:

  1. In object_transformer.py line 186, source_type (of type ElementName) is converted from a string with yaml.safe_load. The expected loaded value is a string (eg. 'myenum') or list of strings (eg. ['myenum1', 'myenum2', ...]). Is this the preferred way to do this?
  2. In object_transformer.py in function transform_enum: while iterating over the enum derivations, the first time we encounter a derivation with mirror_source=True we return str(source_value) (if no mapping has been successful yet). Should we instead iterate over all enum derivations, try to map the source_value (ignoring mirror_source), and only after failing all enum derivations should we return str(source_value) if any of the enum derivations have mirror_source=True?

Closes linkml/linkml#2128

@martinwellman
Copy link
Collaborator Author

Should add unit tests.

@martinwellman martinwellman marked this pull request as ready for review July 9, 2024 17:38
@amc-corey-cox
Copy link
Contributor

@martinwellman Thanks for working on this feature - supporting multiple enum ranges is a real need (linkml/linkml#2128).

I'm sorry to have taken so long to get back to you on this. After reviewing, I have some concerns about the approach:

Syntax Issue

The PR uses range: [enum1, enum2] which isn't standard LinkML syntax. LinkML coerces this to a string "['enum1',
'enum2']", which is why yaml.safe_load is needed to parse it back.

The proper LinkML way to specify multiple enum ranges is any_of:

slots:
my_slot:
range: Any
any_of:
- range: Enum1
- range: Enum2

With this syntax, you'd access the enums via slot.any_of[*].range directly - no string parsing needed.

Answers to your questions:

  1. Is yaml.safe_load the right way? - It works around the non-standard syntax, but with proper any_of support,
    you'd iterate slot.any_of directly.
  2. Should mirror_source=True return immediately? - Best practice would be to try all enum derivations first,
    then fall back to mirroring only if none match. The current implementation could skip valid mappings in later
    enums.

Recommendation

I'd like to see a reworking of this to support the standard any_of syntax instead. The feature is definitely needed, but
building on non-standard syntax creates a shaky foundation.

Would you be interested in updating the PR to use any_of, or would you prefer we take it from here?

@martinwellman
Copy link
Collaborator Author

@amc-corey-cox Thanks for the response. I realized eventually that I should be using "any_of" instead of having "range" equal to an array. I have a newer working version locally that uses "any_of" that I've been using for our project. The problem is that there are other parts of LinkML-Map that would need to be updated as well to handle multiple enumerations for ranges. For example, the code for determining the reverse transformation as well as for making a graph of the transformations (there might have been others as well, I can't remember). Would you want these other parts to be updated to support "any_of" with this PR as well?

There was also another problem I had to deal with related to "any_of", but I'll have to look back at my notes to recall what it was, since it's been a while.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Mapping slots with multiple enumerations as a range

3 participants