Skip to content

Feat 239 Update Smartspim Imaging ETL #281

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 25 commits into from
May 27, 2025
Merged

Conversation

mekhlakapoor
Copy link
Collaborator

@mekhlakapoor mekhlakapoor commented Apr 24, 2025

closes #239

This PR:

  • updates the SmartSPIM Imaging ETL to match Mike's script
  • removes mapping data from processing manifest in the ETL (left it in AcquisitionJobSettings for backwards compatibility but it will not be used!)
  • integrates data from SLIMS into Acquisition model (in _transform method)
  • Adds example_acquisition.json and tests. If the example is too large, we can remove it

Example usage:

from aind_metadata_mapper.smartspim.models import JobSettings
from aind_metadata_mapper.smartspim.acquisition import SmartspimETL

settings = JobSettings(
subject_id="786846",
 metadata_service_domain="http://aind-metadata-service", 
input_source="input_dir", output_directory="."
)
etl = SmartspimETL(settings)
etl.run_job() # Writes etl to output dir

x_res = y_res = session_config.get("µm/pix")
z_res = session_config.get("z_step_um")
x_res = y_res = session_config.get("um/pix")
z_res = session_config.get("Z step (um)")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key in session_config for "Z step" was different in Mike's script so I updated it here. Let me know if this shouldnt be changed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should change it. @miketaormina correct me if I am wrong, but these keys depend on the version of the microscope software dataset we are going to upload. I think we can stay with the latest since we have already uploaded old datasets.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is really confusing. It looks like

  1. the original JSON/TXT files it is Z step (um), which depends on the version like @camilolaiton says.
  2. The function called on a JSON version at this line renames this key to z_step_um. This then works without this proposed change.
  3. In the event that the JSON file doesn't exist, the script reads the .txt file and I think would fail unless the proposed change were implemented.

In short, the code should keep the z_res = session_config.get("z_step_um") and we need to better handle the reading of a TXT file. BUT I'm not sure how this is working with the $\mu$ symbol. When I look at a session_config object created with the current script, I see an entry 'm/pix': '1.800'

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function that reads the JSON is here, which isn't what I expected. . . I know we've had a bout 4 different ways of dealing with this weird encoding examples

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry this is so confusing @mekhlakapoor , but I think that this current PR should keep z_step_um but change µm/pix to um/pix

This gets us to a presumbly working state and the issue with reading a txt instead of JSON can be a. separate issue on the repo

@mekhlakapoor mekhlakapoor marked this pull request as draft April 25, 2025 20:26
@mekhlakapoor mekhlakapoor marked this pull request as ready for review April 25, 2025 21:18
default="derivatives/processing_manifest.json",
description=("Deprecated, use metadata_service_path instead. "),
)
metadata_service_domain: str
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separated metadata_service_domain and metadata_service_path to match pattern of GatherMetadata JobSettings. However, that might be unnecessary so let me know and I can just make it one with no default, or default to "http://aind-metadata-service/slims/smartspim_imaging

@mekhlakapoor mekhlakapoor requested a review from miketaormina May 5, 2025 22:42
class SlimsImmersionMedium(Enum):
"""Enum for the immersion medium used in SLIMS."""

DIH2O = "diH2O"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way in how we can do something like aind-data-schema-models where we can simply have a CSV? @miketaormina do you think we will upgrade/change the mediums in the future?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The enum was to validate the values coming out of SLIMS and map it, but we can do a str check instead

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enumerating the Cargille oil like this is weird for two reasons:

  1. The catalogue number from the manufacturer is the same for both of these "1.53" and "1.52," they just have different values of refractive index.
  2. The numerical value of refractive index is additionally measure in another field (or at least used to be).
  3. Most acquisitions these days use an oil with RI=1.5170.

I think this is fine if it is what matches SLIMS, but that might mean that SLIMS needs updated. I thought that schema was built like that already (separating the immersion type from the RI value: schema

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enum is what is in SLIMS right now. (It puts them together). The response from SLIMS is then mapped to the schema (with the separation of immersion type and RI value). We can ask for SLIMS to be updated

Copy link
Contributor

@camilolaiton camilolaiton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost everything looks good to me, except the z_step_um where I mentioned @miketaormina for his knowledge on the microscope side.

x_res = y_res = session_config.get("µm/pix")
z_res = session_config.get("z_step_um")
x_res = y_res = session_config.get("um/pix")
z_res = session_config.get("Z step (um)")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should change it. @miketaormina correct me if I am wrong, but these keys depend on the version of the microscope software dataset we are going to upload. I think we can stay with the latest since we have already uploaded old datasets.

@mekhlakapoor mekhlakapoor marked this pull request as draft May 8, 2025 20:20
@mekhlakapoor mekhlakapoor marked this pull request as ready for review May 8, 2025 21:06
@mekhlakapoor mekhlakapoor requested a review from camilolaiton May 8, 2025 21:07
@@ -153,11 +160,13 @@ def make_acq_tiles(metadata_dict: dict, filter_mapping: dict):
)

if z_res is None:
z_res = session_config.get("Z step (m)")
z_res = session_config.get("Z step (um)")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@miketaormina @camilolaiton changed the unit here. Let me know if this was a bug or if it needs to be able to check for Z step (m) too

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the z_step_um was failing because the other repository changed the key here. So the file should read Z step (um), but was read into a dictionary as z_step_um.

But I think that the if statement here is meant to catch a different problem, namely that the text file encoding for the micron character has always been slippery, so there were cases where the character just gets dropped instead of replaced with a "u."

So I think we mislead you on the correct value of z_res. We need to either change that key in the JSON file, or change this repo to z_res = session_config.get("Z step (um)"), as you had before. In that scenario, line 163 would read z_res = session_config.get("Z step (m)") as a catch on the character encoding of µ

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about I check all 3? Z step (um), z_step_um, and then lastly Z step(m) ?

Copy link
Collaborator

@miketaormina miketaormina left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the JSON key issue was solved (Z step (um) etc.). I am okay with the rest of this, if it has been tested as creating a valid acquisition object

class SlimsImmersionMedium(Enum):
"""Enum for the immersion medium used in SLIMS."""

DIH2O = "diH2O"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enumerating the Cargille oil like this is weird for two reasons:

  1. The catalogue number from the manufacturer is the same for both of these "1.53" and "1.52," they just have different values of refractive index.
  2. The numerical value of refractive index is additionally measure in another field (or at least used to be).
  3. Most acquisitions these days use an oil with RI=1.5170.

I think this is fine if it is what matches SLIMS, but that might mean that SLIMS needs updated. I thought that schema was built like that already (separating the immersion type from the RI value: schema

@mekhlakapoor
Copy link
Collaborator Author

Add an optional field in JobSettings for a user to define a "slims_date". As default, SLIMS should check against the microscope datetimes. If SLIMS date is supplied, check gte and lte slims_date

@mekhlakapoor mekhlakapoor marked this pull request as draft May 20, 2025 23:03
@mekhlakapoor mekhlakapoor marked this pull request as ready for review May 21, 2025 20:54
@mekhlakapoor mekhlakapoor merged commit b8f4807 into dev May 27, 2025
3 checks passed
@mekhlakapoor mekhlakapoor deleted the feat-239-smartspim-imaging branch May 27, 2025 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Imaging from SLIMS to Smartspim ETL
3 participants