Add regression test for Unicode preservation in IA subject imports #11639

pranitaurlam · 2025-12-30T15:56:34Z

This PR adds a regression test to assert that Unicode characters in subjects
are preserved during IA imports.

While investigating #11637, I traced Unicode handling through MARC parsing,
edition building, and import validation, and did not find any ASCII
normalization or character stripping on the Open Library side. Adding this
test helps make the issue reproducible and prevents future regressions
without guessing at a fix.

Related: #11637

for more information, see https://pre-commit.ci

pranitaurlam · 2025-12-30T15:58:22Z

I’ve opened a pull request related to this issue #11637.
Please let me know if you’d like me to make any changes or investigate further.

tfmorris · 2025-12-30T19:07:31Z

You might want to double check that your branch contains all the commits that you intended, because it appears empty.

pranitaurlam and others added 2 commits December 30, 2025 21:24

Add regression test for Unicode preservation in IA subject imports

b499edf

[pre-commit.ci] auto fixes from pre-commit.com hooks

c2400ba

for more information, see https://pre-commit.ci

pranitaurlam mentioned this pull request Dec 30, 2025

Bad MARC data is being imported #11613

Open

github-actions bot added the Needs: Response Issues which require feedback from lead label Dec 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add regression test for Unicode preservation in IA subject imports #11639

Add regression test for Unicode preservation in IA subject imports #11639

pranitaurlam commented Dec 30, 2025

Uh oh!

pranitaurlam commented Dec 30, 2025

Uh oh!

tfmorris commented Dec 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Add regression test for Unicode preservation in IA subject imports #11639

Are you sure you want to change the base?

Add regression test for Unicode preservation in IA subject imports #11639

Conversation

pranitaurlam commented Dec 30, 2025

Uh oh!

pranitaurlam commented Dec 30, 2025

Uh oh!

tfmorris commented Dec 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants