Skip to content

Llmo 3555 readability exclusions#2396

Open
noruiz wants to merge 11 commits intocode-freeze-apr-2026from
LLMO-3555-readability-exclusions
Open

Llmo 3555 readability exclusions#2396
noruiz wants to merge 11 commits intocode-freeze-apr-2026from
LLMO-3555-readability-exclusions

Conversation

@noruiz
Copy link
Copy Markdown
Contributor

@noruiz noruiz commented Apr 16, 2026

Please ensure your pull request adheres to the following guidelines:

  • make sure to link the related issues in this description
  • when merging / squashing, make sure the fixed issue references are visible in the commits, for easy compilation of release notes
  • If data sources for any opportunity has been updated/added, please update the wiki for same opportunity.

Related Issues

https://jira.corp.adobe.com/browse/LLMO-3555

Thanks for contributing!

noelruiz34 and others added 9 commits April 16, 2026 08:55
Please ensure your pull request adheres to the following guidelines:
- [ ] make sure to link the related issues in this description
- [ ] when merging / squashing, make sure the fixed issue references are
visible in the commits, for easy compilation of release notes
- [ ] If data sources for any opportunity has been updated/added, please
update the
[wiki](https://wiki.corp.adobe.com/display/AEMSites/Data+Sources+for+Opportunities)
for same opportunity.

## Related Issues


Thanks for contributing!

---------

Co-authored-by: Rares Cheseli <rcheseli@adobe.com>
# [1.416.0](v1.415.3...v1.416.0) (2026-04-16)

### Features

* Add source in URL when fetching brand presence files ([#2394](#2394)) ([30ac0e4](30ac0e4))
@noruiz noruiz self-assigned this Apr 16, 2026
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 16, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Comment thread src/readability/shared/analysis-utils.js Outdated
*/
const CITATION_EXCLUSION_PATTERNS = [
/doi:\s*10\.\d{4,}/i,
/\d{4};\s*(doi:|https?:\/\/)/i,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Matches year; URL anywhere in text. A sentence like:

"The platform launched in 2022; https://results.example.com/ published the first results."

would be incorrectly excluded. The negative test on line 575 only validates year; text without a URL following. The risky pattern is year; https:// in normal prose — untested and unguarded.

Comment thread src/readability/shared/analysis-utils.js Outdated
// include a citation tail (e.g. DOI) that would wrongly drop the whole element.
if ($el.html().includes('<br')) {
return true;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same situation for the
-split paragraph filter ( handler.js ). If a new exclusion signal is added in future, both sites need updating. Extract to a shared helper:

Comment thread src/readability/shared/analysis-utils.js Outdated
@github-actions
Copy link
Copy Markdown

This PR will trigger a minor release when merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants