Skip to content

Add Pseudonymization to CLI #13158

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jun 2, 2025
Merged

Conversation

paudelritij
Copy link
Contributor

@paudelritij paudelritij commented May 24, 2025

Closes #13109

This PR adds a new jabkit CLI command 'pseudonymize'.

jabkit pseudonymize  
--input /path/to/library.bib 
[--output /path/to/pseudonymized-library.bib] 
[--key /path/to/pseudonymized-keys.csv] 
[-f | --force]

Parameters:

--input <file> : (required) Path to the input BibTeX file to pseudonymize.
--output <file> : Path to save the pseudonymized BibTex file. (Default: <input-filename>.pseudo.bib)
--key <file> : Path to save the entries key mapping CSV file. (Default: <input-filename>.pseudo.csv)
-f, --force : Overwrite output file(s) if they already exist.

Example

jabkit pseudonymize --input myLib.bib --output myLib.pseudo.bib  --key myLib.pseudo.csv --force

Mandatory checks

  • I own the copyright of the code submitted and I license it under the MIT license
  • Change in CHANGELOG.md described in a way that is understandable for the average user (if change is visible to the user)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • [/] Screenshots added in PR description (if change is visible to the user)
  • Checked developer's documentation: Is the information available and up to date? If not, I outlined it in this pull request.
  • Checked documentation: Is the information available and up to date? If not, I created an issue at https://github.com/JabRef/user-documentation/issues or, even better, I submitted a pull request to the documentation repository.

- A new pseudonymize command has been created.
- Test cases has been added.
- The display of commands has been sorted for ease of use.
- An entry has been added to CHANGELOG.md.
Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick check while on the road. Maybe you can address this before a full review?

@paudelritij
Copy link
Contributor Author

@calixtus for the output file shall I add check (of it's existence while output option is not passed) before parsing result or at the end where the new database is saved... and shall I ask the user to overwrite or not or define numerical formate automatically (origin_pseudo_1.bib, origin_pseudo_2.bib ...) ?

@calixtus
Copy link
Member

Test the logic you implement, not the logic, you imported by dependencies. If you feel that this is a not-trivial test, then do it. If you are just testing already tested logic its superfluous.

@calixtus
Copy link
Member

This might not be real TDD, but real TDD is too much noise anyway imho.

@koppor koppor mentioned this pull request May 26, 2025
1 task
@koppor
Copy link
Member

koppor commented May 27, 2025

@paudelritij It would be nice if you wrote the new command on the PR description - then it is easier for non-programmers to follow...

@calixtus for the output file shall I add check (of it's existence while output option is not passed) before parsing result or at the end where the new database is saved... and shall I ask the user to overwrite or not or define numerical formate automatically (origin_pseudo_1.bib, origin_pseudo_2.bib ...) ?

I answered in-line, but I repeat:

  1. error on file existance
  2. Offer flag --force (short: -f) for overwrite.

@paudelritij
Copy link
Contributor Author

Thank you for the comprehensive review. I have already implemented some of the suggestions yesterday and will make the necessary changes asap

@calixtus
Copy link
Member

Don't forget to push after commiting, so we can follow

- Implement ADR 0045
- Add -f / --force flag to overwrite if file exist
- Reformat saving of database to save meta-data as well
- Add methods from FileUtil
- Reformat CHANGELOG.md entry
- Add JabRef_en.properties entry
- A Comment added
- Remove duplicate file Chocolate.bib
- Improve logger and localization lang
- Remove trivial test case
- Rename all occurrence of word anon. to pseudo
@paudelritij paudelritij marked this pull request as ready for review May 29, 2025 21:51
@paudelritij paudelritij marked this pull request as draft May 29, 2025 22:46
@koppor
Copy link
Member

koppor commented May 29, 2025

For ADR one needs to add this as dependency to build.gradle.kts - just copy and paste from another build.gradle.kts with this dependency.

@paudelritij paudelritij marked this pull request as ready for review May 30, 2025 00:26
@paudelritij paudelritij requested a review from koppor June 1, 2025 15:27
Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overwrite output file(s) if it exist reads strange. replace by

Overwrite output file(s) if any exist(s)

The path is not resolved correctly.

If no output file names are given, the output should be stored next to the input file.

./gradlew :jabkit:run --args="pseudonymize --input=C:\git-repositories\jabref-all\jabref-demo-libraries\chocolate\Chocolate.bib"

Results in output in jabkit folder - very strange

image

- Default output file path is now the same as input file location if not specified.
- Created a single method to check for file existence.
Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On CLI output:

... is progress...

intead of

Pseudonymizing library 'Chocolate'.
Saving: C:\git-repositories\jabref-all\jabref-demo-libraries\chocolate\Chocolate.pseudo.bib.
Saving: C:\git-repositories\jabref-all\jabref-demo-libraries\chocolate\Chocolate.pseudo.csv.

Please

Pseudonymizing library 'Chocolate'...
Saved C:\git-repositories\jabref-all\jabref-demo-libraries\chocolate\Chocolate.pseudo.bib.
Saved C:\git-repositories\jabref-all\jabref-demo-libraries\chocolate\Chocolate.pseudo.csv.
`´`

Saving does not take long time, thus it can be the output oafer successful save

@paudelritij paudelritij force-pushed the fix-for-issue-13109 branch from f36591c to 39a904e Compare June 2, 2025 00:34
Copy link

trag-bot bot commented Jun 2, 2025

@trag-bot didn't find any issues in the code! ✅✨

@koppor koppor enabled auto-merge June 2, 2025 12:37
Copy link

trag-bot bot commented Jun 2, 2025

@trag-bot didn't find any issues in the code! ✅✨

@koppor koppor added this pull request to the merge queue Jun 2, 2025
Merged via the queue into JabRef:main with commit 7d9815c Jun 2, 2025
47 checks passed
Siedlerchr added a commit to brandon-lau0/jabref that referenced this pull request Jun 2, 2025
…or-test

* upstream/main: (102 commits)
  Try to fix output
  Improve AI preferences UI and templates (JabRef#13202)
  Bump jablib/src/main/abbrv.jabref.org from `6926b83` to `333c2f1` (JabRef#13216)
  Bump jablib/src/main/resources/csl-styles from `8a2317a` to `c3df987` (JabRef#13215)
  Fixed search result focus handling (JabRef#13174)
  New Crowdin updates (JabRef#13214)
  Add Pseudonymization to CLI (JabRef#13158)
  Try parallel gource build
  Update gource.yml
  Fix position of checkout
  Preapre: Enable gradle configuration cache (JabRef#13212)
  Add yml as YAML extension (JabRef#13213)
  Fix wrong detection of issue numbers (JabRef#13211)
  Miscellaneous refactoring - II (JabRef#13197)
  Run Windows tests only on main (and on demand) (JabRef#13210)
  Fix porcelain for consistency check (JabRef#13209)
  Use setup-jbang action (instead of custom call of .sh script) (JabRef#13208)
  Add link to JabRef guru (JabRef#13207)
  Switch to gradlex for modularity (JabRef#13112)
  feat(ci-cd): change issue URL pattern (JabRef#13206)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Pseudonymization to CLI
3 participants