Investigate RSyntaxTextArea for RUPS by iText-CI · Pull Request #155 · itext/rups

iText-CI · 2025-01-22T23:08:11Z

Programmatically created Pull Request to automatically keep merge branch to develop up-to-date

Initially the idea was to just use PdfTokenizer from iText, but it had some problems, like not preserving token positions in the input data, skipping whitespace outright, have hard errors for invalid output, etc. For more example see PdfContentStreamParser docs. This will be used as a reference model for the editor. Ideally we would have the same model in the editor to limit memory consumption, but having a separate one allows us some flexibility, which might be useful when implementing static analysis.

This is one of the first steps of integrating RSyntaxTextArea for PDF content stream editing. Tokenization is based on the parsing logic, which was added in the previous commit. Since PDF content streams are, in a general case, not text, but binary data, some workarounds had to be made, as RSyntaxTextArea was designed to work with text. A custom token type and painter was created, so that we could render arbitrary characters not as text, but as hexadecimal representations of the binary data. Additionally, Latin1Filter was made so that we could somewhat trick RSyntaxTextArea to work with binary data. It replaces any characters, which are not representable in Latin-1 (i.e. U+0100 and beyond) with their UTF-8 representation, but with bytes stored in chars. As a result all chars in the document fit in one byte and the backing character array for the document acts as an inflated byte array. This way we can just decode the text with Latin-1 to get the expected output, which can be put directly into PDF. With how RSyntaxTextArea is structured, quite a lot of code from there has to be copied, as inheritance is not "granular" enough to do what we cant. Since the input stream is no longer processed by iText, what you see in the editor pane is the raw data in the stream itself unmodified. This was one of the goals, as before opening a stream for editing would make it pretty much impossible to save it without altering at least whitespace in some way. As of now, there are some regressions. For example: 1. Images are no longer render in the stream pane. For now, they are displayed as text. 2. Since stream is presented as-is, there is no indentation at the moment. This will be added later as an explicit prettifier option. Code folding, static syntax analysis and RSTAUI dialog integrations will be added later.

Now in the editor you should be able to freely fold BT->ET blocks and BMC/BDC->EMC sequences.

This is pretty basic as a proof of concept. It can currently show the following issues: * Array/Dictionary/String object was not closed. * Unnecessary whitespace at the end of lines. * Unexpected tokens. * Operand count and type for path construction operators.

Default caret never becomes visible, if the text area is not editable, which is very odd...

Our code uses the locale explicitly for i18n. But RSyntaxTextArea uses the default locale everywhere for its controls and dialogs. To combat that we will just change the default locale at the start of the app.

* Everything related to the stream pane was moved to a separate package. * Added SYNTAX_STYLE_BINARY handling for the text editor. In this mode it uses the same painter (so that non-ASCII character are displayed in hex), but there is no fancy processing or coloration. Ideally we would have a separate hex editor pane for that... * Temporarily removed our custom menu dialog in the text editor. It was replacing the existing one in RSyntaxTextArea. Instead, we should add additional entries to the existing menu. TBD. * Removed our custom undo manager. RSyntaxManager was using its own, so it caused more issues, than it solved. * Added image stream view back in a basic form. For now, it just shows the image instead of the text editor, but there is no manipulation controls yet. * Additional refactoring.

iText-CI assigned Eswcvlad Jan 22, 2025

iText-CI assigned iText-CI and unassigned iText-CI Mar 13, 2025

Eswcvlad force-pushed the RES-911 branch 2 times, most recently from 2d917fa to d888565 Compare March 21, 2025 17:23

iText-CI assigned iText-CI and unassigned iText-CI Mar 24, 2025

Eswcvlad force-pushed the RES-911 branch from 9b0f525 to bd90a76 Compare March 26, 2025 14:53

iText-CI assigned iText-CI and unassigned iText-CI Apr 3, 2025

Eswcvlad added 7 commits April 9, 2025 13:16

Add basic fold parser for PDF content streams

0ec2a40

Now in the editor you should be able to freely fold BT->ET blocks and BMC/BDC->EMC sequences.

Fix stream editor caret not appearing in read-only mode

9f659be

Default caret never becomes visible, if the text area is not editable, which is very odd...

Fix locale usage in RSyntaxTextArea

6a19b3c

Our code uses the locale explicitly for i18n. But RSyntaxTextArea uses the default locale everywhere for its controls and dialogs. To combat that we will just change the default locale at the start of the app.

Eswcvlad force-pushed the RES-911 branch from bd90a76 to 0e77dc0 Compare April 9, 2025 10:19

iText-CI assigned iText-CI and unassigned iText-CI May 8, 2025

iText-CI assigned iText-CI and unassigned iText-CI May 16, 2025

iText-CI assigned iText-CI and unassigned iText-CI Aug 11, 2025

iText-CI assigned iText-CI and unassigned iText-CI Sep 21, 2025

iText-CI assigned iText-CI and unassigned iText-CI Oct 6, 2025

iText-CI assigned iText-CI and unassigned iText-CI Nov 3, 2025

iText-CI assigned iText-CI and unassigned iText-CI Nov 13, 2025

iText-CI assigned iText-CI and unassigned iText-CI Nov 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate RSyntaxTextArea for RUPS#155

Investigate RSyntaxTextArea for RUPS#155
iText-CI wants to merge 7 commits into
developfrom
RES-911

iText-CI commented Jan 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

iText-CI commented Jan 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants