pdfHTML is an iText Core add-on for creating PDF from HTML/XML (and associated CSS).
This release introduces a new high-level API to make the generation of PDF/UA documents simpler and more intuitive, and some more CSS/SVG improvements.
More Convenient PDF/UA Creation
Our main goal with this latest release of pdfHTML was in improving the conversion from HTML into documents which conform to the PDF/UA standards for universal accessibility. As mentioned in an article over on itextpdf.com, the improvements in iText Core and pdfHTML’s APIs make creating archivable PDF/A documents from HTML much easier than with iText 5. In a similar way, we’ve now fine-tuned pdfHTML’s API to allow for smooth HTML conversion into both PDF/UA-1 and PDF/UA-2 compliant documents in just a single step.
To make things easy, the ConverterProperties Object comes with a handy toggle for PDF/UA support. Just choose the version of the PDF/UA standard you want to focus on, and you can count on iText Core and pdfHTML to handle all the intricate details smoothly for you.
CSS and SVG Improvements
Following on from the CSS additions in the previous iText release, we’ve now added support for the align-content property to pdfHTML’s existing Flexbox support. Supported values are normal, content-distribution and content-position, and you can expect further support for advanced flex properties coming down the line.
Another important change for this release is in the conversion of <svg> tags for PDF/UA. When targeting conformance with the PDF/UA standard, it requires an alternative description to be set. While SVG does not support <alt> tags as such, it does have the corresponding <desc> element to provide accessible, long-text descriptions of SVG containers or graphics elements. Now, pdfHTML will handle this translation for you automatically.
Miscellaneous and Bug Fixes
There is a change in the handling of alternative descriptions for form fields in the case of PDF/UA-2 documents. This aims to ensure alternative descriptions are applied properly when accessibility properties are set.
The copyright headers for all jsoup-related files have been modified to note that 1) they reflect the proper attribution, and 2) may contain modifications not present in the original project.
We’ve resolved a NullPointerException which could result from processing HTML and using relayout(). This was reported by a customer who encountered the issue when laying out a bulleted list where the list’s children did not have a parent during the execution of the parent#drawChildren method. To fix this, we’ve enhanced the applyListSymbolPosition method to ensure proper handling of list symbols and prevent NPEs. In addition, an injectSymbolRendererIntoParagraphRenderer helper method has been introduced, and the logic for rendering list symbols was refactored.
Finally, a couple of CSS bugs have been fixed. The first occurred when parsing multiple CSS selectors which were separated in an unconventional way, and so we added new parsing logic to account for this.
The second issue was in the calculation of the height for inline elements in fixed-size containers. To fix this, we’ve modified the layout module to add the AnonymousBox class to represent anonymous layout boxes, and AnonymousBoxRenderer to handle rendering of layout boxes with custom behaviors for margins and height resolution.
New Features
- New high-level HTML to PDF/UA API
- Support for the CSS align-content property in Flexbox
Improvements
- SVG tags should take into account alternate description when tagged as figure
- Change how alternative descriptions are applied for widget annotation in PDF/UA-2
- Added copyright attribution clarification for jsoup-related files
Bug Fixes
- Percent height for inline element not resolved correctly in fixed size containers
- CSS selector with two elements failed
- NullPointerException processing HTML and using relayout()