Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: itext/itext-pdfocr-java
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: 1.0.2
Choose a base ref
...
head repository: itext/itext-pdfocr-java
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: develop
Choose a head ref
Loading
Showing with 6,388 additions and 2,928 deletions.
  1. +183 −0 CONTRIBUTING.md
  2. +40 −0 SECURITY.md
  3. +50 −3 pdfocr-api/pom.xml
  4. +55 −0 pdfocr-api/src/main/java/com/itextpdf/pdfocr/AbstractPdfOcrEventHelper.java
  5. +2 −2 pdfocr-api/src/main/java/com/itextpdf/pdfocr/IImageRotationHandler.java
  6. +36 −2 pdfocr-api/src/main/java/com/itextpdf/pdfocr/IOcrEngine.java
  7. +7 −7 pdfocr-api/src/main/java/com/itextpdf/pdfocr/{PdfOcrMetaInfo.java → IOcrProcessProperties.java}
  8. +45 −0 pdfocr-api/src/main/java/com/itextpdf/pdfocr/IProductAware.java
  9. +5 −2 pdfocr-api/src/main/java/com/itextpdf/pdfocr/OcrEngineProperties.java
  10. +458 −191 pdfocr-api/src/main/java/com/itextpdf/pdfocr/OcrPdfCreator.java
  11. +62 −0 pdfocr-api/src/main/java/com/itextpdf/pdfocr/OcrPdfCreatorEventHelper.java
  12. +0 −83 pdfocr-api/src/main/java/com/itextpdf/pdfocr/OcrPdfCreatorMetaInfo.java
  13. +50 −2 pdfocr-api/src/main/java/com/itextpdf/pdfocr/OcrPdfCreatorProperties.java
  14. +77 −0 pdfocr-api/src/main/java/com/itextpdf/pdfocr/OcrProcessContext.java
  15. +32 −32 pdfocr-api/src/main/java/com/itextpdf/pdfocr/PdfCreatorUtil.java
  16. +12 −4 pdfocr-api/src/main/java/com/itextpdf/pdfocr/PdfOcrFontProvider.java
  17. +21 −9 ...itextpdf/metainfo/TestMetaInfo.java → main/java/com/itextpdf/pdfocr/PdfOcrMetaInfoContainer.java}
  18. +2 −2 pdfocr-api/src/main/java/com/itextpdf/pdfocr/ScaleMode.java
  19. +18 −56 pdfocr-api/src/main/java/com/itextpdf/pdfocr/TextInfo.java
  20. +26 −18 pdfocr-api/src/main/java/com/itextpdf/pdfocr/{OcrException.java → exceptions/PdfOcrException.java}
  21. +41 −0 pdfocr-api/src/main/java/com/itextpdf/pdfocr/exceptions/PdfOcrExceptionMessageConstant.java
  22. +62 −0 pdfocr-api/src/main/java/com/itextpdf/pdfocr/exceptions/PdfOcrInputException.java
  23. +31 −19 pdfocr-api/src/main/java/com/itextpdf/pdfocr/{ → logs}/PdfOcrLogMessageConstant.java
  24. +0 −1 pdfocr-api/src/main/java/com/itextpdf/pdfocr/package-info.java
  25. +15 −11 ...-api/src/main/java/com/itextpdf/pdfocr/{IMetaInfoWrapper.java → statistics/PdfOcrOutputType.java}
  26. +119 −0 pdfocr-api/src/main/java/com/itextpdf/pdfocr/statistics/PdfOcrOutputTypeStatisticsAggregator.java
  27. +88 −0 pdfocr-api/src/main/java/com/itextpdf/pdfocr/statistics/PdfOcrOutputTypeStatisticsEvent.java
  28. +44 −0 pdfocr-api/src/main/java/com/itextpdf/pdfocr/structuretree/ArtifactItem.java
  29. +124 −0 pdfocr-api/src/main/java/com/itextpdf/pdfocr/structuretree/LogicalStructureTreeItem.java
  30. +39 −0 pdfocr-api/src/main/java/com/itextpdf/pdfocr/structuretree/ParagraphTreeItem.java
  31. +14 −9 .../TestMetaInfo.java → pdfocr-api/src/main/java/com/itextpdf/pdfocr/structuretree/SpanTreeItem.java
  32. +39 −0 pdfocr-api/src/main/java/com/itextpdf/pdfocr/structuretree/TableCellTreeItem.java
  33. +50 −0 pdfocr-api/src/main/java/com/itextpdf/pdfocr/structuretree/TableRowTreeItem.java
  34. +19 −14 ...va/com/itextpdf/pdfocr/{events/IThreadLocalMetaInfoAware.java → structuretree/TableTreeItem.java}
  35. +94 −0 pdfocr-api/src/sharpenconfig/java/com/itextpdf/pdfocr/SharpenConfigMapping.java
  36. +1 −0 pdfocr-api/src/sharpenconfig/resources/META-INF/services/sharpen.config.MappingConfiguration
  37. +192 −72 pdfocr-api/src/test/java/com/itextpdf/pdfocr/ApiTest.java
  38. +152 −0 pdfocr-api/src/test/java/com/itextpdf/pdfocr/OcrPdfCreatorEventHelperTest.java
  39. +60 −0 pdfocr-api/src/test/java/com/itextpdf/pdfocr/OcrProcessContextTest.java
  40. +80 −83 pdfocr-api/src/test/java/com/itextpdf/pdfocr/PdfA3uTest.java
  41. +108 −0 pdfocr-api/src/test/java/com/itextpdf/pdfocr/PdfCreatorUtilTest.java
  42. +50 −53 pdfocr-api/src/test/java/com/itextpdf/pdfocr/PdfFontTest.java
  43. +30 −25 pdfocr-api/src/test/java/com/itextpdf/pdfocr/PdfInputImageTest.java
  44. +37 −38 pdfocr-api/src/test/java/com/itextpdf/pdfocr/PdfLayersTest.java
  45. +44 −0 pdfocr-api/src/test/java/com/itextpdf/pdfocr/PdfOcrMetaInfoContainerTest.java
  46. +13 −14 pdfocr-api/src/test/java/com/itextpdf/pdfocr/ScaleModeTest.java
  47. +0 −117 pdfocr-api/src/test/java/com/itextpdf/pdfocr/events/EventCountingTest.java
  48. +65 −0 pdfocr-api/src/test/java/com/itextpdf/pdfocr/exceptions/PdfOcrExceptionTest.java
  49. +15 −23 pdfocr-api/src/test/java/com/itextpdf/pdfocr/helpers/CustomOcrEngine.java
  50. +91 −0 pdfocr-api/src/test/java/com/itextpdf/pdfocr/helpers/CustomProductAwareOcrEngine.java
  51. +2 −2 pdfocr-api/src/test/java/com/itextpdf/pdfocr/helpers/ExtractionStrategy.java
  52. +17 −28 pdfocr-api/src/test/java/com/itextpdf/pdfocr/helpers/PdfHelper.java
  53. +69 −0 pdfocr-api/src/test/java/com/itextpdf/pdfocr/helpers/TestProcessProperties.java
  54. +108 −0 pdfocr-api/src/test/java/com/itextpdf/pdfocr/helpers/TestStructureDetectionOcrEngine.java
  55. +134 −0 ...cr-api/src/test/java/com/itextpdf/pdfocr/statistics/PdfOcrOutputTypeStatisticsAggregatorTest.java
  56. +57 −0 pdfocr-api/src/test/java/com/itextpdf/pdfocr/statistics/PdfOcrOutputTypeStatisticsEventTest.java
  57. +71 −0 pdfocr-api/src/test/java/com/itextpdf/pdfocr/structuretree/LogicalStructureTreeItemTest.java
  58. +58 −0 pdfocr-api/src/test/java/com/itextpdf/pdfocr/structuretree/TableTreeStructureTest.java
  59. BIN pdfocr-api/src/test/resources/com/itextpdf/pdfocr/cmp_tableStructureTree.pdf
  60. BIN pdfocr-api/src/test/resources/com/itextpdf/pdfocr/images/single7x5cm.tif
  61. +70 −4 pdfocr-tesseract4/pom.xml
  62. +176 −73 pdfocr-tesseract4/src/main/java/com/itextpdf/pdfocr/tesseract4/AbstractTesseract4OcrEngine.java
  63. +12 −2 pdfocr-tesseract4/src/main/java/com/itextpdf/pdfocr/tesseract4/ImagePreprocessingOptions.java
  64. +34 −26 pdfocr-tesseract4/src/main/java/com/itextpdf/pdfocr/tesseract4/ImagePreprocessingUtil.java
  65. +2 −2 pdfocr-tesseract4/src/main/java/com/itextpdf/pdfocr/tesseract4/LeptonicaImageRotationHandler.java
  66. +161 −0 pdfocr-tesseract4/src/main/java/com/itextpdf/pdfocr/tesseract4/LeptonicaWrapper.java
  67. +2 −2 pdfocr-tesseract4/src/main/java/com/itextpdf/pdfocr/tesseract4/OutputFormat.java
  68. +0 −89 pdfocr-tesseract4/src/main/java/com/itextpdf/pdfocr/tesseract4/ReflectionUtils.java
  69. +58 −0 pdfocr-tesseract4/src/main/java/com/itextpdf/pdfocr/tesseract4/Tesseract4EventHelper.java
  70. +49 −29 pdfocr-tesseract4/src/main/java/com/itextpdf/pdfocr/tesseract4/Tesseract4ExecutableOcrEngine.java
  71. +77 −0 pdfocr-tesseract4/src/main/java/com/itextpdf/pdfocr/tesseract4/Tesseract4FileResultEventHelper.java
  72. +52 −30 pdfocr-tesseract4/src/main/java/com/itextpdf/pdfocr/tesseract4/Tesseract4LibOcrEngine.java
  73. +0 −82 pdfocr-tesseract4/src/main/java/com/itextpdf/pdfocr/tesseract4/Tesseract4LogMessageConstant.java
  74. +28 −0 pdfocr-tesseract4/src/main/java/com/itextpdf/pdfocr/tesseract4/Tesseract4MetaInfo.java
  75. +27 −14 pdfocr-tesseract4/src/main/java/com/itextpdf/pdfocr/tesseract4/Tesseract4OcrEngineProperties.java
  76. +0 −76 pdfocr-tesseract4/src/main/java/com/itextpdf/pdfocr/tesseract4/Tesseract4OcrException.java
  77. +31 −60 pdfocr-tesseract4/src/main/java/com/itextpdf/pdfocr/tesseract4/TesseractHelper.java
  78. +104 −74 pdfocr-tesseract4/src/main/java/com/itextpdf/pdfocr/tesseract4/TesseractOcrUtil.java
  79. +3 −3 pdfocr-tesseract4/src/main/java/com/itextpdf/pdfocr/tesseract4/TextPositioning.java
  80. +48 −0 ...eract4/src/main/java/com/itextpdf/pdfocr/tesseract4/actions/data/PdfOcrTesseract4ProductData.java
  81. +74 −0 ...ct4/src/main/java/com/itextpdf/pdfocr/tesseract4/actions/events/PdfOcrTesseract4ProductEvent.java
  82. +0 −61 pdfocr-tesseract4/src/main/java/com/itextpdf/pdfocr/tesseract4/events/PdfOcrTesseract4Event.java
  83. +61 −0 ...ract4/src/main/java/com/itextpdf/pdfocr/tesseract4/exceptions/PdfOcrInputTesseract4Exception.java
  84. +63 −0 ...tesseract4/src/main/java/com/itextpdf/pdfocr/tesseract4/exceptions/PdfOcrTesseract4Exception.java
  85. +54 −0 ...main/java/com/itextpdf/pdfocr/tesseract4/exceptions/PdfOcrTesseract4ExceptionMessageConstant.java
  86. +74 −0 ...cr-tesseract4/src/main/java/com/itextpdf/pdfocr/tesseract4/logs/Tesseract4LogMessageConstant.java
  87. +0 −1 pdfocr-tesseract4/src/main/java/com/itextpdf/pdfocr/tesseract4/package-info.java
  88. +214 −0 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/IntegrationEventHandlingTestHelper.java
  89. +16 −19 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/IntegrationTestHelper.java
  90. +30 −29 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/TesseractExecutableIntegrationTest.java
  91. +7 −8 ...threading/MultiThreadingExecutableTest.java → actions/Tesseract4EventHandlingExecutableTest.java}
  92. +33 −0 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/actions/Tesseract4EventHandlingLibTest.java
  93. +492 −0 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/actions/Tesseract4EventHandlingTest.java
  94. +48 −0 ...tesseract4/src/test/java/com/itextpdf/pdfocr/actions/events/PdfOcrTesseract4ProductEventTest.java
  95. +0 −53 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/events/EventCountingExecutableTest.java
  96. +0 −55 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/events/EventCountingLibTest.java
  97. +0 −314 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/events/EventCountingTest.java
  98. +0 −59 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/events/PdfOcrTesseract4EventTest.java
  99. +0 −63 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/events/multithreading/DoImageOcrRunnable.java
  100. +0 −144 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/events/multithreading/MultiThreadingTest.java
  101. +50 −0 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/exceptions/PdfOcrTesseract4ExceptionTest.java
  102. +4 −5 ...tesseract4/src/test/java/com/itextpdf/pdfocr/general/BasicTesseractIntegrationExecutableTest.java
  103. +4 −5 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/general/BasicTesseractIntegrationLibTest.java
  104. +115 −115 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/general/BasicTesseractIntegrationTest.java
  105. +4 −5 ...sseract4/src/test/java/com/itextpdf/pdfocr/imageformats/ImageFormatIntegrationExecutableTest.java
  106. +4 −5 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/imageformats/ImageFormatIntegrationLibTest.java
  107. +37 −40 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/imageformats/ImageFormatIntegrationTest.java
  108. +4 −5 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/pdfa3u/PdfA3UIntegrationExecutableTest.java
  109. +4 −5 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/pdfa3u/PdfA3UIntegrationLibTest.java
  110. +10 −15 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/pdfa3u/PdfA3UIntegrationTest.java
  111. +4 −5 ...cr-tesseract4/src/test/java/com/itextpdf/pdfocr/pdflayers/PdfLayersIntegrationExecutableTest.java
  112. +4 −5 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/pdflayers/PdfLayersIntegrationLibTest.java
  113. +22 −22 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/pdflayers/PdfLayersIntegrationTest.java
  114. +7 −8 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/tessdata/TessDataIntegrationExecutableTest.java
  115. +29 −35 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/tessdata/TessDataIntegrationLibTest.java
  116. +81 −87 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/tessdata/TessDataIntegrationTest.java
  117. +81 −57 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/tesseract4/ApiTest.java
  118. +4 −5 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/tesseract4/ImageIntegrationExecutableTest.java
  119. +5 −6 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/tesseract4/ImageIntegrationLibTest.java
  120. +21 −26 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/tesseract4/ImageIntegrationTest.java
  121. +33 −35 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/tesseract4/ImagePreprocessingUtilTest.java
  122. +84 −0 ...-tesseract4/src/test/java/com/itextpdf/pdfocr/tesseract4/Tesseract4FileResultEventHelperTest.java
  123. +8 −13 ...va → test/java/com/itextpdf/pdfocr/tesseract4/Tesseract4MetaInfoEventHandlingExecutableTest.java}
  124. +33 −0 ...sseract4/src/test/java/com/itextpdf/pdfocr/tesseract4/Tesseract4MetaInfoEventHandlingLibTest.java
  125. +89 −0 ...-tesseract4/src/test/java/com/itextpdf/pdfocr/tesseract4/Tesseract4MetaInfoEventHandlingTest.java
  126. +4 −5 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/tesseract4/TesseractHelperExecutableTest.java
  127. +21 −18 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/tesseract4/TesseractHelperLibTest.java
  128. +7 −12 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/tesseract4/TesseractHelperTest.java
  129. +35 −32 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/tesseract4/TesseractOcrUtilTest.java
  130. +4 −5 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/tesseract4/UserWordsExecutableTest.java
  131. +4 −5 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/tesseract4/UserWordsLibTest.java
  132. +37 −35 pdfocr-tesseract4/src/test/java/com/itextpdf/pdfocr/tesseract4/UserWordsTest.java
  133. BIN pdfocr-tesseract4/src/test/resources/com/itextpdf/pdfocr/documents/englishText_executable.pdf
  134. BIN pdfocr-tesseract4/src/test/resources/com/itextpdf/pdfocr/documents/englishText_lib.pdf
  135. BIN pdfocr-tesseract4/src/test/resources/com/itextpdf/pdfocr/documents/example_01_executable.pdf
  136. BIN pdfocr-tesseract4/src/test/resources/com/itextpdf/pdfocr/documents/example_01_lib.pdf
  137. BIN pdfocr-tesseract4/src/test/resources/com/itextpdf/pdfocr/documents/example_02.pdf
  138. BIN pdfocr-tesseract4/src/test/resources/com/itextpdf/pdfocr/documents/invoice_front_thai_lib_dotnet.pdf
  139. BIN pdfocr-tesseract4/src/test/resources/com/itextpdf/pdfocr/documents/invoice_front_thai_lib_java.pdf
  140. BIN pdfocr-tesseract4/src/test/resources/com/itextpdf/pdfocr/documents/multilang_executable.pdf
  141. BIN pdfocr-tesseract4/src/test/resources/com/itextpdf/pdfocr/documents/multilang_lib.pdf
  142. BIN pdfocr-tesseract4/src/test/resources/com/itextpdf/pdfocr/documents/multipage_executable.pdf
  143. BIN pdfocr-tesseract4/src/test/resources/com/itextpdf/pdfocr/documents/multipage_lib.pdf
  144. BIN pdfocr-tesseract4/src/test/resources/com/itextpdf/pdfocr/documents/numbers_01.pdf
  145. BIN pdfocr-tesseract4/src/test/resources/com/itextpdf/pdfocr/documents/numbers_01_a3u.pdf
  146. BIN ...-tesseract4/src/test/resources/com/itextpdf/pdfocr/documents/numbers_01_compareJpe_executable.pdf
  147. BIN pdfocr-tesseract4/src/test/resources/com/itextpdf/pdfocr/documents/numbers_01_compareJpe_lib.pdf
  148. BIN ...-tesseract4/src/test/resources/com/itextpdf/pdfocr/documents/numbers_01_compareTif_executable.pdf
  149. BIN pdfocr-tesseract4/src/test/resources/com/itextpdf/pdfocr/documents/numbers_01_compareTif_lib.pdf
  150. BIN ...-tesseract4/src/test/resources/com/itextpdf/pdfocr/documents/numbers_02_compareJpg_executable.pdf
  151. BIN pdfocr-tesseract4/src/test/resources/com/itextpdf/pdfocr/documents/numbers_02_compareJpg_lib.pdf
  152. BIN pdfocr-tesseract4/src/test/resources/com/itextpdf/pdfocr/documents/spanish_01_a3u.pdf
  153. BIN pdfocr-tesseract4/src/test/resources/com/itextpdf/pdfocr/documents/thai_01_lib_dotnet.pdf
  154. BIN pdfocr-tesseract4/src/test/resources/com/itextpdf/pdfocr/documents/thai_01_lib_java.pdf
  155. +10 −18 pom.xml
  156. +11 −0 sharpenConfiguration.xml
183 changes: 183 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
# Contributing to iText Community

We'd love for you to contribute to our source code and to make **iText Community** even better than it is
today! Here are the guidelines we'd like you to follow:

- [Question or Problem?](#question)
- [Issues and Bugs](#issue)
- [New Features](#feature)
- [Submission Guidelines](#submit)
- [Coding Rules](#rules)
- [Commit Message Guidelines](#commit)
- [Signing the iCLA](#cla)
- [Contributor Code of Conduct](#coc)


## <a name="question">Got a Question or Problem?</a>

If you have questions about how to use **iText Community**, please direct these to [Stack Overflow][stackoverflow].

If you are a customer with a [support agreement][support], you also have direct access to our JIRA and our developers.


## <a name="issue">Found an Issue?</a>
If you find a bug in the source code or a mistake in the documentation, you can help us by
submitting a [Pull Request][pull] with a fix.

**Please see the [Submission Guidelines](#submit) below**.


## <a name="feature">Want to implement a Feature?</a>
If you would like to implement a new feature then consider what kind of change it is:

* **Major Changes** that you wish to contribute to the project should be discussed first so that we can better
coordinate our efforts, prevent duplication of work, and help you to craft the change so that it is successfully
accepted into the project. Contact us at [community@itextpdf.com](mailto:community@itextpdf.com).
* **Small Changes** can be crafted and submitted to the [GitHub Repository][github] as a [Pull Request][pull].


## <a name="submit">Submission Guidelines</a>

### Submitting a Question or an Issue
Before you submit your question or issue, search [Stack Overflow][stackoverflow], maybe your question was already answered.

If your issue appears to be a bug, and hasn't been reported, ask a question on [Stack Overflow][stackoverflow] to verify that is indeed a bug and not a mistake in your own code.
Help us to maximize the effort we can spend fixing issues and adding new
features, by not reporting duplicate issues. Providing the following information will increase the
chances of your issue being dealt with quickly:

* **[How to ask good questions][good-questions]**
* **Overview of the Issue** - if an error is being thrown a non-minified stack trace helps
* **Motivation for or Use Case** - explain why this is a bug for you
* **iText Version(s)** - is it a regression?
* **Operating System** - is this a problem on Windows or Linux, maybe on Mac?
* **Reproduce the Error** - provide a [Short, Self Contained, Correct (Compilable), Example][sscce], also known as a [Minimal, Complete, and Verifiable example][mcve].
* **Related Issues** - has a similar issue been reported before?
* **Suggest a Fix** - if you can't fix the bug yourself, perhaps you can point to what might be
causing the problem (line of code or commit)
* **Tag the question** - add the tag `itext` to your question so we can find it.

**If you get help, help others. Good karma rulez!**


### Submitting a Pull Request
Before you submit your pull request consider the following guidelines:

* Search [GitHub][pull] for an open or closed Pull Request
that relates to your submission. You don't want to duplicate effort.
* Verify that your proposed change hasn't already been addressed in the develop branch.
* Don't send a separate pull request for every single file you change.
* Please sign the [iText Contributor License Agreement (iCLA)](#cla) before sending pull
requests. We cannot accept code without this agreement.
* Fork the iText repository on GitHub.
* Clone your iText fork to your local machine.
* Make your changes, **including appropriate test cases**.
* Follow our [Coding Rules](#rules).
* Commit your changes using a descriptive commit message that follows our
[commit message conventions](#commit-message-format).
* Now would be a good time to fix up your commits (if you want or need to) with `git rebase --interactive`.
* Build your changes locally to ensure all the tests pass.
* Push your changes to your GitHub account.
* Create a pull request in GitHub.
"Head fork" should be your repository, and the "base fork" should be the iText official repository.
* If we suggest changes then:
* Make the required updates.
* Fix up your commits if needed, with an interactive rebase.
* Re-run the tests and make sure that they are still passing.
* Force push to your GitHub repository. This will update your Pull Request.

That's it! Thank you for your contribution!

#### After your pull request is merged

After your pull request is merged, you can safely delete your fork and pull the changes
from the main (upstream) repository.


## <a name="rules">Coding Rules</a>
To ensure consistency throughout the source code, keep these rules in mind as you are working:

* We develop in Java first, and then port to .NET, so code submissions in Java are preferred.
Nevertheless this shouldn't stop you from making a good pull request to the .NET port.
* All features or bug fixes **must be tested** by one or more unit tests.
* All public API methods **must be documented** with JavaDoc. To see how we document our APIs, please check
out the existing [javadocs][javadocs].
* We follow the rules contained in
[Oracle's Code Conventions for the Java Programming Language][java-style-guide], with these additions:
* Wrap all code at **100 characters**.


## <a name="commit">Git Commit Guidelines</a>

We have guidelines on how our git commit messages should be formatted. This leads to **more
readable messages** that are easy to follow when looking through the **project history**. But also,
we use the git commit messages to **generate the iText Community change log**.

These guidelines were taken from Chris Beams' blog post [How to Write a Git Commit Message][git-commit].

### Commit Message Format
Each commit message consists of a **subject**, a **body** and a **footer**:

```
<subject>
<BLANK LINE>
<body>
<BLANK LINE>
<footer>
```

Any line of the commit message should not be longer 72 characters! This allows the message to be easier
to read on GitHub as well as in various git tools.

### Subject
The subject contains succinct description of the change:

* [Separate subject from body with a blank line][git-commit-separate]
* [Limit the subject line to 50 characters][git-commit-limit-50]
* [Capitalize the subject line][git-commit-capitalize]
* [Do not end the subject line with a period][git-commit-end]
* [Use the imperative mood in the subject line][git-commit-imperative]

### Body
* [Wrap the body at 72 characters][git-commit-wrap-72]
* [Use the body to explain _what_ and _why_ vs. _how_][git-commit-why-not-how]

### Footer
The footer contains any information about **Breaking Changes** and is also the place to
reference JIRA or GitHub issues that this commit **Closes**.


## <a name="cla">Signing the iCLA</a>

Please sign the [**iText Contributor License Agreement (iCLA)**][cla] before sending pull requests. For any code changes to be accepted, the iCLA must be signed. It's a quick process, we promise!

We'll need you to [(digitally) sign and then email, fax or mail the form][cla].


## <a name="coc">Contributor Code of Conduct</a>
Please note that this project is released with a [Contributor Code of Conduct][coc]. By participating in this project you agree to abide by its terms.

We use the [Stack Exchange][stackoverflow] network for free support and [GitHub][github] for code hosting. By using these services, you agree to abide by their terms:

* StackExchange: [http://stackexchange.com/legal](http://stackexchange.com/legal)
* Github: [https://help.github.com/articles/github-terms-of-service/](https://help.github.com/articles/github-terms-of-service/)

[cla]: https://itextpdf.com/en/how-buy/legal/itext-contributor-license-agreement
[coc]: CODE_OF_CONDUCT.md
[github]: https://github.com/itext/i7j-pdfocr
[java-style-guide]: https://www.oracle.com/technetwork/java/codeconvtoc-136057.html
[javadocs]: https://itextpdf.com/api
[pull]: https://github.com/itext/i7j-pdfocr/pulls
[sscce]: http://sscce.org/
[stackoverflow]: https://stackoverflow.com/questions/tagged/itext
[good-questions]: https://stackoverflow.com/help/how-to-ask
[mcve]: https://stackoverflow.com/help/mcve
[support]: https://itextpdf.com/support
[git-commit]: https://chris.beams.io/posts/git-commit/
[git-commit-separate]: https://chris.beams.io/posts/git-commit/#separate
[git-commit-limit-50]: https://chris.beams.io/posts/git-commit/#limit-50
[git-commit-capitalize]: https://chris.beams.io/posts/git-commit/#capitalize
[git-commit-end]: https://chris.beams.io/posts/git-commit/#end
[git-commit-imperative]: https://chris.beams.io/posts/git-commit/#imperative
[git-commit-wrap-72]: https://chris.beams.io/posts/git-commit/#wrap-72
[git-commit-why-not-how]: https://chris.beams.io/posts/git-commit/#why-not-how
40 changes: 40 additions & 0 deletions SECURITY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# iText Security Policy

## Reporting a Vulnerability

We are committed to maintaining the security of our software. If you discover a security vulnerability, we encourage you to report it to us as soon as possible.

To report a vulnerability, please visit our [Vulnerability Reporting Page](https://itextpdf.com/report-vulnerability), or email [vulnerability@apryse.com](vulnerability@apryse.com). If you do not receive a response in 2 business days, please follow up as we may not have received your message.

We follow the procedure of Coordinated Vulnerability Disclosure (CVD) and, to protect the ecosystem, we request that those reporting do the same. Please visit the above page for more information, and follow the steps below to ensure that your report is handled promptly and appropriately:

1. **Do not disclose the vulnerability publicly** until we have had a chance to address it.
2. **Provide a detailed description** of the vulnerability, including steps to reproduce it, if possible.
3. **Include any relevant information** such as the version of pdfOCR you are using, your operating system, and any other pertinent details.

## Security Updates and Patches

When a vulnerability is reported, we will:

1. **Investigate and verify** the vulnerability.
2. **Develop and test** a fix for the vulnerability.
3. **Release a patch** as soon as possible.


## Known Vulnerabilities

The iText Knowledge Base has a page for known [Common Vulnerabilities and Exposures](https://kb.itextpdf.com/itext/cves) (CVEs), please check it to ensure your vulnerability has not already been disclosed or addressed.

## Supported product lines

See [Compatibility Matrix](https://kb.itextpdf.com/itext/compatibility-matrix)

## Security Best Practices

To help ensure the security of your applications using pdfOCR, we recommend the following best practices:

1. **Keep pdfOCR up to date** by regularly checking for and applying updates.
2. **Review and follow** our security guidelines for secure usage.
3. **Monitor your applications** for any unusual activity and investigate any anomalies promptly.

Thank you for helping us keep iText secure!
53 changes: 50 additions & 3 deletions pdfocr-api/pom.xml
Original file line number Diff line number Diff line change
@@ -5,13 +5,16 @@
<parent>
<groupId>com.itextpdf</groupId>
<artifactId>pdfocr-root</artifactId>
<version>1.0.2</version>
<version>4.0.3-SNAPSHOT</version>
</parent>

<artifactId>pdfocr-api</artifactId>

<name>pdfOCR API</name>
<description>pdfOCR is an iText 7 add-on for Java to recognize and extract text in scanned documents and images. It can also convert them into fully ISO-compliant PDF or PDF/A-3u files that are accessible, searchable, and suitable for archiving</description>
<description>pdfOCR is an iText add-on for Java to recognize and extract text in scanned documents and images. It can
also convert them into fully ISO-compliant PDF or PDF/A-3u files that are accessible, searchable, and suitable for
archiving
</description>

<dependencies>
<dependency>
@@ -47,4 +50,48 @@
</resource>
</resources>
</build>
</project>

<profiles>
<profile>
<id>with-sharpen</id>
<build>
<plugins>
<plugin>
<groupId>sharpen</groupId>
<artifactId>sharpen-maven-plugin</artifactId>
<version>1.0-SNAPSHOT</version>
<executions>
<execution>
<phase>install</phase>
<goals>
<goal>sharpen</goal>
</goals>
</execution>
</executions>
<dependencies>
<dependency>
<groupId>sharpen</groupId>
<artifactId>standard-framework-mapping</artifactId>
<version>1.0-SNAPSHOT</version>
</dependency>
</dependencies>
<configuration>
<projectName>pdfocr</projectName>
<cSharpTargetFolder>./../../../sharp/pdfocr</cSharpTargetFolder>
<cSharpSourceCodeDestination>itext/itext.pdfocr.api</cSharpSourceCodeDestination>
<cSharpTestCodeDestination>itext.tests/itext.pdfocr.api.tests</cSharpTestCodeDestination>
<buildDotnet>${sharpen.builddotnet}</buildDotnet>
<showDiff>${sharpen.showdiff}</showDiff>
<sourceCodeFiles>
<file>**/src/main/java/**/*.java</file>
</sourceCodeFiles>
<testCodeFiles>
<file>**/src/test/java/**/*.java</file>
</testCodeFiles>
</configuration>
</plugin>
</plugins>
</build>
</profile>
</profiles>
</project>
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
/*
This file is part of the iText (R) project.
Copyright (c) 1998-2025 Apryse Group NV
Authors: Apryse Software.
This program is offered under a commercial and under the AGPL license.
For commercial licensing, contact us at https://itextpdf.com/sales. For AGPL licensing, see below.
AGPL licensing:
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
*/
package com.itextpdf.pdfocr;

import com.itextpdf.commons.actions.AbstractITextEvent;
import com.itextpdf.commons.actions.AbstractProductITextEvent;
import com.itextpdf.commons.actions.confirmations.EventConfirmationType;
import com.itextpdf.commons.actions.sequence.SequenceId;

/**
* Helper class for working with events. This class is for internal usage.
*/
public abstract class AbstractPdfOcrEventHelper extends AbstractITextEvent {

/**
* Handles the event.
*
* @param event event
*/
public abstract void onEvent(AbstractProductITextEvent event);

/**
* Returns the sequence id
*
* @return sequence id
*/
public abstract SequenceId getSequenceId();

/**
* Returns the confirmation type of event.
*
* @return event confirmation type
*/
public abstract EventConfirmationType getConfirmationType();
}
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/*
This file is part of the iText (R) project.
Copyright (c) 1998-2020 iText Group NV
Authors: iText Software.
Copyright (c) 1998-2025 Apryse Group NV
Authors: Apryse Software.
This program is offered under a commercial and under the AGPL license.
For commercial licensing, contact us at https://itextpdf.com/sales. For AGPL licensing, see below.
Loading