Query |
Details |
System.ArgumentOutOfRangeException thrown when using ExtractText(out TextLines textLines) method |
We have confirmed that the issue with “System.ArgumentOutOfRangeException thrown when using ExtractText(out TextLines textLines) method” is a defect and we have logged a defect report. The fix for this issue will be included in our weekly NuGet package which is expected to be available on 17th March 2020.
|
ExtractedText method returns 2 paragraphs in a single paragraph |
On further analysis, we do not extract the texts based on the layout using page.ExtractText() method. In page.ExtractText() method, text are extracted based on the text rendering operators and a new line character will be added in between text on the occurrence of a text rendering operator which might cause less readability of the extracted content.
However, you can extract the text based on the layout by using the ExtractText(bool) overload. Please find the UG documentation for your reference,
However, we could see some spacing issue in the above suggested layout overload method. We will fix this spacing issue and the fix for this issue will be included in our weekly NuGet package which is expected to be available on 17th March 2020.
|
ProcessStartInfo startInfo = new ProcessStartInfo(@"C:\Program Files\Java\jre1.8.0_121\bin\java.exe"); |
startInfo.Arguments = "-jar tabula-0.8.0-jar-with-dependencies.jar -p all -o ExportSales.csv ExportSales.pdf"; |
ExcelEngine excelEngine = new ExcelEngine();
IApplication application = excelEngine.Excel;
IWorkbook workbook = application.Workbooks.Open("ExportSales.csv");
|
A
határozat ellen fellebbezésnek, újrafelvételi
elj
árásnak nincs helye.
A
határozat bírósági felülvizsgálatát
annak kézbesíté sétől számított tizenöt napon
belül
keresettel a felperes
belföldi székhelye (lak óhelye) szerint
illetékes
közigazgatási és munkaügyi bíróságtól
lehet kérni. A
keresetlevelet az illetékes
bírósághoz címezve, kizárólag a
Döntőbizottsághoz lehet benyújtani. Tárgyalás
tartását a
felperes a keresetlevélben kérheti.
A ke resetlevél benyújtásának
a
határozat végrehajtására nincs
halasztó hatálya.
Query |
Details |
Uthandaraja, testing 18.1.0.36-beta, it seems from the release notes that you worked really a lot. Still, formatted text extraction puts unexpected space series into text. I attach the PDF. |
We did not include the fix for the issue “Extra spaces added between words using ExtractText(bool) method” in 18.1.0.36-beta release. So only, the extra spaces added for the shared PDF document. The fix will be included in our 2020 Vol 1 main release which is expected to be available in the end of March 2020 and the spaces issue will be resolved in this release. |
I am unable to decide whether use formatted or unformatted extraction. |
If you want to extract the texts base on the layout, we suggest you to use the ExtractText(bool) overload. Otherwise, you can use the ExtractText method. |