Hi Joe,Thank you for your patience.We have tried to reproduce the issue “Text is not extracted properly while performing Extract text”, but it is working fine as expected. Kindly refer the sample in the below link which we created to reproduce the reported issue.Kindly share the following details to analyze more on this issue and it will be helpful for us to provide solution at the earliest.1. Modify / Simple sample with which the issue could be reproduced.2. Replication procedure to reproduce the issue or screen shot illustrating the issue.3. PDF document with which the issue could be reproduced.4. Syncfusion.Xamarin.Pdf version.With Regards,Gayathri R
Hi Joe,
Based on the provided details we suspect that your requirement is to add text box for the searched words with the prefilled text in that. So, we have modified the provided code snippet to add text box with prefilled text and shared the same in the following location,
Please try this and revert us with more details about your requirement id you still have any concerns. Else we can setup a web meeting to look into it and provide the solution. Please let us know your availability. We will make every effort to have this scheduled on a date and time of your convenience.
Regards,Akshaya
Hi Joe,Based on the provided details we suspect that your requirement is to add the text box for the text(John__Smith) next to the searched text ($T$:), also wants to replace the ‘_’ with space.But in the provided code snippet, you have searched this text ($T$:) only and then Split that with the Colon(‘:’) then added the text box for that. There is no code related to add the textbox for (John_Shmith).If you have searched for the text($T$:) then it will return the text available in that array only, not the whole word ($T$:John__Smith) . If you are aware of the text which you want to add text box then you can search for that word itself.Note: The text will be extracted based on the format which is preserved in PDF document structure and not necessarily word without space ( $T$:John__Smith) should be preserved as single word. For the provided document “$T$:” and “John__Smith” will be rendered separately( as per document structure) so it will not be in same array.Please revert us with more details about your requirement if you still have any concerns on this.
Regards,Akshaya
Query |
Details |
When you say: "For the provided document “$T$:” and “John__Smith” will be rendered separately( as per document structure) so it will not be in same array." how do I get it to render together? For example in crystal reports it is just one text block, no reason it should be separate. |
The format of the text content in the PDF document depends on the pdf creator which is creating the PDF document, not based on the crystal report format.
The text will be rendered and extracted from the PDF document based on the TJ operators. The text available in single TJ operator is decided by the pdf creator, while creating the document stream, so it may contain single character or single word or multiple words. |
If you have searched for the text($T$:) then it will return the text available in that array only, not the whole word ($T$:John__Smith) . If you are aware of the text which you want to add text box then you can search for that word itself. "
Is incorrect. You can see from the following code:
//Define the key that will show where to put a text box.
String key = "$T$:";
//Search for the text given in the key.
List<TextData> found_texts = page_text.Value.Where(w => w.Text.Contains(key)).ToList();
That it will bring back any array items that CONTAIN the string "$T$:" therefore if the array contained an entry "$T$:John__Smith", it would bring back the whole word.
I then set the textboxfield value using:
//Define the prefilled text.
editableField.Text = word.Text?.Split(':')?.LastOrDefault()?.Replace("__", " ")?.Trim() ?? String.Empty;
Which will give me anything past the : and replace anything in that string where the __ is with a blank space. Therefore after this code is run editableField.Text = "John Smith" and since I'm setting the text, the value is prefilled in the text box.
In doing this, I have found the element via contains without needing to know the value before hand and then prefilled the textbox text with that value.
|
Yes, if you search for the string “$TS:” it will return the array items that contains the provide string.
But for the provided document array item contains only “$TS:”. The text will be rendered and extracted from the PDF document based on the TJ operators. An array item in the TextData will contain the text available in single TJ operator.
Please find the below screenshot showing that the text ($T$:) and (John__Smith) are preserved in two different TJ operators. So those texts are in differed array in the extracted text.
|
Hi Joe,EJ2 PDF Viewer server library allows you to extract the text from a page along with the bounds. ExtractText() will return bounds of each character, using that we can get the bounds of the words. Then we can add the textbox field for the words using the PdfLoadedDocument as per your requirement. We have created the sample for the same and shared in the following location,
Please find the below UG link to extract the text from the PDF document,
Please find the below KB to get the bounds of the words from the extracted text ,
Please try this and let us know if you have any concerns on this.
Regards,Akshaya
HttpContext.Current.Server.MapPath("~/Data/Test Form.pdf”);
|
Hi Joe,Based on the provided details we suspect that, the project residing folder does not have write access. We have embedded Pdfium rendering engine in our PDF Viewer for robust rendering, so Pdfium dll will be generated while running the project. So kindly place your project, which has access to read and write files. Else, please copy the Pdfium assemblies manually inside the Pdfium folder(‘’C:\Windows\SysWOW64\inetsrv\x86\’’) for resolving the reported issue. Also, you can build the project in local folder(which have read and write access) and then copy the files for hosting in the remote machine. The Pdfium assembly will be available in this case.Note: Once we manually copied the Pdfium assemblies then the PDF Viewer will not generate the assembly again.Regards,Akshaya