You might have a lot of PDF files on your disc or database; to automate or process the PDF files, you need to find any corrupted files and take necessary actions. But it’s tedious for anyone to open every single file with a PDF reader to check whether it is corrupt or not.
To save effort and time, Syncfusion PDF Library provides you support to identify corrupted PDF files using C# or VB.NET by checking the PDF format syntax from the PdfDocumentAnalyzer class can be used to find corrupted PDF files by analyzing the PDF document structure and syntax.
Using these APIs, you can ensure that a PDF document is not corrupted and start processing it.
For example:
The following C# code example will check whether the given PDF file is corrupted or not.
static void Main(string[] args) { //Load the PDF file as stream. using (FileStream pdfStream = new FileStream(“inputFile.pdf", FileMode.Open, FileAccess.Read)) { //Create a new instance of PDF document syntax analyzer. PdfDocumentAnalyzer analyzer = new PdfDocumentAnalyzer(pdfStream); //Analyze the syntax and return the results. SyntaxAnalyzerResult analyzerResult = analyzer.AnalyzeSyntax(); //Check whether the document is corrupted or not. if (analyzerResult.IsCorrupted) { StringBuilder strBuilder = new StringBuilder(); strBuilder.AppendLine("The PDF document is corrupted."); int count = 1; foreach (PdfException exception in analyzerResult.Errors) { strBuilder.AppendLine(count++.ToString() + ": " + exception.Message); } Console.WriteLine(strBuilder); } else { Console.WriteLine("No syntax error found in the provided PDF document"); } analyzer.Close(); } }
Syncfusion PDF Library can repair basic cross-reference offset issues in PDF files and open them for further processing. This is done using the overloads of PdfLoadedDocument constructors with openAndRepair parameters.
The following code example will repair the basic cross-reference offset issues and open the PDF document.
static void Main(string[] args) { using (FileStream pdfStream = new FileStream(@"input.pdf", FileMode.Open, FileAccess.Read)) { //load the corrupted document by setting the openAndRepair flag to true to repair the document. PdfLoadedDocument loadedPdfDocument = new PdfLoadedDocument(pdfStream, true); //Do PDF processing. //Save the document. using (FileStream outputStream = new FileStream(@"result.pdf", FileMode.Create)) { loadedPdfDocument.Save(outputStream); } //Close the document. loadedPdfDocument.Close(true); } }
Note: It cannot repair complex document corruption.
You can use these PDF corruption validation and repair APIs in .NET Framework, .NET Core, UWP, and Xamarin applications.
You can download the samples to check for the corrupted PDF files and repair the PDF file from this location.
As you can see, Syncfusion PDF Library provides APIs to find out whether a PDF file is corrupt or not by analyzing its structure and syntax. It also provides APIs to repair basic cross-reference offset-level corruption in PDF files. You can use these to avoid unexpected behavior while processing the PDF files in your .NET applications.
If you are new to our PDF Library, we highly recommend you to follow our Getting Started guide.
If you’re already a Syncfusion user, you can download the product setup here. Otherwise, you can download a free, 30-day trial here.
Have any questions or require clarification about these features? Please let us know in the comments below. You can also contact us through our support forum, Direct-Trac or feedback portal. We are happy to assist you!
If you liked this blog post, we think you’ll also enjoy the following related blog posts: