Skip to content

PDF parsed as invalid #1718

Open
Open
@theprogramsam

Description

@theprogramsam

What were you trying to do?

Open the file via fetch and then write to file with more content added to the pdf pages.

How did you attempt to do it?

const headers = new Headers();
headers.append("Access-Control-Allow-Origin", "https://my-aws-link.s3.us-west-2.amazonaws.com");

const pdfBytes = await fetch(pdf["presigned_link"], { headers: headers }).then(res => res.arrayBuffer());
const pdfDoc = await PDFLib.PDFDocument.load(pdfBytes);
const pdfDataUriOriginal = await pdfDoc.saveAsBase64({ dataUri: true });

Then added it to iframe doesn't work. In Ruby decoding the file:

# Orignal file
3.2.1 :017 > f = File.read(Rails.root.join("app", "public", "SOC2.pdf"))
 => "%PDF-1.7\r%\xE2\xE3\xCF\xD3\r\n1182 0 obj\r<</Linearized 1/L 2860508/O 1184/E 594904/N 4/T 2859044/H [ 510 260]>>\rendobj\r..."
# File from base64 saving of pdfDataUriOriginal
3.2.1 :012 > f = File.read(Rails.root.join("app", "public", "base64_pdf"))
 => "JVBERi0xLjcKJYGBgYEKCjIgMCBvYmoKPDwKL0xlbmd0aCA0Ngo+PgpzdHJlYW0KL0RldmljZVJHQiBDUwovRGV2aWNlUkdCIGNzCnEKL0UxIGdzCi9YMSBEbwp..."

3.2.1 :013 > de = Base64.decode64(f)
 => "%PDF-1.7\n%\x81\x81\x81\x81\n\n2 0 obj\n<<\n/Length 46\n>>\nstream\n/DeviceRGB CS\n/DeviceRGB cs\nq\n/E1 gs\n/X1 Do\nQ\n\ne..."

3.2.1 :014 > File.write(Rails.root.join("app", "public", "base64_pdf.pdf"), de)
(irb):14:in `write': "\x81" from ASCII-8BIT to UTF-8 (Encoding::UndefinedConversionError)

If I force encode it it saves, but is malformed.

File.write(Rails.root.join("app", "controllers", "e_signature", "base64_pdf.pdf"), de.force_encoding('ISO-8859-1').encode
('UTF-8'))

Screenshot 2025-01-08 at 11 48 06 PM

I have attached the file.

Screenshot 2025-01-08 at 11 53 53 PM

SOC2.pdf

What actually happened?

Screen goes blank on iframe for pdf. The Original file output vs decoded pdf-lib base 64 is different.

What did you expect to happen?

By simply just loading it through the library and then saving it without any modifications, it should display well in the browser and result in the same original bytes.

How can we reproduce the issue?

script src="https://unpkg.com/pdf-lib"
script src="https://unpkg.com/@pdf-lib/fontkit/dist/fontkit.umd.min.js"

Version

whichever current version is hosted on https://unpkg.com/pdf-lib

What environment are you running pdf-lib in?

Browser

Checklist

  • My report includes a Short, Self Contained, Correct (Compilable) Example.
  • I have attached all PDFs, images, and other files needed to run my SSCCE.

Additional Notes

No response

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions