Skip to content

Site2pdf Cloud #6

@laiso

Description

@laiso

I am considering offering site2pdf online for mobile users and non-technologists. This service allows users to enter a website URL, which is then converted into a PDF for download. To achieve this quickly, I evaluated the use of Cloudflare's Browser Rendering API (managed Puppeteer server).

https://gist.github.com/laiso/c36ac504afb2715831ef9410853753fb/

This code uses Cloudflare Workers to provide the following functions:

  1. Extract links from the specified URL and add PDF generation tasks to a queue.
  2. Select a random browser session using Puppeteer.
  3. Fetch messages from the queue, visit URLs using the browser session, generate PDFs, and save them in segments to the R2 bucket.
  4. Retrieve multiple PDF files from the R2 bucket, merge them, and return a single PDF file.

Discovered Issues

  1. Resource Constraints of the Browser Rendering API

    • Only two instances can run simultaneously, each occupying one consumer worker. Therefore, scaling with a queue is not feasible. This seems to be intended for in-house use.
  2. Execution Time and Memory Constraints of Cloudflare Workers

    • The execution time and memory constraints of Cloudflare Workers are insufficient for our tasks. PDF generation tasks consume significant resources, making these constraints a major obstacle.

Future Actions

  1. Consideration of Alternative Cloud Services

    • Firecrawl and Jira Reader are deployed to Fly.io and Cloud Functions. These platforms offer more resources and can scale out PDF generation tasks, making them more suitable.
  2. Development of a Desktop Application Desktop Application #9

    • Create a desktop application using Electron, allowing users to generate PDFs using their resources. This approach avoids cloud resource constraints and enables smoother PDF generation.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions