Skip to content

Hai3Ne/pdf-spider

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Web to PDF Converter

A Node.js tool for converting GitBook documentation pages into PDF files. Supports multiple URLs and customizable output settings.

System Requirements

  • Node.js version 14.0.0 or higher
  • NPM (Node Package Manager)
  • Minimum 500MB free disk space (for Playwright and Chromium)
  • Stable internet connection

Installation

1. Install Node.js and NPM

  1. Visit Node.js official website
  2. Download and install the LTS (Long Term Support) version
  3. Verify installation:
node --version
npm --version

2. Create Project

# Create project directory
mkdir gitbook-spider 
cd gitbook-spider

# Initialize Node.js project
npm init -y 

3. Install Dependencies

# Install Playwright
npm install playwright

# Install Playwright browsers
npx playwright install chromium

4. Configure Project Files

Create and set up the following files:

  1. config.json - Configuration settings:
{
  "siteConfig": {
    "chapterLinksElmSelector": "nav a",
    "bodySelector": "main",
    "bookContentSelector": "main",
    "headerSelector": "header",
    "navNextSelector": "nav",
    "sideBarSelector": "aside"
  },
  "browserConfig": {
    "headless": false,
    "timeout": 60000
  },
  "pdfConfig": {
    "format": "A4",
    "margin": {
      "top": "50px",
      "bottom": "50px",
      "left": "50px",
      "right": "50px"
    }
  },
  "books": [
    {
      "url": "https://your-gitbook-url.com",
      "title": "YourBookTitle"
    }
  ],
  "outputDir": "./output"
}
  1. Copy the contents of pdfSpider.js and index.js to their respective files.

Usage

1. Add URLs to Configuration

Update the books array in config.json:

"books": [
  {
    "url": "https://docs.twgamesdev.com/uhfps/guides/managing-inputs",
    "title": "ManagingInputs"
  },
  {
    "url": "https://your-second-url.com",
    "title": "SecondBook"
  }
]

2. Run the Program

node index.js

PDF files will be generated in the output/ directory.

Customization

PDF Format Settings

Modify Config in config.json:

"Config": {
  "format": "A4",  // A4, Letter, Legal...
  "margin": {
    "top": "50px",
    "bottom": "50px",
    "left": "50px",
    "right": "50px"
  }
}

Browser Settings

"browserConfig": {
  "headless": false,  // true for no GUI
  "timeout": 60000    // in milliseconds
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors