[components] Scrapeless - update actions #17493

joy-chanboop · 2025-07-07T06:36:46Z

WHY

Updated @scrapeless-ai/sdk dependency to version 1.6.0 in package.json.
Updated Scraping API action to support Google Search and Google Trends with new parameters for better data retrieval.
Update the way to obtain scrapeless client.

Summary by CodeRabbit

New Features
- Added support for Google Trends data scraping alongside Google Search, with comprehensive parameter options.
Improvements
- Enhanced input customization for scraping actions based on selected API server.
- Improved efficiency by reusing client instances during scraping operations.
- Updated environment handling for the Scrapeless client to support asynchronous loading.
Dependency Updates
- Upgraded the Scrapeless SDK to version 1.6.0.
Bug Fixes
- Addressed issues with input property handling and client instantiation for more reliable scraping.

- Updated @scrapeless-ai/sdk dependency to version 1.6.0 in package.json. - Updated Scraping API action to support Google Search and Google Trends with new parameters for better data retrieval. - Update the way to obtain scrapeless client.

vercel · 2025-07-07T06:36:52Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment

Name	Status	Preview	Comments	Updated (UTC)
pipedream-docs-redirect-do-not-edit	⬜️ Ignored (Inspect)			Jul 7, 2025 6:36am

coderabbitai · 2025-07-07T06:36:53Z

Walkthrough

The updates refactor and extend Scrapeless action modules, notably making the Scrapeless client initialization asynchronous and updating its usage across actions. The scraping API action now supports Google Trends with extensive parameterization. Versions were incremented, input property logic was refactored, and the Scrapeless SDK dependency was updated to version 1.6.0.

Changes

File(s)	Change Summary
components/scrapeless/actions/crawler/crawler.mjs	Refactored `additionalProps` logic for input properties, updated version, moved method order, and improved client usage by caching instance.
components/scrapeless/actions/scraping-api/scraping-api.mjs	Added Google Trends support, introduced dynamic `additionalProps` for both Google Search and Trends, expanded parameter options, improved client usage, removed old prop logic, and updated version.
components/scrapeless/actions/universal-scraping-api/universal-scraping-api.mjs	Made client initialization asynchronous in `run`, reordered method, and incremented version.
components/scrapeless/scrapeless.app.mjs	Changed `_scrapelessClient` from synchronous to asynchronous, switched to dynamic import, set environment variables, and updated error handling for missing API key.
components/scrapeless/package.json	Updated "@scrapeless-ai/sdk" dependency from "^1.4.0" to "1.6.0".

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Action
    participant ScrapelessApp
    participant ScrapelessClient

    User->>Action: Trigger run()
    Action->>+ScrapelessApp: await _scrapelessClient()
    ScrapelessApp->>+ScrapelessClient: (Dynamic import, instantiate with API key)
    ScrapelessApp-->>-Action: ScrapelessClient instance
    Action->>ScrapelessClient: Call appropriate method (crawl, scrape, universal.scrape, etc.)
    ScrapelessClient-->>Action: Return results
    Action-->>User: Respond with summary and results

Poem

In the warren of code, a fresh breeze blew,
Async clients hopping, with versions anew.
Trends and searches, now both in the mix,
With props refactored for clever new tricks.
From package to action, the changes are clear—
A rabbit’s delight: Scrapeless runs with cheer!
🐇✨

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

components/scrapeless/actions/scraping-api/scraping-api.mjs

Oops! Something went wrong! :(

ESLint: 8.57.1

Error [ERR_MODULE_NOT_FOUND]: Cannot find package 'jsonc-eslint-parser' imported from /eslint.config.mjs
at Object.getPackageJSONURL (node:internal/modules/package_json_reader:255:9)
at packageResolve (node:internal/modules/esm/resolve:767:81)
at moduleResolve (node:internal/modules/esm/resolve:853:18)
at defaultResolve (node:internal/modules/esm/resolve:983:11)
at ModuleLoader.defaultResolve (node:internal/modules/esm/loader:801:12)
at #cachedDefaultResolve (node:internal/modules/esm/loader:725:25)
at ModuleLoader.resolve (node:internal/modules/esm/loader:708:38)
at ModuleLoader.getModuleJobForImport (node:internal/modules/esm/loader:309:38)
at #link (node:internal/modules/esm/module_job:202:49)

components/scrapeless/actions/crawler/crawler.mjs

Oops! Something went wrong! :(

ESLint: 8.57.1

Error [ERR_MODULE_NOT_FOUND]: Cannot find package 'jsonc-eslint-parser' imported from /eslint.config.mjs
at Object.getPackageJSONURL (node:internal/modules/package_json_reader:255:9)
at packageResolve (node:internal/modules/esm/resolve:767:81)
at moduleResolve (node:internal/modules/esm/resolve:853:18)
at defaultResolve (node:internal/modules/esm/resolve:983:11)
at ModuleLoader.defaultResolve (node:internal/modules/esm/loader:801:12)
at #cachedDefaultResolve (node:internal/modules/esm/loader:725:25)
at ModuleLoader.resolve (node:internal/modules/esm/loader:708:38)
at ModuleLoader.getModuleJobForImport (node:internal/modules/esm/loader:309:38)
at #link (node:internal/modules/esm/module_job:202:49)

components/scrapeless/scrapeless.app.mjs

Oops! Something went wrong! :(

ESLint: 8.57.1

Error [ERR_MODULE_NOT_FOUND]: Cannot find package 'jsonc-eslint-parser' imported from /eslint.config.mjs
at Object.getPackageJSONURL (node:internal/modules/package_json_reader:255:9)
at packageResolve (node:internal/modules/esm/resolve:767:81)
at moduleResolve (node:internal/modules/esm/resolve:853:18)
at defaultResolve (node:internal/modules/esm/resolve:983:11)
at ModuleLoader.defaultResolve (node:internal/modules/esm/loader:801:12)
at #cachedDefaultResolve (node:internal/modules/esm/loader:725:25)
at ModuleLoader.resolve (node:internal/modules/esm/loader:708:38)
at ModuleLoader.getModuleJobForImport (node:internal/modules/esm/loader:309:38)
at #link (node:internal/modules/esm/module_job:202:49)

1 others

✨ Finishing Touches

📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

vercel · 2025-07-07T06:36:54Z

@joy-chanboop is attempting to deploy a commit to the Pipedreamers Team on Vercel.

A member of the Team first needs to authorize it.

pipedream-component-development · 2025-07-07T06:37:03Z

Thank you so much for submitting this! We've added it to our backlog to review, and our team has been notified.

pipedream-component-development · 2025-07-07T06:37:04Z

coderabbitai

Actionable comments posted: 0

🔭 Outside diff range comments (1)

components/scrapeless/scrapeless.app.mjs (1)
25-40: Remove Unnecessary Environment Variables

Lines setting SCRAPELESS_IS_ONLINE and SCRAPELESS_LOG_ROOT_DIR aren’t required by Scrapeless SDK v1.6.0 and should be removed unless you have a custom, documented use case:

File: components/scrapeless/scrapeless.app.mjs

Remove at lines 26–27:
-      process.env.SCRAPELESS_IS_ONLINE = "true";
-      process.env.SCRAPELESS_LOG_ROOT_DIR = "/tmp";
Keep only the environment variable(s) that are officially required (e.g., SCRAPELESS_API_KEY).

🧹 Nitpick comments (3)

components/scrapeless/package.json (1)
17-17: Consider using caret range for SDK dependency

The dependency version was changed from ^1.4.0 to a fixed 1.6.0. While this ensures compatibility with the async client changes, it prevents automatic patch updates that might include security fixes or bug fixes.

Consider using a caret range to allow patch updates:
-    "@scrapeless-ai/sdk": "1.6.0"
+    "@scrapeless-ai/sdk": "^1.6.0"
components/scrapeless/actions/scraping-api/scraping-api.mjs (2)
148-150: Clarify the tbs parameter description

The description "(to be searched) parameter" is unclear. Consider updating it to be more descriptive.
-        description: "(to be searched) parameter defines advanced search parameters that aren't possible in the regular query field. (e.g., advanced search for patents, dates, news, videos, images, apps, or text contents).",
+        description: "The tbs (to be searched) parameter defines advanced search parameters that aren't possible in the regular query field. (e.g., advanced search for patents, dates, news, videos, images, apps, or text contents).",
179-182: Clarify the tbm parameter description

The description "(to be matched) parameter" could be clearer.
-        description: "(to be matched) parameter defines the type of search you want to do.\n\nIt can be set to:\n`(no tbm parameter)`: `regular Google Search`,\n`isch`: `Google Images API`,\n`lcl` - `Google Local API`\n`vid`: `Google Videos API`,\n`nws`: `Google News API`,\n`shop`: `Google Shopping API`,\n`pts`: `Google Patents API`,\nor any other Google service.",
+        description: "The tbm (to be matched) parameter defines the type of search you want to do.\n\nIt can be set to:\n`(no tbm parameter)`: `regular Google Search`,\n`isch`: `Google Images API`,\n`lcl`: `Google Local API`,\n`vid`: `Google Videos API`,\n`nws`: `Google News API`,\n`shop`: `Google Shopping API`,\n`pts`: `Google Patents API`,\nor any other Google service.",

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 827dd7b and 1c15185.

⛔ Files ignored due to path filters (1)

pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (5)

components/scrapeless/actions/crawler/crawler.mjs (3 hunks)
components/scrapeless/actions/scraping-api/scraping-api.mjs (5 hunks)
components/scrapeless/actions/universal-scraping-api/universal-scraping-api.mjs (2 hunks)
components/scrapeless/package.json (1 hunks)
components/scrapeless/scrapeless.app.mjs (1 hunks)

🧰 Additional context used

🧠 Learnings (3)

components/scrapeless/package.json (1)

Learnt from: jcortes
PR: PipedreamHQ/pipedream#14935
File: components/sailpoint/package.json:15-18
Timestamp: 2024-12-12T19:23:09.039Z
Learning: When developing Pipedream components, do not add built-in Node.js modules like `fs` to `package.json` dependencies, as they are native modules provided by the Node.js runtime.

components/scrapeless/actions/scraping-api/scraping-api.mjs (1)

Learnt from: js07
PR: PipedreamHQ/pipedream#17375
File: components/zerobounce/actions/get-validation-results-file/get-validation-results-file.mjs:23-27
Timestamp: 2025-07-01T17:07:48.193Z
Learning: "dir" props in Pipedream components are hidden in the component form and not user-facing, so they don't require labels or descriptions for user clarity.

components/scrapeless/actions/universal-scraping-api/universal-scraping-api.mjs (2)

Learnt from: GTFalcao
PR: PipedreamHQ/pipedream#12731
File: components/hackerone/actions/get-members/get-members.mjs:3-28
Timestamp: 2024-07-04T18:11:59.822Z
Learning: When exporting a summary message in the `run` method of an action, ensure the message is correctly formatted. For example, in the `hackerone-get-members` action, the correct format is `Successfully retrieved ${response.data.length} members`.

Learnt from: GTFalcao
PR: PipedreamHQ/pipedream#12731
File: components/hackerone/actions/get-members/get-members.mjs:3-28
Timestamp: 2024-10-08T15:33:38.240Z
Learning: When exporting a summary message in the `run` method of an action, ensure the message is correctly formatted. For example, in the `hackerone-get-members` action, the correct format is `Successfully retrieved ${response.data.length} members`.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Lint Code Base
GitHub Check: Publish TypeScript components
GitHub Check: Verify TypeScript components
GitHub Check: pnpm publish

🔇 Additional comments (5)

components/scrapeless/actions/universal-scraping-api/universal-scraping-api.mjs (1)

63-88: Async client initialization correctly implemented

The changes properly handle the async client initialization and cache the client instance for reuse. This aligns with the updated _scrapelessClient method.

components/scrapeless/actions/crawler/crawler.mjs (2)

29-51: Clean refactoring of additionalProps method

The refactored method is more readable and maintains the same functionality. Good improvement!

68-86: Async client handling and improved formatting

Good implementation of async client caching and the backtick formatting for the URL in the summary message improves readability.

components/scrapeless/actions/scraping-api/scraping-api.mjs (2)

3-36: Well-structured imports and Google Trends support

Good addition of Google Trends support and proper use of constants for option values. The version increment appropriately reflects the new functionality.

282-368: Excellent implementation of async client and Google Trends support

The async client handling is consistent with other actions, and the Google Trends implementation is well-structured. The added logging will be helpful for debugging.

joy-chanboop · 2025-07-07T06:55:01Z

Hi @jcortes ,
I’ve opened a new PR. I’m looking forward to your review — please let me know if you have any questions. Thanks a lot!

jcortes

HI @joy-chanboop it is looking great, just a few suggestions other than that it is Ready for QA!

components/scrapeless/actions/scraping-api/scraping-api.mjs

components/scrapeless/package.json

vunguyenhung · 2025-07-08T01:11:20Z

Hi everyone, all test cases are passed! Ready for release!

Test report
https://vunguyenhung.notion.site/components-Scrapeless-update-actions-229bf548bb5e8162a2cbe13147a83779

jcortes · 2025-07-08T18:32:32Z

Hi @joy-chanboop

Please increase the minor version of the following components as well
components/scrapeless/actions/get-scrape-result/get-scrape-result.mjs
components/scrapeless/actions/submit-scrape-job/submit-scrape-job.mjs
components/scrapeless/actions/submit-scrape-job/submit-scrape-job.mjs

Also change the version of components/scrapeless/package.json to 0.2.1

[components] Scrapeless - update actions

1c15185

- Updated @scrapeless-ai/sdk dependency to version 1.6.0 in package.json. - Updated Scraping API action to support Google Search and Google Trends with new parameters for better data retrieval. - Update the way to obtain scrapeless client.

adolfo-pd added the User submitted Submitted by a user label Jul 7, 2025

joy-chanboop mentioned this pull request Jul 7, 2025

[components] Scrapeless - fix actions #17377

Closed

coderabbitai bot reviewed Jul 7, 2025

View reviewed changes

jcortes reviewed Jul 7, 2025

View reviewed changes

components/scrapeless/actions/scraping-api/scraping-api.mjs Show resolved Hide resolved

components/scrapeless/package.json Show resolved Hide resolved

jcortes added this to Component (Source and Action) Backlog Jul 7, 2025

jcortes moved this to Changes Required in Component (Source and Action) Backlog Jul 7, 2025

jcortes moved this from Changes Required to Ready for QA in Component (Source and Action) Backlog Jul 7, 2025

vunguyenhung moved this from Ready for QA to In QA in Component (Source and Action) Backlog Jul 8, 2025

vunguyenhung moved this from In QA to Ready for Release in Component (Source and Action) Backlog Jul 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[components] Scrapeless - update actions #17493

[components] Scrapeless - update actions #17493

joy-chanboop commented Jul 7, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

vercel bot commented Jul 7, 2025

Uh oh!

coderabbitai bot commented Jul 7, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

vercel bot commented Jul 7, 2025

Uh oh!

pipedream-component-development commented Jul 7, 2025

Uh oh!

pipedream-component-development commented Jul 7, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

joy-chanboop commented Jul 7, 2025

Uh oh!

jcortes left a comment

Uh oh!

Uh oh!

Uh oh!

vunguyenhung commented Jul 8, 2025

Uh oh!

jcortes commented Jul 8, 2025

Uh oh!

Uh oh!

[components] Scrapeless - update actions #17493

Are you sure you want to change the base?

[components] Scrapeless - update actions #17493

Conversation

joy-chanboop commented Jul 7, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

WHY

Summary by CodeRabbit

Uh oh!

vercel bot commented Jul 7, 2025

Uh oh!

coderabbitai bot commented Jul 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

vercel bot commented Jul 7, 2025

Uh oh!

pipedream-component-development commented Jul 7, 2025

Uh oh!

pipedream-component-development commented Jul 7, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

joy-chanboop commented Jul 7, 2025

Uh oh!

jcortes left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

vunguyenhung commented Jul 8, 2025

Uh oh!

jcortes commented Jul 8, 2025

Uh oh!

Uh oh!

joy-chanboop commented Jul 7, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jul 7, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)