-
Notifications
You must be signed in to change notification settings - Fork 5.4k
[components] Scrapeless - update actions #17493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[components] Scrapeless - update actions #17493
Conversation
- Updated @scrapeless-ai/sdk dependency to version 1.6.0 in package.json. - Updated Scraping API action to support Google Search and Google Trends with new parameters for better data retrieval. - Update the way to obtain scrapeless client.
The latest updates on your projects. Learn more about Vercel for Git ↗︎ 1 Skipped Deployment
|
WalkthroughThe updates refactor and extend Scrapeless action modules, notably making the Scrapeless client initialization asynchronous and updating its usage across actions. The scraping API action now supports Google Trends with extensive parameterization. Versions were incremented, input property logic was refactored, and the Scrapeless SDK dependency was updated to version 1.6.0. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Action
participant ScrapelessApp
participant ScrapelessClient
User->>Action: Trigger run()
Action->>+ScrapelessApp: await _scrapelessClient()
ScrapelessApp->>+ScrapelessClient: (Dynamic import, instantiate with API key)
ScrapelessApp-->>-Action: ScrapelessClient instance
Action->>ScrapelessClient: Call appropriate method (crawl, scrape, universal.scrape, etc.)
ScrapelessClient-->>Action: Return results
Action-->>User: Respond with summary and results
Poem
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
components/scrapeless/actions/scraping-api/scraping-api.mjsOops! Something went wrong! :( ESLint: 8.57.1 Error [ERR_MODULE_NOT_FOUND]: Cannot find package 'jsonc-eslint-parser' imported from /eslint.config.mjs components/scrapeless/actions/crawler/crawler.mjsOops! Something went wrong! :( ESLint: 8.57.1 Error [ERR_MODULE_NOT_FOUND]: Cannot find package 'jsonc-eslint-parser' imported from /eslint.config.mjs components/scrapeless/scrapeless.app.mjsOops! Something went wrong! :( ESLint: 8.57.1 Error [ERR_MODULE_NOT_FOUND]: Cannot find package 'jsonc-eslint-parser' imported from /eslint.config.mjs
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
@joy-chanboop is attempting to deploy a commit to the Pipedreamers Team on Vercel. A member of the Team first needs to authorize it. |
Thank you so much for submitting this! We've added it to our backlog to review, and our team has been notified. |
Thanks for submitting this PR! When we review PRs, we follow the Pipedream component guidelines. If you're not familiar, here's a quick checklist:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🔭 Outside diff range comments (1)
components/scrapeless/scrapeless.app.mjs (1)
25-40
: Remove Unnecessary Environment VariablesLines setting
SCRAPELESS_IS_ONLINE
andSCRAPELESS_LOG_ROOT_DIR
aren’t required by Scrapeless SDK v1.6.0 and should be removed unless you have a custom, documented use case:
- File:
components/scrapeless/scrapeless.app.mjs
- Remove at lines 26–27:
- process.env.SCRAPELESS_IS_ONLINE = "true"; - process.env.SCRAPELESS_LOG_ROOT_DIR = "/tmp";Keep only the environment variable(s) that are officially required (e.g.,
SCRAPELESS_API_KEY
).
🧹 Nitpick comments (3)
components/scrapeless/package.json (1)
17-17
: Consider using caret range for SDK dependencyThe dependency version was changed from
^1.4.0
to a fixed1.6.0
. While this ensures compatibility with the async client changes, it prevents automatic patch updates that might include security fixes or bug fixes.Consider using a caret range to allow patch updates:
- "@scrapeless-ai/sdk": "1.6.0" + "@scrapeless-ai/sdk": "^1.6.0"components/scrapeless/actions/scraping-api/scraping-api.mjs (2)
148-150
: Clarify the tbs parameter descriptionThe description "(to be searched) parameter" is unclear. Consider updating it to be more descriptive.
- description: "(to be searched) parameter defines advanced search parameters that aren't possible in the regular query field. (e.g., advanced search for patents, dates, news, videos, images, apps, or text contents).", + description: "The tbs (to be searched) parameter defines advanced search parameters that aren't possible in the regular query field. (e.g., advanced search for patents, dates, news, videos, images, apps, or text contents).",
179-182
: Clarify the tbm parameter descriptionThe description "(to be matched) parameter" could be clearer.
- description: "(to be matched) parameter defines the type of search you want to do.\n\nIt can be set to:\n`(no tbm parameter)`: `regular Google Search`,\n`isch`: `Google Images API`,\n`lcl` - `Google Local API`\n`vid`: `Google Videos API`,\n`nws`: `Google News API`,\n`shop`: `Google Shopping API`,\n`pts`: `Google Patents API`,\nor any other Google service.", + description: "The tbm (to be matched) parameter defines the type of search you want to do.\n\nIt can be set to:\n`(no tbm parameter)`: `regular Google Search`,\n`isch`: `Google Images API`,\n`lcl`: `Google Local API`,\n`vid`: `Google Videos API`,\n`nws`: `Google News API`,\n`shop`: `Google Shopping API`,\n`pts`: `Google Patents API`,\nor any other Google service.",
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
pnpm-lock.yaml
is excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (5)
components/scrapeless/actions/crawler/crawler.mjs
(3 hunks)components/scrapeless/actions/scraping-api/scraping-api.mjs
(5 hunks)components/scrapeless/actions/universal-scraping-api/universal-scraping-api.mjs
(2 hunks)components/scrapeless/package.json
(1 hunks)components/scrapeless/scrapeless.app.mjs
(1 hunks)
🧰 Additional context used
🧠 Learnings (3)
components/scrapeless/package.json (1)
Learnt from: jcortes
PR: PipedreamHQ/pipedream#14935
File: components/sailpoint/package.json:15-18
Timestamp: 2024-12-12T19:23:09.039Z
Learning: When developing Pipedream components, do not add built-in Node.js modules like `fs` to `package.json` dependencies, as they are native modules provided by the Node.js runtime.
components/scrapeless/actions/scraping-api/scraping-api.mjs (1)
Learnt from: js07
PR: PipedreamHQ/pipedream#17375
File: components/zerobounce/actions/get-validation-results-file/get-validation-results-file.mjs:23-27
Timestamp: 2025-07-01T17:07:48.193Z
Learning: "dir" props in Pipedream components are hidden in the component form and not user-facing, so they don't require labels or descriptions for user clarity.
components/scrapeless/actions/universal-scraping-api/universal-scraping-api.mjs (2)
Learnt from: GTFalcao
PR: PipedreamHQ/pipedream#12731
File: components/hackerone/actions/get-members/get-members.mjs:3-28
Timestamp: 2024-07-04T18:11:59.822Z
Learning: When exporting a summary message in the `run` method of an action, ensure the message is correctly formatted. For example, in the `hackerone-get-members` action, the correct format is `Successfully retrieved ${response.data.length} members`.
Learnt from: GTFalcao
PR: PipedreamHQ/pipedream#12731
File: components/hackerone/actions/get-members/get-members.mjs:3-28
Timestamp: 2024-10-08T15:33:38.240Z
Learning: When exporting a summary message in the `run` method of an action, ensure the message is correctly formatted. For example, in the `hackerone-get-members` action, the correct format is `Successfully retrieved ${response.data.length} members`.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Lint Code Base
- GitHub Check: Publish TypeScript components
- GitHub Check: Verify TypeScript components
- GitHub Check: pnpm publish
🔇 Additional comments (5)
components/scrapeless/actions/universal-scraping-api/universal-scraping-api.mjs (1)
63-88
: Async client initialization correctly implementedThe changes properly handle the async client initialization and cache the client instance for reuse. This aligns with the updated
_scrapelessClient
method.components/scrapeless/actions/crawler/crawler.mjs (2)
29-51
: Clean refactoring of additionalProps methodThe refactored method is more readable and maintains the same functionality. Good improvement!
68-86
: Async client handling and improved formattingGood implementation of async client caching and the backtick formatting for the URL in the summary message improves readability.
components/scrapeless/actions/scraping-api/scraping-api.mjs (2)
3-36
: Well-structured imports and Google Trends supportGood addition of Google Trends support and proper use of constants for option values. The version increment appropriately reflects the new functionality.
282-368
: Excellent implementation of async client and Google Trends supportThe async client handling is consistent with other actions, and the Google Trends implementation is well-structured. The added logging will be helpful for debugging.
Hi @jcortes , |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HI @joy-chanboop it is looking great, just a few suggestions other than that it is Ready for QA!
Hi everyone, all test cases are passed! Ready for release! Test report |
Please increase the minor version of the following components as well Also change the version of components/scrapeless/package.json to 0.2.1 |
WHY
Summary by CodeRabbit