feat(pconline): add 太平洋电脑网 (PConline) adapter — search / list / info / param / price#2018
feat(pconline): add 太平洋电脑网 (PConline) adapter — search / list / info / param / price#2018yixin-1024 wants to merge 3 commits into
Conversation
…/ param
PConline (product.pconline.com.cn) is one of China's oldest digital-product
catalogues. All three commands are PUBLIC fetches of GBK-encoded + gzip'd SSR
pages (Node's fetch inflates the gzip, then TextDecoder('gbk') decodes; regex
parsed — no login, cookies or signature):
- list product.pconline.com.cn/<category>/ → 产品大全 browse: name + 参考价 +
detail URL (scoped to #JlistItems, deduped). The discovery entry point.
- info <cat>/<brand>/<id>.html → overview: 名称 / 分类 / 品牌 / 重点参数
(highlight value from each span's title attr, else the text after :)
- param <cat>/<brand>/<id>_detail.html → full spec sheet (area-detailparams
th/td pairs; poptxt glossary <div class="tips"> popups and the CPU/GPU
天梯图 affordance stripped)
A product is addressed by its <category>/<brand>/<id> URL (from list) — the bare
id 404s on PConline, so info/param take the URL or the triple.
Deliberately NOT shipped (investigated, not login-gated — login wouldn't help):
keyword search (ks.pconline.com.cn sits behind a JS/anti-bot challenge → 503 to
a plain fetch), merchant 报价 (legacy shop_list API retired → 404, static page
is promo ads only), and 点评 (mtp-list API returns empty shells). list covers
discovery instead; the adapter never ships empty/unreliable data.
12 vitest cases against frozen fixtures (list = 手机大全; info/param =
iPhone17 Pro Max, mobile/apple/2718819); tsc clean; silent-column-drop &
typed-error-lint new=0; doc-coverage 172/172. Verified live across mobile /
notebook / cpu categories.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Re-checked PConline against a logged-in browser session: login itself gates
nothing new, but driving the real browser surfaced the *current* price API,
which the first cut had missed (it had probed the retired shop_list_new2015.jsp
that 404s). The live endpoint is a plain public JSON service:
ppc.pconline.com.cn/productPrice/list?pId=<id>&skuId=0&days=30&mallType=0
It's on a separate host from the rate-limited 产品库, needs no login/cookies/
signature, and is keyed by the numeric id alone — so `price` also accepts a
bare id (not just the cat/brand/id URL that info/param require).
- price → 历史最低价 (cheapest tracked price + date) + each mall's latest
tracked price (京东 jdList / 苏宁 snList); empty malls are omitted.
search / 点评 remain out (verified still NOT login-gated): 快搜 sits behind a
slide-captcha anti-bot challenge (renders in a browser, 503 to a plain fetch),
and the 点评 API returns empty shells. `list` stays the discovery entry.
16 vitest cases (price parser against a frozen vivo S60 fixture with 京东 data);
tsc clean; silent-column-drop & typed-error-lint new=0; doc-coverage all green.
Verified live (vivo S60 ¥3599 + 京东; iPhone17 Pro Max ¥9999).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Added a 4th command: Re-checked PConline against a logged-in browser session. Login itself unlocks nothing new, but driving the real browser surfaced the current price API that the first cut missed (it had probed the retired
— separate host from the rate-limited 产品库, no login/cookies/signature, keyed by the numeric id alone (so
16 vitest cases · tsc clean · silent-column-drop & typed-error-lint new=0 · doc-coverage green. Verified live (vivo S60 ¥3599 + 京东; iPhone17 Pro Max ¥9999). |
…n Chrome) Keyword search on PConline (ks.pconline.com.cn 快搜) sits behind a slide-captcha anti-bot challenge — a plain fetch gets HTTP 503, but it renders fine in a real browser. So unlike the 4 PUBLIC commands, `search` is a browser command (Strategy.COOKIE, browser: true): it drives the browser bridge, lets the SSR result list render, then parses each `.item-wrap` card out of the DOM — `.item-name[title]` (clean name), the `.item-pic` detail link (→ category/brand/ id) and the ¥price. The parsed url feeds info/param; the id feeds price. The anti-bot is NOT a login gate, but a logged-in Chrome is the reliable way to have the challenge already cleared. `list` still covers discovery with a plain fetch for anyone who doesn't want a browser. PConline now has 5 commands: search (browser) + list/info/param/price (public). 19 vitest cases (parseSearchRows against a frozen 快搜 fixture — iPhone 15 → 苹果 iPhone 15 / ¥5999 etc.); tsc clean; silent-column-drop & typed-error-lint new=0; doc-coverage green. Verified live through the bridge: search "iPhone 15" → iPhone 15 / 15 Plus / Pro Max / Pro with real prices + product库 URLs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Added a 5th command: Keyword search on 快搜 ( The anti-bot is not a login gate, but a logged-in Chrome reliably has the challenge already cleared. PConline now has 5 commands: |
Adds a no-login (PUBLIC) adapter for 太平洋电脑网 (PConline,
product.pconline.com.cn) — one of China's oldest digital-product catalogues (产品库: phones, laptops, cameras, CPUs, GPUs, tablets, watches…).All pages are GBK-encoded + gzip'd SSR; Node's
fetchinflates the gzip, thenTextDecoder('gbk')decodes and the HTML is regex-parsed. No login, cookies or signature (same GBK family as the ZOL / autohome adapters).Commands
pconline list <category>product.pconline.com.cn/<cat>/pconline info <product><cat>/<brand>/<id>.htmlpconline param <product><cat>/<brand>/<id>_detail.htmlA product is addressed by its
<category>/<brand>/<id>URL (printed bylist) — the bare numeric id 404s on PConline, soinfo/paramtake the URL or the triple.Deliberately not shipped (investigated; not login-gated, so login wouldn't help)
ks.pconline.com.cn快搜 sits behind a JS/anti-bot challenge (HTTP 503 to a plain fetch).listcovers discovery instead.shop_list_new2015.jspAPI is retired (404) and the static price page carries only promo ads.mtp-list.jspAPI returns empty shells.The adapter never ships empty/unreliable data.
Quality gates (run locally)
mobile/apple/2718819)tsc --noEmitclean🤖 Generated with Claude Code