慣性聚合 高效追讀感興趣之博客、新聞、科技資訊
閱原文 以慣性聚合開啟

推薦訂閱源

小众软件
小众软件
博客园 - 叶小钗
有赞技术团队
有赞技术团队
大猫的无限游戏
大猫的无限游戏
博客园_首页
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
L
LangChain Blog
Hugging Face - Blog
Hugging Face - Blog
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
aimingoo的专栏
aimingoo的专栏
Blog — PlanetScale
Blog — PlanetScale
爱范儿
爱范儿
T
Tailwind CSS Blog
Jina AI
Jina AI
量子位
Stack Overflow Blog
Stack Overflow Blog
人人都是产品经理
人人都是产品经理
J
Java Code Geeks
V
Visual Studio Blog
月光博客
月光博客

DEV Community

Authentication Security Deep Dive: From Brute Force to Salted Hashing (With Java Examples) Why AI Systems Don’t Fail — They Drift Spilling beans for how i learn for exam😁"Reinforcement Learning Cheat Sheet" I Replaced Chrome with Safari for AI Browser Automation. Here's What Broke (and What Finally Worked) How Python Borrows Other People's Work The $40 Architecture: Processing 1 Billion API Requests with 99.99% Uptime Vibe Coding: A Workflow Guide (From Zero to SaaS) Most webhook security guides protect the wrong side. The scary part is delivery. Headless CMS for TanStack Start: Build a Blog with Cosmic EU Age Verification App "Hacked in 2 Minutes" — What Actually Happened Comfy Cloud’s delete function does not actually remove files Running AI Models on GPU Cloud Servers: A Beginner Guide Event-driven media intelligence with AWS Step Functions and Bedrock I scored 500 AI prompts across 8 quality dimensions — here's what broke How to Call Google Gemini API from Next.js (Free Tier, No Backend Needed) The Portal Protocol: Reclaiming Human Connection in the Age of AI How to Fix Your Team's Scattered Knowledge Problem With a Self-Hosted Forum Intro to tc Cloud Functors: A Graph-First Mental Model for the Modern Cloud Designing Multi-Tenant Backends With Both Ownership and Team Access I Built a Neumorphic CSS Library with 77+ Components — Here's What I Learned PostgreSQL Performance Optimization: Why Connection Pooling Is Critical at Scale Cómo construí un SaaS multi-rubro para gestionar expensas en Argentina con FastAPI + Vue 3 🚀 I Built an Ethical Hacking Scanner Tool – Open Source Project I Replaced /usage and /context in Claude Code With a Single Statusline A Pythonic Way to Handle Emails (IMAP/SMTP) with Auto-Discovery and AI-Ready Design I Collected 8.9 Million Polymarket Price Points — Here's What I Found About How Markets Really Move EcoTrack AI — Carbon Footprint Tracker & Dashboard Everyone's Using AI. No One Agrees How. 5 self-hosted ebook managers worth trying in 2026 Building Your First AI Agent with LangChain: From Chatbot to Autonomous Assistant Common SOC 2 Failures (Real World) Stop Vibe-Checking Your AI App: A Practical Guide to Evals How to Use SonarQube and SonarScanner Locally to Level Up Your Code Quality Your Next To-Do App Is Dead — I Replaced Mine with an OpenClaw AI Sign a Nostr event in 60 lines of Python using coincurve — no nostr-sdk, no nbxplorer, no rust toolchain ITGC Audit Explained Like You’re in Big 4 Patch Tuesday abril 2026: Microsoft parcha 163 vulnerabilidades y un zero-day en SharePoint Stop scraping everything: a better way to track competitor price changes Listing on MCPize + the Official MCP Registry while routing payments OUTSIDE the marketplace — how I kept 100% of my x402 revenue Building an AI-Powered Risk Intelligence System Using Serverless Architecture Why We Ripped Function Overloading Out of Our AI Toolchain Testing AI-Generated Code: How to Actually Know If It Works SaaS Churn Is Killing Your Business. Here Is What to Do About It (Without a Support Team) The Speed of AI Is No Longer Linear - And Self-Improving Models Are Why How to Implement RBAC for MCP Tools: A Practical Guide for Engineering Teams From Standard Quote to Persuasive Proposal: AI Automation for Arborists I built a CLI that scaffolds complete multi-tenant SaaS apps Axios CVE-2025–62718: The Silent SSRF Bug That Could Be Hiding in Your Node.js App Right Now The dashboard that ended our friendship Data Pipelines Explained Simply (and How to Build Them with Python)
吾如何以AT协议API构建Bluesky之爬虫器(并发表于Apify)
Daniel Ainsw · 2026-05-28 · via DEV Community

是岁初,Bluesky用户达四千万,异于Twitter者,其依开放协议——AT Protocol——运行,公数据本即公开,且设计上可由机器读取。无每月五千美元之企业API层级。无需律师方能解之速率限制。唯其有洁净REST API,任人可询。

吾欲刮取之。此乃吾建生产级角色之法,及吾所悟之道。

为何 Bluesky 易于抓取(合法地)

凡社交媒体抓取者,多与 Cloudflare、轮换代理及服务条款灰色地带争斗。Bluesky 则异是。AT 协议本为第三方客户端与数据访问而设。其公开 API 于 public.api.bsky.app 接受未认证之读取请求。无指纹识别,无验证码,无 DOM 解析。

唯憾者:检索之端(app.bsky.feed.searchPosts)今需凭免费之应用密码以验真。其他——作者之馈、文脉、画像——皆无需凭符而通。

吾所建之三式

吾欲一司事以应主B2B之用:

检索文——以字句及标签索之,兼限日期、语言、次第。用之bsky.social/xrpc/app.bsky.feed.searchPosts持令牌而行。

作者之馈 — 撷一或多号之文。无需认证。适于竞者察之或稽作者之文脉。

脉络 — 汲一帖之全议。API返嵌套之树;吾平之而深,得次第之文列。

惟一之患:API之径。

此灼我。吾以认证请求(附JWT)送至public.api.bsky.app。彼端乃Cloudflare所代理,若投以认证令牌,则返403——此端专供未认证之流。

其解:认证之呼往bsky.social,未认证之读往public.api.bsky.app。汝当向bsky.social认证,得JWT,继之唯以此JWT用之。bsky.social 调用

单一仓库部署之困

吾正构 Apify 行者之组合于 TypeScript 单一仓库,以 npm 工作空间。共享库 (@apify-actors/shared) 含 PPE 充电助手及错误类。本地,工作空间解析处理之甚洁。于 Apify 之构建服务器,无单一仓库——唯上传之行者文件夹而已。

其解:将共享源码复制入src/shared/ 每一角色之内,皆用相对导入。tsup 将其悉数束为一 dist/main.js。共通之码,存于库中一统之地;各角色自建时,皆烙其本己之复本于其中.

输出之式

每篇文皆返为平铺之 JSON 记录:

{
  "url": "https://bsky.app/profile/user.bsky.social/post/3lhxxxxxxxxx",
  "text": "Post content here",
  "authorHandle": "user.bsky.social",
  "authorDisplayName": "User Name",
  "likeCount": 142,
  "repostCount": 28,
  "replyCount": 19,
  "images": [{ "thumb": "...", "fullsize": "...", "alt": "..." }],
  "externalEmbed": { "uri": "...", "title": "...", "description": "..." },
  "createdAt": "2025-11-15T10:30:00.000Z"
}

入全景模式 出全景模式

可自Apify直接导出为JSON、CSV或Excel。可接Zapier或Make,实现无代码工作流。

该角色已启用

欲用而无须构建者:Apify Store之Bluesky Posts Scraper

PPE定价:每运行0.25美元,每帖0.003美元(3美元/1,000帖)。无需订阅。

AT 协议使 Bluesky 成为当下最洁净之数据源。若君之用例涉于舆情监测、品牌监察,或自高速成长之科技先锋群体中获取商机信号,则值纳入君之工具箱。