惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Stack Overflow Blog
Stack Overflow Blog
PCI Perspectives
PCI Perspectives
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
V2EX - 技术
V2EX - 技术
Google DeepMind News
Google DeepMind News
量子位
博客园_首页
S
SegmentFault 最新的问题
S
Secure Thoughts
F
Full Disclosure
H
Hacker News: Front Page
博客园 - 三生石上(FineUI控件)
U
Unit 42
H
Heimdal Security Blog
N
News and Events Feed by Topic
A
About on SuperTechFans
C
CERT Recently Published Vulnerability Notes
Cyberwarzone
Cyberwarzone
Help Net Security
Help Net Security
The Hacker News
The Hacker News
L
LINUX DO - 最新话题
Application and Cybersecurity Blog
Application and Cybersecurity Blog
罗磊的独立博客
N
News | PayPal Newsroom
Spread Privacy
Spread Privacy
C
Cisco Blogs
C
CXSECURITY Database RSS Feed - CXSecurity.com
云风的 BLOG
云风的 BLOG
A
Arctic Wolf
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
Simon Willison's Weblog
Simon Willison's Weblog
B
Blog
人人都是产品经理
人人都是产品经理
TaoSecurity Blog
TaoSecurity Blog
博客园 - 【当耐特】
C
Cyber Attacks, Cyber Crime and Cyber Security
P
Proofpoint News Feed
Hugging Face - Blog
Hugging Face - Blog
I
InfoQ
D
DataBreaches.Net
大猫的无限游戏
大猫的无限游戏
Apple Machine Learning Research
Apple Machine Learning Research
L
LINUX DO - 热门话题
Google Online Security Blog
Google Online Security Blog
V
Visual Studio Blog
V
Vulnerabilities – Threatpost
Know Your Adversary
Know Your Adversary
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
B
Blog RSS Feed

Lan小站-嗯,不错! - Python

滑块验证图片匹配 - Lan小站-嗯,不错! 从 pip 到 uv:一场 Python 包管理的「换引擎」革命 通过终端管理宝塔Python项目管理器里面的Python项目 - Lan小站-嗯,不错! requests优雅的重试 - Lan小站-嗯,不错! 解决Mac下ssl.SSLCertVerificationError:[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate(_ssl.c:1056) Debian11安装部署stable-diffusion-webui记录 - Lan小站-嗯,不错! 调用ChatGPT超过4096Token后自动截取保留指定长度的Token - Lan小站-嗯,不错! django怎么在迁移数据库的时候,自动生成数据 - Lan小站-嗯,不错! python datetime 东八区时间 - Lan小站-嗯,不错!
某牛某客专栏文章爬虫 - Lan小站-嗯,不错!
Lan · 2023-10-08 · via Lan小站-嗯,不错! - Python

1696748760540.webp
代码已脱敏,自行替换

# @Time    : 2023/10/8 14:43
# @Author  : Lan
# @File    : niukespider.py
# @Software: PyCharm
import time
import requests


def get_category(catalog='10klpm'):
    url = f'https://www.lanol.cn.com/content/zhuanlan/index/catalog/{catalog}'
    return requests.get(url).json()


c = """
<!doctype html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, user-scalable=no, initial-scale=1.0, maximum-scale=1.0, minimum-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <title>Document</title>
</head>
<body>
{{content}}
</body>
</html>
"""


def get_content(catalog, entity):
    url = f'https://www.lanol.cn.com/content/zhuanlan/index/detail/{catalog}/{entity}?_={int(time.time() * 1000)}'
    return requests.get(url, headers={
        'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/117.0.0.0 Safari/537.36'
    }).json()


if __name__ == '__main__':
    catalog = 'Gj5x2m'

    for i in get_category(catalog)['data']['catalog']:
        content = get_content(catalog, i['uuid'])['data']
        with open('./docs/' + i['title'].replace('/', '-') + '.html', 'w', encoding='utf-8') as f:
            f.write(c.replace('{{content}}', content['content']))