惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

酷 壳 – CoolShell
酷 壳 – CoolShell
H
Hacker News: Front Page
P
Palo Alto Networks Blog
T
ThreatConnect
Apple Machine Learning Research
Apple Machine Learning Research
博客园_首页
T
True Tiger Recordings
P
Privacy & Cybersecurity Law Blog
B
Blog
IT之家
IT之家
Last Week in AI
Last Week in AI
F
Full Disclosure
Hacker News: Ask HN
Hacker News: Ask HN
C
Comments on: Blog
Microsoft Azure Blog
Microsoft Azure Blog
C
Cybersecurity and Infrastructure Security Agency CISA
Microsoft Security Blog
Microsoft Security Blog
博客园 - 【当耐特】
N
News and Events Feed by Topic
NISL@THU
NISL@THU
腾讯CDC
雷峰网
雷峰网
Security Latest
Security Latest
李成银的技术随笔
M
Microsoft Research Blog - Microsoft Research
L
LangChain Blog
L
Lohrmann on Cybersecurity
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
C
Check Point Blog
Y
Y Combinator Blog
Recent Announcements
Recent Announcements
博客园 - Franky
N
News | PayPal Newsroom
V
V2EX
A
About on SuperTechFans
The Register - Security
The Register - Security
月光博客
月光博客
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Google Online Security Blog
Google Online Security Blog
MyScale Blog
MyScale Blog
Cisco Talos Blog
Cisco Talos Blog
Vercel News
Vercel News
WordPress大学
WordPress大学
C
Cyber Attacks, Cyber Crime and Cyber Security
The Hacker News
The Hacker News
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
爱范儿
爱范儿
A
Arctic Wolf
L
LINUX DO - 最新话题
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More

博客园 - Mojies

Prompt 相关 T-Lite 简介 CRC 计算 C 语言例子 UNIX 环境编程 Note ( UPDATING ) 航模.fs.ia6b 接收机记录 线程同步 STM32 Note Linux FrameBuffer note VIM Note Android 开发笔记(更新中....) Effective C++ (暂停更新) Linux 命令行对比二进制文件脚本 Linux Note (Updateing) 关于 HID 你可能需要知道的几件事 (更新中...) 自制树莓派 3D 模型,附 STL 文件 《重构改善既有代码的设计》Tips 翻译: buildroot 用户手册 (更新中...) Some view of engineers 地球地址
一个 python 拆解文本文件的工具
Mojies · 2024-01-10 · via 博客园 - Mojies

背景

你是否有遇到过文本文档太大无法打开的情况?比如说压测了好几天,生成了一个十几 G 的日志文件。

下面这个脚本可以帮助你将一个大文件分解成一个小文件。

假设文件名位:splitfile.py

使用方法位:python splitfile.py log 20

该文件将会将 log 文件拆分成 log.0 log.1 log.2 ... log.19 等 20 个文件

源码

import os
import sys

if len( sys.argv ) < 3:
    print( "Please specify file and how many you want split" );
    print( "python %s log 10"%(sys.argv[0]) );
    sys.exit(-1)

logfile = sys.argv[1]
cutnbs = int(sys.argv[2])

if os.path.isfile( logfile ) == False:
    print( "%s is not file"%( logfile ) );
    exit(-1)

f_stat = os.stat( logfile )
print(f_stat)

f_basename = os.path.basename( logfile )
print(f_basename)

f_size = f_stat.st_size
sub_f_size = f_size / cutnbs + cutnbs

# print( f_size )

infd = open( logfile, 'rb' );

for i in range( cutnbs ):
    size_count  = 0
    o_filename = "./%s.%d"%( f_basename, i )
    print( "out filename: %s"%( o_filename ) )
    print( "start dumpL %s"%( o_filename ) )

    if os.path.isfile( o_filename ):
        os.remove( o_filename )
    o_fd = open( o_filename, 'xb' )

    while size_count < sub_f_size:
        line = infd.read( 1024*1024 )
        if len(line) == 0:
            break
        o_fd.write( line )
        size_count += len( line )

    o_fd.close()


infd.close()
exit(0)