惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

GbyAI
GbyAI
J
Java Code Geeks
雷峰网
雷峰网
WordPress大学
WordPress大学
宝玉的分享
宝玉的分享
云风的 BLOG
云风的 BLOG
V
Visual Studio Blog
V
Vulnerabilities – Threatpost
S
Securelist
The Hacker News
The Hacker News
The Register - Security
The Register - Security
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
Help Net Security
Help Net Security
G
Google Developers Blog
Hugging Face - Blog
Hugging Face - Blog
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
M
MIT News - Artificial intelligence
AI
AI
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
The GitHub Blog
The GitHub Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
Schneier on Security
Schneier on Security
N
Netflix TechBlog - Medium
T
The Blog of Author Tim Ferriss
Google DeepMind News
Google DeepMind News
Hacker News - Newest:
Hacker News - Newest: "LLM"
H
Hacker News: Front Page
博客园 - 司徒正美
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
B
Blog
Microsoft Azure Blog
Microsoft Azure Blog
大猫的无限游戏
大猫的无限游戏
Security Latest
Security Latest
Engineering at Meta
Engineering at Meta
N
News and Events Feed by Topic
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
酷 壳 – CoolShell
酷 壳 – CoolShell
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
T
Threat Research - Cisco Blogs
U
Unit 42
V
V2EX
V2EX - 技术
V2EX - 技术
L
LINUX DO - 最新话题
aimingoo的专栏
aimingoo的专栏
Microsoft Security Blog
Microsoft Security Blog
Recorded Future
Recorded Future
P
Privacy & Cybersecurity Law Blog
美团技术团队
小众软件
小众软件
F
Fortinet All Blogs

博客园 - cutepig

換博客了,新地址https://cutepig123.github.io/ 光盘是个好东西 俺买过的电子产品 抽象的好与坏 垂直方向的努力更有意义 python GUI test tool AirtestIDE_2019-04-16_py3_win64 google translate automation using selenium 冯唐论加班 Prometheus 文件系统api为何可以这么简洁以及我們這個抓圖模塊的設計難點 好繫統是一次性設計出來的嗎 jaeger tracing CS lessons c++ 书籍 my codes 我写的代码 用几种语言实现socks server 一個不錯的Modern CMake的入門教程 Modern CMake remote ssh for vscode test
写了一个自动用google翻译文档的工具
cutepig · 2020-10-05 · via 博客园 - cutepig

写了一个自动用google翻译文档的工具

features:

  • 支持word
  • 每一个段落下面放上对照的翻译
from googletrans import Translator
import sys
import docx

fname = sys.argv[1] if len(
    sys.argv) > 1 else r'F:\GoogleDriveSync3\jobrelated\The Fast Forward MBA in Project Management ( PDFDrive.com ).full.docx'


translator = Translator()
foname = fname + '-cn.docx'
doc = docx.Document(fname)
docdes = docx.Document(fname)

N = len(doc.paragraphs)
for i in range(N):
    print(1.0*i/N,)
    subCont = doc.paragraphs[i].text
    try:
        s = translator.translate(subCont, src='en', dest='zh-cn')
        docdes.paragraphs[i].add_run('\n' + str(s.text) + '\n')
    except Exception as e:
        print('except:', e)
        
docdes.save(foname)

from googletrans import Translator
import sys, os
import docx

fname = sys.argv[1] if len(
    sys.argv) > 1 else r'D:\Users\cutep\Downloads\Throw-Away-the-First-90-Days.docx'

def trans(fname):
    translator = Translator()
    foname = fname + '-cn.docx'
    doc = docx.Document(fname)
    docdes = docx.Document(fname)

    N = len(doc.paragraphs)
    NextTarget = 0.1
    i = 0
    while i<N:
        percentage = 1.0*i/N
        if i%10==0: print(percentage)
        if percentage>NextTarget:
            outputfile = '%s-%.2f-cn.docx'%(fname, NextTarget)
            print(outputfile)
            docdes.save(outputfile)
            NextTarget = NextTarget + 0.1

        spacer = '\n========================\n'
        spacer_short = '========================'
        subCont = doc.paragraphs[i].text
        j = i+1
        while len(subCont)<4500 and j<N:
            subCont = subCont + spacer + doc.paragraphs[j].text
            j = j+1
        print(i,j)
        if subCont.strip():
            #try:
            s = translator.translate(subCont, src='en', dest='zh-cn')
            ss = s.text.split(spacer_short)
            assert len(ss)==j-i, '%d, %d'%(len(ss), j-i)
            for k in range(j-i):
                docdes.paragraphs[k+i].add_run('\n' + ss[k] + '\n')
            #except Exception as e:
            #    print('except:', e)
        i = j

    docdes.save(foname)

if __name__ == '__main__':
    if os.path.isfile(fname):
        trans(fname)
    else:
        from multiprocessing import Process

        ps=[]
        for filename in os.listdir(fname):
            if filename.lower().endswith('.docx'):
                p = Process(target=trans, args=(fname + '\\' + filename,))
                p.start()
                ps.append(p)

        for p in ps:
            p.join()