惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

酷 壳 – CoolShell
酷 壳 – CoolShell
H
Hacker News: Front Page
P
Palo Alto Networks Blog
T
ThreatConnect
Apple Machine Learning Research
Apple Machine Learning Research
博客园_首页
T
True Tiger Recordings
P
Privacy & Cybersecurity Law Blog
B
Blog
IT之家
IT之家
Last Week in AI
Last Week in AI
F
Full Disclosure
Hacker News: Ask HN
Hacker News: Ask HN
C
Comments on: Blog
Microsoft Azure Blog
Microsoft Azure Blog
C
Cybersecurity and Infrastructure Security Agency CISA
Microsoft Security Blog
Microsoft Security Blog
博客园 - 【当耐特】
N
News and Events Feed by Topic
NISL@THU
NISL@THU
腾讯CDC
雷峰网
雷峰网
Security Latest
Security Latest
李成银的技术随笔
M
Microsoft Research Blog - Microsoft Research
L
LangChain Blog
L
Lohrmann on Cybersecurity
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
C
Check Point Blog
Y
Y Combinator Blog
Recent Announcements
Recent Announcements
博客园 - Franky
N
News | PayPal Newsroom
V
V2EX
A
About on SuperTechFans
The Register - Security
The Register - Security
月光博客
月光博客
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Google Online Security Blog
Google Online Security Blog
MyScale Blog
MyScale Blog
Cisco Talos Blog
Cisco Talos Blog
Vercel News
Vercel News
WordPress大学
WordPress大学
C
Cyber Attacks, Cyber Crime and Cyber Security
The Hacker News
The Hacker News
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
爱范儿
爱范儿
A
Arctic Wolf
L
LINUX DO - 最新话题
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More

博客园 - Donal

docker命令 NLP | 自然语言处理 - 语言模型(Language Modeling) windows: Python安装scipy,scikit-image时提示"no lapack/blas resources found"的解决方法 Sense2vec with spaCy and Gensim python 去停用词 nohup command > myout.file 2>&1 & NLTK vs SKLearn vs Gensim vs TextBlob vs spaCy Gensim进阶教程:训练word2vec与doc2vec模型 Gensim入门教程 使用pdb调试python git只clone仓库中指定子目录 转:深度学习与自然语言处理之五:从RNN到LSTM 转:如何构建爬虫代理服务? RHEL7 -- Linux搭建FTP虚拟用户 解决windows10搜索不到内容的问题 forward和redirect 的区别 RHEL7磁盘分区挂载和格式化 Spring注解 100 open source Big Data architecture papers for data professionals
RHEL7下安装使用TensorFlow和kcws
Donal · 2016-11-30 · via 博客园 - Donal

0.安装依赖包

#用pip安装python科学计算库numpy,sklearn,scipy
su
- wget http://dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-8.noarch.rpm yum install epel-release-7-8.noarch.rpm yum install python2-pip.noarch
yum install gcc-c++.x86_64 pip
install --upgrade pip pip install numpy pip install sklearn pip install scipy

1.安装bazel 

Bazel是一个类似于Make的工具,是Google为其内部软件开发的特点量身定制的工具,2015年开源。

cd ~
wget https://github.com/bazelbuild/bazel/archive/0.4.0.tar.gz
tar xzvf 0.4.0.tar.gz
cd bazel-0.4.0/
./compile.sh
sudo cp output/bazel /usr/bin/
which bazel
#配置bash_completion
bazel build //scripts:bazel-complete.bash
sudo cp bazel-bin/scripts/bazel-complete.bash /etc/bash_completion.d/

2.安装TensorFlow

#下载TensorFlow源代码
git clone https://github.com/tensorflow/tensorflow
cd tensorflow/
./configure
#Create the pip package and install bazel build -c opt //tensorflow/tools/pip_package:build_pip_package bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg sudo pip install /tmp/tensorflow_pkg/tensorflow-*-cp27-none-linux_x86_64.whl

3.安装kcws - 97.5%准确率的深度学习中文分词(字嵌入+Bi-LSTM+CRF)https://github.com/koth/kcws

git clone https://github.com/koth/kcws.git
cd kcws/
#下载语料库people2014.tar.gz
tar xzvf people2014.tar.gz # 解压到~/kcws/2014
./configure 
#编译后台服务
bazel build //kcws/cc:seg_backend_api
python kcws/train/process_anno_file.py ./2014 chars_for_w2v.txt
bazel build third_party/word2vec:word2vec
#使用word2vec 训练 chars_for_w2v (注意-binary 0),得到字嵌入结果vec.txt
./bazel-bin/third_party/word2vec/word2vec -train chars_for_vec.txt -output kcws/models/vec.txt -size 50 -sample 1e-4 -negative 5 -hs 1 -binary 0 -iter 5
bazel build kcws/train:generate_training
./bazel-bin/kcws/train/generate_training kcws/models/vec.txt ./ all.txt
python kcws/train/filter_sentence.py all.txt
python kcws/train/train_cws_lstm.py --word2vec_path ./kcws/models/vec.txt --train_data_path ./train.txt --test_data_path test.txt --max_sentence_len 80 --learning_rate 0.001