惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

TaoSecurity Blog
TaoSecurity Blog
Jina AI
Jina AI
雷峰网
雷峰网
月光博客
月光博客
The GitHub Blog
The GitHub Blog
WordPress大学
WordPress大学
B
Blog RSS Feed
美团技术团队
C
CXSECURITY Database RSS Feed - CXSecurity.com
小众软件
小众软件
Security Latest
Security Latest
Microsoft Azure Blog
Microsoft Azure Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
C
Cybersecurity and Infrastructure Security Agency CISA
Last Week in AI
Last Week in AI
A
Arctic Wolf
Latest news
Latest news
Attack and Defense Labs
Attack and Defense Labs
I
Intezer
F
Fortinet All Blogs
罗磊的独立博客
MongoDB | Blog
MongoDB | Blog
Webroot Blog
Webroot Blog
S
Secure Thoughts
Help Net Security
Help Net Security
Apple Machine Learning Research
Apple Machine Learning Research
博客园_首页
V
Visual Studio Blog
P
Proofpoint News Feed
博客园 - 【当耐特】
P
Privacy International News Feed
V
Vulnerabilities – Threatpost
Stack Overflow Blog
Stack Overflow Blog
Know Your Adversary
Know Your Adversary
云风的 BLOG
云风的 BLOG
Hacker News: Ask HN
Hacker News: Ask HN
L
LINUX DO - 最新话题
H
Help Net Security
爱范儿
爱范儿
酷 壳 – CoolShell
酷 壳 – CoolShell
S
SegmentFault 最新的问题
Forbes - Security
Forbes - Security
T
Tailwind CSS Blog
量子位
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
T
Tenable Blog
Cloudbric
Cloudbric
N
News and Events Feed by Topic
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
Hugging Face - Blog
Hugging Face - Blog

all models are wrong

-> Going both ways in R <- finding homologous probes using biomaRt More Segment HMMs Python and Numpy integers Nasty Python Things MICROCOSMOGRAPHIA ACADEMICA Pebl The Pirate Bay Trial Latex, Beamer, Python, Beauty
Profiling in R
2010-05-13 · via all models are wrong

Giving a quick talk about vectorising code in R next month. Need to learn about profiling in R. This is what I have discovered in the last hour or so (when I should have been working):

Rprof()

This is the basic profiler in R. One turns it on:

Rprof("profile.out")

One runs some code … and one turns it off using, as always, R’s intuitive API

Rprof(NULL)

To see the results, you can use `summaryRprof(“profile.out”)` which parses the file you made using Rprof and presents it in a nice clear format, which looks something like:

> summaryRprof("method1.out")
$by.self
self.time self.pct total.time total.pct
"strsplit"          4.08     94.0       4.10      94.5
"+"                 0.12      2.8       0.12       2.8
"is.character"      0.12      2.8       0.12       2.8
"as.logical"        0.02      0.5       0.02       0.5
$by.total
total.time total.pct self.time self.pct
"strsplit"           4.10      94.5      4.08     94.0
"+"                  0.12       2.8      0.12      2.8
"is.character"       0.12       2.8      0.12      2.8
"as.logical"         0.02       0.5      0.02      0.5

There’s more detail here about `Rprof()` and `summaryRprof()`.

proftools

This package presents an alternative to summaryRprof, using graphviz to generate teh pretteh. Not sure how massively useful it would be in practice, unless your code got pretty complex! I’d be interested to know how complicated people’s functions actually got when using R.

To make the graph, make sure you have graph and Rgraphviz properly installed, then bash out

plotProfileCallGraph("profile.out")

profr

I kind of want everything Hadley Wickham does to be awesome, so when I see he’s made a profiling tool, I give it a lot more attention than it necessarily deserves. For example, you won’t find me hunting for documentation through anyone else’s github account. Nevertheless, this is what I find myself doing for `profr` (which in my head I will be pronouncing “proffer”). Note: it does turn out that I could have simply typed ?profr in my R terminal (does this always work)?

To use, simply run

out <- profr(code_to_profile)

then to inspect it use the normal `head`, `tail` and `summary`, noting that `out` is (pretty much) a `data.frame`:

> head(out)
f level time start  end  leaf source
8      example     1 0.44  0.00 0.44 FALSE  utils
9  <Anonymous>     2 0.04  0.00 0.04 FALSE   <NA>
10     library     2 0.02  0.04 0.06 FALSE   base
11      source     2 0.38  0.06 0.44 FALSE   base
12  prepare_Rd     3 0.04  0.00 0.04 FALSE   <NA>
13        %in%     3 0.02  0.04 0.06 FALSE   base
> summary(out)
f          level             time          start
match        : 7   Min.   : 1.000   Min.   :0.02   Min.   :0.0000
%in%         : 6   1st Qu.: 5.000   1st Qu.:0.02   1st Qu.:0.1200
is.factor    : 6   Median : 7.000   Median :0.02   Median :0.1600
eval.with.vis: 6   Mean   : 8.113   Mean   :0.04   Mean   :0.1685
inherits     : 5   3rd Qu.:10.500   3rd Qu.:0.02   3rd Qu.:0.2200
<Anonymous>  : 3   Max.   :19.000   Max.   :0.44   Max.   :0.4200
(Other)      :82
end            leaf            source
Min.   :0.0200   Mode :logical   Length:115
1st Qu.:0.1400   FALSE:99        Class :character
Median :0.2000   TRUE :16        Mode  :character
Mean   :0.2085   NA's :0
3rd Qu.:0.2500
Max.   :0.4400

Here I should say that `code_to_profile` was the standard `example(glm)`. One can also use `plot()` or `ggplot()` to plot the ‘call tree’. I’m not quite sure what a ‘call tree’ is, and to be honest the (adimittedly pretty) graph that `ggplot(out)` produces doesn’t enlighten me a great deal. That’s OK though – all I really wanted was the time spent.

A quick point about Hadley’s code, that I think highlights something wrong with R in general (and right with Python) is that in `profr`, from a user perspective I don’t really have to learn anything new to make it work. I apply a function (handily called `profr`) to some code and I get back a `data.frame` on which I can use very standard tools in order to investigate. I don’t have to write to a file (or generate a temp file to write to or anything), I don’t have to learn to stop and start hidden things, or that to stop profiling I pass `NULL` to a function that started profiling. I don’t have to use a whole new tool to parse what is basically a table. And to plot I can just use `plot` rather than having to dissapear down the (admittedly fun) hole that is graphviz. It feels like every new thing I come across in R means learning a whole API and set of concepts and object structure. (Admittedly ggplot is the total opposite of this, but hey I guess that was the point)