惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Help Net Security
Help Net Security
G
Google Developers Blog
雷峰网
雷峰网
WordPress大学
WordPress大学
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
Engineering at Meta
Engineering at Meta
Security Latest
Security Latest
T
Threat Research - Cisco Blogs
AWS News Blog
AWS News Blog
F
Full Disclosure
C
Cybersecurity and Infrastructure Security Agency CISA
T
The Exploit Database - CXSecurity.com
J
Java Code Geeks
U
Unit 42
C
Cyber Attacks, Cyber Crime and Cyber Security
V
V2EX
C
Cisco Blogs
博客园 - 司徒正美
Project Zero
Project Zero
L
LINUX DO - 热门话题
阮一峰的网络日志
阮一峰的网络日志
Blog — PlanetScale
Blog — PlanetScale
Scott Helme
Scott Helme
A
About on SuperTechFans
Hugging Face - Blog
Hugging Face - Blog
S
Securelist
小众软件
小众软件
aimingoo的专栏
aimingoo的专栏
S
Schneier on Security
G
GRAHAM CLULEY
酷 壳 – CoolShell
酷 壳 – CoolShell
Cyberwarzone
Cyberwarzone
MongoDB | Blog
MongoDB | Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
博客园 - 叶小钗
T
Threatpost
Recorded Future
Recorded Future
C
CXSECURITY Database RSS Feed - CXSecurity.com
宝玉的分享
宝玉的分享
N
News and Events Feed by Topic
人人都是产品经理
人人都是产品经理
The Register - Security
The Register - Security
S
Security Archives - TechRepublic
博客园 - Franky
N
News | PayPal Newsroom
Simon Willison's Weblog
Simon Willison's Weblog
S
SegmentFault 最新的问题
W
WeLiveSecurity
A
Arctic Wolf
B
Blog

博客园 - Shicai Yang

Computer Vision Tutorials from Conferences (3) -- CVPR Computer Vision Tutorials from Conferences (2) -- ECCV Computer Vision Tutorials from Conferences (1) -- ICCV PhD Positions opening at University of Nevada, Reno (Wireless Networking / Cognitive Radio / Wireless Security) Best of Best系列(6)——SIGIR Best of Best系列(5)——IJCAI Best of Best系列(4)——AAAI Best of Best系列(3)——ICML Best of Best系列(2)——ICCV Best of Best系列(1)——CVPR 美国内华达大学招收计算机视觉方向MS/PhD学生,提供RA职位,每月1600$ Win7 64bit下MexOpenCV的安装,Matlab和C++&OpenCV的完美结合 Windows SDK 7.1 安装 Deep Learning at NIPS2012 Some Tips on Reading Research Papers Comics: do we know that we are not doing research in the wrong way? PCCS-RGB变换表 21世纪初最有影响力的30篇计算机视觉会议论文 21世纪初最有影响力的20篇计算机视觉期刊论文
Some Interesting Papers from NIPS 2012
Shicai Yang · 2012-12-06 · via 博客园 - Shicai Yang

W. Koolen, D. Adamskiy, M. Warmuth
Putting Bayes to sleep
Some signals look sort of jump Markov — the distribution of the data changes over time so that there are segments which have distribution A, then later it switches to B, then perhaps back to A, and so on. A prediction procedure which “mixes past posteriors” works well in this setting but it was not clear why. This paper provides a Bayesian interpretation for the predictor as mixing in a “sleeping experts” setting.

J. Duchi, M. Jordan, M. Wainwright, A. Wibisono
Finite Sample Convergence Rates of Zero-Order Stochastic Optimization Methods
This paper looked at stochastic gradient descent when function evaluations are cheap but gradient evaluations are expensive. The idea is to compute an unbiased approximation to the gradient by evaluating the function at the \theta_t and \theta_t + \mathrm{noise} and then do the discrete approximate to the gradient. Some of the attendees claimed this is similar to an approach proposed by Nesterov, but the distinction was unclear to me.

J. Lloyd, D. Roy, P. Orbanz, Z. Ghahramani
Random function priors for exchangeable graphs and arrays
This paper looked at Bayesian modeling for structures like undirected graphs which may represent interactions, like protein-protein interactions. Infinite random graphs whose distributions are invariant under permutations of the vertex set can be associated to a structure called a graphon. Here they put a prior on graphons, namely a Gaussian process prior, and then try to do inference on real graphs to estimate the kernel function of the process, for example.

N. Le Roux, M. Schmidt, F. Bach
A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets
This was a paper marked for oral presentation — the idea is that in gradient descent it is expensive to evaluate gradients if your objective function looks like \sum_{i=1}^{n} f(\theta, x_i), where x_i are your data points and n is huge. This is because you have to evaluate n gradients. On the other hand, stochastic gradient descent can be slow because it picks a single i and does a gradient step at each iteration on f(\theta_t, x_i). Here what they do at step t is pick a random point j, evaluate its gradient, but then take a gradient step on all n points. For points i \ne j they just use the gradient from the last time i was picked. Let T_i(t) be the last time i was picked before time t, and T_j(t) = t. Then they take a gradient step like \sum_{i = 1}^{n} f(\theta_{T_i(t)}, x_i). This works surprisingly well.

Stephane Mallat
Classification with Deep Invariant Scattering Networks
This was an invited talk — Mallat was trying to explain why deep networks seem to do learning well (it all seems a bit like black magic), but his explanation felt a bit heuristic to me in the end. The first main point he had is that wavelets are good at capturing geometric structure like translation and rotation, and appear to have favorable properties with respect to “distortions” in the signal. The notion of distortion is a little vague, but the idea is that if two signals (say images) are similar but one is slightly distorted, they should map to representations which are close to each other. The mathematics behind his analysis framework was group theoretic — he wants to estimate the group of actions which manipulate images. In a sense, this is a control-theory view of the problem (at least it seemed to me). The second point that I understood was that sparsity in representation has a big role to play in building efficient and layered representations. I think I’d have to see the talk again to understand it better, but in the end I wasn’t sure that I understoodwhy deep networks are good, but I did understand some more interesting things about wavelet representations, which is cool.

From:http://ergodicity.net/2012/12/05/nips-2012-day-two/