惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

H
Help Net Security
博客园 - Franky
GbyAI
GbyAI
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
爱范儿
爱范儿
IT之家
IT之家
酷 壳 – CoolShell
酷 壳 – CoolShell
aimingoo的专栏
aimingoo的专栏
博客园_首页
MongoDB | Blog
MongoDB | Blog
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
Recent Announcements
Recent Announcements
Scott Helme
Scott Helme
有赞技术团队
有赞技术团队
M
MIT News - Artificial intelligence
C
CERT Recently Published Vulnerability Notes
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
Jina AI
Jina AI
F
Fortinet All Blogs
N
Netflix TechBlog - Medium
L
LangChain Blog
L
LINUX DO - 最新话题
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
H
Hacker News: Front Page
MyScale Blog
MyScale Blog
P
Palo Alto Networks Blog
G
Google Developers Blog
Google DeepMind News
Google DeepMind News
AI
AI
T
Troy Hunt's Blog
Microsoft Azure Blog
Microsoft Azure Blog
阮一峰的网络日志
阮一峰的网络日志
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
Vercel News
Vercel News
Microsoft Security Blog
Microsoft Security Blog
罗磊的独立博客
S
Secure Thoughts
大猫的无限游戏
大猫的无限游戏
博客园 - 叶小钗
人人都是产品经理
人人都是产品经理
Blog — PlanetScale
Blog — PlanetScale
博客园 - 司徒正美
Apple Machine Learning Research
Apple Machine Learning Research
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
博客园 - 三生石上(FineUI控件)
S
Security @ Cisco Blogs
Cloudbric
Cloudbric
E
Exploit-DB.com RSS Feed
Attack and Defense Labs
Attack and Defense Labs

博客园 - Len3d

研报和信息查询网站 研报查询链接 初步测试了一下C++11的async/future 徒手画个disk不容易啊。。。 fast powf Mongoose也是个大坑 A tiny program to benchmark image transpose algorithms On extracting ops from LLVM backend Into concurrent LRU caching once again 性能大坑 多项式在线拟合神器 iOS app开发资料整理 完美的视图旋转算法 Windows上使用clang编译 - Len3d nodejs Rpath handling on Linux C++ Web Service SDK Fast integer math tricks for C - Len3d Point in polygon algorithm C code
SSE sqrt还是比C math库的sqrtf快了不少
Len3d · 2017-11-28 · via 博客园 - Len3d
#include <stdio.h>
#include <xmmintrin.h>
#define NOMINMAX
#include <windows.h>
#include <math.h>
#include <time.h>

__forceinline float fast_sqrt(float x)
{
    return _mm_cvtss_f32(_mm_sqrt_ss(_mm_set_ss(x)));
}

int main(int argc, char *argv[])
{
    const int N = 100000000;
    float *buf = new float[N];
    for (int i = 0; i < N; ++i)
    {
        buf[i] = 1000.0f * (float)rand() / (float)RAND_MAX;
    }

    float sum;
    int start_time;

    sum = 0.0f;
    start_time = clock();
    for (int i = 0; i < N; ++i)
    {
        sum += sqrtf(buf[i]);
    }
    printf("sum = %f   in clock %d\n", sum, clock() - start_time);



    sum = 0.0f;
    start_time = clock();
    for (int i = 0; i < N; ++i)
    {
        sum += fast_sqrt(buf[i]);
    }
    printf("sum (fast) = %f   in clock %d\n", sum, clock() - start_time);



    delete[]buf;
    return 0;
}

测试结果:

sum = 536870912.000000 in clock 391
sum (fast) = 536870912.000000 in clock 281