惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Cloudbric
Cloudbric
E
Exploit-DB.com RSS Feed
SecWiki News
SecWiki News
Forbes - Security
Forbes - Security
N
News | PayPal Newsroom
S
Security @ Cisco Blogs
Schneier on Security
Schneier on Security
V
V2EX - 技术
S
Secure Thoughts
W
WeLiveSecurity
Google DeepMind News
Google DeepMind News
C
CERT Recently Published Vulnerability Notes
NISL@THU
NISL@THU
S
Securelist
S
Security Archives - TechRepublic
Know Your Adversary
Know Your Adversary
V
Vulnerabilities – Threatpost
Security Latest
Security Latest
Recent Commits to openclaw:main
Recent Commits to openclaw:main
G
GRAHAM CLULEY
H
Hacker News: Front Page
Microsoft Azure Blog
Microsoft Azure Blog
I
Intezer
Google Online Security Blog
Google Online Security Blog
美团技术团队
阮一峰的网络日志
阮一峰的网络日志
T
The Exploit Database - CXSecurity.com
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
Webroot Blog
Webroot Blog
Jina AI
Jina AI
Engineering at Meta
Engineering at Meta
P
Proofpoint News Feed
The Cloudflare Blog
I
InfoQ
L
LangChain Blog
U
Unit 42
P
Proofpoint News Feed
S
Schneier on Security
S
Security Affairs
Y
Y Combinator Blog
T
Tenable Blog
N
News and Events Feed by Topic
MyScale Blog
MyScale Blog
量子位
Google DeepMind News
Google DeepMind News
Cyberwarzone
Cyberwarzone
博客园 - 聂微东
D
Darknet – Hacking Tools, Hacker News & Cyber Security
GbyAI
GbyAI
AWS News Blog
AWS News Blog

博客园 - sundeepblue

const 堆和栈的区别 (转贴) The Linux Alternative Project Linux与windows 互访总结 Airplane: Cool!!! Lyx: The Document Processor! VC快捷键 几个一定要掌握的string 函数 **A Tree Manipulation Library What is Recursion? Iteration Vs. Recursion How to Rewrite Standard Recursion through a State Stack & Iteration Recursive 使用SIMD指令高度优化Matrix类(in C++) 发挥缓存的威力,提高代码效率,及如何实现16位浮点数 Inline Assembly in VC++ and gcc 写Makefile IQ, EQ, AQ & 阿Q 编写C/C++程序需要知道的基本问题
快速求平方根的算法
sundeepblue · 2007-07-31 · via 博客园 - sundeepblue
"

Any 3D engine draws it's power and speed from the mathematical models and implementations within, and trust John Carmack of ID software for using really good hacks. As it turns out, a very interesting hack is used in Quake III to calculate an inverse square root.  

Preface 

ID software has recently released the source code of Quake III engine with a GPL license. In this article we'll see Carmack work his black magic to calculate the square root of a floating point number blazingly fast.  

Carmack's Unusual Inverse Square Root  

A fast glance at the file game/code/q_math.c reveals many interesting performance hacks. The first thing that pops out is the use of the number 0x5f3759df in the function Q_rsqrt, which calculates the inverse square root of a float and why does this function actually work?  

Observe the original function from q_math.c 

float Q_rsqrt( float number )  

{  

  long i;  

  float x2, y; 

  const float threehalfs = 1.5F;  

  x2 = number * 0.5F;  

  y  = number;  

  i  = * ( long * ) &y;  // evil floating point bit level hacking  

  i  = 0x5f3759df - ( i >> 1 ); // what the fuck?  

  y  = * ( float * ) &i;  

  y  = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration  

  // y  = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed  

  #ifndef Q3_VM  

  #ifdef __linux__  

    assert( !isnan(y) ); // bk010122 - FPE?  

  #endif 

  #endif 

  return y;}      

Not only does it work, on some CPU's Carmack's Q_rsqrt runs up to 4 times faster than (float)(1.0/sqrt(x), eventhough sqrt() is usually implemented using the FSQRT assembley instruction!  

In another file, code/common/cm_trace.c , a neater implementation of the same hack can be found. This time it is used for calculating the square root of a float - sqrt(x). Note that the only real difference is in the return value – instead of returning y, return number*y as the square root:  

/*  

================ 

 SquareRootFloat

================

*/

float SquareRootFloat(float number) { 

    long i; 

    float x, y; 

    const float f = 1.5F; 

    x = number * 0.5F; 

    y  = number; 

    i  = * ( long * ) &y; 

    i  = 0x5f3759df - ( i >> 1 ); 

    y  = * ( float * ) &i; 

     y  = y * ( f - ( x * y * y ) ); 

    y  = y * ( f - ( x * y * y ) ); 

     return number * y;

}   

Newton's Approximation of Roots

The code above implements the well known Newton Approximation of roots [3]. As in most other iterative approximation calculations, The Newton approximation is supposed to be ran in iterations; each iteration enhances the accuracy until enough iterations have been made for reaching the desired accuracy. 

 The general idea behind Newton's approximation is that whenever we have a guess y for the value of the square root of a number x, we can perform a simple manipulation to get a better guess (one closer to the actual square root) by averaging y with x/y. For example we can compute the square root of 2 as follows; Suppose the initial guess is 1:  

2/1 = 2 ;  (2 + 1) / 2 = 1.5  

2/1.5 = 1.3333; ( 1.5 + 1.3333 ) / 2 = 1.4167 

2/1.4167 = 1.4117;  ( 1.4167 + 1.4117 ) / 2 = 1.4142 

And so on... 

As mentioned before, Newton's approximation is a well known method of fast root calculations. However, Carmack picked an unusual first guess for the first iteration that enhances the approximation drastically and lowers the number of required iterations in Quake III's calculations to only one iteration!

A Witchcraft Number

The really interesting aspect of this function is the magic constant 0x5f3759df, used to calculate the initial guess, in :  

i  = 0x5f3759df - ( i >> 1 ); 

Hence, dividing the input by 2 and subtracting it from the magic constant. This constant works almost perfectly - Only one Newton approximation iteration is required for a low relative error of 10^-3. . As the code illustrates in the commented out second iteration, this approximation is good enough for the Quake III engine.

It turns out that the magic constant 0x5f3759df is a real mystery. In the article "Fast Inverse Square Root" [2] , mathematician Chris Lomont of Purdue University researches the meaning of this magic constant. Using several elaborate techniques, Lomont tries to mathematically calculate this constant himself. The results are very surprising – Lomont's well calculated mathematically optimal constant turns out to be slighltly different (0x5f37642f) , and in spite of being theoretically better, it yields worse results then the original constant used in the source code!! Indeed, John Carmack must have used genuine black magic to find this number.

Only by numerically searching for another better constant, Lomont found one, which is slightly better that the original. However, in practice the two constants are likely to yield similar results. Lomont proposes this function with the better constant:

float InvSqrt(float x)

{    float xhalf = 0.5f*x; 

  int i = *(int*)&x; // get bits for floating value 

  i = 0x5f375a86- (i>>1); // gives initial guess y0 

  x = *(float*)&i; // convert bits back to float 

   x = x*(1.5f-xhalf*x*x); // Newton step, repeating increases accuracy  

  return x;  

}     

References