惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Simon Willison's Weblog
Simon Willison's Weblog
P
Privacy International News Feed
www.infosecurity-magazine.com
www.infosecurity-magazine.com
T
Troy Hunt's Blog
Hacker News - Newest:
Hacker News - Newest: "LLM"
Attack and Defense Labs
Attack and Defense Labs
S
Secure Thoughts
V2EX - 技术
V2EX - 技术
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
O
OpenAI News
Cloudbric
Cloudbric
Google Online Security Blog
Google Online Security Blog
Schneier on Security
Schneier on Security
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
Help Net Security
Help Net Security
Cyberwarzone
Cyberwarzone
G
GRAHAM CLULEY
L
Lohrmann on Cybersecurity
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
Spread Privacy
Spread Privacy
NISL@THU
NISL@THU
N
News and Events Feed by Topic
T
Tenable Blog
S
Security @ Cisco Blogs
N
News and Events Feed by Topic
The Hacker News
The Hacker News
C
CXSECURITY Database RSS Feed - CXSecurity.com
宝玉的分享
宝玉的分享
月光博客
月光博客
酷 壳 – CoolShell
酷 壳 – CoolShell
美团技术团队
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Google DeepMind News
Google DeepMind News
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
T
Tailwind CSS Blog
V
Visual Studio Blog
P
Proofpoint News Feed
Webroot Blog
Webroot Blog
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
博客园 - 三生石上(FineUI控件)
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
Jina AI
Jina AI
雷峰网
雷峰网
T
The Blog of Author Tim Ferriss
Hugging Face - Blog
Hugging Face - Blog
腾讯CDC
L
LangChain Blog
The Register - Security
The Register - Security
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
博客园 - 聂微东

博客园 - nasdaqhe

Frida 使用 Android https 抓包 云效流水线部署ack Android 解包重签名打包 centos7 docker 安装及配置 CentOS 6 升级 curl Mac 下编译安装 php-5.6 ubuntu+php5-fpm 下安装 memcached PHP扩展 cmd下使用telnet连接到memcached服务器操作 Lucence.Net 2.9.3 日期范围搜索 SQL语句优化一例 row_number not in or flickr head中用到的标签 - nasdaqhe - 博客园 MSSQL备忘 新浪微博产品图 (Vincent.H手笔) MindManage HTML5学习资料整理 ubuntu备忘 const 与 readonly ubuntu 10.04 安装 oracle11g VS2010 .NET 4学习资料整理
判断中文是否UTF8编码
nasdaqhe · 2010-12-23 · via 博客园 - nasdaqhe

代码

        #region 判断Url参数是否UTF8编码
        
public static bool IsUTF8(string url)
        {
            
byte[] buf = GetUrlCodingToBytes(url);
            
return IsTextUTF8(buf);
        }
        
private static bool IsTextUTF8(byte[] buf)
        {
            
int i;
            
byte cOctets = 0// octets to go in this UTF-8 encoded character  
            bool bAllAscii = true;
            
long iLen = buf.Length;
            
for (i = 0; i < iLen; i++)
            {
                
if ((buf[i] & 0x80!= 0) bAllAscii = false;if (cOctets == 0)
                {
                    
if (buf[i] >= 0x80)
                    {
                        
do
                        {
                            buf[i] 
<<= 1;
                            cOctets
++;
                        }
                        
while ((buf[i] & 0x80!= 0);

                        cOctets

--;
                        
if (cOctets != 2)
                            
return false;
                    }
                }
                
else
                {
                    
if ((buf[i] & 0xC0!= 0x80)
                        
return false;
                    cOctets
--;
                }
            }
            
if (cOctets > 0)
                
return false;
            
if (bAllAscii)
                
return false;
            
return true;
        }
        
private static byte[] GetUrlCodingToBytes(string url)
        {
            StringBuilder sb 
= new StringBuilder();int i = url.IndexOf('%');
            
while (i >= 0)
            {
                
if (url.Length < i + 3)
                {
                    
break;
                }
                sb.Append(url.Substring(i, 
3));
                url 
= url.Substring(i + 3);
                i 
= url.IndexOf('%');
            }
string urlCoding = sb.ToString();
            
if (string.IsNullOrEmpty(urlCoding))
                
return new byte[0];

            urlCoding 

= urlCoding.Replace("%"string.Empty);
            
int len = urlCoding.Length / 2;
            
byte[] result = new byte[len];
            len 
*= 2;
            
for (int index = 0; index < len; index++)
            {
                
string s = urlCoding.Substring(index, 2);
                
int b = int.Parse(s, System.Globalization.NumberStyles.HexNumber);
                result[index 
/ 2= (byte)b;
                index
++;
            }
            
return result;
        }
        
#endregion 判断Url参数是否UTF8编码

UTF-8编码规则参考

http://blog.csdn.net/sandyen/archive/2006/08/23/1108168.aspx

 上面代码是网络上找的,不过存在大部分不能识别的情况,后根据对于中文,UTF8 一定编码成 3 字节,这个原则

 修改了一下,现在大部分情况下都能正确识别