惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

MyScale Blog
MyScale Blog
C
CXSECURITY Database RSS Feed - CXSecurity.com
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
阮一峰的网络日志
阮一峰的网络日志
罗磊的独立博客
博客园 - 叶小钗
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
美团技术团队
酷 壳 – CoolShell
酷 壳 – CoolShell
雷峰网
雷峰网
宝玉的分享
宝玉的分享
大猫的无限游戏
大猫的无限游戏
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Last Week in AI
Last Week in AI
爱范儿
爱范儿
小众软件
小众软件
K
Kaspersky official blog
P
Proofpoint News Feed
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
博客园 - Franky
V
Vulnerabilities – Threatpost
博客园_首页
Microsoft Security Blog
Microsoft Security Blog
C
Cybersecurity and Infrastructure Security Agency CISA
V
V2EX
C
Check Point Blog
S
Schneier on Security
P
Palo Alto Networks Blog
IT之家
IT之家
GbyAI
GbyAI
T
Threat Research - Cisco Blogs
Hugging Face - Blog
Hugging Face - Blog
D
Darknet – Hacking Tools, Hacker News & Cyber Security
Apple Machine Learning Research
Apple Machine Learning Research
C
Cyber Attacks, Cyber Crime and Cyber Security
T
Tailwind CSS Blog
Project Zero
Project Zero
Y
Y Combinator Blog
V
Visual Studio Blog
Simon Willison's Weblog
Simon Willison's Weblog
T
Threatpost
Scott Helme
Scott Helme
L
LINUX DO - 热门话题
S
Securelist
C
CERT Recently Published Vulnerability Notes
A
Arctic Wolf
M
MIT News - Artificial intelligence
人人都是产品经理
人人都是产品经理

博客园 - rosanshao

使用sc 命令写脚本 添加和删除服务 简单应用 Win 10激活 couchbase map reduce StartWith 测试 Couchbase II( View And Index) Couchbase I SqlIO优化 死锁检测 MemCached 安装笔记 Asp.Net异步编程-使用了异步,性能就提升了吗? Autofac Mvc Webapi注入笔记 Sql Server 2005/2008 SqlCacheDependency查询通知的使用总结 WCF .NET REST调用方式 WCF HelpPage 和自动根据头返回JSON XML .net 4.0 新特性 Linq 并行化处理 IIS 假死状态处理 gmap jQuery插件开发 StatusCode - rosanshao
笔记,转
rosanshao · 2017-12-25 · via 博客园 - rosanshao

http://docs.couchdb.org/en/latest/ddocs/views/collation.html

6.2.2. Views Collation

Basics

View functions specify a key and a value to be returned for each row. CouchDB collates the view rows by this key. In the following example, the LastName property serves as the key, thus the result will be sorted by LastName:

function(doc) {
    if (doc.Type == "customer") {
        emit(doc.LastName, {FirstName: doc.FirstName, Address: doc.Address});
    }
}

CouchDB allows arbitrary JSON structures to be used as keys. You can use JSON arrays as keys for fine-grained control over sorting and grouping.

Examples

The following clever trick would return both customer and order documents. The key is composed of a customer _id and a sorting token. Because the key for order documents begins with the _idof a customer document, all the orders will be sorted by customer. Because the sorting token for customers is lower than the token for orders, the customer document will come before the associated orders. The values 0 and 1 for the sorting token are arbitrary.

function(doc) {
    if (doc.Type == "customer") {
        emit([doc._id, 0], null);
    } else if (doc.Type == "order") {
        emit([doc.customer_id, 1], null);
    }
}

To list a specific customer with _id XYZ, and all of that customer’s orders, limit the startkey and endkey ranges to cover only documents for that customer’s _id:

startkey=["XYZ"]&endkey=["XYZ", {}]

It is not recommended to emit the document itself in the view. Instead, to include the bodies of the documents when requesting the view, request the view with ?include_docs=true.

Sorting by Dates

It maybe be convenient to store date attributes in a human readable format (i.e. as a string), but still sort by date. This can be done by converting the date to a number in the emit() function. For example, given a document with a created_at attribute of 'Wed Jul 23 16:29:21 +0100 2013', the following emit function would sort by date:

emit(Date.parse(doc.created_at).getTime(), null);

Alternatively, if you use a date format which sorts lexicographically, such as "2013/06/09 13:52:11 +0000" you can just

emit(doc.created_at, null);

and avoid the conversion. As a bonus, this date format is compatible with the JavaScript date parser, so you can use new Date(doc.created_at) in your client side JavaScript to make date sorting easy in the browser.

String Ranges

If you need start and end keys that encompass every string with a given prefix, it is better to use a high value Unicode character, than to use a 'ZZZZ' suffix.

That is, rather than:

startkey="abc"&endkey="abcZZZZZZZZZ"

You should use:

startkey="abc"&endkey="abc\ufff0"

Collation Specification

This section is based on the view_collation function in view_collation.js:

// special values sort before all other types
null
false
true

// then numbers
1
2
3.0
4

// then text, case sensitive
"a"
"A"
"aa"
"b"
"B"
"ba"
"bb"

// then arrays. compared element by element until different.
// Longer arrays sort after their prefixes
["a"]
["b"]
["b","c"]
["b","c", "a"]
["b","d"]
["b","d", "e"]

// then object, compares each key value in the list until different.
// larger objects sort after their subset objects.
{a:1}
{a:2}
{b:1}
{b:2}
{b:2, a:1} // Member order does matter for collation.
           // CouchDB preserves member order
           // but doesn't require that clients will.
           // this test might fail if used with a js engine
           // that doesn't preserve order
{b:2, c:2}

Comparison of strings is done using ICU which implements the Unicode Collation Algorithm, giving a dictionary sorting of keys. This can give surprising results if you were expecting ASCII ordering. Note that:

  • All symbols sort before numbers and letters (even the “high” symbols like tilde, 0x7e)
  • Differing sequences of letters are compared without regard to case, so aa but also aaand AA
  • Identical sequences of letters are compared with regard to case, with lowercase before uppercase, so A

You can demonstrate the collation sequence for 7-bit ASCII characters like this:

require 'rubygems'
require 'restclient'
require 'json'

DB="http://127.0.0.1:5984/collator"

RestClient.delete DB rescue nil
RestClient.put "#{DB}",""

(32..126).each do |c|
    RestClient.put "#{DB}/#{c.to_s(16)}", {"x"=>c.chr}.to_json
end

RestClient.put "#{DB}/_design/test", <<EOS
{
    "views":{
        "one":{
            "map":"function (doc) { emit(doc.x,null); }"
        }
    }
}
EOS

puts RestClient.get("#{DB}/_design/test/_view/one")

This shows the collation sequence to be:

` ^ _ - , ; : ! ? . ' " ( ) [ ] { } @ * / \ & # % + < = > | ~ $ 0 1 2 3 4 5 6 7 8 9
a A b B c C d D e E f F g G h H i I j J k K l L m M n N o O p P q Q r R s S t T u U v V w W x X y Y z Z

Key ranges

Take special care when querying key ranges. For example: the query:

startkey="Abc"&endkey="AbcZZZZ"

will match “ABC” and “abc1”, but not “abc”. This is because UCA sorts as:

abc < Abc < ABC < abc1 < AbcZZZZZ

For most applications, to avoid problems you should lowercase the startkey:

startkey="abc"&endkey="abcZZZZZZZZ"

will match all keys starting with [aA][bB][cC]

Complex keys

The query startkey=["foo"]&endkey=["foo",{}] will match most array keys with “foo” in the first element, such as ["foo","bar"] and ["foo",["bar","baz"]]. However it will not match ["foo",{"an":"object"}]

_all_docs

The _all_docs view is a special case because it uses ASCII collation for doc ids, not UCA:

startkey="_design/"&endkey="_design/ZZZZZZZZ"

will not find _design/abc because ‘Z’ comes before ‘a’ in the ASCII sequence. A better solution is:

startkey="_design/"&endkey="_design0"

Raw collation

To squeeze a little more performance out of views, you can specify "options":{"collation":"raw"}within the view definition for native Erlang collation, especially if you don’t require UCA. This gives a different collation sequence:

1
false
null
true
{"a":"a"},
["a"]
"a"

Beware that {} is no longer a suitable “high” key sentinel value. Use a string like "\ufff0" instead