--- 1/draft-ietf-regext-rdap-sorting-and-paging-03.txt 2019-08-02 04:13:51.175247771 -0700 +++ 2/draft-ietf-regext-rdap-sorting-and-paging-04.txt 2019-08-02 04:13:51.223248981 -0700 @@ -1,21 +1,21 @@ Registration Protocols Extensions M. Loffredo Internet-Draft M. Martinelli Intended status: Standards Track IIT-CNR/Registro.it -Expires: December 14, 2019 S. Hollenbeck +Expires: February 2, 2020 S. Hollenbeck Verisign Labs - June 12, 2019 + August 1, 2019 Registration Data Access Protocol (RDAP) Query Parameters for Result Sorting and Paging - draft-ietf-regext-rdap-sorting-and-paging-03 + draft-ietf-regext-rdap-sorting-and-paging-04 Abstract The Registration Data Access Protocol (RDAP) does not include core functionality for clients to provide sorting and paging parameters for control of large result sets. This omission can lead to unpredictable server processing of queries and client processing of responses. This unpredictability can be greatly reduced if clients can provide servers with their preferences for managing large responses. This document describes RDAP query extensions that allow @@ -30,21 +30,21 @@ Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at https://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." - This Internet-Draft will expire on December 14, 2019. + This Internet-Draft will expire on February 2, 2020. Copyright Notice Copyright (c) 2019 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents @@ -297,28 +297,32 @@ the following: sort = "sort=" sortItem *( "," sortItem ) sortItem = property-ref [":" ( "a" / "d" ) ] property-ref = ALPHA *( ALPHA / DIGIT / "_" ) "a" means that the ascending sort MUST be applied, "d" means that the descending sort MUST be applied. If the sort direction is absent, an ascending sort MUST be applied (Figure 3). - https://example.com/rdap/domains?name=*nr.com&sort=ldhName + https://example.com/rdap/domains?name=*nr.com&sort=name https://example.com/rdap/domains?name=*nr.com&sort=registrationDate:d - https://example.com/rdap/domains?name=*nr.com&sort=lockedDate,ldhName + https://example.com/rdap/domains?name=*nr.com&sort=lockedDate,name Figure 3: Examples of RDAP query reporting the "sort" parameter + Servers MUST implement sorting according to the JSON value type of + the RDAP field the sorting property refers to: the lexicographic + sorting for strings and the numeric sorting for numbers. + If the "sort" parameter reports an allowed sorting property, it MUST be provided in the "currentSort" field of the "sorting_metadata" element. 2.3.1. Sorting Properties Declaration In the "sort" parameter ABNF syntax, property-ref represents a reference to a property of an RDAP object. Such a reference could be expressed by using a JSON Path. The JSON Path in a JSON document [RFC8259] is equivalent to the XPath [W3C.CR-xpath-31-20161213] in a @@ -364,78 +367,83 @@ * reinstantiationDate * transferDate * lockedDate * unlockedDate o Object specific properties. With regard to the specific properties, some of them are already defined among the query paths. In the following a list of possible sorting properties, grouped by objects, is shown: - * Domain: ldhName - * Nameserver: ldhName, ipV4, ipV6. + * Domain: name + * Nameserver: name, ipV4, ipV6. + * Entity: fn, handle, org, email, voice, country, cc, city. The correspondence between the sorting properties and the RDAP fields is shown in Table 1: - +-----------+-----------+-------------+---------+---------+---------+ - | Object | Sorting | RDAP | RFC7483 | RFC6350 | RFC8605 | - | class | property | property | | | | - +-----------+-----------+-------------+---------+---------+---------+ - | Searchabl | Common pr | eventAction | 4.5. | | | - | e objects | operties | values | | | | - | | | suffixed by | | | | - | | | "Date" | | | | + +-----------+-----------+---------------------+------+-------+------+ + | Object | Sorting | RDAP property | RFC | RFC | RFC | + | class | property | | 7483 | 6350 | 8605 | + +-----------+-----------+---------------------+------+-------+------+ + | Searchabl | Common pr | eventAction values | 4.5. | | | + | e objects | operties | suffixed by "Date" | | | | | | | | | | | - | Domain | ldhName | ldhName | 5.3. | | | + | Domain | name | unicodeName/ldhName | 5.3. | | | | | | | | | | - | Nameserve | ldhName | ldhName | 5.2. | | | + | Nameserve | name | unicodeName/ldhName | 5.2. | | | | r | | | | | | - | | ipV4 | v4 | 5.2. | | | - | | | ipAddress | | | | - | | ipV6 | v6 | 5.2. | | | - | | | ipAddress | | | | + | | ipV4 | v4 ipAddress | 5.2. | | | + | | ipV6 | v6 ipAddress | 5.2. | | | | | | | | | | | Entity | handle | handle | 5.1. | | | | | fn | vcard fn | 5.1. | 6.2.1 | | | | org | vcard org | 5.1. | 6.6.4 | | - | | voice | vcard tel | 5.1. | 6.4.1 | | - | | | with type=" | | | | - | | | voice" | | | | + | | voice | vcard tel with | 5.1. | 6.4.1 | | + | | | type="voice" | | | | | | email | vcard email | 5.1. | 6.4.2 | | - | | country | country | 5.1. | 6.3.1 | | - | | | name in | | | | - | | | vcard adr | | | | - | | cc | country | 5.1. | | 3.1 | - | | | code in | | | | + | | country | country name in | 5.1. | 6.3.1 | | | | | vcard adr | | | | - | | city | locality in | 5.1. | 6.3.1 | | + | | cc | country code in | 5.1. | | 3.1 | | | | vcard adr | | | | - +-----------+-----------+-------------+---------+---------+---------+ + | | city | locality in vcard | 5.1. | 6.3.1 | | + | | | adr | | | | + +-----------+-----------+---------------------+------+-------+------+ Table 1: Sorting properties definition With regard to the definitions in Table 1, some further - considerations must be made to disambiguate cases where the RDAP - property is multivalued: + considerations must be made to disambiguate some cases: + + o since the response to a search on either domains or nameservers + might include both A-labels and U-labels ([RFC5890]) in general, a + consistent sorting policy shall take unicodeName and ldhName as + two formats of the same value rather than separately. Therefore, + the unicodeName value MUST be taken while sorting, when + unicodeName is missing, the value of ldhName MUST be considered + instead; + + o the jCard "sort-as" parameter MUST be ignored for the purpose of + the sorting capability as described in this document; o even if a nameserver can have multiple IPv4 and IPv6 addresses, the most common configuration includes one address for each IP version. Therefore, the assumption of having a single IPv4 and/or IPv6 value for a nameserver cannot be considered too stringent; o with the exception of handle values, all the sorting properties defined for entity objects can be multivalued according to the definition of vCard as given in RFC6350 [RFC6350]. When more than - a value is reported, sorting can be applied to the preferred value - identified by the parameter pref="1". + a value is reported, sorting will be applied to the preferred + value identified by the parameter pref="1". If the pref parameter + is missing, sorting will be applied to the first value. Each RDAP provider MAY define other sorting properties than those shown in this document. The "jsonPath" field in the "sorting_metadata" element is used to clarify the RDAP field the sorting property refers to. The mapping between the sorting properties and the JSON Paths of the RDAP fields is shown in Table 2. The JSON Paths are provided according to the Goessner v.0.8.0 specification ([GOESSNER-JSON-PATH]): @@ -458,24 +466,24 @@ | | e | ction=="deletion")].eventDate | | | reinstantia | "$.domainSearchResults[*].events[?(@.eventA | | | tionDate | ction=="reinstantiation")].eventDate | | | transferDat | "$.domainSearchResults[*].events[?(@.eventA | | | e | ction=="transfer")].eventDate | | | lockedDate | "$.domainSearchResults[*].events[?(@.eventA | | | | ction=="locked")].eventDate | | | unlockedDat | "$.domainSearchResults[*].events[?(@.eventA | | | e | ction=="unlocked")].eventDate | | | | | - | Domai | ldhName | $.domainSearchResults[*].ldhName | + | Domai | name | $.domainSearchResults[*].unicodeName | | n | | | | | | | - | Names | ldhName | $.nameserverSearchResults[*].ldhName | + | Names | name | $.nameserverSearchResults[*].unicodeName | | erver | | | | | ipV4 | $.nameserverSearchResults[*].ipAddresses.v4 | | | | [0] | | | ipV6 | $.nameserverSearchResults[*].ipAddresses.v6 | | | | [0] | | | | | | Entit | handle | $.entitySearchResults[*].handle | | y | | | | | fn | $.entitySearchResults[*].vcardArray[1][?(@[ | | | | 0]="fn")][3] | @@ -502,40 +510,40 @@ sort criteria (Figure 4). Each link represents a reference to an alternate view of the results. { "rdapConformance": [ "rdap_level_0", "sorting_level_0" ], ... "sorting_metadata": { - "currentSort": "ldhName", + "currentSort": "name", "availableSorts": [ { "property": "registrationDate", "jsonPath": "$.domainSearchResults[*] .events[?(@.eventAction==\"registration\")].eventDate", "default": false, "links": [ { "value": "https://example.com/rdap/domains?name=*nr.com - &sort=ldhName", + &sort=name", "rel": "alternate", "href": "https://example.com/rdap/domains?name=*nr.com &sort=registrationDate", "title": "Result Ascending Sort Link", "type": "application/rdap+json" }, { "value": "https://example.com/rdap/domains?name=*nr.com - &sort=ldhName", + &sort=name", "rel": "alternate", "href": "https://example.com/rdap/domains?name=*nr.com &sort=registrationDate:d", "title": "Result Descending Sort Link", "type": "application/rdap+json" } ] }, "domainSearchResults": [ ... @@ -545,68 +553,65 @@ Figure 4: Example of a "sorting_metadata" instance to implement result sorting 2.4. "cursor" Parameter An RDAP query could return a response with hundreds, even thousands, of objects, especially when partial matching is used. For that reason, the cursor parameter addressing result pagination is defined to make responses easier to handle. - Using limit and offset operators represents the most common way to - implement results pagination. Both of them can be used individually: + Presently, the most popular methods to implement pagination in REST + API are: offset pagination and keyset pagination. Both two + pagination methods don't require the server to handle the result set + in a storage area across the requests since a new result set is + generated each time a request is submitted. Therefore, they are + preferred in comparison to any other method requiring the management + of a REST session. + + Using limit and offset operators represents the traditionally used + method to implement results pagination. Both of them can be used + individually: o "limit": means that the server must return the first N objects of the result set; o "offset": means that the server must skip the first N objects and must return objects starting from position N+1. When limit and offset are used together, they allow to identify a specific portion of the result set. For example, the pair "offset=100,limit=50" returns first 50 objects starting from position 101 of the result set. - However, offset pagination raises some well known drawbacks: + Despite its easiness of implementation, offset pagination raises some + well known drawbacks: o when offset has a very high value, scrolling the result set could take some time; o it always requires to fetch all the rows before dropping as many rows as specified by offset; o it may return inconsistent pages when data are frequently updated (i.e. real-time data) but this doesn't seem the case of registration data. - An alternative approach to offset pagination is keyset pagination - [SEEK] which consists in adding a query condition that enables the - seletion of the only data not yet returned. This method has been - taken as the basis for the implementation of a "cursor" parameter - [CURSOR] by some REST API providers (e.g. + The keyset pagination [SEEK] consists in adding a query condition + that enables the seletion of the only data not yet returned. This + method has been taken as the basis for the implementation of a + "cursor" parameter [CURSOR] by some REST API providers (e.g. [CURSOR-API1],[CURSOR-API2]). The cursor is an opaque URL-safe string representing a logical pointer to the first result of the next - page (Figure 5). Basically, the cursor is the encryption of the key - value identifying the last row of the current page. For example, the - cursor value "a2V5PXRoZWxhc3Rkb21haW5vZnRoZXBhZ2UuY29t=" is the mere - Base64 encoding of "key=thelastdomainofthepage.com". - - The ABNF syntax is the following: - - cursor = "cursor=" ( ALPHA / DIGIT / "/" / "=" / "-" / "_" ) - - https://example.com/rdap/domains?name=*nr.com - &cursor=wJlCDLIl6KTWypN7T6vc6nWEmEYe99Hjf1XY1xmqV-M= - - Figure 5: An example of RDAP query reporting the "cursor" parameter + page (Figure 5). - Nevertheless, even cursor pagination can be troublesome: + Nevertheless, even keyset pagination can be troublesome: o it needs at least one key field; o it does not allow to sort just by any field because the sorting criterion must contain a key; o it works best with full composite values support by DBMS (i.e. [x,y]>[a,b]), emulation is possible but ugly and less performant; o it does not allow to directly navigate to arbitrary pages because @@ -633,33 +638,50 @@ response, the time required by offset pagination to skip the previous pages could be much faster than the processing time needed to build the current page. In fact, RDAP objects are usually formed by information belonging to multiple data structures and containing multivalued properties (i.e. arrays) and, therefore, data selection might be a time consuming process. This situation occurs even though the selection is supported by indexes; o depending on the access levels defined by each RDAP operator, the - increase of complexity and the decrease of flexibility of cursor + increase of complexity and the decrease of flexibility of keyset pagination with respect to the offset pagination could be considered impractical. Ultimately, both pagination methods have benefits and drawbacks. - That said, the cursor parameter can be used to encode not only the - key value but also the information about offset pagination. For - example, the cursor value "b2Zmc2V0PTEwMCxsaW1pdD01MAo=" is the mere - Base64 encoding of "offset=100,limit=50". This solution lets RDAP - providers to implement a pagination method according to their needs, - the user access levels, the submitted queries. In addition, servers - can change the method over time without announcing anything to the - clients. + + That said, the cursor parameter defined in this specification can be + used to encode information about any pagination method. For example, + in the case of a simple implementation of the cursor parameter to + represent offset pagination information, the cursor value + "b2Zmc2V0PTEwMCxsaW1pdD01MAo=" is the mere Base64 encoding of + "offset=100,limit=50". Likewise, in a simple implementation to + represent keyset pagination information, the cursor value + "a2V5PXRoZWxhc3Rkb21haW5vZnRoZXBhZ2UuY29t=" represents the mere + Base64 encoding of "key=thelastdomainofthepage.com" where the key + value identifies the last row of the current page. + + This solution lets RDAP providers to implement a pagination method + according to their needs, the user access levels, the submitted + queries. In addition, servers can change the method over time + without announcing anything to the clients. + + The ABNF syntax of the cursor paramter is the following: + + cursor = "cursor=" 1*( ALPHA / DIGIT / "/" / "=" / "-" / "_" ) + + https://example.com/rdap/domains?name=*nr.com + &cursor=wJlCDLIl6KTWypN7T6vc6nWEmEYe99Hjf1XY1xmqV-M= + + Figure 5: An example of RDAP query reporting the "cursor" parameter 2.4.1. Representing Paging Links An RDAP server SHOULD use the "links" array of the "paging_metadata" element to provide a ready-made reference [RFC8288] to the next page of the result set (Figure 6). Examples of additional "rel" values a server MAY implements are "first", "last", "prev". { "rdapConformance": [ @@ -724,24 +746,20 @@ The implementation of the new parameters is technically feasible, as operators for counting, sorting and paging are currently supported by the major RDBMSs. Similar operators are completely or partially supported by the most known NoSQL databases (MongoDB, CouchDB, HBase, Cassandra, Hadoop) so the implementation of the new parameters seems to be practicable by servers working without the use of an RDBMS. - Furthermore, both two pagination methods don't require the server to - handle the result set in a storage area across the requests since a - new result set is generated each time a request is submitted. - 6. Implementation Status NOTE: Please remove this section and the reference to RFC 7942 prior to publication as an RFC. This section records the status of known implementations of the protocol defined by this specification at the time of posting of this Internet-Draft, and is based on a proposal described in RFC 7942 [RFC7942]. The description of implementations in this section is intended to assist the IETF in its decision processes in progressing @@ -854,20 +871,25 @@ [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA Considerations Section in RFCs", RFC 5226, DOI 10.17487/RFC5226, May 2008, . [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, DOI 10.17487/RFC5234, January 2008, . + [RFC5890] Klensin, J., "Internationalized Domain Names for + Applications (IDNA): Definitions and Document Framework", + RFC 5890, DOI 10.17487/RFC5890, August 2010, + . + [RFC6350] Perreault, S., "vCard Format Specification", RFC 6350, DOI 10.17487/RFC6350, August 2011, . [RFC7230] Fielding, R., Ed. and J. Reschke, Ed., "Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing", RFC 7230, DOI 10.17487/RFC7230, June 2014, . [RFC7480] Newton, A., Ellacott, B., and N. Kong, "HTTP Usage in the @@ -972,31 +994,35 @@ Parameter". Removed "FOR DISCUSSION" items. Provided a more detailed description of both "sorting_metadata" and "paging_metadata" objects. 02: Removed both "offset" and "limit" parameters. Added ABNF syntax of cursor parameter. Rearranged the layout of some sections. Removed some items from "Informative References" section. Changed "IANA Considerations" section. 03: Added "cc" to the list of sorting properties in "Sorting Properties Declaration" section. Added RFC8605 to the list of "Informative References". + 04: Replaced "ldhName" with "name" in the "Sorting Properties + Declaration" section. Clarified the sorting logic with respect + the JSON value types and the sorting policy for multivalued + fields. Authors' Addresses - Mario Loffredo IIT-CNR/Registro.it Via Moruzzi,1 Pisa 56124 IT Email: mario.loffredo@iit.cnr.it URI: http://www.iit.cnr.it + Maurizio Martinelli IIT-CNR/Registro.it Via Moruzzi,1 Pisa 56124 IT Email: maurizio.martinelli@iit.cnr.it URI: http://www.iit.cnr.it Scott Hollenbeck