Internet Engineering Task Force G. Bertrand, Ed. Internet-Draft I. Oprescu, Ed. Intended status: InformationalE. Stephan Expires: August 26, 2013France Telecom - OrangeR. Peterkofsky Skytide, Inc.Expires: November 28, 2013 F. Le Faucheur, Ed. Cisco SystemsP. Grochocki Orange Polska February 22,R. Peterkofsky Skytide, Inc. May 27, 2013 CDNI Logging Interfacedraft-ietf-cdni-logging-01draft-ietf-cdni-logging-02 Abstract This memo specifies the Logging interface between a downstream CDN (dCDN) and an upstream CDN (uCDN) that are interconnected as per the CDN Interconnection (CDNI) framework. First, it describes a reference model for CDNI logging. Then, it specifies the CDNI Logging File format and the actual protocol forCDNI logging informationexchangecovering the information elements as well as the transportofthose elements.CDNI Logging Files. Status ofthisThis Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire onAugust 26,November 28, 2013. Copyright Notice Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. This document may contain material from IETF Documents or IETF Contributions published or made publicly available before November 10, 2008. The person(s) controlling the copyright in some of this material may not have granted the IETF Trust the right to allow modifications of such material outside the IETF Standards Process. Without obtaining an adequate license from the person(s) controlling the copyright in such materials, this document may not be modified outside the IETF Standards Process, and derivative works of it may not be created outside the IETF Standards Process, except to format it for publication as an RFC or to translate it into languages other than English. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . .. 53 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . .5 1.2. Abbreviations . . . . . . . . . . . . . . . . . . . . . . 84 2. CDNI Logging Reference Model . . . . . . . . . . . . . . . .. 85 2.1. CDNI Logging interactions . . . . . . . . . . . . . . . .85 2.2. Overall Logging Chain . . . . . . . . . . . . . . . . . .128 2.2.1. Logging Generation and During-Generation Aggregation. . . . . . . . . . . . . . . . . . . . . 139 2.2.2. Logging Collection . . . . . . . . . . . . . . . . .. 1410 2.2.3. Logging Filtering . . . . . . . . . . . . . . . . . .1410 2.2.4. Logging Rectification and Post-Generation Aggregation. . . . . . . . . . . . . . . . . . . . . 1511 2.2.5. Log-Consuming Applications . . . . . . . . . . . . .. 1512 2.2.5.1. Maintenance/Debugging . . . . . . . . . . . . . .1512 2.2.5.2. Accounting . . . . . . . . . . . . . . . . . . .. 1612 2.2.5.3. Analytics and Reporting . . . . . . . . . . . . .1613 2.2.5.4. Security . . . . . . . . . . . . . . . . . . . .. 1613 2.2.5.5. Legal Logging Duties . . . . . . . . . . . . . .. 1613 2.2.5.6. Notions common to multiple Log Consuming Applications . . . . . . . . . . . . . . . . . .. 1613 3. CDNI LoggingTransport Requirements .File Format . . . . . . . . . . . .18 3.1. Timeliness. . . . . . 15 3.1. CDNI Logging File Directives . . . . . . . . . . . . . . 16 3.2. Logging Records . . . .19 3.2. Reliability. . . . . . . . . . . . . . . . . 19 3.2.1. HTTP Request Logging Record . . . . . .19 3.3. Security. . . . . . . 20 3.2.2. CDNI Logging File Example . . . . . . . . . . . . . . 26 3.3. Fields and Directives Formats . . . .19 3.4. Scalability. . . . . . . . . . 27 4. CDNI Logging File Exchange Protocol . . . . . . . . . . . . .19 3.5. Consistency between27 4.1. CDNI Loggingand CDN LoggingFeed . . . . .20 3.6. Dispatching/Filtering. . . . . . . . . . . . . . . 28 4.2. CDNI Logging File Pull . . .20 4. CDNI Logging Information Structure and Transport. . . . . . .20 5. CDNI Logging Fields. . . . . . . 28 5. Open Issues . . . . . . . . . . . . . .22 5.1. Semantics of CDNI Logging Fields. . . . . . . . . . . 29 6. IANA Considerations . .22 5.2. Syntax of CDNI Logging Fields. . . . . . . . . . . . . .26 6. CDNI Logging Records. . . . . 31 7. Security Considerations . . . . . . . . . . . . . . . .27 6.1. Content Delivery. . . 31 7.1. Authentication, Confidentiality, Integrity Protection . . 31 7.2. Non Repudiation . . . . . . . . . . . . . . . .27 6.2. Content Invalidation and Purging. . . . . 32 7.3. Privacy . . . . . . . .29 6.3. Request Routing. . . . . . . . . . . . . . . . . 32 8. Acknowledgments . . . .29 6.4. Logging Extensibility. . . . . . . . . . . . . . . . . .29 7. CDNI Logging File Format . . . . . . . . . . . . . . . . . . . 29 7.1. Logging Files . . . . . . .. 32 9. References . . . . . . . . . . . . . .29 7.2. File Format. . . . . . . . . . . 33 9.1. Normative References . . . . . . . . . . . .29 7.2.1. Headers. . . . . . 33 9.2. Informative References . . . . . . . . . . . . . . . . .30 7.2.2. Body (Logging Records) Format33 Appendix A. Requirements . . . . . . . . . . . .31 7.2.3. Footer Format. . . . . . . . 34 A.1. Compliance with cdni-requirements . . . . . . . . . . . .31 8. CDNI Logging File Transport Protocol34 A.2. Additional Requirements . . . . . . . . . . . . .31 9. Open Issues. . . . 34 A.2.1. Timeliness . . . . . . . . . . . . . . . . . . . . .32 10. IANA Considerations34 A.2.2. Reliability . . . . . . . . . . . . . . . . . . . . .32 11.35 A.2.3. SecurityConsiderations . . . . . . . . . . . . . . . . . . . 32 11.1. Privacy . . . . . . . . . . . . . . . . . . . . . . . . . 33 11.2. Non Repudiation . . . . . . . . . . . . . . . . . . . . . 33 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 33 13. References . .. . . . . . . . . . . . . . . . . . . . . . 35 A.2.4. Scalability . .33 13.1. Normative References. . . . . . . . . . . . . . . . . . .33 13.2. Informative References35 A.2.5. Consistency between CDNI Logging and CDN Logging . . 35 A.2.6. Dispatching/Filtering . . . . . . . . . . . . . . . .3335 AppendixA. Examples Log Format . . . . .B. Analysis of candidate protocols for Logging Transport . . . . . . . . . . . .34 A.1. W3C Common Log File (CLF) Format. . . . . . . . . 36 B.1. Syslog . . . .35 A.2. W3C Extended Log File (ELF) Format. . . . . . . . . . . .35 A.3. National Center for Supercomputing Applications (NCSA) Common Log Format. . . . . . . . . 36 B.2. XMPP . . . . . . . . . . .37 A.4. NCSA Combined Log Format. . . . . . . . . . . . . . . 36 B.3. SNMP . .37 A.5. NCSA Separate Log Format. . . . . . . . . . . . . . . . .37 A.6. Squid 2.0 Native Log Format for Access Logs. . . . . . .37 Appendix B. Requirements36 Authors' Addresses . . . . . . . . . . . . . . . . . . . .38 B.1. Additional Requirements. . .. . . . . . . . . . . . . . 38 B.2. Compliancy with Requirements draft . . . . . . . . . . . . 39 Appendix C. Analysis of candidate protocols for Logging Transport . . . . . . . . . . . . . . . . . . . . . . 39 C.1. Syslog . . . . . . . . . . . . . . . . . . . . . . . . . . 40 C.2. XMPP . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 C.3. SNMP . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 4036 1. Introduction This memo specifies the Logging interface between a downstream CDN (dCDN) and an upstream CDN (uCDN). First, it describes a reference model for CDNI logging. Then, it specifies the CDNI Logging File format and the actual protocol forCDNI logging informationexchangecovering the information elements as well as the transportofthose elements.CDNI Logging Files. The reader should be familiar with thework of the CDNI WG:following documents: o CDNI problem statement [RFC6707] and framework [I-D.ietf-cdni-framework] identify a Logging interface, o Section78 of [I-D.ietf-cdni-requirements] specifies a set of requirements for Logging, o [RFC6770] outlines real world use-cases for interconnecting CDNs. These use cases require the exchange of Logging information between the dCDN and the uCDN. As stated in [RFC6707], "the CDNI Logging interface enables details of logs or events to be exchanged between interconnected CDNs". The present document describes: o The CDNI Logging reference model (Section 2), o The CDNI Logginginformation structure and Transport (Section 4), o The CDNI Logging Fields (Section 5), o The CDNI Logging Records (Section 6), o The CDNI LoggingFile format (Section7),3), o The CDNI Logging FileTransport ProtocolExchange protocol (Section8), In the Appendices, the document provides: o A list of identified requirements (Appendix B.1), which should be considered for inclusion in [I-D.ietf-cdni-requirements],4). 1.1. Terminology In this document, the first letter of each CDNI-specific term is capitalized. We adopt the terminology described in [RFC6707] and [I-D.ietf-cdni-framework], and extend it with the additional terms defined below. For clarity, we use the word "Log" only for referring to internal CDN logs and we use the word "Logging" for any inter-CDN information exchange and processing operations related to CDNI Logging interface. Log and Logging formats may be different. CDN Logging information: logging information generated and collected within a CDN CDNI Logging information: logging information exchanged across CDNs using the CDNI Logging Interface Logging information: logging information generated and collected within a CDN or obtained from another CDN using the CDNI Logging Interface CDNI Logging Field: an atomic element of information that can be included in a CDNI Logging Record. The time an event/task started, the IP address of an End user to whom content was delivered, and the URI of the content delivered are examples of CDNI Logging Fields. CDNI Logging Record: an information record providing information about a specific event. This comprises a collection of CDNI Logging Fields.Separator Character: a specific character used to enable the parsing of Logging Records. This character separates the Logging Fields that compose a Logging Record.CDNI Logging File: a file containing CDNI Logging Records, as well as additional information facilitating the processing of the CDNI Logging Records. CDN Reporting: the process of providing the relevant information that will be used to create a formatted content delivery report provided to the CSP in deferred time. Such information typically includes aggregated data that can cover a large period of time (e.g., from hours to several months). Uses of Reporting include the collection of charging data related to CDN services and the computation of Key Performance Indicators (KPIs). CDN Monitoring: the process of providing content delivery information in real-time. Monitoring typically includes data in real time to provide visibility of the deliveries in progress, for service operation purposes. It presents a view of the global health of the services as well as information on usage and performance, for network services supervision and operation management. In particular, monitoring data can be used to generate alarms.End-User experience management: study of Logging data using statistical analysis to discover, understand, and predict user behavior patterns. Class-of-requests: A Class-of-requests identifies a set of content Requests, related to a specific CSP, received from clients in a given footprint and sharing common properties. These properties include: o Any header, URL parameter, query parameter of an HTTP (or RTMP) content request o Any header, or sub-domain of the FQDN of a DNS lookup request Examples: o Class-of-Requests = all the requests that include the HTTP header "User-Agent: Mozilla/5.0" related to CSP "http://*.cdn.example.com" from AS3215 o Class-of-Requests = all the DNS requests from anywhere and related to CSP "cdn*.example.com" Delivery Service: A Delivery Service is defined by a set of Class-of- Requests and a list of parameters that apply to all these Class-of- Requests (logging format, delivery quality/capabilities requirements...) Service Agreement: A service agreement is defined by a uCDN identifier, a dCDN identifier, a set of Delivery Services and a list of parameters that apply to the Service Agreement. Once a Service Agreement is agreed between the administrative entities managing the CDNs to be interconnected, the upstream CDN and the downstream CDN of the CDNI interconnection must be configured according to this agreed Service Agreement. For instance, a given uCDN (uCDN1) may request a given dCDN (dCDN1) to configure one Delivery Service for handling requests for HTTP Adaptive streaming videos delegated by uCDN1 and related to a specific CSP (CSP1) and another one for handling requests for static pictures delegated by uCDN1 and related to CSP1. These Delivery services would belong to the Service Agreement between uCDN1 and dCDN1 for CSP1. In this simple example, uCDN1 may request dCDN1 to include Delivery Service information in its CDNI Logging, to help uCDN1 to provide relevant reports to CSP1. 1.2. Abbreviations o API: Application Programming Interface o CCID: Content Collection Identifier o CDN: Content Delivery Network o CDNP: Content Delivery Network Provider o CoDR: Content Delivery Record o CSP: Content Service Provider o DASH: Dynamic Adaptive Streaming over HTTP o dCDN: downstream CDN o FTP: File Transfer Protocol o HAS: HTTP Adaptive Streaming o KPI: Key Performance Indicator o PVR: Personal Video Recorder o SID: Session Identifier o SFTP: SSH File Transfer Protocol o SNMP: Simple Network Management Protocol o uCDN: upstream CDN2. CDNI Logging Reference Model 2.1. CDNI Logging interactions The CDNI logging reference model between a given uCDN and a given dCDN involves the following interactions: o customization by the uCDN of the CDNI logging information to be provided by the dCDN to the uCDN (e.g. control of which logging fields are to be communicated to the uCDN for a given task performed by the dCDN, control of which types of events are to be logged). The dCDN takes into account this CDNI logging customization information to determine what logging information to provide to the uCDN, but it may, or may not, take into account this CDNI logging customization information to influence what CDN logging information is to be generated and collected within the dCDN (e.g. even if the uCDN requests a restricted subset of the logging information, the dCDN may elect to generate a broader set of logging information). The mechanism to support the customisation by the uCDN of CDNI Logging information is outside the scope of this document and left for further study. We note that the CDNI Control interfaceoreor the CDNI Metadatainterfacesinterface appear as candidate interfaces on which to potentially build such a customisationmechanism.mechanism in the future. Before such a mechanism is available, the uCDN and dCDN are expected to agree off-line on what CDNI logging information is to be provide by dCDN to UCDN and rely on management plane actions to configure the CDNI Logging functions to generate (respectively, expect) in dCDN (respectively, in uCDN). o generation and collection by the dCDN of logging information related to the completion of any task performed by the dCDN on behalf of the uCDN (e.g., delivery of the content to an end user) or related to events happening in the dCDN that are relevant to the uCDN (e.g., failures or unavailability in dCDN). This takes place within the dCDN and does not directly involve CDNI interfaces. o communication by the dCDN to the uCDN of the logging information collected by the dCDN relevant to the uCDN. This is supported by the CDNI Logging interface and in the scope of the present document. For example, the uCDN may use this logging information to charge the CSP, to perform analytics and monitoring for operational reasons, to provide analytics and monitoring views on its content delivery to the CSP or to perform trouble-shooting. o customization by the dCDN of the logging to be performed by the uCDN on behalf of the dCDN. The mechanism to support the customisation by the dCDN of CDNI Logging information is outside the scope of this document and left for further study. o generation and collection by the uCDN of logging information related to the completion of any task performed by the uCDN on behalf of the dCDN (e.g., serving of content by uCDN to dCDN for acquisition purposes by dCDN) or related to events happening in the uCDN that are relevant to the dCDN. This takes place within the uCDN and does not directly involve CDNI interfaces. o communication by the uCDN to the dCDN of the logging information collected by the uCDN relevant to the dCDN. For example, the dCDN might potentially benefit form this information for security auditing or content acquisition troubleshooting. This is outside the scope of this document and left for further study. Figure 1 provides an example of CDNI Logging interactions (focusing only on the interactions that are in the scope of this document) in a particular scenario where 4 CDNs are involved in the delivery of content from a given CSP: the uCDN has a CDNI interconnection with dCDN-1 and dCDN-2. In turn, dCDN2 has a CDNI interconnection with dCDN3. In this example, uCDN, dCDN-1, dCDN-2 and dCDN-3 all participate in the delivery of content for the CSP. In this example, the CDNI Logging interface enables the uCDN to obtain logging information from all the dCDNs involved in the delivery. In the example, uCDN uses the Logging data: o to analyze the performance of the delivery operated by the dCDNs and to adjust its operations (e.g., request routing) as appropriate, o to provide reporting (non real-time) and monitoring (real-time) information to CSP. For instance, uCDN merges Logging data, extracts relevant KPIs, and presents a formatted report to the CSP, in addition to a bill for the content delivered by uCDN itself or by its dCDNs on his behalf. uCDN may also provide Logging data as raw log files to the CSP, so that the CSP can use its own logging analysis tools. +-----+ | CSP | +-----+ ^ Reporting and monitoring data * Billing ,--*--. Logging ,-' `-. Data =>( uCDN )<= Logging // `-. _,-' \\ Data || `-'-'-' || ,-----. ,-----. ,-' `-. ,-' `-. ( dCDN-1 ) ( dCDN-2 )<== Logging `-. ,-' `-. _,-' \\ Data `--'--' `--'-' || ,-----. ,' `-. ( dCDN-3 ) `. ,-' `--'--' ===> CDNI Logging Interface ***> outside the scope of CDNI Figure 1: Interactions in CDNI Logging Reference Model A dCDN (e.g., dCDN-2) integrates the relevant logging information obtained from its dCDNs (e.g., dCDN-3) in the logging information that it provides to the uCDN, so that the uCDN ultimately obtains all logging information relevant to a CSP for which it acts as the authoritative CDN. Note that the format of Logging information that a CDN provides over the CDNI interface might be different from the one that the CDN uses internally. In this case, the CDN needs to reformat the Logging information before it provides this information to the other CDN over the CDNI Logging interface. Similarly, a CDN might reformat the Logging data that it receives over the CDNI Logging interface before injecting it into its log-consuming applications or before providing some of this logging information to the CSP. Such reformatting operations introduce latency in the logging distribution chain and introduce a processing burden. Therefore, there are benefits in specifying CDNI Logging format that are suitable for use inside CDNs and also are close to the CDN Log formats commonly used in CDNs today. 2.2. Overall Logging Chain This section discusses the overall logging chain within and across CDNs to clarify how CDN Logging information is expected to fit in this overall chain. Figure 2 illustrates the overall logging chain within the dCDN, across CDNs using the CDNI Logging interface and within the uCDN. Note that the logging chain illustrated in the Figure is obviously only indicative and varies depending on the specific environments. For example, there may be more or less instantiations of each entity (i.e., there may be 4 Log consuming applications in a given CDN). As another example, there may be one instance of Rectification process per Log Consuming Application instead of a shared one. Log Consuming Log Consuming App App /\ /\ | | Rectification-------- /\ | Filtering /\ | Collection uCDN /\ /\ | | | Generation | CDNI Logging --------------------------------------------- exchange /\ Log Consuming Log Consuming | App App | /\ /\ | | | Rectification Rectification--------- /\ /\ | | Filtering /\ | Collection dCDN /\ /\ | | Generation Generation Figure 2: CDNI Logging in the overall Logging Chain The following subsections describe each of the processes potentially involved in the logging chain of Figure 2. 2.2.1. Logging Generation and During-Generation Aggregation CDNs typically generate logging information for all significant task completions, events, and failures. Logs are typically generated by many devices in the CDN including the surrogates, the request routing system, and the control system. The amount of Logging information generated can be huge. Therefore, during contract negotiations, interconnected CDNs often agree on a Logging retention duration, and optionally, on a maximum size of the Logging data that the dCDN must keep. If this size is exceeded, the dCDN must alert the uCDN but may not keep more Logs for the considered time period. In addition, CDNs may aggregate logs and transmit only summaries for some categories of operations instead of the full Logging data. Note that such aggregation leads to an information loss, which may be problematic for some usages of Logging (e.g., debugging). [I-D.brandenburg-cdni-has] discusses logging for HTTP Adaptive Streaming (HAS). In accordance with the recommendations articulated there, it is expected that a surrogate will generate separate logging information for delivery of each chunk of HAS content. This ensures that separate logging information can then be provided to interconnected CDNs over the CDNI Logging interface. Still in line with the recommendations of [I-D.brandenburg-cdni-has], the logging information for per-chunck delivery may include some information (a Content Collection IDentifier and a SessionIDentifier as discussed in Section 5)IDentifier) intended to facilitate subsequent post-generation aggregation of per-chunk logs into per-session logs. Note that a CDN may also elect to generate aggregate per-session logs when performing HAS delivery, but this needs to be in addition to, and not instead of, the per-chunk delivery logs. We note that this may be revisited in future versions of this document. Note that in the case of non real-time logging, the trigger of the transmission or generation of the logging file appears to be a synchronous process from a protocol standpoint. The implementation algorithm can choose to enforce a maximum size for the logging filebeyoundbeyond which the transmission is automatically triggered (and thus allow for anasynchrounousasynchronous transmission process). 2.2.2. Logging Collection This is the process that continuously collects logs generated by the log-generating entities within a CDN. In a CDNI environment, in addition to collecting logging information from log-generating entities within the local CDN, the Collection process also collects logging information provided by another CDN, or other CDNs, through the CDNI Logging interface. This is illustrated in Figure 2 where we see that the Collection process of the uCDN collects logging information from log-generating entities within the uCDN as well as logging information coming through CDNI Logging exchange with the dCDN through the CDNI Logging interface. 2.2.3. Logging Filtering A CDN may require to only present different subset of the whole logging information collected to various log-consuming applications. This is achieved by the Filtering process. In particular, the Filtering process can also filter the right subset of information that needs to be provided to a given interconnected CDN. For example, the filtering process in the dCDN can be used to ensure that only the logging information related to tasks performed on behalf of a given uCDN are made available to that uCDN (thereby filtering all the logging information related to deliveries by the dCDN of content for its own CSPs). Similarly, the Filtering process may filter or partially mask some fields, for example, to protect End Users' privacy when communicating CDNI Logging information to another CDN. Filtering of logging information prior to communication of this information to other CDNs via the CDNI Logging interface requires that the downstream CDN can recognize the set of log records that relate to each interconnected CDN. The CDN will also filter some internal scope information such as information related to its internal alarms (security, failures, load, etc). In some use cases described in [RFC6770], the interconnected CDNs do not want to disclose details on their internal topology. The filtering process can then also filter confidential data on the dCDNs' topology (number of servers, location, etc.). In particular, information about the requests served by every Surrogate may be confidential. Therefore, the Logging information must be protected so that data such as Surrogates' hostnames is not disclosed to the uCDN. In the "Inter-Affiliates Interconnection" use case, this information may be disclosed to the uCDN because both the dCDN and the uCDN are operated by entities of the same group. 2.2.4. Logging Rectification and Post-Generation Aggregation If Logging is generated periodically, it is important that the sessions that start in one Logging period and end in another are correctly reported. If they are reported in the starting period, then the Logging of this period will be available only after the end of the session, which delays the Logging generation. A Logging rectification/update mechanism could be useful to reach a good trade-off between the Logging generation delay and the Logging accuracy. Depending on the selected Logging protocol(s), such mechanism may be invaluable for real time Logging, which must be provided rapidly and cannot wait for the end of operations in progress. In the presence of HAS, some log-consuming applications can benefit from aggregate per-session logs. For example, for analytics, per- session logs allow display of session-related trends which are much more meaningful for some types of analysis than chunk-related trends. In the case where the log-generating entities have generated during- generation aggregate logs, those can be used by the applications. In the case where aggregate logs have not been generated, the Rectification process can be extended with a Post-Generation Aggregation process that generates per-session logs from the per- chunk logs, possibly leveraging the information included in the per- chunk logs for that purpose (Content Collection IDentifier and a Session IDentifier). However, in accordance with [I-D.brandenburg-cdni-has], this document does not define exchange of such aggregate logs on the CDNI Logging interface. We note that this may be revisited in future versions of this document. 2.2.5. Log-Consuming Applications 2.2.5.1. Maintenance/Debugging Logging is useful to permit the detection (and limit the risk) of content delivery failures. In particular, Logging facilitates the resolution of configuration issues. To detect faults, Logging must enable the reporting of any CDN operation success and failure, such as request redirection, content acquisition, etc. The uCDN can summarize such information into KPIs. For instance, Logging format should allow the computation of the number of times during a given epoch that content delivery related to a specific service succeeds/fails. Logging enables the CDN providers to identify and troubleshoot performance degradations. In particular, Logging enables the communication of traffic data (e.g., the amount of traffic that has been forwarded by a dCDN on behalf of an uCDN over a given period of time), which is particularly useful for CDN and network planning operations. 2.2.5.2. Accounting Logging is essential for accounting, to permit inter-CDN billing and CSP billing by uCDNs. For instance, Logging information provided by dCDNs enables the uCDN tocheckcompute the total amount of traffic delivered by every dCDNandforevery Delivery Service,a particular Content Provider, as well as, the associated bandwidth usage (e.g., peak, 95th percentile), and the maximum number of simultaneous sessions over a given period of time. 2.2.5.3. Analytics and Reporting The goal of analytics is to gather any relevant information to track audience, analyze user behavior, and monitor the performance and quality of content delivery. For instance, Logging enables the CDN providers to report on content consumption (e.g., delivered sessions per content) in a specific geographic area. The goal of reporting is to gather any relevant information to monitor the performance and quality of content delivery and allow detection of delivery issues. For instance, reporting could track the average delivery throughput experienced by End-Users in a given region for a specific CSP or content set over a period of time. 2.2.5.4. Security The goal of security is to prevent and monitor unauthorized access, misuse, modification, and denial of access of a service. A set of information is logged for security purposes. In particular, a record of access to content is usually collected to permit the CSP to detect infringements of content delivery policies and other abnormal End User behaviors. 2.2.5.5. Legal Logging Duties Depending on the country considered, the CDNs may have to retain specific Logging information during a legal retention period, to comply with judicial requisitions. 2.2.5.6. Notions common to multiple Log Consuming Applications 2.2.5.6.1. Logging Information Views Within a given log-consuming application, different views may be provided to different users depending on privacy, business, and scalability constraints. For example, an analytics tool run by the uCDN can provide one view to an uCDN operator that exploits all the logging information available to the uCDN, while the tool may provide a different view to each CSP exploiting only the logging information related to the content of the given CSP. As another example, maintenance and debugging tools may provide different views to different CDN operators, based on their operational role. 2.2.5.6.2. Key Performance Indicators (KPIs) This section presents, for explanatory purposes, a non-exhaustive list of Key Performance Indicators (KPIs) that can be extracted/ produced from logs. Multiple log-consuming applications, such as analytics, monitoring, and maintenance applications, often compute and track such KPIs. In a CDNI environment, depending on the situation, these KPIs may be computed by the uCDN or by the dCDN. But it is usually the uCDN that computes KPIs, because uCDN and dCDN may have different definitions of the KPIs and the computation of some KPIs requires a vision of all the deliveries performed by the uCDN and all its dCDNs. Here is a list of important examples of KPIs: o Number of delivery requests received from End-Users in a given region for each piece of content, during a given period of time (e.g., hour/day/week/month) o Percentage of delivery successes/failures among the aforementioned requests o Number of failures listed by failure type (e.g., HTTP error code) for requests received from End Users in a given region and for each piece of content, during a given period of time (e.g., hour/ day/week/month) o Number and cause of premature delivery termination for End Users in a given region and for each piece of content, during a given period of time (e.g., hour/day/week/month) o Maximum and mean number of simultaneous sessions established by End Users in a given region, for a givenDelivery Service,Content Provider, and during a given period of time (e.g., hour/day/week/month) o Volume of traffic delivered for sessions established by End Users in a given region, for a givenDelivery Service,Content Provider, and during a given period of time (e.g., hour/day/week/month) o Maximum, mean, and minimum delivery throughput for sessions established by End Users in a given region, for a givenDelivery Service,Content Provider, and during a given period of time (e.g., hour/day/week/ month) o Cache-hit and byte-hit ratios for requests received from End Users in a given region for each piece of content, during a given period of time (e.g., hour/day/week/month) o Top 10 of the most popularly requested content (during a givenday/week/month),day /week/month), o Terminal type (mobile, PC, STB, if this information can be acquired from the browser type header, for example). Additional KPIs can be computed from other sources of information than the Logging, for instance, data collected by a content portal or by specific client-sideAPIs.application programming interfaces. Such KPIs are out of scope for the present memo. The KPIs used depend strongly on the considered log-consuming application -- the CDN operator may be interested in different metrics than the CSP is. In particular, CDN operators are often interested in delivery and acquisition performance KPIs, information related to Surrogates' performance, caching information to evaluate the cache-hit ratio, information about the delivered file size to compute the volume of content delivered during peak hour, etc. Some of the KPIs, for instance those providing an instantaneous vision of the active sessions for a given CSP's content, are useful essentially if they are provided in real-time. By contrast, some other KPIs, such as the one averaged on a long period of time, can be provided in non-real time. 3. CDNI LoggingTransport Requirements 3.1. Timeliness Some applications consumingFile Format As defined in Section 1.1 a CDNILogging information, suchlogging field is asaccounting or trend analytics, only requirean atomic logging informationto be available withelement and atimeliness of the order ofCDNI Logging Record is aday or the hour. This document focuses on addressing this requirement. Some applications consumingcollection of CDNI Logginginformation, such as real- time analytics, requireFields containing all logging information corresponding tobe available in real- time (i.e. of the order ofasecond after the corresponding event).single logging event. This documentleaves this requirement out of scope. 3.2. Reliability CDNI logging information must be transmitted reliably. The transport protocol should contain an anti-replay mechanism. 3.3. Security CDNI logging information exchange must allow authentication, integrity protection, and confidentiality protection. Also,defines anon- repudiation mechanism is mandatory,third level of structure, thetransport protocol should support it. 3.4. ScalabilityCDNIlogging information exchange must support large scale information exchange, particularly so in the presence of HTTP Adaptive Streaming. For example, if we considerLogging File, that is aclient pulling HTTP Progressive Download content with an average durationcollection of10 minutes, this represents 1/600CDNIdeliveryLoggingRecords per second. If we assume the dCDNRecords. This structure issimultaneously serving 100,000 such clients on behalf of the uCDN, the dCDN will be generating 167illustrated in Figure 3. The CDNI LoggingRecords per second to be communicated to the uCDN over the CDNI Logging interface. Or equivalently, if we assume an average delivery rate of 2Mb/s, the dCDN generates 0.83 CDNI Logging Records per second for every Gb/s of streaming on behalf of the uCDN. For example, if we consider a client pulling HAS content and receiving a video chunk every 2 seconds, a separate audio chunck every 2 seconds and a refreshed manifest every 10 seconds, this represents 1.1 delivery Logging Record per second. If we assume the dCDN is simultaneously serving 100,000 such clients on behalf of the uCDN, the dCDN will be generating 110,000 Logging Records per second to be communicated to the uCDN over the CDNI Logging interface. Or equivalently, if we assume an average delivery rate of 2Mb/s, the dCDN generates 550 CDNI Logging Records per second for every Gb/s of streaming on behalf of the uCDN. 3.5. Consistency between CDNI Logging and CDN Logging There are benefits in using a CDNI logging format as close as possible to intra-CDN logging format commonly used in CDNs tody in order to minimize systematic translation at CDN/CDNI boundary. 3.6. Dispatching/Filtering When a CDN is acting as a dCDN for multiple uCDNs, the dCDN needs to dispatch each CDNI Logging Record to the uCDN that redirected the corresponding request. The CDNI Logging format need to allow, and possibly facilitate, such a dispatching. 4. CDNI Logging Information Structure and Transport As defined in Section 1.1 a CDNI logging field is as an atomic logging information element and a CDNI Logging Record is a collection of CDNI Logging Fields containing all logging information corresponding to a single logging event. This document defines non-real-time transport of CDNI Logging information over the CDNI interface. For such non-real-time transport, this documents defines a third level of structure, the CDNI Logging File, that is a collection of CDNI Logging Records. ThisFile structure and encoding isdescribed in Figure 3. This document then specifies how to transport such CDNI Logging Files across interconnected CDNs. We observe that this approach can be tunedspecified ina real deployment to achieve near-real time exchange of CDNI Logging information, e.g., by increasing the frequency of logging file creation and distribution throughouttheLogging chain, but it is not expected that this approach can support real time transport (e.g., sub-second) of CDNI logging information.present section. +------------------------------------------------------+ |CDNI Logging File | | | | +--------------------------------------------------+ | | |CDNI Logging Record | | | | +-------------+ +-------------+ +-------------+ | | | | |CDNI Logging | |CDNI Logging | |CDNI Logging | | | | | | Field | | Field | | Field | | | | | +-------------+ +-------------+ +-------------+ | | | +--------------------------------------------------+ | | | | +--------------------------------------------------+ | | |CDNI Logging Record | | | | +-------------+ +-------------+ +-------------+ | | | | |CDNI Logging | |CDNI Logging | |CDNI Logging | | | | | | Field | | Field | | Field | | | | | +-------------+ +-------------+ +-------------+ | | | +--------------------------------------------------+ | | | | +--------------------------------------------------+ | | |CDNI Logging Record | | | | +-------------+ +-------------+ +-------------+ | | | | |CDNI Logging | |CDNI Logging | |CDNI Logging | | | | | | Field | | Field | | Field | | | | | +-------------+ +-------------+ +-------------+ | | | +--------------------------------------------------+ | +------------------------------------------------------+ Figure 3: Structure of Logging FilesItThe CDNI Logging File format isexpected that future version of thisinspired from the W3C Extended Log File Format [ELF]. However, it is fully specified by the present document. Where the present documentwill also specify real time transportdiffers from the W3C Extended Log File Format, an implementation of CDNI Logginginformation overMUST comply with the present document. A CDNIinterface. We note that this might involve direct transportLogging File MUST contain a sequence of lines containing US- ASCII characters [CHAR_SET] terminated by either the sequence LF or CRLF. A CDNI LoggingRecords without prior grouping intoimplementation consuming CDNI Logging Files MUST accept lines terminated by either LF or CRLF. Each line of afile structure to avoid the latency associated with creating and transporting suchCDNI Logging File MUST contain either afile structure throughout the logging chain. The semantics and encoding ofdirective or a CDNI Logging Record. Directives record information about the CDNI Loggingfieldsprocess itself. Lines containing directives MUST begin with the "#" character. Directives are specified in Section5. The semantics and encoding3.1. Logging Records provide actual details ofCDNIthe logged event. Logging Records are specified in Section6. The3.2. 3.1. CDNI Logging File Directives An implementation of the CDNI LoggingFile format isinterface MUST support the following directives (formats specified in the form <...> are specified in Section7. The protocol for transport3.3): o Version: * format: <digit>.<digit> * semantic: indicates the version of the CDNI Logging Fileisformat. The value MUST be "1.0" for the version specified inSection 8. 5. CDNI Logging Fields Existing CDNs Logging functions collect and consolidate logs performed by their Surrogates. Surrogates usually storethelogs using a format derived from Web servers'present document. * occurrence: there MUST be one andcaching proxies' log standards such as W3C, NCSA [ELF] [CLF], or Squid format [squid]. In practice, these formats are adapted to cope with CDN specifics. Appendix A presents examples of commonly used log formats. 5.1. Semanticsonly one instance ofCDNI Logging Fields This section specifiesthis directive. It MUST be thesemanticsfirst line of the CDNI LoggingFields. The specific subset offile. o UUID: * format: <string> * semantic: this is Universally Unique IDentifier for the CDNI Loggingfields that can be found in each type of Logging Record isFile as specified inSection 6. The semantics[RFC4122]. * occurrence: there MUST be one and only one instance of this directive. o Origin: * format: <host> * semantic: this identifies the entity transmitting the CDNI LoggingFields are specifiedFile (e.g. the host inTable 1. +--------------+----------------------------------------------------+ | Name | Description | +--------------+----------------------------------------------------+ | Start-time | A start date and time associated withalogged | | | event; for instance,dCDN supporting thetime at which a Surrogate | | | received a content delivery requestCDNI Logging interface) or thetime at | | | which an origin server received a content | | | acquisition request. | | End-time | An end date and time associated with a logged | | | event. For instance, the time at which a | | | Surrogate completed the handling of a content | | | delivery request (e.g., end of deliveryentity responsible for transmitting the CDNI Logging File (e.g. the dCDN). * occurrence: there MUST be zero orerror). | | Duration | The durationone instance ofan operation in milliseconds. For | | | instance,thisfield coulddirective. This directive MAY beused to provide the | | | time it tookincluded by theSurrogate to sendimplementation transmitting therequested | | | file toCDNI Logging file. When included by theEnd-Usertransmitting side, it MUST be validated or over-written by thetimereceiving side. When, it is not included by the transmitting side, ittookMAY be added locally by the| | | Surrogatereceiving side. [Editor's Note if we include a non-repudiation mechanism: discuss the fact that this will provide incentive toacquiredCDN to not cheat , as it can be detected] o Record-Type: * format: <string> * semantic: indicates thefile on a cache-miss | | | event. Intype of thecase where Start-time, End-time, | | | and Duration appear in aCDNI LoggingRecord,Records that follow this directive, until another Record-Type directive (or the| | | Duration is toend of the CDNI Logging File). "cdni_http_request_v1" MUST beinterpretedindicated in the Record-Type directive for CDNI Logging records corresponding to HTTP request (e.g. a HTTP delivery request) as specified in Section 3.2.1. * occurrence: there MUST be at least one instance of this directive. The first instance of this directive MUST precede atotal activity | | | time related toFields directive and precede any CDNI Logging Record. o Fields: * format: <field-name>[ <field-name>], where thelogged operation. | | Client-IP | The IP addressallowed list of <field-name> are specified for each Record-Type in Section 3.2. * semantic: this lists theUser Agentnames of all the fields for which a value is to appear in the CDNI Logging Records thatissuedare after this directive. The names of the| | | logged request orfields, as well as their possible occurrences, are specified for each type of CDNI Logging Records in Section 3.2. The field names listed in this directive MUST be separated by aproxy, forwhitespace (" "). * occurrence: there MUST be at least one instance| | | "203.0.113.1". | | Client-port | The source portofthe logged request (e.g., 9542) | | Destination- |this directive per Record-Type directive. TheIP addressfirst instance of this directive for a given Record-Type MUST precede any CDNI Logging Record for this Record-Type. o Integrity-Hash: * format: <string> * semantic: This directive permits thehost that received the | | IP | logged request (e.g., 192.0.2.2). | | Destination- | The hostnamedetection of a corrupted CDNI Logging File. This can be useful, for instance, if a problem occurs on thehost that received the logged | | hostname | request (e.g., Surrogate1.cdna.com). | | Destination- | The destination portfilesystem of thelogged request (e.g., | | port | 80). | | Operation | The kinddCDN Logging system and leads to a truncation ofoperation that is logged; for instance | | | Delivery or Purging. | | URI_full |a logging file. Thefull requested URL (e.g., | | | "http://node1.peer-a.op-b.net/cdn.csp.com/movies/p | | | otter.avi?param=11&user=toto"). When HTTP request | | | redirectionIntegrity-Hash value isused,computed, and included in thisURI includesdirective by the| | | Surrogate FQDN. Ifentity that transmits theassociation of requests t | | | oSurrogates is confidential,CDNI Logging File, by applying thedCDN can present | | | only URI_part to uCDN. | | URI_part | The requested URL path (e.g., | | | /cdn.csp.com/movies/potter.avi?param=11&user=toto | | | ifMD5 ([RFC1321]) cryptographic hash function on thefull request URL was | | | "http://node1.peer-a.op-b.net/cdn.csp.com/movies/p | | | otter.avi?param=11&user=toto"). The URI without | | | host-name typically includesCDNI Logging File, including all the"CDN domain" | | | (ex.cdn.csp.com) - cf. [I-D.ietf-cdni-framework]: | | | it enablesdirectives and logging records, up to theidentification ofIntergrity-Hash directive itself, excluding theCSP service | | | agreed betweenIntegrity- Hash directive itself and, when present, also excluding the Non-Repudiation-Hash directive. The Integrity-Hash value is represented as a US-ASCII encoded hexadecimal number, 32 digits long (representing a 128 bit hash value). The entity receiving theCSP andCDNI Logging File also computes in a similar way theCDNP operatingMD5 hash on the| | | uCDN. | | Protocol | The protocolreceived CDNI Logging File andprotocol versioncompares this hash to the value of themessage | | | that triggeredIntegrity-Hash directive. If the two values are equal, then the received CDNI Loggingentry (e.g., HTTP/1.1). | | Request-meth | The protocol method ofFile MUST be considered non-corrupted. If therequest message that | | od | triggeredtwo values are different, the received CDNI Loggingentry. | | Status |File MUST be considered corrupted. Theprotocol statusbehavior of thereply message related | | | to theentity that received a corrupted CDNI Loggingentry | | Bytes-Sent | The number of bytes at application-layer | | | protocol-level (e.g., HTTP)File is outside the scope of this specification; we note that thereply message | | | relatedentity MAY attempt to pull again the same CDNI Loggingentry. It includesfile from the| | | sizetransmitting entity. * occurrence: there MUST be one and only one instance of this directive. This field MUST be theresponse headers. | | Headers-Sent | The number of bytes corresponding to response | | | headers at application-layer protocol-level (e.g., | | | HTTP)last line of thereply message related to theCDNI Logging| | | entry. | | Bytes-receiv | The number of bytes (headers + body) ofFile when the| | ed | message that triggeredNon-Repudiation-Hash is absent, and MUST be theLogging entry. | | Referrer | The valueone before last line of theReferrer header in an HTTP | | | request. | | User-Agent | The valueCDNI Logging File when the Non-Repudiation-Hash is present. o Non-Repudiation-Hash: * format: <string> * semantic: This hash field permits the non-repudiation of theUser Agent header in an HTTP | | | request. | | Cookie | The valueCDNI Logging File by the entity that transmitted the CDNI Logging File. [Editor's Note: I need help for specifying the appropriate hash - ie hash must be signed with private-key of entity transmitting theCookie header in an HTTP request. | | Byte-Range | [Ed. note: toCDNI Logging File] * occurrence: there MAY bedefined] | | Cache-contro | The valueone and only one instance of this directive. When present, this directive MUST be thecache-control header in an HTTP | | l | answer. This header is particularly important for | | | content acquisition logs. | | Record-diges | A digestlast line of the CDNI LoggingRecord; it enables | | t | detecting corruptedFile. 3.2. LoggingRecords. | | CCID |Records AContent Collection IDentifier (CCID) eases the | | | correlation of severalCDNI LoggingRecords related to | | | a Content Collection (e.g.,Record consists of amovie split in | | | chunks). | | SID | A Session Identifier (SID) eases the correlation | | | (and aggregation)sequence ofseveralCDNI LoggingRecords | | | relatedFields relating toa session. The SID is especially | | | relevant for summarizing HASthat single CDNI Logginginformation | | | [I-D.brandenburg-cdni-has]. | | uCDN-ID | An element authenticatingRecord. CDNI Logging Fields MUST be separated by theoperator"horizontal tabulation (TAB)" character. Some CDNI Logging field names use a prefix scheme similar to the one used in W3C Extended Log File Format [ELF] to facilitate readability. The semantics of theuCDN | | | asprefix in the present document is: o c: refers to theauthority having delegatedUser Agent that issues the request (corresponds to| | |thedCDN. | | Delivering-C | An identifier (e.g., an aggregation of an IP | | DN-ID | address and a FQDN)"client" of W3C Extended Log Format) o s: refers to theDelivering CDN. The | | | Delivering-CDN-ID might be considered as | | | confidential bydCDN Surrogate that serves thedCDN. In such case,request (corresponds to the "server" of W3C Extended Log Format) o cs: refers to communication from the dCDN| | | could either not provide this fieldSurrogate towards the User-Agent o sc: refers to communication from theuCDN or | | | overwriteUser-Agent towards theDelivering-CDN-IDdCDN Surrogate [Editor's Note: see discussion withits on | | | identifier. | | Cache-bytes | The numberRob about adding definition for "r"] An implementation ofbody bytes served from caches. This | | | quantity permitsthecomputationCDNI Logging interface as per the present specification MUST support the CDNI HTTP Delivery Records as specified in Section 3.2.1. [Editor's Note": other types of delivery records will be listed here if we specify other types for this version eg Request Routing]. The formats listed in this section in thebyte hit | | | ratio. | | Action |form <...> are specified in Section 3.3). 3.2.1. HTTP Request Logging Record TheAction describes how a given request was | | | treated locally: throughHTTP Request Logging Record contains the following CDNI Logging Fields, listed by their field name: o date: * format: <date> * semantic: the date at whichtransport protocol, | | | with or without content revalidation, with a cache | | | hit or cache miss, with fresh or stale content, | | |the processing of request started on the Surrogate. * occurrence: there MUST be one and(if relevant) with which error. Example with | | | Squid format [squid]: "TCP_REFRESH_FAIL_HIT" means | | | that an expired copyonly one instance ofan object requested | | | through TCP was inthis field. o time: * format: <time> * semantic: thecache. Squid attempted to | | | make an If-Modified-Since request, but it failed. | | | The old (stale) object was delivered totime at which the| | | client. | | MIME-Type | The MIME-Typeprocessing of request started on therequested content | | dCDN | An element authenticatingSurrogate. * occurrence: there MUST be one and only one instance of this field. o time-taken: * format: <fixed> * semantic: duration, in seconds, between theoperatorstart of thedCDN | | identifier | asprocessing of theauthority requestingrequest and thecontent tocompletion of the| | | uCDN | | Caching_date | Date at whichdelivery by thedelivered content was stored in | | | cache | | Validity_hea | A copySurrogate. * occurrence: there MUST be one and only one instance ofall headers related to content validity: | | ders | Pragmathis field. o c-ip: * format: <address> * semantic: the source IPv4 orCache-Control (no-cache), ETag, Vary, | | | last-modified... | | Lookup_durat | Duration ofIPv6 address (i.e. theDNS resolution for resolving"client" address) in the| | ion | FQDN of (uCDN's or CSP's) origin server. | | Delay_to_fir | Durationrequest received by the Surrogate. * occurrence: there MUST be one and only one instance of this field. o c-port: * format: <integer> * semantic: theoperations fromsource TCP port (i.e. thesending of"client" port) in the| | st_bit | content acquisitionrequesttoreceived by thereceptionSurrogate. * occurrence: there MUST be zero or exactly one instance of| | |this field. o s-ip: * format: <address> * semantic: thefirst bitIPv4 or IPv6 address of therequested content. | | Delay_to_las | Duration ofSurrogate that served theoperations fromrequest (i.e. thesending"server" address). * occurrence: there MUST be zero or exactly one instance of this field. o s-hostname: * format: <host> * semantic: the hostname of the Surrogate that served the| | t_bit | content acquisitionrequestto(i.e. thereception"server" hostname). * occurrence: there MUST be zero or exactly one instance of| | |this field. o s-port: * format: <integer> * semantic: thelast bit ofdestination TCP port (i.e. therequested content. | +--------------+----------------------------------------------------+ Table 1: Semantics of CDNI Logging Fields NB: we define three fields related to"server" port) in the request received by thetimingSurrogate. * occurrence: there MUST be zero or exactly one instance oflogged operations: Start-time, End-time, and Duration. Start-timethis field. o cs-method: * format: <string> * semantic: this istypically useful for human readers (e.g., while debugging), however, some servers log the operation's End-time which corresponds tothetime of log record generation. In absenceHTTP method ofLogging summarization,the HTTP request received by the Surrogate. * occurrence: There MUST be one and onlytwoone instance ofthese three fields are required to obtain relevant timing information onthis field. o cs-uri: [Editor's note: rename "sr-uri" ?] * format: <uri> * semantic: this is theoperation. However, when some kindabsolute-URI ofLogging aggregation/summarization is used, it can be advantageous to keepthethree fields: for instance, inrequest received by thecaseSurrogate. [Editor's Note: do we agree this should be an absolute-URI even if teh request uses a relative-URI?] * occurrence: there MUST be zero or exactly one instance ofHAS, keeping the three fields permits computingthis field. o ucdn-centric-uri: * format: <uri> * semantic: this is anaverage delivery bitrateabsolute URI derived froma single Logging Record aggregating information onthedeliveryabsolute-URI ofmultiple consecutive video chunks. Multiple header fields, in addition totheones explicitly listed inrequest received by thetable could be reproduced inSurrogate but modified by the entity generating or transmitting the CDNI Loggingrecords. NoteRecord, in a way thatuCDN may want to filter Logging data by user (and not by IP address) to provide more relevant information tois agreed upon between theCSP. In such case, a user may be identified as a combination of several piecestwo ends ofinformation such astheclient IP and User Agent or throughCDNI Logging interface. For example, theSID. The URI_full provides information ontwo ends of theSurrogateCDNI Logging interface could agree thatprovidedthecontent. This information can be relevant, for instance, forucdn-centric-uri strips theInter-Affiliates use case described in [RFC6770]. However, in some cases it may be considered as confidential andpart of thedCDN may provide URI_part instead. Other information that could be logged include operationsdelivery-uri thatrefer toexposes which individual Surrogate actually performed thegeneral statedelivery. The details of modification performed to generate therequest, before it gets processed locally. Such information is relateducdn-centric-uri, as well as the mechanism to agree on these modifications between theauthorizationtwo sides of therequests, URL rewriting rules enforced, the X-FORWARDED-FOR non standard HTTP header... [Editor's Note:CDNI Logginginformation may be used for debugging. Therefore, various CDN operations might be logged, depending oninterface are outside theagreement betweenscope of thedCDN andpresent document. [Editor's Note: do we agree this should be an absolute-URI even if theuCDN, such as operations related to Request Routing and Metadata. These may call forrequest uses afew additional Fields torelative-URI?] * occurrence: there MUST bedefined]. 5.2. Syntaxone and only one instance ofCDNI Logging Fields This sectionthis field. o protocol: * format: <string> * semantic: this isintended to contain the specification for the syntax and encodingvalue of theCDNI Logging fields. For now, Table 2 illustrates the definitionHTTP-Version field as specified in [RFC2616] ofsome information elements. It provides examples using Apache log format strings [apache] when they exist. [Ed. note: specify for all Logging Fieldsthetype (e.g., varchar, int, float, ...) andRequest-Line of themaximum size (e.g., varchar(200))] +----------+-------------------+------------------------------------+ | Name | String | Example | +----------+-------------------+------------------------------------+ | Time | %t | [10/Oct/2000:13:55:36-0700] | | Duration | %D | - | | Client-I | %a | 203.0.113.45 | | P | | | | Operatio | - | - | | n | | | | URI_full | %U | - | | Protocol | %H | HTTP/1.0 | | Request | %m | GET | | method | | | | Status | %>s | 200 | | Bytes | %O | 2326 | | Sent | | | | Bytes | %I | 432 | |request received| | | | Header | \"%{Referrer}i\" | "http://www.example.com/start.html | | | \"%{User-agent}i\ | ""Mozilla/4.08 [en] (Win98; I | | | " | ;Nav)" | +----------+-------------------+------------------------------------+ Table 2: Examples using Apache format 6. CDNI Logging Records [Ed. note: we need to specifyby theencodingSurrogate (e.g. "HTTP/1.1"). * occurrence: there MUST be one and only one instance of this field. o sc-status: * format: <digit><digit><digit> * semantic: this is thefile,HTTP Status-Code in theseparation character, etc...] This section definesHTTP response from theevents for which a CDNI Logging record canSurrogate. * occurrence: There MUST beexchanged over the CDNI Logging interafceone andfor each typeonly one instance ofLogging Record indicatesthis field. o sc-total-bytes: * format: <integer> * semantic: this is theallowed settotal number of bytes ofCDNI Information Elements. We classifythelogged events depending onHTTP response sent by theCDN operationSurrogate in response towhich they relate: Content Delivery, Content Acquisition, Content Invalidation/Purging, etc. 6.1. Content Delivery The content delivery event triggeringthegeneration of a Logging Record include: o Reception by a dCDN Surrogate of a content request The Logging Record for Content Delivery containsrequest. This includes thefollowing setbytes ofCDNI Logging Elements: +----------------------+--------------------------------------------+ | Name | Mandatory/Optional | +----------------------+--------------------------------------------+ | Start-time | Mandatory | | Duration | Mandatory | | Client-IP | Mandatory | | Client-port | Optional | | Destination-IP | Mandatory if Destination-Hostname is | | | absent | | Destination-Hostname | Mandatory if Destination-IP is absent | | Destination-port | Optional | | Operation | Optional | | URI_full | Mandatory if URI_part is absent | | URI_part | Mandatory if URI_full is absent | | Protocol | Mandatory if protocol is different to | | | HTTP/1.1 | | Request-method | Mandatory | | Status | Mandatory | | Bytes-Sent | Mandatory | | Headers-Sent | Optional | | Bytes-received | Optional | | Referrer | Optional | | User-Agent | Optional | | Cookie | Optional | | Byte-Range | ? | | Cache-control | Optional | | Record-digest | ? | | CCID | Optional. Only applicable to HTTP | | | Adaptive Streaming delivery. | | SID | Optional. Only applicable tothe Status-Line (including HTTP| | | Adaptive Streaming delivery. | | Cache-bytes | Optional | | Action | Mandatory (in particulat re cache | | | Hit/Miss) | | MIME-Type | Mandatory | +----------------------+--------------------------------------------+ Table 3: CDNI Logging Fields in Delivery Logging Record In Table 3, "Mandatory" means thatheaders) and of the message-body. * occurrence: There MUST be one and only one instance of thisfieldfield. o sc-entity-bytes: * format: <integer> * semantic: this is the number of bytes of the message-body in the HTTP response sent by the Surrogate in response to the request. This does not include the bytes of the Status-Line (and therefore does not include the bytes of the HTTP headers). * occurrence: there MUST beincludedzero or exactly one instance of this field. o cs(<HTTP-header>): * format: <string> * semantic: the value of the HTTP header identified ineach Delivery Record and "Optional" means thatthe field name as itcanappears in the request processed by the Surrogate. * occurrence: there MUST beincluded based onzero, one or any number of instance of this field. o sc(<HTTP-header>): * format: <string> * semantic: theagreement betweenvalue of thedCDN andHTTP header identified in theuCDNfield name asestablished via mechanism outsideit appears in thescope of this document (e.g.,response issued byhuman agreement). 6.2. Content Invalidation and Purging Given thatthePurge interface is expected to contain a mechanismSurrogate toreport on completion ofserve theInvalidation/purge request,request. * occurrence: thereis no need to specify separate Log Records for these events. 6.3. Request RoutingMUST be zero, one or any number of instance of this field. o s-ccid: * format: [Editor's Note:Is there a requirement for the dCDNtoprovide logs for request routing events?] 6.4. Logging Extensibility Future usages might introducebe based on cdni-metadata or relevant companion I-D] * semantic: this contains theneed for additional Logging fields. In addition, some use-cases such as an Inter-Affiliate Interconnection [RFC6770], might take advantagevalue ofextended Logging exchanges. Therefore, it is importantthe Content Collection IDentifier specified in [I-D.ietf-cdni-metadata] and associated topermit CDNsthe content served by the Surrogate through the CDNI Metadata interface. * occurrence: there MUST be zero or exactly one instance of this field. o s-sid: * format: [Editor's Note: add reference touse additional Logging fields besidesthestandard ones, if they want. For instance, an "Account-name" identifyingI-D defining thecontract enforced byformat of Session ID>?] * semantic: this contains thedCDN for a givenvalue of the Session IDentifier specified in ??? and associated to the served requestcouldby the Surrogate. * occurrence: there MUST beprovided in extended fields.zero or exactly one instance of this field. o s-cached: [Editor's Note: W3C uses "cached" . is "s-cached" better?] * format: <string> * semantic: this characterises whether the Surrogate could serve the request using content already stored on its local cache. Therequired Logging Records may dependallowed values are "0" (for miss) and "1" for hit). "1" MUST be used when the Surrogate could serve the request using exclusively content already stored on its local cache. "0" MUST be used otherwise (including cases where theconsidered services. For instance, static file delivery (e.g., pictures) typically doesSurrogate served the request using some, but notinclude any delivery restrictions. By contrast, video delivery typically implies strongall, contentdelivery restrictions, as explainedalready stored on its local cache). Note that a "0" only means a cache miss in[RFC6770],the Surrogate andLogging could includedoes not provide any informationabouton whether theenforcementcontent was already stored, or not, in another device ofthese restrictions. Therefore, to easethesupport of varied services as well asdCDN i.e. whether this was a "dCDN hit" or "dCDN miss". * occurrence: there MUST be zero or exactly one instance offuture services,this field. o s-uri-signing: * format: <string> * semantic: this characterises theLogging interface should support optional Logging Records. 7. CDNI Logging File Format Interconnected CDNs may support various Logging formats. However, they must support at leasturi signing validation performed by thedefault Logging File format described here. 7.1. Logging Files [Ed. Note: How many files (one per type of Delivery Service (e.g., HTTP, WMP) and per type of Event (e.g., Errors, Delivery, Acquisition,...?)and what would be inside... These aspects needs toSurrogate on the request. The allowed values are: * + "0" : no uri signature validation performed + "1" : uri signature validation performed and validated + "2" : uri signature validation performed and rejected * occurrence: there MUST bedetailed...] 7.2. File Formatzero or exactly one instance of this field. TheLogging file format should be independent from the selected transport protocol,"Fields" directive corresponding toguaranteeaflexible choice of transport protocols. [Ed. note: for the real timeHTTP Request Loggingexchanges,Record MUST list all the fields whose occurrence is specified above as "There MUST be one and only one instance of thismightfield". These fields MUST behard] All Logging Recordspresent in every HTTP Request Logging Record. The "Fields" directive corresponding to a HTTP Request LoggingFile must shareRecord MAY list all thesame format (samefields whose occurrence is specified above as "there MUST be zero or exactly one instance of this field" or "there MUST be zero, one or any number of instance of this field". The set ofLogging Fields,such fields actually listed in thesame order, with the same semantics, separated"Fields" directive is selected by thesame Separator Character), to easeimplementation generating the CDNI Logging File based on agreements between the interconnected CDNs established through mechanisms outside theparsingscope of this specification (e.g. contractual agreements) . When such a field is not listed in the "Fields" directive, it MUST NOT be included in the Loggingdata byRecord. When such a field is listed in theCDN that receives"Fields" directive, it MUST be included in the LoggingFile. The CDNRecord; in thatprovidescase, if theLogging datavalue for the field isresponsiblenot available, this MUST be conveyed via a dash character ("-"). The fields listed in the "Fields" directive can be listed in the order in which they are listed in Section 3.2.1 or in any other order. [Editor's Note: discuss private fields ] 3.2.2. CDNI Logging File Example #Version: 1.0 #UUID: urn:uuid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6??? #Origin: cdni-logging-entity.dcdn.example.com #Record-Type: cdni_http_request_v1 #Fields: date time time-taken c-ip cs-method ucdn-centric-uri protocol sc-status sc-total-bytes cs(User-Agent) cs(Referer) s-cached 2013-05-17 00:38:06.825 88.958 10.5.7.1 GET http://cdni- ucdn.dcdn.example.com/video/movie100.mp4 HTTP/1.1 200 672989 Mozilla/ 5.0 (Windows; U; Windows NT 6.0; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.127 Safari /533.4 host1.example.com 1 2013-05-17 00:39:09.145 169.790 10.5.10.5 GET http://cdni- ucdn.dcdn.example.com/video/movie118.mp4 HTTP/1.1200 1579920 Mozilla/ 5.0 (Windows; U; Windows NT 6.0; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.127 Safari /533.4 host1.example.com 1 2013-05-17 00:42:53.437 2.879 10.5.10.5 GET http://cdni- ucdn.dcdn.example.com/video/picture11.mp4 HTTP/1.0 200 17724 Mozilla/ 5.0 (Windows; U; Windows NT 6.0; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.127 Safari /533.4 host5.example.com 0 #Integrity-Hash: 9e107d9d372bb6826bd81d3542a419d6 [Editor's Note: include the correct MD5-hash value forguaranteeingtheconsistencyactual example] 3.3. Fields and Directives Formats [Editor's Note: still needs work to minimise the number ofthe Logging records' formats, typically via its log filteringtypes defined across this section andaggregation processes (see Section 2.2.3). 7.2.1. Headers Logging files must include a header withspecific types defined inside theinformation described in Figure 4. +----------------+-------------------+------------------------------+ | Field | Description | Examples | +----------------+-------------------+------------------------------+ | Format | Identification of | standard_cdni_errors_http_v1 | |field definitions themselves] o <digit> = "0" |CDNI Log format."1" | "2" | "3" |Fields"4" |A description of"5" | "6" | "7" | "8" | "9" o <integer> = 1*<digit> o <address> = <integer> [ "." *<integer> ] [ ":" <integer> ] o <host> = as specified in [RFC3986]. o <date> = 4<digit> "-" 2<digit> "-" 2<digit> * Dates are recorded in therecordformat| | | | (list of fields). | | | Log-ID | Identifier | abcdef1234 | | |YYYY-MM-DD where YYYY, MM and DD stand for theCDNI Log | | | | file (facilitates | | | | detection of | | | | duplicate Logs | | | |numeric year, month andtrackingday respectively. All dates are specified in| | | | case of | | | | aggregation). | | | Log-Timestamp | Time,Universal Time Coordinated (UTC). o <time> = 2<digit> ":" 2<digit> ":" 2<digit> ["." *<digit>] * Times are recorded in| [20/Feb/2012:00:29.510+0200] | | | milliseconds,the| | | |form HH:MM:SS or HH:MM:SS.S where HH is the hour in 24 hour format, MM is minutes and SS is seconds. All times are specified in Universal Time Coordinated (UTC). o <uri> = <string> containing a URI as specified in [RFC3986]. o <fixed> = Fixed Format Float = 1*<digit> [. *<digit>] o <HTTP-header> = <string> containing a HTTP header field name (e.g. "User-Agent", "Referer") as specified in [RFC2616]. 4. CDNILog was | | | | generated. | | | Log-Origin | IdentifierLogging File Exchange Protocol This document specifies a protocol for the exchange of CDNI Logging Files as specified in Section 3. This protocol comprises: o a CDNI Logging feed, allowing the| cdn1.cdni.example.com | | | authority (e.g., | | | |dCDNor uCDN) | | | | providingto notify the uCDN about the CDNI Logging files that can be retrieved by that uCDN from the dCDN, as well as all theLog-| | | | -ging | | +----------------+-------------------+------------------------------+ Figure 4:information necessary for retrieving each of these CDNI LoggingHeaders All time-relatedFile. The CDNI LoggingFields and datafeed is specified intheSection 4.1. o a CDNI Logging Fileheaders/ footers must providepull mechanism, allowing the uCDN to obtain from the dCDN atime zone and be at leastgiven CDNI Logging File atmillisecond (ms) accuracy. The accuracy must be consistent to permitthecomputationuCDN convenience. The CDNI Logging File pull mechanisms is specified in Section 4.2. An implementation ofKPIs involving operations realizedthe CDNI Logging interface as per the present document generating CDNI Logging file (i.e. onseveral CDNs. [Ed. note: would it make sense to add a kindthe dCDN side) MUST support the server side of"examplethe CDNI LoggingRecord" infeed and the server side of the CDNI Logging pull mechanism. An implementation of the CDNI Logging interface as per the present document consuming CDNI Logging file (i.e. on the uCDN side) MUST support the client side of the CDNI Logging feed andassociated semantic (e.g., in a structure data format) ?] 7.2.2. Body (Logging Records) Format [Ed.the client side of the CDNI Logging pull mechanism. [Editor's note: verify that theW3C extended log format is a good base candidate to look at. ] Since records for real time informationclient side andnon-real time information could use different formats, we do not yet solveserver side are well defined in theproblemrespective sections] We note that implementations ofreal time logging exchanges in this version. 7.2.3. Footer Formatthe CDNI Loggingfiles must include a footerinterface MAY also support other mechanisms to exchange CDNI Logging Files, for example in view of exchanging logging information with minimum time-lag (e.g. sub-minute or sub-second) between when theinformation describedevent occurred inFigure 5. +---------+----------------------------------------------+----------+ | Field | Description | Examples | +---------+----------------------------------------------+----------+ | Log | Digest ofthecomplete Log (facilitates | | | Digest | detection of Log corruption) | | +---------+----------------------------------------------+----------+ Figure 5: Logging footers This digest field permitsdCDN and when thedetectioncorresponding Logging Record is made available to the uCDN (e.g. for log-consuming applications requiring extremely fresh logging information such as near-real-time content delivery monitoring). Such mechanism might be defined in future version ofcorruptedthe present document. 4.1. CDNI Loggingfiles. This canFeed [Editor's Note: text to beuseful,added. Feed is based on ATOM and contains a UUID + URI forinstance,each CDNI Logging File in "window" - ifa problem occurs onappropriate thefilesystem oftext should refer to thedCDNside generating the CDNI LoggingsystemFeed "as server-side", andleads to a truncation of a logging file. Additional mechanisms to avoid corruptedthe side consuming the Feed as the client- side]. 4.2. CDNI Loggingfiles are expected to be provided byFile Pull A client-side implementation of the CDNI Loggingtransport protocol, cf. Section 8. 8.interface MAY pull at its convenience any CDNI Logging FileTransport Protocol As presented in [RFC6707], several protocols already existthatcould potentially be usedis advertised by the server-side in the CDNI Logging Feed. To do so, the client-side: o MUST use HTTP v1.1 o SHOULD use TLS (i.e. use what is loosely referred toexchangeas "HTTPS") o MUST use the URI associated to the CDNI Loggingbetween interconnected CDNs. The offline exchange of non real-timeFile in the CDNI Loggingcould rely on several protocols. In particular,Feed o SHOULD indicate thedCDN could publishcompression schemes it supports Note that a client-side implementation of the CDNI Loggingoninterface MAY pull aserver whereCDNI Logging File that it has already pulled, as long as theuCDN would retrieve them using a secure protocol. For managedfiletransfer, the recommended protocol is SSH File Transfer Protocol (SFTP) [I-D.ietf-secsh-filexfer]. SFTPiswidely deployed and it guaranteesstill advertised by therespect ofserver-side in thecriteria expressedCDNI Logging Feed. The server-side implementation MUST respond to any valid pull request by a client-side implementation for a CDNI Logging File advertised by the server-side in the CDNI LoggingTransport Requirements: timeliness, reliability, security and scalability. [Ed note: include options for lossless compression] 9. Open IssuesFeed. Themain remaining tasks on this ID areserver-side implementation: o MUST handle thefollowing:client-side request as per HTTP v1.1 oFinaliseMUST include thelist ofCDNI LoggingFields o FinaliseFile identified by theencodingrequest URI inside the body ofCDNI Logging Fields, Recordsthe HTTP response o MUST support the gzip andFile.deflate compression schemes oIdentify what can be done (if anything)MAY support other compression schemes o when the client-side request indicates client-supported compression schemes, SHOULD use a compression scheme that it supports and is supported by the client-side [Editor's Note: discuss Non-Repudiation : it is a nice tomaximise reuse of Logging Fieldshave andLogging Records encodinghow it could be supported, via a different digest than the one forfuture support of real-time CDNI Logging exchange [Ed. Note:integrity] 5. Open Issues o The proposed format for Date and Time isstill to be agreed on.based on W3C and is only in UTC. Is this all OK? RFC 5322 (Section 3.3) format could be used or ISO 8601 formatted date and time in UTC (same format as proposed in [draft-caulfield-cdni-metadata-core-00]). Also see RFC5424 Section6.2.3.] [Ed. note:6.2.3. We currently use same field names as W3C since we have same definition. o (comment from Kevin) how are errors handled ? If the client gets handed a bunch of 403s and 404s, but still gets the content eventually, without triggering an event, are those still logged? For Bytes-Sent, if there were aborted requests, do those get counted as well? Not all client behavior can becorrelated with the simplified log] 10. IANA Considerations TBD 11. Security Considerations 11.1. Privacy CDNs havecorrelated with the simplified log o Do we need to specify Logs for Request Routing performed by dCDN? Observation: Probably can be generalized to the requirement for "event" logging (e.g. dCDN request Router not able to redirect, dCDN cannot acquire metadata, dCDN cannot aquire content, "dCDN Busy Tone" ) Recommendation: Try first specify what events and what information needs to be exchanged. Depending on progress include in initial logging spec or not i.e. handle as a [MED] requirement. o Privacy: do we need some explicit support of IP address masking by dCDN to uCDN, or is it OK to assume that uCDN is to keep this info confidential (like dCDN is assumed to do already)? o definition of field prefixes: add "r" is uCDN. This one is less clear to me. I need to see how you propose to use "r" below, before I can agree. (Just for my own notes, I thought "r" could be used if the dCDN Surrogate was going to Log something related to acquisition of content by the dCDN Surrogate from some content source. Also, in a delivery log generated by a dCDN Surrogate , how can it know about acquisition from uCDN that can be done by other devices than theopportunitydCDN Surrogate). "ucdn-centric-uri": ROB> going back tocollect detailed information aboutthedownloads performed by End-Users. The provisiondefinitions ofthis information to another CDN introduces End-Users privacy protection concerns. 11.2. Non Repudiation Logging provides the raw materials/c/r suggested above, forcharging. It permits the dCDNa CDNI logfile field would then just be "sr-uri". So we don't need tobill the uCDNinvent a new prefix for CDNI, we can use thecontent deliveriesbasic w3c naming? FRANCOIS: I am OK to use "sr-uri" as long as we feel confident that we will never need Surrogate to log information about how it acquires from within the dCDNmakes on behalf(ie regular use of "r" prefix). Are we confident? o Do we need Record-Type as File Directive?: ROB> Is this needed - would a record type per file do theuCDN. It also permitsjob? ... if we don't allow mixed record types, we can include theuCDN to billrecord type in theCSP forATOM feed (to allow thecontent Delivery Service. Therefore, non-repudiationreader to decide whether there might be records it's interested in without getting the logfile). I can't think ofLogging data is essential. 12. Acknowledgments The authors would likea reason tothank Sebastien Cubaud, Anne Marrec, Yannick Le Louedec, and Christian Jacquenetmix, (for example) http/rtmp records, or delivery/req- routing. Different things are likely to be generating those records anyway. A version change can always be done by starting a new file. <Francois> Here are a couple potential use cases fordetailed feedback on early versionsmixing record types in a single file: * we later define "cdni_has_delivery_v1" record types for HTTP Adaptive BitRate sessions. Then a dCDN Surrogate will be generating a continuous mixture ofthis document"cdni_http_request_v1" records for PDL requests and "cdni_has_request_v1" records fortheir input on existing Log formats. The authors would like alsoHAS sessions. Why should we be forced tothank Fabio Costa, Sara Oueslati, Yvan Massot, Renaud Edel, and Joel Favierbreak those? * we later define some record types fortheir input and comments. Finally, they thankevents taking place on Surrogates , which can happen any time in thecontributorsmiddle of sessions. Why shoudl we be forced to break those into separate files. It seems wise to keep theEU FP7 OCEAN project for valuable inputs. 13. References 13.1. Normative References [RFC2119] Bradner, S., "Key words for useflexibility inRFCsthe File structure toIndicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC5424] Gerhards, R., "The Syslog Protocol", RFC 5424, March 2009. 13.2. Informative References [CLF] A. Luotonen, "The Common Log-file Format, W3C (workallow the mix inprogress)", 1995, <http://www.w3.org/pub/WWW/Daemon/User/ Config/Logging.html>. [ELF] Phillip M. Hallam-Baker and Brian Behlendorf, "Extended Log File Format, W3C (workthe future. And the overhead is very small since it is encoded inprogress), WD-logfile- 960323", <http://www.w3.org/TR/WD-logfile.html>. [I-D.brandenburg-cdni-has] Brandenburg, R., Deventer, O., Faucheur, F., and K. Leung, "Models for adaptive-streaming-aware CDN Interconnection", draft-brandenburg-cdni-has-04 (worka Directive. o Integrity-Hash:ROB> draft-snell-atompub-link-extensions adds a hash of the resource to the ATOM feed (not sure about the status of that doc, looks like it's stalled a bit). But if we include that inprogress), January 2013. [I-D.ietf-cdni-framework] Peterson, L. and B. Davie, "Framework for CDN Interconnection", draft-ietf-cdni-framework-03 (workthe ATOM feed, the value inprogress), February 2013. [I-D.ietf-cdni-requirements] Leung, K. and Y. Lee, "Content Distribution Network Interconnection (CDNI) Requirements", draft-ietf-cdni-requirements-04 (workthe feed would need to include this Integrity-Hash inprogress), December 2012. [I-D.ietf-secsh-filexfer] Galbraith, J. and O. Saarenmaa, "SSH File Transfer Protocol", draft-ietf-secsh-filexfer-13 (workthe log file itself, which might mean re- calculating the hash (especially if the feed is not generated inprogress), July 2006. [RFC6707] Niven-Jenkins, B., Le Faucheur, F., and N. Bitar, "Content Distribution Network Interconnection (CDNI) Problem Statement", RFC 6707, September 2012. [RFC6770] Bertrand, G., Stephan, E., Burbridge, T., Eardley, P., Ma, K.,the same place as the logfile). So we probably only want one of the two? I think my preference would be to keep it in the feed, saves any complications about what to hash (just running "md5sum" on a downloaded logfile would work, rather than needing to remove the last line). The draft-snell also allows other hashes, "sha1" andG. Watson, "Use Casesso on - forContent Delivery Network Interconnection", RFC 6770, November 2012. [apache] "Apache 2.2 log files documentation", Feb. 2012, <http://httpd.apache.org/docs/current/logs.html>. [squid] "Squid Log-Format documentation", Feb. 2012, <http://wiki.squid-cache.org/SquidFaq/SquidLogs>. Appendix A. Examples Log Format This section provides examplecdni interoperability, we could limit it to md5 or stick with draft-snell's base set. <Francois> Very good point. I agree we should probably want one oflog formats implementedthe two inexisting CDNs, web servers,a typical deployment. Leveraging draft-snell-atompub-link-extensions is attractive because it leverages generic ATOM features andcaching proxies. Web servers (e.g., Apache) maintain at least one logexpertise. It has the potential drawback of introducing a dependency on a document that may be published later (or potentially never since it is not even a WG doc). Defining our own hash in the filefor logging accesses to content (the Access Log). Theyis attractive because we cantypicallybeconfigureddone right away, and there could be simple short term implementation that start using the CDNI Logging File without relying on the ATOM Feed. At the same time we don't want tolog errors inend up with two redundant hashes eventually. How about an approach where : * we define aseparate log file (the Error Log). The log formatssimple MD5 has only, and make it optional * when there is no other mechanism to get the hash, it can bespecifiedincluded in the file * when there are other mechanism (e.g. draft-snell-atompub-link- extensions), it is not included in theserver's configuration files. However, webmasters often use standard log formatsfile. o Compression: <Ben>When we say the server MUST support gzip & deflate we probably need to think through whether we mean content- encoding, transfer-encoding or both. The semantics get a little confusing so we probably just need to think them through to ensure we allow a server to store compressed logs as transmit them compressed. 6. IANA Considerations TBD 7. Security Considerations 7.1. Authentication, Confidentiality, Integrity Protection The use of TLS for transport of the CDNI Logging feed mechanism (Section 4.1) and CDNI Logging File pull mechanism (Section 4.2) allows: o the dCDN and uCDN to authenticate each other (to ensure they are transmitting/receiving CDNI Logging File from an authenticated CDN) o the CDNI Logging information to be transmitted with confidentiality o the integrity of the CDNI Logging information toeasebe protected during thelog processing with available log analysis tools. A.1. W3C Common Log File (CLF) Formatexchange. TheCommon LogIntegrity-Hash directive inside the CDNI Logging File(CLF) format defined byprovides additional integrity protection, this time targeting potential corruption of theWorld Wide Web Consortium (W3C) working group is compatible with many log analysis tools and is supported byCDNI logging information during themain web servers (e.g., Apache) Access Logs. AccordingCDNI Logging File generation. This mechanism does not allow restoration of the corrupted CDNI Logging information, but it allows detection of such corruption and therefore triggering of appropraite correcting actions (e.g. discard of corrupted information, attempt to[CLF],re-obtain thecommon log-file format is as follows: remotehost rfc931 authuser [date] "request" status bytes. Example (from [apache]): 127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326 The fields are defined as follows [CLF]: +------------+------------------------------------------------------+ | Element | Definition | +------------+------------------------------------------------------+ | remotehost | Remote hostname (or IP number if DNS hostname is not | | | available, or if DNSLookup is Off. | | rfc931 |CDNI Logging information). 7.2. Non Repudiation Theremote lognameNon-Repudiation-Hash directive in the CDNI Logging File allows support of non-repudiation of theuser. | | authuser |CDNI Logging File by the dCDN. Theusername thatoptional Non-Repudiation-Hash can be used on theuser employedCDNI Logging interface where needed. 7.3. Privacy CDNs have the opportunity toauthenticate | | | himself. | | [date] | Date and time ofcollect detailed information about therequest. | | "request" | An exact copydownloads performed by End-Users. The provision ofthe request line that camethis information to another CDN introduces End-Users privacy protection concerns. [Editor's Note: see list of open questions] 8. Acknowledgments This document borrows from the| | | client. | | status |W3C Extended Log Format [ELF]. Thestatus codeauthors would like to thank Sebastien Cubaud, Pawel Grochocki, Christian Jacquenet, Yannick Le Louedec, Anne Marrec and Emile Stephan for their contributions on early versions ofthe HTTP reply returnedthis document. The authors would like also to thank Rob Murray, Fabio Costa, Sara Oueslati, Yvan Massot, Renaud Edel, and Joel Favier for their input and comments. Finally, they thank the| | | client. | | bytes | The content-lengthcontributors of thedocument transferred. | +------------+------------------------------------------------------+ Table 4: Information elementsEU FP7 OCEAN project for valuable inputs. 9. References 9.1. Normative References [I-D.ietf-cdni-metadata] Niven-Jenkins, B., Murray, R., Watson, G., Caulfield, M., Leung, K., and K. Ma, "CDN Interconnect Metadata", draft- ietf-cdni-metadata-01 (work inCLF format A.2. W3C Extended Log File (ELF) Format The Extendedprogress), February 2013. [RFC1321] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, April 1992. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999. [RFC3986] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform Resource Identifier (URI): Generic Syntax", STD 66, RFC 3986, January 2005. [RFC4122] Leach, P., Mealling, M., and R. Salz, "A Universally Unique IDentifier (UUID) URN Namespace", RFC 4122, July 2005. [RFC5424] Gerhards, R., "The Syslog Protocol", RFC 5424, March 2009. 9.2. Informative References [CHAR_SET] , "IANA Character Sets registry", , <http://www.iana.org/ assignments/character-sets/character-sets.xml>. [ELF] Phillip M. Hallam-Baker, and Brian Behlendorf, "Extended Log File(ELF) format defined byFormat, W3Cextends the CLF with new fields. This format is supported by Microsoft IIS 4.0(work in progress), WD- logfile-960323", , <http://www.w3.org/TR/WD-logfile.html>. [I-D.brandenburg-cdni-has] Brandenburg, R., Deventer, O., Faucheur, F., and5.0. The supported fields are listed below [ELF]. +------------+---------------------------------------------------+ | Element | Definition | +------------+---------------------------------------------------+ | date | Date at which transaction completed | | time | Time at which transaction completed | | time-taken | Time takenK. Leung, "Models fortransaction to completeadaptive-streaming-aware CDN Interconnection", draft-brandenburg-cdni-has-05 (work inseconds | | bytes | bytes transferred | | cached | Records whether a cache hit occurred | | ip | IP addressprogress), April 2013. [I-D.ietf-cdni-framework] Peterson, L. and B. Davie, "Framework for CDN Interconnection", draft-ietf-cdni-framework-03 (work in progress), February 2013. [I-D.ietf-cdni-requirements] Leung, K. and Y. Lee, "Content Distribution Network Interconnection (CDNI) Requirements", draft-ietf-cdni- requirements-06 (work in progress), April 2013. [RFC6707] Niven-Jenkins, B., Le Faucheur, F., and N. Bitar, "Content Distribution Network Interconnection (CDNI) Problem Statement", RFC 6707, September 2012. [RFC6770] Bertrand, G., Stephan, E., Burbridge, T., Eardley, P., Ma, K., andport | | dns | DNS name | | status | Status code | | comment | Comment returnedG. Watson, "Use Cases for Content Delivery Network Interconnection", RFC 6770, November 2012. Appendix A. Requirements A.1. Compliance withstatus code | | method | Method | | uri | URI | | uri-stem | Stem portion alone of URI (omitting query) | | uri-query | Query portion alone of URI | +------------+---------------------------------------------------+ Table 5: Information elementscdni-requirements This section checks that all the identified requirements inELF format Some fields start with a prefix (e.g., "c-", "s-"), which explains which host (client/server/proxy)thefield refers to. o Prefix Description o c- Client o s- Server o r- Remote o cs- Client to Server. o sc- Server to Client. o sr- Server to Remote Server (usedSection 7 of [I-D.ietf-cdni-requirements] are fulfilled byproxies) o rs- Remote Serverthis document. [Editor's node: toServer (used by proxies) Example: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs- username c-ip cs(User-Agent) sc-status sc-substatus sc-win32-status time-taken 2011-11-23 15:22:01 x.x.x.x GET /file 80 y.y.y.y Mozilla/ 5.0+(Windows;+U;+Windows+NT+6.1;+en-US;+rv:1.9.1.6)+Gecko/ 20091201+Firefox/3.5.6+GTB6 200 0 0 2137 A.3. National Center for Supercomputing Applications (NCSA) Common Log Formatbe written later] A.2. Additional Requirements Thisformat for Access Logs offers the following fields: o host rfc931 date:time "request" statuscode bytes o x.x.x.x userfoo [10/Jan/2010:21:15:05 +0500] "GET /index.html HTTP/1.0" 200 1043 A.4. NCSA Combined Log Format The NCSA Combined log format is an extension of the NCSA Common log format with three (optional)section identies additionalfields: the referral field, the user_agent field, and the cookie field. o host rfc931 username date:time request statuscode bytes referrer user_agent cookie o Example: x.x.x.x - userfoo [21/Jan/2012:12:13:56 +0500] "GET /index.html HTTP/1.0" 200 1043 "http://www.example.com/" "Mozilla/ 4.05 [en] (WinNT; I)" "USERID=CustomerA;IMPID=01234" A.5. NCSA Separate Log Format The NCSA Separate log format refers to a log format in which the information gathered is separatedrequirements that must also be met. [Editor's node: How do we incorporate this info intothree separate files. This way, every entry in the Access Log (intheNCSA Common log format) is complemented with an entryI-D: ina Referral log and another oneappendix? inan Agent log. These three records can be correlated easily thanks to the date:time value. The format of the Referral logmain body? does it remain after publication or is temporary?] A.2.1. Timeliness Some applications consuming CDNI Logging information, such asfollows: o date:time referrer o Example: [21/Jan/2012:12:13:56 +0500] "http://www.example.com/index.html" The formataccounting or trend analytics, only require logging information to be available with a timeliness of theAgent log is as follows: o date:time agent o [21/Jan/2012:12:13:56 +0500] "Microsoft Internet Explorer - 5.0" A.6. Squid 2.0 Native Log Format for Access Logs Squid [squid] is a popular pieceorder ofopen-source software for transforming a Linux host intoacaching proxy. Variations of Squid log format are supported by some CDNs. Squid common access log format isday or the hour. This document focuses on addressing this requirement. Some applications consuming CDNI Logging information, such asfollow:real- timeelapsed remotehost code/status bytes method URL rfc931 peerstatus/peerhost type. Squid also supports a more detailed native access log format: Timestamp Elapsed Client Action/Code Size Method URI Ident Hierarchy/ From Content Accordinganalytics, require logging information toSquid 2.0 documentation [squid], these fields are defined as follows: +-----------+-------------------------------------------------------+ | Element | Definition | +-----------+-------------------------------------------------------+ | time | Unix timestamp as UTC seconds with a millisecond | | | resolution. | | duration | The elapsed timebe available inmilliseconds the transaction | | | busied the cache. | | client | The client IP address. | | address | | | bytes | The size isreal- time (i.e. of theamountorder ofdata delivered toa second after the| | | client, including headers. | | request | The request method to obtain an object. | | method | | | URL |corresponding event). This document leaves this requirement out of scope. A.2.2. Reliability CDNI logging information must be transmitted reliably. Therequested URL. | | rfc931 | maytransport protocol should containthe ident lookups for the requesting | | | client (turned off by default) | | hierarchy | The hierarchy information providesan anti-replay mechanism. A.2.3. Security CDNI logging informationon how | | code | the request was handled (forwarding it to another | | | cache, or requesting the content to the Origin | | | Server). | | type | The content type ofexchange must allow authentication, integrity protection, and confidentiality protection. Also, a non- repudiation mechanism is mandatory, theobject as seentransport protocol should support it. A.2.4. Scalability CDNI logging information exchange must support large scale information exchange, particularly so in the presence of HTTP| | | reply header. | +-----------+-------------------------------------------------------+ Table 6: Information elements in Squid format Squid also usesAdaptive Streaming. For example, if we consider a"store log", which coversclient pulling HTTP Progressive Download content with an average duration of 10 minutes, this represents 1/600 CDNI delivery Logging Records per second. If we assume theobjects currently keptdCDN is simultaneously serving 100,000 such clients ondisk or removed ones, for debugging purposes typically. Appendix B. Requirements B.1. Additional Requirements Section 7 of [I-D.ietf-cdni-requirements], already specifies a setbehalf ofrequirements forthe uCDN, the dCDN will be generating 167 Logging(LOG-1Records per second toLOG-16). Some security requirements also affect Logging (e.g., SEC-4). This section is a placeholder for requirements identified in the work on logging, before they are proposedbe communicated to therequirements draft authors.uCDN over the CDNI Loggingdata is sensitive as it providesinterface. Or equivalently, if we assume an average delivery rate of 2Mb/s, theraw materialdCDN generates 0.83 CDNI Logging Records per second forproducing bills etc. Therefore, the protocol deliveringevery Gb/s of streaming on behalf of the uCDN. For example, if we consider a client pulling HAS content and receiving a video chunk every 2 seconds, a separate audio chunck every 2 seconds and a refreshed manifest every 10 seconds, this represents 1.1 delivery Loggingdata must be reliable to avoid information loss. In addition, the protocol must scale to supportRecord per second. If we assume thetransport of large amountsdCDN is simultaneously serving 100,000 such clients on behalf of the uCDN, the dCDN will be generating 110,000 Loggingdata. CDNs needRecords per second totrust Logging information, thus, they wantbe communicated toknow: o who issuedthe uCDN over the CDNI Logging(authentication), and ointerface. Or equivalently, if we assume an average delivery rate of 2Mb/s, the dCDN generates 550 CDNI Logginghas been modified by a third party (integrity). Logging also contains confidential data, and therefore, it should be protected from eavesdropping. All these needs translate into security requirementsRecords per second for every Gb/s of streaming onbothbehalf of the uCDN. A.2.5. Consistency between CDNI Loggingdata formatandon theCDN Loggingprotocol. Finally, this protocol must comply with the requirements identifiedThere are benefits in[I-D.ietf-cdni-requirements]. [Ed. note: cf. requirements draft: "SEC-4 [MED] The CDNI solution should be able to ensure that the Downstream CDN cannot spoofusing atransaction log attempting to appearCDNI logging format asif it correspondsclose as possible to intra-CDN logging format commonly used in CDNs today in order to minimize systematic translation at CDN/CDNI boundary. A.2.6. Dispatching/Filtering When arequest redirected by a given Upstream CDN when that request has not been redirected by this Upstream CDN. This ensures non-repudiation by the Upstream CDN of transaction logs generated by the DownstreamCDN is acting as a dCDN fordeliveries performed bymultiple uCDNs, theDownstream CDN on behalf ofdCDN needs to dispatch each CDNI Logging Record to theUpstream CDN."] B.2. Compliancy with Requirements draft This section checksuCDN thatall the identified requirements inredirected theRequirements draft are fulfilled by this document. [Ed. node:corresponding request. The CDNI Logging format need tobe written later]allow, and possibly facilitate, such a dispatching. AppendixC.B. Analysis of candidate protocols for Logging Transport This section will be expanded later with an analysis of alternative candidate protocols for transport of CDNI Logging in non-real-time as well as real-time.C.1.B.1. Syslog [Ed. node: to be written later]C.2.B.2. XMPP [Ed. node: to be written later]C.3.B.3. SNMP Authors' Addresses Gilles Bertrand (editor) France Telecom - Orange 38-40 rue du General Leclerc Issy lesMoulineaux,Moulineaux 92130 FR Phone: +33 1 45 29 89 46 Email: gilles.bertrand@orange.com Iuniana Oprescu (editor) France Telecom - Orange 38-40 rue du General Leclerc Issy lesMoulineaux,Moulineaux 92130 FR Phone: +33 6 89 06 92 72 Email: iuniana.oprescu@orange.comStephan Emile France TelecomFrancois Le Faucheur (editor) Cisco Systems E.Space Park -Orange 2 avenue Pierre Marzin Lannion F-22307 FranceBatiment D 6254 Allee des Ormes - BP 1200 Mougins cedex 06254 FR Phone: +33 4 97 23 26 19 Email:emile.stephan@orange.comflefauch@cisco.com Roy Peterkofsky Skytide, Inc. One Kaiser Plaza, Suite 785 Oakland CA 94612 USA Phone: +01 510 250 4284 Email: roy@skytide.comFrancois Le Faucheur (editor) Cisco Systems Greenside, 400 Avenue de Roumanille Sophia Antipolis 06410 FR Phone: +33 4 97 23 26 19 Email: flefauch@cisco.com Pawel Grochocki Orange Polska ul. Obrzezna 7 Warsaw 02-691 Poland Email: pawel.grochocki@orange.com