[Docs] [txt|pdf|xml|html] [Tracker] [Email] [Nits]

Versions: 00 01

Applications Area Working Group                               F. Echtler
Internet-Draft                          Munich Univ. of Applied Sciences
Intended status: Informational                             March 7, 2011
Expires: September 8, 2011

            GISpL: Gestural Interface Specification Language


   This document introduces GISpL, the Gestural Interface Specification
   Language.  GISpL enables the unambiguous description of gestures used
   in human-computer interfaces.  This includes gestures on touch and
   multi-touch screens, with digital pens, with hand-held controllers or
   in free air.  A matching engine analyzes motion data produced by the
   input device(s) and triggers registered gestures.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on September 8, 2011.

Copyright Notice

   Copyright (c) 2011 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as

Echtler                 Expires September 8, 2011               [Page 1]

Internet-Draft                    GISpL                       March 2011

   described in the Simplified BSD License.

Echtler                 Expires September 8, 2011               [Page 2]

Internet-Draft                    GISpL                       March 2011

1.  Introduction

   This document describes GISpL (Gestural Interface Specification
   Language), a formal language for describing human-computer interfaces
   which use gestures.  The term "gesture" is used in a very wide sense
   here, meaning any motion of the user which can be captured by an
   input device.

1.1.  Motivation

   As novel types of human-computer interfaces such as multi-touch
   screens, digital pens, hand-held controllers or even free-air
   gestures become more and more common, so does the number of special-
   purpose applications built to interact with these devices.

   Using a dedicated formal language to describe the various types of
   gestural interaction which are possible with an application has
   advantages for various distinct groups:

   Users   End users of gestural applications benefit in two ways: the
           behavior of such applications becomes more consistent across
           different platforms and the users are given the option to
           modify the behavior to suit their needs.

   Researchers  When the behavior of a gestural interface is specified
           formally, it is easier for user interface researchers to
           compare different interfaces.  Also, prototypical
           implementations of such interfaces are easier to modify and
           adapt during the design, development and evaluation cycle.

   Developers  Since the formal specification enables a dedicated
           matching engine to perform the actual detection and matching
           of gestures, developers are spared tedious tasks such as
           constant reimplementation of basic gesture matching

1.2.  Design Considerations

   GISpL serves two purposes:

   o  describe what motions the users have to execute in order to
      trigger a certain response

   o  deliver information about the actual execution of these motions to
      the application

   Moreover, GISpL should be usable across a very wide range of
   platforms, even unconventional ones.  One important example is the

Echtler                 Expires September 8, 2011               [Page 3]

Internet-Draft                    GISpL                       March 2011

   Firefox web browser which has recently acquired the capability to
   deliver multi-touch events to web applications.

   Consequently, the requirements regarding GISpL are:

   o  must be human-readable

   o  must be machine-readable

   o  must be supported on a wide range of platforms

   o  must be lightweight for low-latency network transmission

   Therefore, the decision was made to base GISpL on JSON [RFC4627].
   Compared to XML, JSON gives a better balance between readability for
   humans and code size.  It is supported in nearly all programming
   languages and platforms.

Echtler                 Expires September 8, 2011               [Page 4]

Internet-Draft                    GISpL                       March 2011

2.  Specification

   GISpL consists of three core elements: regions, gestures and

   For the full formal ABNF specifictation, please see the appendix.
   This section will use a relaxed syntax for easier readability, with
   rule names enclosed by angle braces < >.  JSON-related markup ( { } ,
   [ ] "" ) will be reproduced verbatim.

2.1.  Definitions

   In the context of GISpL, the terms _input object_ and _input event_
   will be used.  An _input object_ is any physical object which is
   detectable by the input hardware, such as a mouse, a Wiimote or the
   user's hand.  Input objects are classified into one of 32 categories
   as given in the appendix.  Every input object is given a unique ID
   number according to the capabilities of the hardware.  I.e., a
   uniquely identifiable tangible object always has the same ID, while
   anonymous touch points on an interactive surface will have IDs so
   that each touch point is uniquely identifiable during the duration of
   the touch.  An _input event_ is a single spatial measurement
   regarding the position (and optionally, orientation, dimensions etc.)
   of an input object as captured by one or more of the sensors used.

2.2.  Region

   Regions define spatial areas in which a certain set of gestures is
   valid and which capture motion data that falls within their
   boundaries.  Regions are defined in one of two reference coordinate

   o  A 2D screen coordinate system where the x and y coordinates range
      from 0.0 to 1.0 across the physical extension of the user's
      screen, with the origin being in the lower left corner.  This is
      indicated by the region flag "poly".  The list of boundary points
      is interpreted as a closed polygon in the x-y plane.  The z
      coordinate is not used.

   o  A 3D metric coordinate system with an arbitrary reference point as
      origin (usually defined by the input device).  This is indicated
      by the region flag "hull".  The list of boundary points is
      interpreted as defining the convex hull of the region.

   Regions are managed in an ordered list.  Arriving input events are
   checked against this list, starting from the first element.  If the
   position of the input event falls within the boundaries of the region
   _and_ if the bit in the filter bitmask which corresponds to the

Echtler                 Expires September 8, 2011               [Page 5]

Internet-Draft                    GISpL                       March 2011

   category of the input event is set, the the input event is captured
   by the region.  Captured events and their history are subsequently
   used to check whether one or more of the gestures attached to this
   region have been triggered.

   point     = [ <number>, <number>, <number> ]
   pointlist = [ null/<point> *( , <point> ) ]

   regionflags = "poly" / "hull"

   region = {

2.3.  Gesture

   Gestures appear in two variants: either as gesture specifications
   (describing what motions the user has to execute for a certain
   effect) or as gesture events (describing a) the fact that a matching
   motion event just took place and b) the motion data which triggered
   the event).

   Whether a gesture object is a gesture event can be determined from
   looking at the flags.  Should they contain the string "result", then
   it is a gesture event and the features composing this gesture will
   contain valid data in their "result" array.  Otherwise, the object is
   a gesture specification and the features will contain valid data in
   their "constraints" array.

   When a gesture has the "oneshot" flag, then it can only be triggered
   once by a given set of input IDs.  Repeated triggering is only
   possible when the set of IDs captured by the containing region

   When a gesture has the "default" flag set, it is added to a pool of
   default gestures.  When a gesture specification with an empty feature
   list is encountered, then the name is looked up in the default
   gesture pool and if a match is found, the feature list is copied from
   the default gesture.  Otherwise, the empty gesture is ignored.

Echtler                 Expires September 8, 2011               [Page 6]

Internet-Draft                    GISpL                       March 2011

   gesturelist  = [ null/<gesture> *( , <gesture> ) ]
   gestureflag  = "oneshot" / "default" / "result"
   gestureflags = [ null/<gestureflag> (* , <gestureflag> ) ]

   gesture = {

2.4.  Feature

   are the building blocks of gestures and are atomic mathematical
   properties of the raw motion data, such as the average motion vector
   or the diameter of the convex hull of all motion points.

   featurelist = [ null/<feature> *( , <feature> ) ]
   featureitem = <point>/<number>
   itemlist    = [ null/<featureitem>  *( , <featureitem>  ) ]

   feature = {

   A feature is classified by its type, a filter bitmask for input
   events as described in Section 2.2, a type-specific list of
   constraints and a type-specific list of results.  Depending on
   whether the containing gesture is a "result" gesture or not (see
   Section 2.3), either the constraint list or the result list will be
   empty.  All temporal relations (such as motion vector) are expressed
   relative to a time unit of one sensor frame, i.e. for a sensor setup
   running at 60 Hz, the temporal unit will be 16.66 ms.

   Features are divided into two groups.  The first one are _single-
   match_ features, which will only generate a single result instance
   regardless of the number of input objects.

Echtler                 Expires September 8, 2011               [Page 7]

Internet-Draft                    GISpL                       March 2011

   |  Feature |  Constraint |  Result  |          Description          |
   |   Type   |    Types    |   Types  |                               |
   |  Motion  |   <point>,  |  <point> |  Average motion vector, lower |
   |          |   <point>   |          |      and upper limits per     |
   |          |             |          |           component           |
   |          |             |          |                               |
   | Rotation |  <number>,  | <number> |   Relative rotation angle in  |
   |          |   <number>  |          |  rad, lower and upper limits  |
   |          |             |          |                               |
   |   Scale  |  <number>,  | <number> |  Relative size change, lower  |
   |          |   <number>  |          |        and upper limits       |
   |          |             |          |                               |
   |   Path   | <point> *(, | <number> |    Match for template path    |
   |          |   <point>)  |          |                               |
   |          |             |          |                               |
   |   Count  |  <number>,  | <number> |    Number of input objects,   |
   |          |   <number>  |          |     lower and upper limits    |
   |          |             |          |                               |
   |   Delay  |  <number>,  | <number> |  Number of frames since first |
   |          |   <number>  |          |   object entered, lower and   |
   |          |             |          |          upper limits         |

                           Single-Match Features

                                  Table 1

   The second group of features are _multi-match_ features, which will
   potentially generate one result instance for each input object
   (depending on the constraint values).

Echtler                 Expires September 8, 2011               [Page 8]

Internet-Draft                    GISpL                       March 2011

   |   Feature Type  |  Constraint  |   Result  |      Description     |
   |                 |     Types    |   Types   |                      |
   |     ObjectID    |   <number>,  |  <number> |    Match range of    |
   |                 |   <number>   |           | unique IDs (e.g. for |
   |                 |              |           |   tangible objects)  |
   |                 |              |           |                      |
   |   ObjectParent  |   <number>,  |  <number> |    Match range of    |
   |                 |   <number>   |           |   unique parent IDs  |
   |                 |              |           |   (e.g. for fingers  |
   |                 |              |           |     belonging to     |
   |                 |              |           |    specific user)    |
   |                 |              |           |                      |
   |  ObjectPosition |   <point>,   |  <point>  |  Position, lower and |
   |                 |    <point>   |           |   upper limits per   |
   |                 |              |           |       component      |
   |                 |              |           |                      |
   | ObjectDimension |   <point>,   |  <point>, |  Object dimensions,  |
   |                 |   <point>,   |  <point>, |    lower and upper   |
   |                 |   <point>,   |  <number> | limits per component |
   |                 |   <point>,   |           |  (major axis, minor  |
   |                 |   <number>,  |           |      axis, size)     |
   |                 |   <number>   |           |                      |
   |                 |              |           |                      |
   |   ObjectGroup   |   <number>,  | <number>, | Match group of input |
   |                 |   <number>,  |  <point>  | objects (min. count, |
   |                 |   <number>   |           |   max. count, max.   |
   |                 |              |           |   radius). Result =  |
   |                 |              |           |    count/centroid    |

                           Multi-Match Features

                                  Table 2

Echtler                 Expires September 8, 2011               [Page 9]

Internet-Draft                    GISpL                       March 2011

3.  Runtime Behavior

   This section details the behavior of the gesture matching engine.

   The following steps are performed every time an input event has been

   1.  ...

   2.  ...

   The following steps are performed every time after a full set of
   input events has been received:

   1.  ...

   2.  ...

Echtler                 Expires September 8, 2011              [Page 10]

Internet-Draft                    GISpL                       March 2011

4.  Examples

   For illustration purposes, this section gives some usage examples of


Echtler                 Expires September 8, 2011              [Page 11]

Internet-Draft                    GISpL                       March 2011

5.  Security Considerations

   Gesture events, if intercepted while in transit over the network, may
   disclose private data about the users' actions.  Consequently, they
   should preferably be transmitted over secure channels such as private
   networks, VPNs or SSL-encrypted links.

Echtler                 Expires September 8, 2011              [Page 12]

Internet-Draft                    GISpL                       March 2011

6.  IANA Considerations

   This document has no actions for IANA.

Echtler                 Expires September 8, 2011              [Page 13]

Internet-Draft                    GISpL                       March 2011

7.  References

7.1.  Normative References

   [RFC5234]  Crocker, D. and P. Overell, "Augmented BNF for Syntax
              Specifications: ABNF", STD 68, RFC 5234, January 2008.

7.2.  Informative References

   [RFC4627]  Crockford, D., "The application/json Media Type for
              JavaScript Object Notation (JSON)", RFC 4627, July 2006.

Echtler                 Expires September 8, 2011              [Page 14]

Internet-Draft                    GISpL                       March 2011

Appendix A.  Formal GISpL Specification

   The following formal specification of GISpL is given in ABNF
   [RFC5234].  JSON-related rule definitions (e.g. <number>, <string>
   ...) are to be found in [RFC4627].

Echtler                 Expires September 8, 2011              [Page 15]

Internet-Draft                    GISpL                       March 2011

quote = quotation-mark

point = begin-array
        number value-separator number value-separator number

item  = point / number

itemlist    = begin-array item    *( value-separator item    ) end-array
pointlist   = begin-array point   *( value-separator point   ) end-array
gesturelist = begin-array gesture *( value-separator gesture ) end-array
featurelist = begin-array feature *( value-separator feature ) end-array

filters     = quote "filter" quote name-separator number value-separator
regionid    = quote "id" quote name-separator string value-separator
regionflags = quote "flags" quote name-separator
              ( ( quote "poly" quote ) / ( quote "hull" quote ) )

gesturename  = quote "name" quote name-separator string value-separator
gestureflag  = ( quote "oneshot" quote ) / ( quote "default" quote )
gestureflags = quote "flags" quote name-separator
               begin-array gestureflag
               *( value-separator gestureflag )
               end-array value-separator

featuretype  = quote "type" quote name-separator string value-separator
results      = quote "result" quote name-separator itemlist
constraints  = quote "constraints" quote name-separator
               itemlist value-separator

feature      = begin-object
               featuretype constraints results

gesture      = begin-object
               gesturename gestureflags featurelist

region       = begin-object
               regionid regionflags filters
               pointlist value-separator gesturelist

                                 Figure 1

Echtler                 Expires September 8, 2011              [Page 16]

Internet-Draft                    GISpL                       March 2011

Appendix B.  Input Event Types

   TBD - see TUIO 2.0 spec

Echtler                 Expires September 8, 2011              [Page 17]

Internet-Draft                    GISpL                       March 2011

Author's Address

   Florian Echtler
   Munich Univ. of Applied Sciences
   Lothstr. 64
   Munich  80336

   Phone: +49 89 1265 3736
   Email: floe@butterbrot.org
   URI:   http://floe.butterbrot.org/

Echtler                 Expires September 8, 2011              [Page 18]

Html markup produced by rfcmarkup 1.126, available from https://tools.ietf.org/tools/rfcmarkup/