[wg11] Part 28 features we could lose

Tue Jun 8 15:49:56 EDT 2004

I offer the following as a guide to what one might want to do to 
simplify Part 28.

Not all features of the EXPRESS language need to be captured in the XML 
schemas.  I would divide the EXPRESS features into a "required" group, 
an "at most optional" group, and an "impossible" group:

Required:
  - entities and attributes
  - SUBTYPE and simple inheritance of attributes
  - multiple inheritance
  - ANDOR subtype overlaps
  - entity-valued attributes "by reference" (by identifier)
  - entity-valued attributes "by value"
  - primitive types
  - aggregation data types
  - ENUMERATION types
  - SELECT types
  - defined data types

At most optional:
  - ABSTRACT entity
  - references to entity instances not in the data set
  - INVERSE attributes
  - DERIVEd attributes (value only)
  - attribute redeclaration
  - UNIQUE rules
  - restrictions of STRING and BINARY data types (FIXED and max length)
  - constraints on the sizes of aggregate values
  - ordering and uniqueness in aggregates (SET/BAG vs. LIST)
  - ARRAY OF OPTIONAL
  - "references" to values that are not entity instances
  - specialization rules in defined data types

Impossible:
  - USE/REFERENCE (you can't import the XML schema for the interfaced 
EXPRESS schema selectively, and in general, you can't subtype any of its 
entity data types or extend EXTENSIBLE types)
  - substitution of specializations
  - WHERE clauses
  - expressions and FUNCTIONs
  - SUBTYPE CONSTRAINTs and SUPERTYPE clauses
  - EXTENSIBLE SELECT (only the extended SELECT can be mapped)
  - EXTENSIBLE ENUMERATION (only the extended ENUM can be mapped)
  - attribute RENAME (only the new name is available)

To do the Required list elegantly requires all of the following features 
of XML schema:
- XML elements
   (for EXPRESS entities and attributes, and for some instances of 
non-entity data types)
- XML attributes
   (for instance ids and other "metadata")
- XML data types
   (for EXPRESS data types, nearly 1-to-1)
- sequence particles
   (for most structures)
- choice particles
   (for SELECT types)
- extensions of simpleTypes (complex types with simple content)
   (add XML attributes to BINARY and aggregate data types)
- restrictions of simple types
   (ENUMERATION, and empty restrictions for other defined data types)
- extensions of complex types
   (attributes of subtypes)
- restrictions of complex types with simple content
   (only for defined data types whose underlying type is another defined 
data type and whose fundamental type is BINARY or certain aggregates)
- restrictions of complex types with complex content
   (only for defined data types whose underlying type is another defined 
data type and whose fundamental type is certain aggregates)
- substitution groups
   (subtypes)
- XML identifiers
   (for "entity instance names" and references to them)
- nillable
   (for aggregates of entities to include references)

Support for "multiple inheritance" = SUBTYPE OF (a, b), requires 
construction of XML data types that do not show inheritance 
relationships at all for the type that has multiple inheritance and all 
of its supertypes.  (In 6.6, there are four rules for this.  Choosing 
inheritance="true" always only eliminates the first rule.  Choosing 
inheritance="false" eliminates all the rules, but it also effectively 
turns all supertypes into SELECT types in the mapping to XML schema.)

Support for ANDOR requires declaration of "partial entity instance" 
elements distinct from the entity elements, and inclusion of an 
"external mapping" element as an option for the content of an attribute 
whose value could be an ANDOR.  (Not currently in the CD, but needed.)

Optional XML features:
- abstract XML data types
   (to support ABSTRACT, but not 1-to-1)
- nested extensions and nested restrictions
   (to support redeclaration, and constraints on aggregation sizes, 
string lengths)
- keys
   (to support UNIQUE rules and reference validation)
- keyrefs with simple target Xpaths
   (to support reference validation)
- keyrefs with compound target Xpaths
   (to support reference validation with multiple inheritance or ANDOR)
- block
   (to prevent schema extension)

Support for attribute redeclaration involves cascading complex type 
restrictions in XML schema and several different rules (see clause 6.6.4).

Support for ARRAY OF OPTIONAL requires a different structure from the 
ARRAY structure for primitive types, and either representation of the 
subscripts of the elements present or "nil" representation of the 
elements absent.

Distinction between LIST and any other aggregate can only be "supported" 
by XML attributes that may be inspected by the recipient.  There is no 
XML schema equivalent of ARRAY, SET or BAG.

Support for aggregate bounds is easy for most instances of many 
aggregation data types, and slightly uglier for certain types, but a 
general solution requires a new XML data type for every different 
occurrence of an anonymous aggregate in the EXPRESS schema.

Optional features of the CD that are not derived from EXPRESS concepts:

- Support for entity references out of the data set requires declaration 
of "proxy elements" distinct from the entity elements.

- Support for referenceable instances that are not entity instances 
requires instance elements that have the XML attributes of entity 
instance elements for all data types.

Configuration features of the CD designed to revise or augment the 
EXPRESS schema:
- invert
- entity name=
- attribute name= map=
- type name= map=
- aggregate name=
- exp-attribute="no-tag", exp-attribute="entity-tag"
- tag-source and tag-values
- exp-type
- contain, use-id
- notation
- naming-convention
- tagless for aggregates of STRING and BINARY

Configuration features of the CD derived from the refusal of the 
technical experts to make an engineering decision:
- sparse: standardize one of "true" or "false", delete the other
- flatten: standardize one of "true" or "false", delete the others
- tagless: standardize the current default (true for simple values, 
false for others), delete the options
- exp-attribute="attribute-tag","double-tag","attribute-content": 
standardize the current default (attribute-tag for simple values, 
double-tag for others), delete the options

(Note the number of subclauses and decision tables in 6.3.2, 7.8 and 
other places that would be completely eliminated by simply making these 
decisions.)

-Ed

P.S. I'm not really trying to make recommendations.  I'd like to focus 
the discussion of the complexity of the XML schemas on explicit issues.

-- 
Edward J. Barkmeyer                        Email: edbark at nist.gov
National Institute of Standards & Technology
Manufacturing Systems Integration Division
100 Bureau Drive, Stop 8264                Tel: +1 301-975-3528
Gaithersburg, MD 20899-8264                FAX: +1 301-975-4694

"The opinions expressed above do not reflect consensus of NIST,
  and have not been reviewed by any Government authority."