SEDS for ISO 10303-11: TYPEOF(x:NUMBER)

Thu Nov 29 16:44:54 EST 2001

(I thought I had sent this before.)

David Price wrote:

> I'm not sure I like the idea that the datatype of something depends on its
> instantiated value.

To be honest, as a compiler writer in my youth and a modeler in my 3rd career, I was never comfortable with this idea, either. 
But this idea is very clearly central to many manipulations in Express!  So like it or not, we have to live with it.

> Is this example
> any different from x being declared as a BAG that happens to be instantiated
> with non-duplicate members and so you want TYPEOF to return SET?

We must be careful here.  Jochen's SEDS deals with "simple" types, and the BAG/SET problem is an "aggregate" types issue.  But
the same idea should apply to the SET/BAG relationship, precisely because Part 11 (clause 8.2.4) says that SET OF T is a
specialization of BAG OF T, just as 8.2.1 says that REAL is a specialization of NUMBER.  

It is the fact that REAL is a specialization of NUMBER that permits an actual parameter of type REAL to be supplied for a formal
parameter of type NUMBER; and it is the fact that that substitution can occur that makes it important for the body of the
FUNCTION to be able to determine via TYPEOF that the actual parameter is of type REAL.  In a similar way, an instance of type
SET OF T can be supplied for a formal parameter of type BAG OF T and the corresponding problem arises.  The body of the function
should be able to determine that the actual parameter was of type SET OF T via TYPEOF, in the same way that it can do this if
the formal parameter was of type AGGREGATE OF T.

But the problem that David addresses is technically different.  That is: if the value itself satisfies the requirements for an
instance of a specialization of the type (but was *never syntactically associated with that specialization*), should TYPEOF
return that specialization in the result list?

The problem here is that the underlying model of "data type" in EXPRESS is intrinsically syntactic, not semantic.  EXPRESS data
types do not have "value-based type predicates" -- EXPRESS does not define the rules for determining from its "value" whether a
"datum" in the universe of discourse is an instance of any given type!  In the EXPRESS model, type association is "by fiat" (aka
"by roster") -- some syntactic construct tells the implementation that this datum is an instance of that data type.  But Part 11
is careful to say that that syntactic construct doesn't have to appear in Part 11 itself, i.e. it could be in some 20-series
Part used in the representation of the datum.  So the "requirement for an instance of a specialization" is that some syntax
somewhere says this datum is an instance of that specialization!  There is no value-based rule for determining whether this
datum is "semantically a valid X" if it is only *declared* to be a Y.  E.g. Part 11 says "3.4" denotes an instance of REAL, but
the occurrence might actually represent an instance of length_measure as well, if it appears where a length_measure value is
syntactically required by Part 11.  You can't tell from 3.4 itself whether it is a length_measure; you can tell from Part 11
that every occurrence of 3.4 in the schema is a REAL.

We may believe that we have a type predicate for REAL or SET, and Part 11 says that our intuitive predicate is TRUE for every
instance of those types, but it does *not* say the converse: that every datum in the universe of discourse that satisfies our
type predicate is an instance of that type!  Rather Part 11 says that every datum in the UoD that is (syntactically) *said to
be* an instance of that type is one if it satisfies the stated "property predicates" (WHERE clauses, Part 11 rules), and is an
"invalid instance" if it doesn't!  I.e.

TYPE positive_length_measure = length_measure;
   WHERE SELF > 0;
END_TYPE;

does *not* say that every instance of length_measure whose value is greater than 0 is an instance of positive_length_measure. 
It only says that the value of every (valid) instance of positive_length_measure is greater than 0.

Further, the syntactic association of type is with the "occurrence", not with the "value".  That is, let 3,4 designate the
"mathematical real value" denoted in Express and Part 21 by "3.4". Then the 3,4 which is the "x" attribute of Cartesian_Point
#1234 is different from the 3,4 which is the "x" attribute of Cartesian_Point #4321 and from the 3,4 which is the "eccentricity"
attribute of Ellipse #1101.  Each of these 3,4s is a distinct "occurrence"; their "values" are equal.  The first two occurrences
are instances of type length_measure; the third is not.  There is nothing about the "value" 3,4 that determines whether it is a
length_measure, or even that it is a REAL!  Part 11 says that the occurrence of 3,4 that appears in the EXPRESS schema as 3.4 is
an instance of type REAL; Part 21 says that the occurrence of 3,4 that appears in the exchange structure in #4321 =
CARTESIAN_POINT(3.4, -1.0, 0.0) is an instance of type length_measure.

Now the rules for TYPEOF don't say that the syntactic type associated with the actual parameter by the caller is the narrowest
specialization to be considered.  When clause 15.25 includes "types actually present in the instance", it refers to *all*
syntactic type associations made to the occurrence in question, beginning with its original identification as a datum governed
by the schema.  But there is no notion of the intrinsic semantic nature of the value of the occurrence -- the "semantic type
predicate" in EXPRESS is: "the datum was somewhere declared to be an instance of this type and its properties satisfy the Part
11 rules and the local rules for this type".

All of this is to say that we can change Part 11 clause 15.25 to permit the inclusion of "subtypes and specializations actually
present in the instance" for simple and aggregate types *without* requiring the interpretation that David suggests:
> x being declared as a BAG that happens to be instantiated
> with non-duplicate members and so you want TYPEOF(x) to return SET
If x is *never* syntactically associated with type SET OF T, it doesn't make any difference whether it has duplicate members; x
is *not* an instance of SET OF T per Part 11.  And TYPEOF(x) should return ['BAG'].  But if the actual parameter y was
*declared* SET OF T (somewhere) and the formal parameter x is declared BAG OF T, when the function calls TYPEOF(x), *then*
TYPEOF should return ['BAG', 'SET'].

-Ed

P.S. The terms "instance", "value" and "occurrence" are not interchangeable.  In the above, I use "instance" only in "instance
of <type>" and I distinguish an "occurrence" from its "value".  And I deliberately used "datum in the UoD" so as to be unclear
as to which of these was meant -- the "value" or an "occurrence" of it.  Part 11 makes a distinction between "instance" and
"value" in clause 12.2.1.7, in distinguishing "instance comparison" (object-id comparison) from "value comparison" (attribute
value comparison).  And I don't use either word in that sense.  So if you have problems with my use of "instance" and "value"
and "occurrence" above, change them to "B-inst", "B-val" and "B-occ", i.e. treat them as reserved words that may not have the
meaning you would ascribe to them.

P.P.S. Apologies for information modeling lecture #316.4.

-- 
Edward J. Barkmeyer                       Email: edbark at nist.gov
National Institute of Standards & Technology
Manufacturing Systems Integration Division
100 Bureau Drive, Mail Stop 8260          Tel: +1 301-975-3528
Gaithersburg, MD 20899-8260               FAX: +1 301-975-4482

"The opinions expressed above do not reflect consensus of NIST,
and have not been reviewed by any Government authority."