[wg11] SEDS for EXPRESS (10303-11:2004)

Fri Jun 3 17:55:03 EDT 2005

Lothar,

Yes, we are writing a paper.  In fact, we are proposing a standard 
metamodel for EXPRESS and some changes to Part 11.  The fact that our 
models are this close means that we have a nearly common understanding, 
and we may have some consensus on text for the standard(s).

> I think we have to add the Express-schema(s) to the whole discussion
> to get a proper solution.
> 
> The question whether a combination of single entity data types/values
> is complete or not can only be answered in the scope of an express
> schema. 

Yes, of course.  The model we have produced says that the scope of an 
entity data type is a schema and that all the instances/values are 
described by one "governing_schema".  (Actually, some entity_data_types 
can have only an AlgorithmScope, but we don't really need to go there.)

> The same combination may be complete in one schema, but for
> another schema it is not.

That can be, but that concept is not in the scope of the model as we 
have it.  If you have entirely different schemas, the entity data type 
names don't necessarily refer to the same thing at all.  And in that 
case, the values are "cheese and chalk".

If one schema interfaces the other (via USE/REFERENCE) then the entity 
data type IS in the scope of the interfacing schema, and the interfacing 
rules (clause 11) require its supertypes to be at least implicitly 
interfaced as well.  So completeness with respect to the interfaced 
entity data type is exactly the same.  Validity may not be, but 
"completeness", in the sense of the required collection of 
single_entity_values, is the same.

We must be careful to define a "complete entity value" as a 
complex_entity_value in which every supertype of every entity data type 
that is represented by a single_entity_type in the value is also 
represented by a single_entity_type in the value.  Then, by definition, 
a complete_complex_entity_value is "complete with respect to" EVERY 
entity data type that appears in it as a single_entity_value.  (We need 
this rule to handle multi-leaf values and multiple inheritance and ANDOR.)

> So I agree that it does not make sense to
> have specific subtypes for complete or partial since this can only be
> answered in relation to a particular schema.

More than in relation to a given schema, it can only be complete in 
relation to a particular set of entity types.

I think you misunderstood this:
>>The main problem I have with it is that a "complete complex entity
>>value" is "complete" with respect to 1:? entity data types, but it may
>>also be "partial" with respect to 0:? entity data types as well.
> 
> Ok - this is a question of definition. So you say that a partial
> entity value may consist of 0 single entity values. ...

No, I don't say this at all.  I maintained that part of Lothar's model 
that says a given complex_entity_value is a collection of 1 or more 
(1:?) single_entity_values.  This also agrees with the model I presented.

What I was saying in the part Lother quotes (above) is that a given 
complex_entity_value can be simultaneously
- complete with respect to 0:? entity data types, and
- incomplete ("partial") with respect to 0:? entity data types.
It is complete with respect to an entity data type if it contains all 
the single_entity_types required for that entity data type by the 
subtype/supertype graph (which is, as you say, schema-dependent).  It is 
"partial" with respect to any entity data type which requires some or 
all of those single_entity_types AND OTHERS.

In the simplest case, for example:
   ENTITY A; ... END_ENTITY;
   ENTITY B SUBTYPE OF (A); ... END_ENTITY;
The complex_entity_value consisting of exactly {A(...)} is "complete" 
with respect to A and "partial" with respect to B.

>>This is why in the revised diagram attached, I model 
>>"complete complex entity value" as complete with respect to a 
>>"reference_type" entity_type.
> 
> No - I don't think it is ok to introduce another object "entity_type"
> when already having "single_entity_type" and "(partial) complex entity
> type".

Ah.  But this is important.  If we define single_entity_type to mean 
exactly the set of (explicit) attributes appearing in the ENTITY 
declaration, what shall we use to represent the entity, for example B 
above, WITH THE INHERITED PROPERTIES that Part 11 says it has?  By 
definition, single_entity_type DOES NOT represent those properties.  So 
we need to introduce the concept entity_type to represent the entity 
with all of its inherited properties.  Now we can use 
complex_entity_type for that purpose (which matches the Part 11 usage), 
but then we need a different term (Part 11 uses "partial complex entity 
type") for arbitrary collections of single_entity_values that may or may 
not represent instances of "complex entity data types".

The point is that we need all of the following (7) symbols:

E1 type/value (LK:single_entity_type/_value, EB:SingleEntityType/Value) 
refers to the set of explicit attributes that appears in one entity 
declaration.

E2 type/value (LK:complex_entity_type/_value?, 
EB:PartialEntityType/Value) refers to a collection of one or more E1s.

E3 type (LK:?, EB:EntityType and entity_type) refers to the thing 
actually declared by an entity declaration: Per Part 11 9.2, that 
includes ALL of the properties (attributes and rules) appearing in the 
declaration itself and all of the properties inherited thru SUBTYPE 
relationships that appear in the declaration.

E3 instance (LK:entity_instance, EB:EntityInstance) refers to a named 
value of an entity data type, i.e. "E3 type", which corresponds to the 
view that is presented by the EXPRESS schema of an individual in the 
conceptual "entity" class corresponding to the E3 type.  (Part 11, 3.3.x)

E4 value (LK:complete_complex_entity_value, EB:EntityValue), subtype of 
E2 value, refers to a collection of E1 values that is sufficient to 
describe an E3 instance.  (As Lothar says, and I agree, this is a 
transformation that can occur.)  By the definition above, the E4 value 
will be sufficient to describe an instance that is an instance of every 
entity data type whose E1 value is contained in it.

It is my contention that we need all 7 concepts to explain the EXPRESS 
model of entity types and values, in the presence of the "partial entity 
constructor", the "group" operator, and the "complex entity 
constructor".  I don't care what we call them.

It is possible that I misconstrued the intended meaning of Lothar's 
symbols.  So I use only Part 11 terms, and the En terms, in the 
definitions above, so that we can sort this out.

>>But the point is that "complete" is only meaningful *in reference to* a
>>target entity_type.
> 
> I can't follow you here.
> "complete" can only be answered for a schema and this decides whether
> we can make an instance out of it.

OK.  I agree that the idea "reference_type" is wrong.  A 
complex_entity_value is "complete" if it meets the criterion above: it 
has all the single_entity_values needed for every entity data type (E3 
type) that appears in it.  And the relationship to the set of entity 
data types for which it is complete can be derived from the relationship 
to the single_entity_values it contains.

[BTW, this derivation can't be written with an EXPRESS DERIVEd 
attribute.  It is the set
  (s.of_type.declared_by FOR s <* SELF\single_entity_values)
But EXPRESS lacks the LISP 'mapcar' function that does this kind of thing.]

>   The next question is whether this instance can be assigned to a
> variable, parameter or entity-attribute. This is a different question
> and clearly covered by Express already I think.

My point is that, in Part 11 (but not necessarily in Part 22, I 
suppose), the only thing you can do with an expression that produces a 
"complete_complex_entity_value" is to assign it to something.  It is 
only when we do that assignment that we care that the value is 
"complete".  So I am *defining* complete_complex_entity_value ("E4 
value") to be a complex_entity_value ("E2 value") that can be converted 
to an instance of a *specified* entity data type (not *some* entity data 
type).

Upon reflection, we can indeed make the assignment a "different 
question".  And in that case, we don't need "E4 value" 
(complete_complex_entity_value) at all!  We could instead write, as Part 
11 does, a rule for the conversion operator, e.g.
  "The "entity-value" operand of the (true) "entity instance 
constructor" shall be a complex_entity_value that <satisfies the 
definition above>."
That constructor produces an entity instance that may be a valid 
instance of many entity data types, and whether that instance can be 
assigned to a given target is another matter.  The only problem is that 
it may not be a *valid* instance, because it doesn't satisfy some local 
or global rule, but I think that part is now "schema-dependent" in the 
larger sense.

But we note that both Lothar and I thought it useful to create a term 
and model that relationship.

> ...
>>For these reasons, I don't see the need to have complete_ and partial_
>>complex_entity_types, or partial_complex_entity_values.
> 
> Yes - get rid of those

OK. we agree on this part.

>>...
>>The other addition I made is to the relationship between 
>>single_entity_type and complex_entity_type (and the same for _value).
>>This relates to the SingleEntityType SUBTYPE OF (PartialEntityType) 
>>issue you raise, and I discuss that below.
> 
> It is ok that a single_entity_type corresponds to exactly one
> complex_entity_type with the additional rule that the
> complex_entity_type has only one component and this single component
> is THIS single_entity_type.

Exactly.  (I thought I said this as well.)

> The attribute equivalent_type on the value level does not make sense I
> think - and this lead me to something else:

Yes.  I copied the relationship and forgot to change the text.
It should read "equivalent_value".

> --- start: conceptual versus implementation level ---
> In our Java implementation it is clear that we can have 2 instances of
> Java class EntityValue or 2 instance of Java class ComplexEntityValue
> with the same value. This is necessary to effectively work with this.
>   But in a conceptual data model (not one for implementation) we may
> say that an EntityValues and ComplexEntityValues must be unique in
> respect to their contents. This is because there is not way to
> distinguish between two Complex/Single entity values with the same
> contents - they don't have an identity like the entity instance has.
>
>  At the end we have to answer if we want to create a conceptual data
> model which will exactly represent the nature of Express or one which
> is suitable for implementation.
>   Only when we speak on such a conceptual level a single_entity_value
> has exactly one corresponding complex_entity_value. In a real
> implementation we may have 0, 1 or more.

I think I see the problem.

With respect to values, the EXPRESS-G diagram is correct but incomplete. 
  What is intended is what the UML diagram shows:

single_entity_value.equivalent_value is unique.  For single entity value 
B(5.0, "cm"), the equivalent_value is the set {B(5.0, "cm")}, which is a 
complex_entity_value.  It is unique, because the complex_entity_value is 
defined to be a set, and any two sets having exactly the same members 
are equal, i.e. the same set.  And the INVERSE relationship to 
single_entity_value.equivalent_value is SET[0:1].  Each 
complex_entity_value that consists of exactly one single_entity_value 
has one instance of that inverse relationship, namely to that 
single_entity_value; all others have none.

But the single_entity_value B(5.0, "cm") can appear in any number of 
different sets, i.e. different complex_entity_types.  So the INVERSE 
relationship to complex_entity_type.components is also SET[1:?].

But the problem Lothar raises is this: we have to decide whether we are 
talking about the value, or an occurrence of the value.  And that is the 
difference between the conceptual model and the implementation.

In an implementation, *all* occurrences have identity that is distinct 
from value.  Every occurrence of B(5.0, "cm") is distinct, and the one 
that appears in any given complex_entity_occurrence is different from 
the one that appears in any other complex_entity_occurrence.  They are 
equal in *value*, but their *identities* are distinct.  So any 
single_entity_occurrence appears in exactly one 
complex_entity_occurrence.  And that behavior is different from what is 
stated for single_entity_value.

Interestingly, this is also true of 5.0.  If I have two complex entity 
occurrences that contain that same single entity value, there are two 
distinct occurrences of B(5.0, "cm"), and there are also two distinct 
occurrences of 5.0.  In the implementation, the two occurrences of 5.0 
also have distinct identities.  And in the implementation, the 
expression "v->value_part == 5.0" involves another occurrence of 5.0 
that has an identity different from both of the occurrences in the 
single_entity_occurrences.  (If you look at the hardware implementation, 
the machine address of the comparand 5.0 is different from the machine 
address for the component of the struct.)

Now, Java gives us a syntax for referring to the Value 5.0, and 
different syntax for referring to some of its Occurrences.  Java gives 
us NO syntax for referring to the Value B(5.0, "cm"); it only provides a 
syntax for referring to Occurrences of it.

In my mind, *that* is the difference between the conceptual model and 
the implementation model *for Java*.  The implementation model deals 
with the limitations of the language in manipulating the concepts, and 
it does that by casting the concepts into constructs the language can 
support.  The fact that the Java support for REAL values and for B 
values is different causes the models to be the same in some places and 
different in others.

If you implement in LISP, you get a different collection of curious 
differences between the conceptual and implementation models.

And it is this last observation that tells me that "we" (SC4, OMG, et 
al.) don't want an "implementation model".  Because the one I make for 
Java and the one I make for LISP will be different!

>>Unfortunately, it also makes it unnecessary and inconvenient for 
>>SingleEntityType to be a data type!
> 
> Ups - no!
> The definition says that a data type is a "domain of values". Since we
> have single_entity_values for a single_entity_type this is a data
> type.

But a single_entity_value is not the value of anything.  There is no 
EXPRESS construct for a single_entity_value, and there is no EXPRESS 
operator that is said to produce one.  If we put a single_entity_type in 
the model as a data type, what would have that data type?

complex_entity_type is only the data type of a few kinds of Expression; 
single_entity_type is not the data type of any Expression; it is the 
data type of nothing.

So why would we have this data type?

>>The type of a SingleEntityValue is
>>a "partial complex entity data type", as Part 11 says.
> 
> Let us find those places and change it
> 
> 
>>There are two
>>operations that conceptually produce SingleEntityValues, but no 
>>operations are defined on SingleEntityValues.   Those operations are
>>said to produce a partial complex entity value consisting of one 
>>SingleEntityValue.  The group operator, the attribute reference, the
>>complex entity constructor are all defined to take operands that are
>>PartialEntityValues (partial complex entity values), and the "implicit"
>>EntityValue to EntityInstance conversion operator takes a partial 
>>complex entity value.  So if we construct a "single entity data type"
>>that is not a subtype of "partial complex entity data type", it isn't
>>the data type of anything!  And if we make SingleEntityValue an instance
>>of "single entity data type", it isn't a valid operand of any operator!
> 
> We have to change the text to say that the constructor and
> group_operator (in the case that an entity type is used) results in a
> single entity value. This single entity value is then implicitly
> transformed into a complex entity value and further on into an entity
> instance as needed by the other stuff around.

But the transform always occurs the moment the single_entity_value is 
produced, because *every* use of it needs a (partial) complex entity value.

Obviously we could do what Lothar suggests.  Do we really want to? 
Except for the complex entity constructor definition, the current text 
of Part 11 conveys the intent without using the term single entity value 
*at all*.  (I'm sure that is why TC#2, which introduced the term, didn't 
make all of these changes.)

We agree to use single_entity_value to designate a piece of a 
complex_entity_value, so that we can describe the entity constructors 
carefully, but we don't need it to be an Instance/Value to do that.  An 
indexed_element of an ARRAY value isn't a value; it's just a piece of one.

Why is it important to have single_entity_value be a value, and 
single_entity_type be a data type?  Is it just because we called them 
xxx_value and xxx_type?  Or is it because we need both ideas, and they 
relate to each other in the type/value way.

> We should try to minimise the number of changes.

I agree.  I think we now pretty much agree on the model, except for a 
couple of things in which we have choices that don't affect the meaning.

-Ed

-- 
Edward J. Barkmeyer                        Email: edbark at nist.gov
National Institute of Standards & Technology
Manufacturing Systems Integration Division
100 Bureau Drive, Stop 8264                Tel: +1 301-975-3528
Gaithersburg, MD 20899-8264                FAX: +1 301-975-4694

"The opinions expressed above do not reflect consensus of NIST,
  and have not been reviewed by any Government authority."