[wg11] SEDS for EXPRESS (10303-11:2004)
Ed Barkmeyer
edbark at nist.gov
Fri Jun 3 17:55:03 EDT 2005
Lothar,
Yes, we are writing a paper. In fact, we are proposing a standard
metamodel for EXPRESS and some changes to Part 11. The fact that our
models are this close means that we have a nearly common understanding,
and we may have some consensus on text for the standard(s).
> I think we have to add the Express-schema(s) to the whole discussion
> to get a proper solution.
>
> The question whether a combination of single entity data types/values
> is complete or not can only be answered in the scope of an express
> schema.
Yes, of course. The model we have produced says that the scope of an
entity data type is a schema and that all the instances/values are
described by one "governing_schema". (Actually, some entity_data_types
can have only an AlgorithmScope, but we don't really need to go there.)
> The same combination may be complete in one schema, but for
> another schema it is not.
That can be, but that concept is not in the scope of the model as we
have it. If you have entirely different schemas, the entity data type
names don't necessarily refer to the same thing at all. And in that
case, the values are "cheese and chalk".
If one schema interfaces the other (via USE/REFERENCE) then the entity
data type IS in the scope of the interfacing schema, and the interfacing
rules (clause 11) require its supertypes to be at least implicitly
interfaced as well. So completeness with respect to the interfaced
entity data type is exactly the same. Validity may not be, but
"completeness", in the sense of the required collection of
single_entity_values, is the same.
We must be careful to define a "complete entity value" as a
complex_entity_value in which every supertype of every entity data type
that is represented by a single_entity_type in the value is also
represented by a single_entity_type in the value. Then, by definition,
a complete_complex_entity_value is "complete with respect to" EVERY
entity data type that appears in it as a single_entity_value. (We need
this rule to handle multi-leaf values and multiple inheritance and ANDOR.)
> So I agree that it does not make sense to
> have specific subtypes for complete or partial since this can only be
> answered in relation to a particular schema.
More than in relation to a given schema, it can only be complete in
relation to a particular set of entity types.
I think you misunderstood this:
>>The main problem I have with it is that a "complete complex entity
>>value" is "complete" with respect to 1:? entity data types, but it may
>>also be "partial" with respect to 0:? entity data types as well.
>
> Ok - this is a question of definition. So you say that a partial
> entity value may consist of 0 single entity values. ...
No, I don't say this at all. I maintained that part of Lothar's model
that says a given complex_entity_value is a collection of 1 or more
(1:?) single_entity_values. This also agrees with the model I presented.
What I was saying in the part Lother quotes (above) is that a given
complex_entity_value can be simultaneously
- complete with respect to 0:? entity data types, and
- incomplete ("partial") with respect to 0:? entity data types.
It is complete with respect to an entity data type if it contains all
the single_entity_types required for that entity data type by the
subtype/supertype graph (which is, as you say, schema-dependent). It is
"partial" with respect to any entity data type which requires some or
all of those single_entity_types AND OTHERS.
In the simplest case, for example:
ENTITY A; ... END_ENTITY;
ENTITY B SUBTYPE OF (A); ... END_ENTITY;
The complex_entity_value consisting of exactly {A(...)} is "complete"
with respect to A and "partial" with respect to B.
>>This is why in the revised diagram attached, I model
>>"complete complex entity value" as complete with respect to a
>>"reference_type" entity_type.
>
> No - I don't think it is ok to introduce another object "entity_type"
> when already having "single_entity_type" and "(partial) complex entity
> type".
Ah. But this is important. If we define single_entity_type to mean
exactly the set of (explicit) attributes appearing in the ENTITY
declaration, what shall we use to represent the entity, for example B
above, WITH THE INHERITED PROPERTIES that Part 11 says it has? By
definition, single_entity_type DOES NOT represent those properties. So
we need to introduce the concept entity_type to represent the entity
with all of its inherited properties. Now we can use
complex_entity_type for that purpose (which matches the Part 11 usage),
but then we need a different term (Part 11 uses "partial complex entity
type") for arbitrary collections of single_entity_values that may or may
not represent instances of "complex entity data types".
The point is that we need all of the following (7) symbols:
E1 type/value (LK:single_entity_type/_value, EB:SingleEntityType/Value)
refers to the set of explicit attributes that appears in one entity
declaration.
E2 type/value (LK:complex_entity_type/_value?,
EB:PartialEntityType/Value) refers to a collection of one or more E1s.
E3 type (LK:?, EB:EntityType and entity_type) refers to the thing
actually declared by an entity declaration: Per Part 11 9.2, that
includes ALL of the properties (attributes and rules) appearing in the
declaration itself and all of the properties inherited thru SUBTYPE
relationships that appear in the declaration.
E3 instance (LK:entity_instance, EB:EntityInstance) refers to a named
value of an entity data type, i.e. "E3 type", which corresponds to the
view that is presented by the EXPRESS schema of an individual in the
conceptual "entity" class corresponding to the E3 type. (Part 11, 3.3.x)
E4 value (LK:complete_complex_entity_value, EB:EntityValue), subtype of
E2 value, refers to a collection of E1 values that is sufficient to
describe an E3 instance. (As Lothar says, and I agree, this is a
transformation that can occur.) By the definition above, the E4 value
will be sufficient to describe an instance that is an instance of every
entity data type whose E1 value is contained in it.
It is my contention that we need all 7 concepts to explain the EXPRESS
model of entity types and values, in the presence of the "partial entity
constructor", the "group" operator, and the "complex entity
constructor". I don't care what we call them.
It is possible that I misconstrued the intended meaning of Lothar's
symbols. So I use only Part 11 terms, and the En terms, in the
definitions above, so that we can sort this out.
>>But the point is that "complete" is only meaningful *in reference to* a
>>target entity_type.
>
> I can't follow you here.
> "complete" can only be answered for a schema and this decides whether
> we can make an instance out of it.
OK. I agree that the idea "reference_type" is wrong. A
complex_entity_value is "complete" if it meets the criterion above: it
has all the single_entity_values needed for every entity data type (E3
type) that appears in it. And the relationship to the set of entity
data types for which it is complete can be derived from the relationship
to the single_entity_values it contains.
[BTW, this derivation can't be written with an EXPRESS DERIVEd
attribute. It is the set
(s.of_type.declared_by FOR s <* SELF\single_entity_values)
But EXPRESS lacks the LISP 'mapcar' function that does this kind of thing.]
> The next question is whether this instance can be assigned to a
> variable, parameter or entity-attribute. This is a different question
> and clearly covered by Express already I think.
My point is that, in Part 11 (but not necessarily in Part 22, I
suppose), the only thing you can do with an expression that produces a
"complete_complex_entity_value" is to assign it to something. It is
only when we do that assignment that we care that the value is
"complete". So I am *defining* complete_complex_entity_value ("E4
value") to be a complex_entity_value ("E2 value") that can be converted
to an instance of a *specified* entity data type (not *some* entity data
type).
Upon reflection, we can indeed make the assignment a "different
question". And in that case, we don't need "E4 value"
(complete_complex_entity_value) at all! We could instead write, as Part
11 does, a rule for the conversion operator, e.g.
"The "entity-value" operand of the (true) "entity instance
constructor" shall be a complex_entity_value that <satisfies the
definition above>."
That constructor produces an entity instance that may be a valid
instance of many entity data types, and whether that instance can be
assigned to a given target is another matter. The only problem is that
it may not be a *valid* instance, because it doesn't satisfy some local
or global rule, but I think that part is now "schema-dependent" in the
larger sense.
But we note that both Lothar and I thought it useful to create a term
and model that relationship.
> ...
>>For these reasons, I don't see the need to have complete_ and partial_
>>complex_entity_types, or partial_complex_entity_values.
>
> Yes - get rid of those
OK. we agree on this part.
>>...
>>The other addition I made is to the relationship between
>>single_entity_type and complex_entity_type (and the same for _value).
>>This relates to the SingleEntityType SUBTYPE OF (PartialEntityType)
>>issue you raise, and I discuss that below.
>
> It is ok that a single_entity_type corresponds to exactly one
> complex_entity_type with the additional rule that the
> complex_entity_type has only one component and this single component
> is THIS single_entity_type.
Exactly. (I thought I said this as well.)
> The attribute equivalent_type on the value level does not make sense I
> think - and this lead me to something else:
Yes. I copied the relationship and forgot to change the text.
It should read "equivalent_value".
> --- start: conceptual versus implementation level ---
> In our Java implementation it is clear that we can have 2 instances of
> Java class EntityValue or 2 instance of Java class ComplexEntityValue
> with the same value. This is necessary to effectively work with this.
> But in a conceptual data model (not one for implementation) we may
> say that an EntityValues and ComplexEntityValues must be unique in
> respect to their contents. This is because there is not way to
> distinguish between two Complex/Single entity values with the same
> contents - they don't have an identity like the entity instance has.
>
> At the end we have to answer if we want to create a conceptual data
> model which will exactly represent the nature of Express or one which
> is suitable for implementation.
> Only when we speak on such a conceptual level a single_entity_value
> has exactly one corresponding complex_entity_value. In a real
> implementation we may have 0, 1 or more.
I think I see the problem.
With respect to values, the EXPRESS-G diagram is correct but incomplete.
What is intended is what the UML diagram shows:
single_entity_value.equivalent_value is unique. For single entity value
B(5.0, "cm"), the equivalent_value is the set {B(5.0, "cm")}, which is a
complex_entity_value. It is unique, because the complex_entity_value is
defined to be a set, and any two sets having exactly the same members
are equal, i.e. the same set. And the INVERSE relationship to
single_entity_value.equivalent_value is SET[0:1]. Each
complex_entity_value that consists of exactly one single_entity_value
has one instance of that inverse relationship, namely to that
single_entity_value; all others have none.
But the single_entity_value B(5.0, "cm") can appear in any number of
different sets, i.e. different complex_entity_types. So the INVERSE
relationship to complex_entity_type.components is also SET[1:?].
But the problem Lothar raises is this: we have to decide whether we are
talking about the value, or an occurrence of the value. And that is the
difference between the conceptual model and the implementation.
In an implementation, *all* occurrences have identity that is distinct
from value. Every occurrence of B(5.0, "cm") is distinct, and the one
that appears in any given complex_entity_occurrence is different from
the one that appears in any other complex_entity_occurrence. They are
equal in *value*, but their *identities* are distinct. So any
single_entity_occurrence appears in exactly one
complex_entity_occurrence. And that behavior is different from what is
stated for single_entity_value.
Interestingly, this is also true of 5.0. If I have two complex entity
occurrences that contain that same single entity value, there are two
distinct occurrences of B(5.0, "cm"), and there are also two distinct
occurrences of 5.0. In the implementation, the two occurrences of 5.0
also have distinct identities. And in the implementation, the
expression "v->value_part == 5.0" involves another occurrence of 5.0
that has an identity different from both of the occurrences in the
single_entity_occurrences. (If you look at the hardware implementation,
the machine address of the comparand 5.0 is different from the machine
address for the component of the struct.)
Now, Java gives us a syntax for referring to the Value 5.0, and
different syntax for referring to some of its Occurrences. Java gives
us NO syntax for referring to the Value B(5.0, "cm"); it only provides a
syntax for referring to Occurrences of it.
In my mind, *that* is the difference between the conceptual model and
the implementation model *for Java*. The implementation model deals
with the limitations of the language in manipulating the concepts, and
it does that by casting the concepts into constructs the language can
support. The fact that the Java support for REAL values and for B
values is different causes the models to be the same in some places and
different in others.
If you implement in LISP, you get a different collection of curious
differences between the conceptual and implementation models.
And it is this last observation that tells me that "we" (SC4, OMG, et
al.) don't want an "implementation model". Because the one I make for
Java and the one I make for LISP will be different!
>>Unfortunately, it also makes it unnecessary and inconvenient for
>>SingleEntityType to be a data type!
>
> Ups - no!
> The definition says that a data type is a "domain of values". Since we
> have single_entity_values for a single_entity_type this is a data
> type.
But a single_entity_value is not the value of anything. There is no
EXPRESS construct for a single_entity_value, and there is no EXPRESS
operator that is said to produce one. If we put a single_entity_type in
the model as a data type, what would have that data type?
complex_entity_type is only the data type of a few kinds of Expression;
single_entity_type is not the data type of any Expression; it is the
data type of nothing.
So why would we have this data type?
>>The type of a SingleEntityValue is
>>a "partial complex entity data type", as Part 11 says.
>
> Let us find those places and change it
>
>
>>There are two
>>operations that conceptually produce SingleEntityValues, but no
>>operations are defined on SingleEntityValues. Those operations are
>>said to produce a partial complex entity value consisting of one
>>SingleEntityValue. The group operator, the attribute reference, the
>>complex entity constructor are all defined to take operands that are
>>PartialEntityValues (partial complex entity values), and the "implicit"
>>EntityValue to EntityInstance conversion operator takes a partial
>>complex entity value. So if we construct a "single entity data type"
>>that is not a subtype of "partial complex entity data type", it isn't
>>the data type of anything! And if we make SingleEntityValue an instance
>>of "single entity data type", it isn't a valid operand of any operator!
>
> We have to change the text to say that the constructor and
> group_operator (in the case that an entity type is used) results in a
> single entity value. This single entity value is then implicitly
> transformed into a complex entity value and further on into an entity
> instance as needed by the other stuff around.
But the transform always occurs the moment the single_entity_value is
produced, because *every* use of it needs a (partial) complex entity value.
Obviously we could do what Lothar suggests. Do we really want to?
Except for the complex entity constructor definition, the current text
of Part 11 conveys the intent without using the term single entity value
*at all*. (I'm sure that is why TC#2, which introduced the term, didn't
make all of these changes.)
We agree to use single_entity_value to designate a piece of a
complex_entity_value, so that we can describe the entity constructors
carefully, but we don't need it to be an Instance/Value to do that. An
indexed_element of an ARRAY value isn't a value; it's just a piece of one.
Why is it important to have single_entity_value be a value, and
single_entity_type be a data type? Is it just because we called them
xxx_value and xxx_type? Or is it because we need both ideas, and they
relate to each other in the type/value way.
> We should try to minimise the number of changes.
I agree. I think we now pretty much agree on the model, except for a
couple of things in which we have choices that don't affect the meaning.
-Ed
--
Edward J. Barkmeyer Email: edbark at nist.gov
National Institute of Standards & Technology
Manufacturing Systems Integration Division
100 Bureau Drive, Stop 8264 Tel: +1 301-975-3528
Gaithersburg, MD 20899-8264 FAX: +1 301-975-4694
"The opinions expressed above do not reflect consensus of NIST,
and have not been reviewed by any Government authority."
More information about the wg11
mailing list