E2 and merging SUBTYPE_CONSTRAINTs

Wed May 8 14:19:16 EDT 2002

Peter R Wilson" wrote:
> In the body of the DAM it gives a simple explanation of how TOTAL_OVER
> constraints are merged from different SUBTYPE_CONSTRAINTs for the same
> entity, but there is no hint as to how supertype_expressions are merged,
> just a pointer to an Appendix, which does not give a hint either, rather a
> complicated algorithm. It would be most helpful if the body at least
> outlined what is meant to happen.

I agree in spirit with this.  But nowhere in Part 11 does it say how
multiple RULEs for the same entity types are "merged", either.  
Logically they are "ANDed" together, in the sense that all of them
must hold for any given population to be valid.

SUBTYPE CONSTRAINTS are rules, and they also must all hold.  And if you
rewrite them as LOGICAL constraint expressions, then they can be ANDed
together if that is a concern.  What other kind of "merge" is needed?

The problem with the previous SUPERTYPE clause is that it forced a 
conceptual set of population rules to be written as a single expression
that involved operators that did not have clear LOGICAL interpretation.
The algorithm in Annex B was an effort to provide a formal definition of
the meaning and precedence of those operators, so that the meaning of
a SUPERTYPE expression could be determined.

SUBTYPE CONSTRAINTs alleviate the need for the single complex expression,
and it is my contention that we can eliminate the complex supertype
expressions altogether (subject of a previous email).

> From reading the DAM I end up mentally replacing TOTAL_OVER by
> AT_LEAST_ONEOF. 

Agreed.

> Looked at in this light, TOTAL_OVER(a, b, c) can be replaced
> by (a ANDOR b ANDOR c). 

Maybe.  TOTAL_OVER(a, b, c) doesn't mean anything without a reference
to the supertype for which the constraint is being declared.  And
 SUBTYPE_CONSTRAINT sc0 for p; TOTAL_OVER (a, b, c); ...
is very definitely *not* the same as:
 ENTITY p SUPERTYPE OF (a ANDOR b ANDOR c); ...
which says exactly the same thing as:
 ENTITY p;

I know Peter disagrees (see below), but I find it easier to understand 
it using RULEs:
  SUBTYPE_CONSTRAINT sc0 for p; TOTAL_OVER (a, b, c); END...
is exactly the same as:
  RULE sc0 FOR (p, a, b, c); WHERE p = a + b + c; END_RULE;
And now I can "merge" it with any other constraint by logical AND.

> Doing this, then the DAM says that the two
> SUBTYPE_CONSTRAINTs:
> SUBTYPE_CONSTRAINT sc1 FOR p;
>   TOTAL_OVER(f,m);
> END...
> and
> SUBTYPE_CONSTRAINT sc2 FOR p;
>   TOTAL_OVER(a,c);
> END...
> are merged to:
> SUBTYPE_CONSTRAINT merged1 FOR p;
>   (f ANDOR m) AND (a ANDOR c);
> END...
>  That is TOTAL_OVERs are merged by ANDing them in ED1 terminology.

This is false!  The "merged1" constraint as written permits both
 p and p&b (where b is some other entity declared SUBTYPE OF p)
as evaluated set members per Annex B, whereas the two TOTAL_OVER
constraints do not.  This is the consequence of the oversight indicated 
above.

SUBTYPE_CONSTRAINT merged1 is the same as:
  RULE sc0 FOR (p, a, c, f, m); WHERE f + m = a + c; END_RULE;
and as you can see, it says nothing about p itself.
The TOTAL_OVER constraints say that the two ANDOR expressions are 
equal *because* they are both equal to p!

>     Further, the body of the DAM makes it clear that:
> SUBTYPE_CONSTRAINT sc3 FOR p;
>   <expression>;
>   TOTAL_OVER(f, m);
> END...
> is equivalent to:
> SUBTYPE_CONSTRAINT equiv1 FOR p;
>   <expression> AND (f ANDOR m);
> END...

This also is not true.  Let <expression> be ONEOF(a, b).  Then
 SUBTYPE_CONSTRAINT sc3 FOR p;
   ONEOF(a,b);
   TOTAL_OVER(f, m);
 END...
is the same as:
  RULE sc3 FOR (p, a, b, f, m); WHERE 
   oneof_ab: a * b = 0; 
   total_over_fm: p = f + m;
  END_RULE;
but
 SUBTYPE_CONSTRAINT equiv1 FOR p;
   ONEOF(a,b) AND (f ANDOR m);
 END...
is the same as:
  RULE equiv1 FOR (p, a, b, f, m); WHERE 
   oneof_ab: a * b = 0;
   andrule:  a + b = f + m;
  END_RULE;
which says something quite different!

Like RULEs and *unlike* supertype-constraints; the constraint expressions
separated by semicolons in a SUBTYPE_CONSTRAINT are *independent* rules.

>     It is not clear to me how expressions are meant to be combined.
> SUBTYPE_CONSTRAINT sc4 FOR p;
>   <expression4>;
> END...
> and
> SUBTYPE_CONSTRAINT sc5 FOR p;
>   <expression5>;
> END...
> can be combined as
> SUBTYPE_CONSTRAINT sc45 FOR p;
>   <expression4> ? <expression5>;
> END...
> where ? is some operator. The TOTAL_OVER examples suggest it is "AND", but
> is it? The options seem to be "AND", or "ANDOR" or ";" (the last of which
> begs the point).

And this is just the point.  ";" is exactly right!  The separate subtype
constraint expressions are separate rules for the content of the population. 
They can be combined by *logical AND*.  There is *NO* "supertype expression
operator" that combines them.  

>     My concerns about the above have arisen from two things:
> 
> 1) Given an existing supertype entity, S, with some ONEOF subtypes (A, B)
> specified in a SUBTYPE_CONSTRAINT for S, how to add a new subtype, C, for
> S and additional SUBTYPE_CONSTRAINT for S to make the resulting overall
> constraint ONEOF(A, B, C)?

That is easy:
 ENTITY C SUBTYPE OF S; ...
 SUBTYPE_CONSTRAINT independent_C FOR S;
  ONEOF(A, B, C);
 END...
This extends and does not contradict the existing ONEOF(A,B) constraint.
For the RULE view above, the equivalent is:
  A * B = 0 AND A * C = 0 AND B * C = 0;
And if I "AND" this with the existing rule: A*B = 0, I get:
  (A*B = 0) AND (A * B = 0 AND A * C = 0 AND B * C = 0);
which is redundant but consistent.

> 2) The proposed algorithm ("Guidance for generating long-form EXPRESS
> schemas for all editions of EXPRESS", Feeney, Haenisch, Price and Wasmer,
> 2001/08/27) for short to long forms includes an Ed2 to Ed1 algorithm, which
> suggests creating a RULE to represent a TOTAL_OVER. As I am in principle
> against RULEs except under dire circumstances I was looking for a non-RULE
> method of mapping TOTAL_OVERs. I think that mapping them as a regular (a
> ANDOR ...) constraint does the job better, or am I missing something?

"mapping them as a regular constraint" does the job *wrong*!  
The NIST position (adequately supported by the Feeney above) is that RULEs
have the advantage of clear mathematical interpretation.  The problem with
RULEs in Express is that they can contain procedural code, in addition to
static mathematical expressions.  But when you are describing the
relationships between sets (which is what all supertype/subtype constraints
do), it is difficult to improve on the established mathematical operators:
equal, intersect, union, difference.

-Ed

-- 
Edward J. Barkmeyer                       Email: edbark at nist.gov
National Institute of Standards & Technology
Manufacturing Systems Integration Division
100 Bureau Drive, Mail Stop 8260          Tel: +1 301-975-3528
Gaithersburg, MD 20899-8260               FAX: +1 301-975-4482

"The opinions expressed above do not reflect consensus of NIST,
and have not been reviewed by any Government authority."