[wg11] Re: Conflict between the goals of EXPRESS modelers and AP
developers
Ed Barkmeyer
edbark at nist.gov
Fri Jun 18 19:17:59 EDT 2004
David Price wrote:
> Secondly, it misses the point about what technology is nice to have
> available vs. what needs to be an ISO standard.
I am still struggling with that question: What needs to be in the standard?
So I started from this question: What is the motivation for Part 28 in
the first place?
(1) There is a need for a standard XML representation of data modeled in
EXPRESS in ISO standards. An XML representation is useful for the
following reasons:
- the standards-conforming data elements can be processed using XML
encode/decode libraries available in your Java, VBasic, etc.,
programming toolkits;
- the standards-conforming documents can be displayed by browsers with
style sheets and other specialized XML tools;
- the standards-conforming documents can be transformed to other XML
forms using XSLT;
- the standards-conforming data sets can be embedded in e-business
transaction messages and other XML documents, without requiring
special-purpose decoders.
Using XML provides access to more and better tools, books and training,
and a large body of educated programmers and analysts.
All of these results accrue from defining a mapping from EXPRESS models
to XML formation rules, rather like Part 21. There is no direct need
for either DTD models or XML Schema models.
But programmers expect to see a DTD or XML schema model to tell them
what the standards-conforming data elements will look like. They don't
expect to see an EXPRESS model and a set of rules for rendering the
corresponding XML. So the use of toolkits, style sheets, and XSLT is
greatly improved by providing a DTD or XML schema model for the data.
And DTDs are not adequate to defining elegant models, and they are out
of favor in the perennial IT popularity contests. So:
(2) There is a need for an XML Schema specification of the data elements
and document constructs for a data set that corresponds to an EXPRESS model.
(3) There is a need to validate exchange data sets against the standard.
In general, there is no automated technology that can do this. But
there are several possible interpretations of this statement that can be
automated:
- Validate that the received data can be converted into the internal
model of the target application, using the provided XML schema for the
structure of the data and the programmers understanding of the semantic
intent (derived from EXPRESS models and text, tutorials, etc.).
- Validate that the received data set satisfies as many of the EXPRESS
rules as your EXPRESS toolkit can actually implement (which is most but
not all), without any notion of what the data means.
- Validate that the received data set satisfies the rules stated in
the XML schema, without any notion of what the data means.
- Validate that the received data set is well-formed XML.
Application tools need to do the first. This is what users need.
Application tool vendors may use tools that do the EXPRESS-based
validation, as a means of debugging their own output routines, and as a
means of minimizing the confusion on input when something ugly is
encountered.
Testbeds and other "meta-work" facilities specific to SC4 work do the
EXPRESS-based validation without regard for the meaning, primarily as a
means of increasing vendor and user confidence in using the standards
and user confidence in the conformance of the tools to the intent of the
standards. (But they can't test conformance to intent this way -- they
can only test "syntactic conformance".)
Much of the target audience for the XML representation does not read
EXPRESS or standards written in it, and they will not do EXPRESS-based
validation, ever. They still need to do the first. And the supporting
tools for debugging, testbeds, etc., for them might reasonably be
expected to do the XML schema validation, since there are available
off-the-shelf tools that do that. So:
(4) It is desirable to have XML schema that captures the readily
testable constraints stated in the EXPRESS schema. It is not necessary
to have this, but it has value. This level of "validation" cannot be
equivalent to the EXPRESS-based validation:
- All supertype/subtype constraints and most WHERE clauses cannot be
stated in XML schema at all. Those that can require considerable
analysis and interpretation to be mapped to XML schema equivalents.
- Most simple EXPRESS constraints -- bounds on string length, constant
bounds on the sizes of aggregate values -- can be stated in XML schema.
- UNIQUE rules can be stated in XML schema.
- Referential constraints that arise from the XML schema
representation of the EXPRESS notion of "object identity" can be stated
in XML schema.
Many of these constraints use features of XML schema that are not
commonly used in e-business transactions and other XML exchange
standards. They are therefore unfamiliar to many programmers and
modelers who are otherwise "XML literate", and they have less reliable
support in the off-the-shelf tool kits. And the tools that will
validate against EXPRESS models don't need them. Therefore:
(5) It is desirable to have an option, or a conformance class, that
deletes from the derived XML schema all the constraints that use unusual
features of XML schema: complex type restriction, unique, key and keyref.
(6) From an entirely different starting point, it is desirable to be
able to use the XML schema models, with documentation or tutorials, as a
means of introducing SC4 standard models to the much larger community
that is unfamiliar with EXPRESS. This would require the XML schema
models to be in some sense "natural" to folks who are familiar with
hierarchical XML models. So, unlike Part 21:
- the XML schema must permit entity instances to be "contained within"
other entity instances, as well as, or perhaps instead of, pointed to by
"instance name". But since EXPRESS contains no hints for which things
should be contained and which should be pointed to, the best the XML
schema can do is to allow both.
Because EXPRESS schemas in STEP standards are built around an
architecture that wasn't trying for elegance in the view presented to
the end-user or the implementor, the straightforward mapping to XML
schema will produce somewhat ugly XML schemas, and they will be hard for
XML literati to understand and use. So:
(7) It is desirable to be able to modify or re-organize the EXPRESS
schema to produce a more "accessible" XML schema version for this target
larger community.
But this means that the rote mapping of the EXPRESS schema to the XML
schema will not be the standard one, or not the only standard one. But
the standard should allow all users of a given EXPRESS-based standard to
use a common XML schema to reach the larger audience, and more
importantly, the standard should make it clear to the programmer exactly
what the output data must look like and what the input data may look
like. So:
(8) It is required that Part 28 define one standard mapping that
determines a single XML schema as the standard representation of a given
EXPRESS schema. (With respect to item (5) above, it is possible that
there are two versions, both of which define exactly the same XML
structures, but one of them also states additional XML schema validation
constraints.)
This makes it necessary to provide wish-list item (7) OUTSIDE OF Part
28, possibly using EXPRESS-X beforehand, or XSLT after the mapping.
Regrettably, the team of experts that I worked with did not share the
opinion that (8) was a requirement. I understand that many of them
strongly believe in (6) and (7) and are willing to sacrifice (8) to
achieve that. I believe that (8) is the requirement, and (7) is merely
"desirable", and that where sacrifice is necessary it must go the other
way. But in any case, I believe that SC4 needs to make a decision as to
whether Rule (8) is a mandatory requirement.
If (8) is a requirement, the current draft is not even close to meeting
the objectives of Part 28! The draft allows thousands of XML schemas
that produce radically different XML data organizations to all be
conformant with Part 28 for one given EXPRESS schema! If (8) isn't a
requirement, then the current document is probably very close, and the
remaining issue is what we do about problem (5).
I'm sure everyone has his own chain of reasoning. The above is mine.
But IMNSHO, every chain of reasoning must still come to answering the
question: Is it a requirement that a given EXPRESS schema produce one
given conforming XML schema per Part 28 (and therefore one conforming
XML data structure) or not?
Please, please answer that question in your NB comments! And please
realize that any answer other than YES it is a requirement, will be
interpreted as No, because that is the mindset of the developers!
-Ed
P.S. I would also observe that some of the configuration directives and
design decisions cannot be excused by any of the above logic, not even
choosing (7) over (8).
--
Edward J. Barkmeyer Email: edbark at nist.gov
National Institute of Standards & Technology
Manufacturing Systems Integration Division
100 Bureau Drive, Stop 8264 Tel: +1 301-975-3528
Gaithersburg, MD 20899-8264 FAX: +1 301-975-4694
"The opinions expressed above do not reflect consensus of NIST,
and have not been reviewed by any Government authority."
More information about the wg11
mailing list