The XML2RDF mapping is part of the ReDeFer project. It allows moving metadata from the XML to the Semantic Web world in a transparent way. XML instances are mapped to RDF ones that are semantically enriched. The semantics are those previously explicited by the XSD to OWL mappings of the involved XSDs using the XSD2OWL tool.
For instance, it is possible to perform semantic queries on the resulting RDF that take into account the semantics of the substitutionGroup. If we use XQuery in order to retrieve MPEG-7 SegmentType descriptions from an XML database with MPEG-7 metadata, we must be aware of the hierarchy of segment types and implement an XQuery that has to cover any kind of multimedia segment, i.e. VideoSegmentType, AnalyticClipType, AudiSegmentType, etc. Once the hierarchy of segments types is available in Web Ontology Language (OWL) form, semantic queries benefit from the, now, explicit semantics. Therefore, a semantic query for SegmentType will retrieve all subclasses without requiring additional developing efforts.
This is necessary because, although XML Schemas capture some semantics of the domain they model, XML tools are based on syntax. The captured semantics remain implicit from XML processing tools point of view. Therefore, when an XQuery searches for a SegmentType, the XQuery processor has no way to know that there are many other kinds of segment types that can appear in its place, i.e. they are more concrete kinds of segments.
The XML2RDF mapping can be tested on-line in the ReDeFer project web page.
Once all the metadata XML Schemas are available as mapped OWL ontologies, it is time to map the XML metadata that instantiates them. The intention is to produce RDF metadata as transparently as possible. Therefore, a structure-mapping approach has been selected. It is also possible to take a model-mapping approach. XML model-mapping is based on representing the XML information set using semantic tools. This approach is better when XML metadata is semantically exploited for concrete purposes. However, when the objective is semantic metadata that can be easily integrated, it is better to take a more transparent approach.
Transparency is achieved in structure-mapping models because they only try to represent the XML metadata structure, i.e. a tree, using RDF. The RDF model is based on the graph so it is easy to model a tree using it. Moreover, we do not need to worry about the semantics loose produced by structure-mapping. We have formalised the underlying semantics into the corresponding ontologies and we will attach them to RDF metadata using the instantiation relation rdf: type.
The structure-mapping is based on translating XML metadata instances to RDF ones that instantiate the corresponding construct in OWL. The more basic translation is between relation instances, from xsd: elements and xsd: attributes to rdf: Properties. Concretely, owl: ObjectProperties for node to node relations and owl: DatatypeProperties for node to values relations. However, in some cases, it would be necessary to use rdf: Properties for xsd: elements that have both data type and object type values. Values are kept during the translation as simple types and RDF blank nodes are introduced in the RDF model in order to serve as source and destination for properties. They will remain blank for the moment until they are enriched with semantic information. For the moment, the current state of the mapping is exemplified in Fig. 1.
Fig. 1. XML tree and resulting RDF graph models
The resulting RDF graph model contains all that we can obtain from the XML tree. It is already semantically enriched thanks to the rdf: type relation that connects each RDF properties to the owl: ObjectProperty or owl: DatatypeProperty it instantiates. It can be enriched further if the blank nodes are related to the owl: Class that defines the package of properties and associated restrictions they contain. This semantic decoration of the graph is formalised using rdf: type relations from blank nodes to the corresponding OWL classes.