Friday, November 21, 2008

XML Namespaces

When I was first learning XML namespaces, one of the things that confused me most was the use of URLs (eg. xmlns:sca="http://docs.oasis-open.org/ns/opencsa/sca/200712"). For some reason, I was convinced that this is how an XML document pointed to its XML schema for validation. It took me a while to realize that the fact the namespace is [usually] a URL makes NO DIFFERENCE whatsoever to the XML document. In fact "anyStringWhatsoever" would work just as well, so long as it is universally unique. In fact, the concepts of XML namespaces and XML Schema aren't even defined by the same spec (http://www.xml.com/pub/a/2005/04/13/namespace-uris.html vs http://www.xml.com/pub/a/2005/04/13/namespace-uris.html).

So how do XML instance documents identify their XML Schema?
From www.w3.org/TR/xmlschema-1:

Schema Representation Constraint: Schema Document Location Strategy
Given a namespace name (or none) and (optionally) a URI reference from xsi:schemaLocation or xsi:noNamespaceSchemaLocation, schema-aware processors may implement any combination of the following strategies, in any order:
1 Do nothing, for instance because a schema containing components for the given namespace name is already known to be available, or because it is known in advance that no efforts to locate schema documents will be successful (for example in embedded systems);
2 Based on the location URI, identify an existing schema document, either as a resource which is an XML document or a element information item, in some local schema repository;
3 Based on the namespace name, identify an existing schema document, either as a resource which is an XML document or a element information item, in some local schema repository;
4 Attempt to resolve the location URI, to locate a resource on the web which is or contains or references a element;
5 Attempt to resolve the namespace name to locate such a resource.
Whenever possible configuration and/or invocation options for selecting and/or ordering the implemented strategies should be provided.


A couple of observations: First of all, why do they use an ordered list here if the order is not supposed to matter? Secondly, at least my preconceived notion of the link between namespaces and schema made the list (coming in at #5).

So why is it that just about every namespace in use is defined by a URL?

http://www.w3.org/TR/uri-clarification/

The XML Schema spec actually defines the namespace string to be a "Universal Resource Identifier" (URI). A URI is then further broken into 2 types: Universal Resource Locator (URL) and Universal Resource Name (URN). The reason most (certainly not all) schemas use URLs is because they are dereferenceable(aka you can go there). Basically, a dereferenceable URI gives us 2 main benefits:
1) Creating UUIDs is hard...especially UUIDs that have a shot of being meaningful/informative to human readers. The web's address system (IP, DNS, and whatnot) is a scalable, performant, and proven solution in this space.
2) Providing a URL gives the document reader a place (docs.oasis-open.org/ns/opencsa/sca/200712) and method (http) to check for more information...or even look for a schema (eg. #5 above).

For this reason, if you are defining your own namespace, please use a URL, preferably in a domain you control, and pretty please put page up at that location to describe the namespace (links or text describing what it is and what its intended to be used for). Dumb users like me will probably go there to find the schema.

http://www.xml.com/pub/a/2005/04/13/namespace-uris.html

Thursday, November 6, 2008

Mylyn Tip

Use lots of workspaces? Tired of creating mylyn queries for each workspace? Try using a shared mylyn folder by setting the Data directory under the Advanced section of the Tasks Preference page.




More fun with p2

It's been a few months since I performed my pooled Eclipse Ganymede installations and since then the Eclipse train has continued to roll with the 3.4.1 Maintenance Release as well as the first three milestones of the 3.5 stream (Galileo).

One of the touted benefits of the new provisioning system is that it allows users to upgrade from one Eclipse release to another (in place). However, I had a few problems with my work install (leaving it in a seemingly endless cycle of unresolved dependencies) and so I have decided to see if I could recreate them in a more controlled environment (and document that process here of course).

When I tried to upgrade from my Ganymede install to the latest 3.5M3, I was hit with the following error:

Cannot complete the request. See the details.
Eclipse SDK is already installed, so an update will be performed instead.
Cannot find a solution satisfying the following requirements Match[requiredCapability: org.eclipse.equinox.p2.iu/org.eclipse.ltk.core.refactoring/[3.4.100.v20080806-1800,3.4.100.v20080806-1800]].

When I tried installing the same IU to the other (minimal) eclipse 3.4 instance, I received to errors/warning but hit the following error during downloads:

An error occurred while collecting items to be installed
No repository found containing: org.eclipse.ant.core/osgi.bundle/3.2.100.v20080721
A quick retry seemed to get past this one and the update quickly finished without error.