从文档读取名称空间并缓存它们
NamespaceContext 的下一个版本要稍好一些。它只在构造函数内提前读取一次名称空间。对一个名称空间的每次调用均回应自缓存。这样一来,文档内的更改就变得无关紧要,因为名称空间列表在 Java 对象创建之时就已被缓存。
清单 10. 从文档缓存名称空间解析
public class UniversalNamespaceCache implements NamespaceContext {
private static final String DEFAULT_NS = "DEFAULT";
private Map<String, String> prefix2Uri = new HashMap<String, String>();
private Map<String, String> uri2Prefix = new HashMap<String, String>();
/**
* This constructor parses the document and stores all namespaces it can
* find. If toplevelOnly is true, only namespaces in the root are used.
*
* @param document
* source document
* @param toplevelOnly
* restriction of the search to enhance performance
*/
public UniversalNamespaceCache(Document document, boolean toplevelOnly) {
examineNode(document.getFirstChild(), toplevelOnly);
System.out.println("The list of the cached namespaces:");
for (String key : prefix2Uri.keySet()) {
System.out
.println("prefix " + key + ": uri " + prefix2Uri.get(key));
}
}
/**
* A single node is read, the namespace attributes are extracted and stored.
*
* @param node
* to examine
* @param attributesOnly,
* if true no recursion happens
*/
private void examineNode(Node node, boolean attributesOnly) {
NamedNodeMap attributes = node.getAttributes();
for (int i = 0; i < attributes.getLength(); i++) {
Node attribute = attributes.item(i);
storeAttribute((Attr) attribute);
}
if (!attributesOnly) {
NodeList chields = node.getChildNodes();
for (int i = 0; i < chields.getLength(); i++) {
Node chield = chields.item(i);
if (chield.getNodeType() == Node.ELEMENT_NODE)
examineNode(chield, false);
}
}
}
/**
* This method looks at an attribute and stores it, if it is a namespace
* attribute.
*
* @param attribute
* to examine
*/
private void storeAttribute(Attr attribute) {
// examine the attributes in namespace xmlns
if (attribute.getNamespaceURI() != null
&& attribute.getNamespaceURI().equals(
XMLConstants.XMLNS_ATTRIBUTE_NS_URI)) {
// Default namespace xmlns="uri goes here"
if (attribute.getNodeName().equals(XMLConstants.XMLNS_ATTRIBUTE)) {
putInCache(DEFAULT_NS, attribute.getNodeValue());
} else {
// The defined prefixes are stored here
putInCache(attribute.getLocalName(), attribute.getNodeValue());
}
}
}
private void putInCache(String prefix, String uri) {
prefix2Uri.put(prefix, uri);
uri2Prefix.put(uri, prefix);
}
/**
* This method is called by XPath. It returns the default namespace, if the
* prefix is null or "".
*
* @param prefix
* to search for
* @return uri
*/
public String getNamespaceURI(String prefix) {
if (prefix == null || prefix.equals(XMLConstants.DEFAULT_NS_PREFIX)) {
return prefix2Uri.get(DEFAULT_NS);
} else {
return prefix2Uri.get(prefix);
}
}
/**
* This method is not needed in this context, but can be implemented in a
* similar way.
*/
public String getPrefix(String namespaceURI) {
return uri2Prefix.get(namespaceURI);
}
public Iterator getPrefixes(String namespaceURI) {
// Not implemented
return null;
}
}
|
请注意在代码中有一个调试输出。每个节点的属性均被检查和存储。但子节点不被检查,因为构造函数内的布尔值 toplevelOnly 被设置为 true。如果此布尔值被设为 false,那么子节点的检查将会在属性存储完毕后开始。有关此代码,有一点需要注意:在 DOM 中,第一个节点代表整个文档,所以,要让元素 book 读取这些名称空间,必须访问子节点刚好一次。
在这种情况下,使用 NamespaceContext 非常简单:
清单 11. 具有缓存了的名称空间解析的示例 3(只面向顶级)
private static void example3(Document example)
throws XPathExpressionException, TransformerException {
sysout("\n*** Third example - namespaces of toplevel node cached ***");
XPath xPath = XPathFactory.newInstance().newXPath();
xPath.setNamespaceContext(new UniversalNamespaceCache(example, true));
try {
...
NodeList result1 = (NodeList) xPath.evaluate(
"books:booklist/science:book", example,
XPathConstants.NODESET);
...
} catch (XPathExpressionException e) {
...
}
...
NodeList result2 = (NodeList) xPath.evaluate(
"books:booklist/fiction:book", example, XPathConstants.NODESET);
...
String result = xPath.evaluate(
"books:booklist/fiction:book[1]/:author", example);
...
}
|
这会导致如下输出:
清单 12. 示例 3 的输出
*** Third example - namespaces of toplevel node cached ***
The list of the cached namespaces:
prefix DEFAULT: uri http://univNaSpResolver/book
prefix fiction: uri http://univNaSpResolver/fictionbook
prefix books: uri http://univNaSpResolver/booklist
Try to use the science prefix:
--> books:booklist/science:book
The cache only knows namespaces of the first level!
The fiction namespace is such a namespace:
--> books:booklist/fiction:book
Number of Nodes: 2
<?xml version="1.0" encoding="UTF-8"?>
<fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
<title xmlns="http://univNaSpResolver/book">Faust I</title>
<author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
</fiction:book>
<?xml version="1.0" encoding="UTF-8"?>
<fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
<title xmlns="http://univNaSpResolver/book">Faust II</title>
<author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
</fiction:book>
The default namespace works also:
--> books:booklist/fiction:book[1]/:author
Johann Wolfgang von Goethe
|
上述代码只找到了根元素的名称空间。更准确的说法是:此节点的名称空间被构造函数传递给了方法 examineNode。这会加速构造函数的运行,因它无需迭代整个文档。不过,正如您从输出看到的,science 前缀不能被解析。XPath 表达式导致了一个异常(XPathExpressionException)。