This repository has been archived by the owner on Jul 1, 2024. It is now read-only.
forked from elharo/xom
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Todo.txt
1444 lines (932 loc) · 43.1 KB
/
Todo.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Todo:
1.4:
---
digital signatures?
encryption?
1.3:
-----
DTD API?
catalog support????
RELAX NG?
add an XPath based NodeFactory that only instantiates nodes that
satisfy a certain pattern
A util package with the most useful samples and contribs
Do Serializer and canonicalizer need setOutputStream methods?
should Builder have an option to cache all external entities so it doesn't keep reloading them, perhaps via a SAX EntityResolver.
A getAttributeValueInScope() method that searches up the tree for the nearest
ancestor element with the specified attribute. This would be useful for xml:lang, xml:space, and many other cases!!!!
Should Builder/NodeFactory have a build(Document) method that allows one to pass an existing document through a node factory without reserializing?
Should/could the TrAX source and result, Jaxen adapter and so forth go public in the converters package?
Per Michael Kay:
Is there any way of building a XOM document from a stream of SAX events?
That is, something that implements ContentHandler, is called to receive the
SAX events, and returns the document node?
Look at some more of Wolfgang's optimizations
could some form of filterlist improve xpath performance?
should there be a special iterator for *; can this be combined with named iterator?
should //* somehow avoid the resort?
could we simplify /descendant-or-self::node()/child::* as
/descendant-or-self::*/child::* | /self::node()/child::*
Optimize normalization
test running in unsigned java web start
add a NodeFactory section to the tutorial for processing big documents
Make a build target for non-LGPL, closed source version
Sign up on Kagi or somewhere for software sales
Add a Buy XOM option to the main XOM sidebar, the unstable sidebar. and the
license page
Profile AllTests (or maybe FastTests).
Add Class-Path entries to manifest
Add more XPath samples based on XOM
Figure out a way to allow users to configure different text storage algorithms,
all stored as byte arrays.
Possibilities:
System property (though that's VM wide, which is really, really bad)
Method in text class and Builder
StAX Builder and converter sample to test API and implementation
Add separate implementations optimized for speed and size.
The current implementation is mostly optimized for size. The speedy version
would carry extra pointers (especially next sibling) to make navigation very quick.
Some final methods would have to be implemented by calling non-final package protected
methods that could be overridden instead.
Should XSLTransform have a constructor that takes a TrAX
Templates or Transformer object to allow
additional properties to be configured
A FastSerializer that does no indenting, no line-breaking, no character sets
except UTF-8 and UTF-16, no normalization, and does not use TextWriter
consider using a WeakHashMap to hold all mappings of nodes to base URIs
Could Builder hold a cache of interned strings to use while building?
Does NUX BinaryXMLCODEC do something like this?
Add a binary distribution that's like the full distribution but no source.
Should the source distribution not include the binary?
Could DOMConversion be done in parallel in two separate threads?
DHollenbeck suggests using char[] arrays internally rather than Strings
would save memory and time.
See http://lists.ibiblio.org/pipermail/xom-interest/2004-January/000898.html
Can XInclude be rewritten to be non-recursive? Isn't it already?
with XIncluder would it make more sense to use a private class
that stores all the variables rather than constantly passing them
back and forth from static methods? That is, make private methods instance
methods rather than static????
would it be possible to get a minimal interoperable subset of Big5, SJIS, etc
and escape all non-interoperable characters?
A streaming serializer based on a NodeFactory that can serialize arbitrarily large documents without having the entire document in memory
should I synchronize a single static TransformerFactory object inside XSLTransform?
Would it make sense to cache previously verified namespace prefixes
to avoid checking them every element and attribute?
Ditto, would it make sense to hash previously looked up names?
Must profile this.
Add a Fetch task that loads code from CVS
(Need to fix guest access on java.net first, or register a fake user)
Use GET task to grab servlet.jar and tagsoup.jar if necessary
XInclude: In order to minimize encoding errors for parse="text" processing,
please change the definition of the encoding attribute to include a
requirement that if the attribute has a legal value and the encoding is
supported and the protocol supports such action, that the server is
informed of the encoding attribute value, e.g. for encoding="iso-8859-2"
and a HTTP request, that the request includes
Accept-Charset: iso-8859-2
such that the server has a chance to provide a proper representation.
would it make since to start with a larger size for attributes and children when building and trimToSize when done?
Clean up classpaths in build.xml.
could I set up JDK15Parser to compile as individual tasks
that are only compiled when dependencies are satisfied?
For 1.5 use ant.java.version property and an equals condition
Wolfgang's ant build file for smaller vs. faster jars (Text class)
Add tools page to web site (Ant, TagSoup, Clover, etc.)
cvs -d :pserver:[email protected]:/cvs rtag XOM_11a3 xom
Backport Fixes to make in 1.0.1
-------
TextWriter.java 1.1a2
XOMTestCase.java
XOMHandler.java for internal DTD subset
Serializer for NFC in consecutive text nodes
XIncluder for base URLs and better exception messages
XOMReader and SAXConverter to allow XSL to work with xml:base attributes
XOMHandler: Workaround for Crimson bug that fails to report asterisk after mixed content declarations
XOMHandler: Workaround for Crimson bug that makes xmlsn:xml="correct URI" and error
XOMHandler: Workaround for Crimson external and internal DTD subset mixup bugs
Hi Elliote.
http://www.cafeconleche.org/XOM/apidocs/nu/xom/xslt/XSLTransform.html
has an example
public static Nodes transform(Document in)
throws XSLException, ParsingException, IOException {
Builder builder = new Builder();
Document stylesheet = builder.build("mystylesheet.xsl");
XSLTransform stylesheet = new XSLTransform(stylesheet);
^^^^^^^^^
return stylesheet.transform(doc);
}
stylesheet used twice !
~/projects/1.0.1$ cvs -d :pserver:[email protected]:/cvs rtag -r XOM_10 -b BR_1_0 xom
lock.c:222: failed assertion `strncmp (repository, current_parsed_root->directory, strlen (current_parsed_root->directory)) == 0'
cvs [rtag aborted]: received abort signal
cvs [rtag aborted]: received abort signal
lock.c:222: failed assertion `strncmp (repository, current_parsed_root->directory, strlen (current_parsed_root->directory)) == 0'
===============
Done 1.2.10 Release
===============
Android support
===============
Done 1.2.9 Release
===============
Exclude UserDataHandler from Jaxen files we copy in to avoid problems with some application servers.
===============
Done 1.2.5 Release
===============
Throw NullPointerException instead of MalformedUriException when a null Reader is passed to Builder.build.
Added a target that builds a maven2 jar archive.
===============
Done 1.2.4 Release
===============
More automatic deploy process
Fixed maven targets
Slight optimization to XPath by combining two loops.
===============
Done 1.2.3 Release
===============
Bug fix for some obscure corner cases
===============
Done 1.2.2 Release
===============
Support OSGI packaging
Repackages the internal copy of org.jaxen into nu.xom.jaxen to avoid accidental conflicts and classloader problems
===============
Done 1.2.1 Release
===============
Upgraded Info.java so java -jar xom.jar shows the right version number.
===============
Done 1.2 Release
===============
Fixed bug when escaping namespace URIs that contained ampersands in Element.toXML()
===============
Done 1.2b3
===============
Latest Unicode normalization tables. Shrunk and optimized UnicodeUtil.
Canonicalization bug fix.
Latest TagSoup
Upgraded to Jaxen 1.1.2
===============
Done 1.2b1
===============
xml:id attributes no longer checked for NCNames
Upgraded to Xerces 2.8.0, DTD-only version
DOMConverter can accept a NodeFactory to be used in creating the XOM document
No longer possible to set an attribute's type to null.
Jaxen source is bundled. Ant no longer checks it out of CVS.
Added a lookup method to XPathContext that retrieves a namespace URI given a prefix
===============
Done 1.1
===============
Documentation updates
===============
Done 1.1b7/RC1:
===============
Fixed bug that could unnecessarily escape carriage returns and linefeeds and numeric character references
Fixed bug that could sometimes change line breaks
Avoid leaking memory from Builder when not reusing it.
===============
Done 1.1b6:
===============
Fixed bug that could append a text node to a document when parsing a malformed document, followed by a well-formed document.
Fixed infinite loop in Canonicalizer when canonicalizing an element with at least two ancestors but in no document
Canonicalizer no longer puts detached elements in a document when canonicalizing them
===============
Done 1.1b5:
===============
Small optimizations in Attribute class
Fixed bug in Canonicalizer when canonicalizing a non-detached element
===============
Done 1.1b4:
===============
Fixed bug in SAXConverter where start/endNamespacePrefixMapping could be called multiple times for the same namespace
===============
Done 1.1b3:
===============
Added SUIDs to Serializable classes (mostly exceptions)
Numerous optimizations including:
Replacing several stacks with ArrayLists
Using an unsynchronized custom BufferedWriter for serialization
Using String instead of StringBuffer in characters() in XOMHandler
CDATASection.toXML() now outputs its value wrapped in a CDATA section rather than escaped.
Bundling Xalan 2.7 instead of 2.6 and Xerces 2.7.1
Various workarounds for bugs in Xalan 2.7 and changes in Xerces 2.7
===============
Done 1.1b2:
===============
Child nodes and attributes are now stored directly in arrays without an intermediary list.
This saves some memory.
A few bug fixes, especially in XPath
===============
Done 1.1b1:
===============
Lots of Jaxen fixes
Lots of Crimson fixes
===============
Done 1.1a3:
===============
Normalization form C serialization is now correct
even when the characters that need to be combined cross the
boundaries of consecutive text nodes
In addition, Serializer does its own normalization. There is no longer any dependence on ICU.
XInclude sometimes generates relative URLs when doing base URI fixup.
XSLT can now operate on xml:base attributes
===============
Done 1.1a2:
===============
XPath is an order of magnitude faster
XPathTypeException class
XPathDriver sample program
Small bug fix in XOMTestCase
Various bug fixes in Jaxen
===============
Done 1.1a1:
===============
A few small speed ups
Some bug fixes in XOMTestCase
===============
Done 1.1d6:
===============
The primary JAR file now bundles Jaxen so that in Java 1.4 and later the only
thing that needs to be on the classpath is xom-1.1d6.jar (provided you don't use NFC in the serializer)
A setParameter method in XSLTransform
All items in a Nodes must now be non-null
XPath expressions now recognize the xml: prefix, even if it hasn't been
specifically bound in a context
Fixed bug that unnecessarily duplicated xml: attributes
during document subset canonicalization
The Canonicalizer API has been reworked significantly.
Document subset canoniclization is now performed by passing in a Nodes
object rather than a Document and an XPath expression. The Canonicalizer(out, boolean, boolean) constructor
has been removed on the grounds of redundancy and confusion. The canonicalize(Document) method is now canonicalize(node) and canonicalizes the entire subtree represented by the ndoe you pass to it.
===============
Done 1.1d5:
===============
Exclusive XML canonicalization
===============
Done 1.1d4:
===============
XPath expressions can now return Namespace nodes
Document subset canonicalization
Fixed bug that prevented round tripping of \r\n in attribute values
===============
Done 1.1d3:
===============
xml:id support
===============
Done 1.1d2:
===============
XPath support
Preserve all entity declarations in internal DTD subset
because these may be needed by the external DTD subset
Bugs Fixed:
-----------
Escape percent signs in the internal DTD subset
to prevent accidental interpretation as parameter entity references
Workaround for Crimson bug that does not use parentheses when reporting NOTATION names for attributes
Can now handle " and &in the internal entity declarations in the internal DTD subset
Better handling of weird filters that skip expected steps like startDocument
XOMTestCase.compare(Node, Node) throws ComparisonFailure when comparing nodes of different types.
XOMTestCase.compare(Node, Node) throws ComparisonFailure when comparing nodes of different types.
XOMTestCase.compare(Node, Node) can now compare attributes
Parses better with parentless filters
===============
Done 1.1d1:
===============
setInternalDTDSubset
===============
Done 1.0:
===============
Update all versions to 1.0
Add an example of running canoniucalizer and/or prettyprinter to README
===============
Done 1.0b11/RC5:
===============
Servlet samples restored, but build file only
compiles them if servlet.jar is present
LGPL, license, and readme files added to distro
===============
Done 1.0b8/RC2:
===============
The TagSoup and servlet JARs are no longer bundled. They're not needed to run XOM, just for one of the samples and for the JavaDoc
A few more optimizations to speed up the checking of namespace URIs, and a variety of other operations.
===============
Done 1.0b7:
===============
Comments whose data begins with a hyphen are now allowed.
Builder is considerably more robust against buggy parsers. It converts all
runtime exceptions thrown by such a parser
(including XOM XMLExceptions thrown by a NodeFactory)
into ParsingExceptions.
It uses a verifying factory for Saxon 7's AElfred derivative.
XIncluder treats bad encoding attributes as fatal errors
Various optimizations have sped up a lot of common operations
including getValue(), toXML(), DOM and SAX conversion, canonicalization,
and XSL transformation
The zip archives and CVS no longer contain files with names that are problematic on Windows.
The manifest file is now versioned.
In keeping with the recommendation in RFC2396bis that "For consistency, URI producers and normalizers should use uppercase hexadecimal digits for all percent-encodings", XOM now uses uppercase percent encodings for base URIs. There may still be a few places where lower case escapes are used.
Holler if you spot any.
Fixed bug where base URIs were not encoded in UTF-8 on all platforms. Mac OS X 10.3 was the particular offender here. Surprisingly the problem did not manifest on Mac OS X 10.2.
===============
Done 1.0b6:
===============
SAXConverter no longer converts XOM xml:base attributes into SAX attributes.
Instead the xml:base attributes are used to determine the URI information
the locator reports. Providing xml:base attributes as well would risk
double counting some relative URLs.
Fixed a number of bugs in converting file names to base URIs
Improved compatibility with Turkish locales that do not see I as the
upper case form of i and vice versa
Fixed bug where carriage returns in internal entity
replacement text in the internal DTD subset was not properly escaped
on reserialization
Fixed bug where carriage returns, less than signs, double
quotes, and ampersands in attribute default values in the
internal DTD subset were not properly escaped
on reserialization
Hid the error messages logged by Xerces and Xalan on System.err
when deliberately testing error conditions. Therefore, there should
be no output fropm the test cases when all tests pass.
Added a junithtml build target to convert JUnit results to HTML.
The strings returned by toString in Comment, ProcessingInstruction,
Attribute, and text are all now truncated if they get too long.
Furthermore any embedded line breaks and tabs are escaped as \n, \t,
and \r. This makes the objects easier to inspect in various debuggers.
The Ant build file now specifies that the input encoding of all .java files
is UTF-8. Most files are pure ASCII, but there are a couple of places where
non-ASCII characters are used.
Unit test coverage has been improved slightly.
Fixed a bug in Serializer that did not always properly trim whitespace
===============
Done 1.0b5:
===============
XSLTransform.setNodeFactory is deprecated. Instead use
the new XSLTransform(Document, NodeFactory) constructor.
XIncluder now resolves XPointers in xi:include elements
against the acquired infoset rather
than the source infoset.
Scheme specific errors in XPointers are treated as resource errors when
xincluding rather than fatal errors.
Much faster when building documents from File objects.
Deprecated constructors have been removed from XSLTransform
===============
Done 1.0b4:
===============
XSLT transformation is now based on SAX conversion rather than toXML.
This should save memory in transformations and probably speed things up.
All constructors in XSLTransform that take anything other than a Document are deprecated
and will be removed in the next release:
public XSLTransform(InputStream stylesheet)
public XSLTransform(Reader stylesheet)
public XSLTransform(String URL)
public XSLTransform(File stylesheet)
SAXConverter can now convert Nodes lists as well as Documents.
SAXConverter now provides location information for system IDs.
Various bug fixes in SAXConverter, especially with respect to startPrefixMapping
and endPrefixMapping
The toXML methods now use \n as the line separator, since this is more likely
to match the contents of text nodes created by parsing an XML document.
The goal is to minimize the number of documents with mixed line break strings.
DOMConverter can now convert XOM documents with only a single element to DOM.
Minor bug fixes to better handle line breaks in the internal DTD subset
===============
Done 1.0b3:
===============
Java encoding names are now recognized when using the repackaged Xerces
bundled with Java 1.5
Fixed several bugs in DOMConverter
===============
Done 1.0b2:
===============
Fixed various bugs that prevented the loading of JDK15_XML1_0Parser
Worked around bugs in JDK 1.5 beta 2 that limited elements to a single attribute.
The API documentation is now well-formed XHTML. (It might even be valid. I haven't checked.)
There's a new tools package that contains classes used to help build XOM. Currently this
contains the class to convert JavaDoc to XHTML using TagSoup.
===============
Done 1.0b1:
===============
The XInclude test suite is loaded and run from the W3C CVS
server if necessary
Worked around various JDK bugs that prevent round-tripping of
some characters in Japanese encodings
Improved compatibility with Java 1.5
===============
Done 1.0a5:
===============
ParsingException and ValidityException now supply the URI of the document that caused
the exception if it's available
OASIS XSLT conformance tests are now included in the unit test suite
Handling of additional namespaces in transforms now works with
recent versions of Xalan
Improved compatibility with pre-1.4 VMs
===============
Done 1.0a4:
===============
Nodes.remove(int) now returns the node removed
The IBM virtual machine 1.4.1 is no longer special cased.
The API documentation has undergone extensive editing.
The unpublished nu.xom.xerces package has been removed.
===============
Done 1.0a3:
===============
The Element copy constructor and copy methods are no longer recursive, so they
shouldn't cause stack overflows in deep documents. This necessitated adding a
protected shallowCopy() method that can be used to create an instance of a subclass
of Element. Overriding this is preferred to overriding copy() when one wishes
to maintain the objects' types after a copy.
The getBaseURI() method is also no longer recursive.
The W3C XML Schema Language and WML and HTML DOMs have been removed from
the bundled version of Xerces to save space.
There is now a contributor license agreement.
XOM will now use character references only when necessary for
*all* encodings supported by the local virtual machine.
However, this may be quite a bit slower than the
explicitly supported encodings like UTF-8 and the ISO-8859
character sets. Measurements remain to be performed.
===============
Done 1.0a2:
===============
URI verification and base URI resolution are now performed
according to the RFC2396bis algorithm, rather than by using the
Xerces and java.net URI classes.
The Builder no longer sets any System properties for more compatibility
with applets and multiclassloader environments.
Fixed bug in DOMConverter
===============
Done 1.0a1:
===============
The base URI handling has been modified as follows:
1. getBaseURI() always returns an absolute URI or the empty string if the base URI is not known.
Other than the empty string it never returns a relative URI.
It never returns null.
2. Base URI of an element does not change when it is detached or copied
3. setBaseURI requires an absolute URI, and throws a MalformedURIException if you attempt
to pass it a relative URI, or a URI with a fragment identifier. (Relative URIs are still allowed in
xml:base attributes.)
XOM will not double verify when being fed data through Norm Walsh's catalog filter;
provided that the underlying parser is good.
Supports 2nd candidate recommendation syntax for XInclude
Constraints on parentage are not checked when building with Nonverifying factory
(fastAddAttribute, fastInsertChild)
DOMConverter is now non-recursive
An element's absolutized base URI is preserved when detaching
===============
Done 1.0d25:
===============
The checkFoo methods have been eliminated. All
setter and mutator methods in the node classes are now non-final.
NodeFactory.makeDocument has been renamed startMakingDocument
NodeFactory.endDocument has been renamed finishMakingDocument
Added a method to DOMConverter to convert a DocumentFragment to a Nodes
Added XSLTransform.toDocument() method that converts a Nodes to
a Document
Added UnavailableCharacterException, a subclass of XMLException,
to be thrown when attempting to serialize a
character that is not available in the current charater set and cannot be escaped
Element.addAttribute is declared to throw the more specific MultipleParentException instead of
IllegalAddException
Added a non-recursive serializer sample
Removed checkDetach() method from Node. It was redundant with
checkRemoveChild() in ParentNode.
Reursion has been eliminated from several methods in Element
to make it work better in very deep documents; notably
toXML(), getValue(), and getNamespaceURI(prefix)
The canonicalizer has been made non-recursive
ParentNode.replaceChild() will not remove the old child unless it can insert
the new child. It can no longer do one but not the other.
Document.replaceChild now allows replacing of the DocType by another DocType
or the root element by another element
Element.removeChildren() now either removes all children or none.
It also returns a Nodes object containing the children removed
LeafNode has been removed. DocType, Text, Comment, and ProcessingINstrcution
now directly extend Node.
Removed hasChildren method from Element, Node, ParentNode, and Document
Much better testing of canonicalizer. I am now fairly convinced
it is correct in all or almost all cases.
Line breaks are now used between declarations in internal DTD subset
Compiled jar without debugging symbols to save space.
(These can be turned on again easily enough in build.xml
if anyone needs them.)
Made a XOMSamples.jar
The core JAR archive is sealed
Many JavaDoc improvements
===============
Done 1.0d24:
===============
Fixed resource loading in servlet/multiclassloader environment
===============
Done 1.0d23:
===============
Added support for accept, accept-charset, and accept-language
attributes on include elements
MissingHrefException has been renamed NoIncludeLocationException
XOMTestCase is part of the published API.
CircularInclusionException has been renamed InclusionLoopException
Factory methods are now invoked in document order. Previously this
wasn't true for text nodes, which weren't flushed until after
the next tag, PI, etc. This was necessary to enable text nodes to
be maximally contiguous, though in fact they might not be if
the factory returned several text nodes in a row for non-text nodes.
In any case, with the default factory, or with a custom factory that
doesnot remove any nodes or change their base types (e.g. coment to Text)
text nodes are still maximum contiguous after a build.
Added support for GB18030 encoding on output
(requires Java 1.4)
IllegalDataException and its subclasses have getData and setData methods
to get and set the exact text that caused the exception.
IllegalNameException, IllegalCharacterDataException, and IllegalTargetException are now
subclasses of IllegalDataException. IllegalCharacterDataException replaces most previous
uses of IllegalDataException.
NamespaceException has been subdivided into
IllegalNameException, MalformedURIException, and NamespaceConflictException.
Verifier is now based on table lookup.
XOM no longer contains any JDOM code.
Removed NodeFactory makeWhiteSpaceInElementContent() method
Serialization speed-ups for Non-Unicode, non-Latin-1
encodings
It is now possible to supply a NodeFactory to XSLTransform to be used for
construcing nodes in the result tree
Improved support for IBM JVM 1.4.1
Added support for Thai in ISO-8859-11/TIS-620 encoding
Speeded up Serializer for non-Unicode/non-Latin-1 encodings
Attribute.Type.toXML is now Attribute.Type.getName(). This was necessary
to be consistent with handling attributes of type ENUMERATION, which is not a DTD keyword
though it is referenced in the Infoset.
Removed no-args constructors from the various exception classes.
The Nodes class now has insert and remove methods,
in addition to append.
Supports the XInclude 2003 2nd last call working draft.
The methods that resolve Nodes objects have been marked private.
Added NoSuchAttributeException for parallelism with NoSuchChildException
Unit tests have been dramatically expanded. There are now over
700 separate test methods, many of which perform several tests.
No longer allow the namespace URI
http://www.w3.org/XML/1998/namespace
to have any prefix other than xml, per conformance with
the namespaces erratum
Allow the xml: prefix (with the right URI) to be used on elements
per conformance with the namespaces recommendation
NodeFactory make methods now return Nodes objects that may change the type
or number of nodes returned, subject to the ususal XML well-formedness constraints.
Better exception messages when name and namespace arguments are swapped
getBaseURI returns null if the base URI can't be determined due
to a malformed xml:base attribute.
===============
Done 1.0d22:
===============
Serializer.preservebaseURI() is now Serializer.setPreserveBaseURI()
Carriage returns are no longer allowed in comment and processing instruction data
because they can't be roundtripped. (Character references aren't resolved inside
comment and processing instruction data.)
Initial white space is not longer allowed in processing instruction
data because this cannot be roundtripped.
DOMConverter.translate methods have been renamed DOMConverter.convert
DOMConverter can now convert individual DOM nodes into XOM objects.
It is no longer limited to converting entire documents.
ValidityException now has a getDocument() method which returns the
complete well-formed but invalid document. It also has getValidityError(int n),
getLineNumber(int n), and getColumnNumber(int n) methods which return
information about the successive validity errors in the document.
Numeric character references now use upper case.
In Serializer, writeMarkup has been renamed writeRaw and writeText
has been renamed writeEscaped
since in subclasses these may not actually be writing markup
Much more fine-grained control of serialization from subclasses
using several new methods including writeXMLDeclaration(),
writeStartTag(), and writeEmptyElementTag().
Added an option to serialize using Unicode normalization form C.
Added a protected getColumnNumber() method to Serializer to assist subclasses that
wish to do implement their own line breaking strategies.
Can now specify a Builder to be used when XIncluding
More XPointer syntax errors are detected when XIncluding
NodeList has been renamed Nodes.
Java encoding names such as ISO8859_1 are now recognized on input
if Xerces is the parser.
XIncludeException (and its subclasses) can now report the URI
of the document where the problem was detected
Upgraded to Xerces 2.6 nightly build to fix bug
involving relative URL resolution in documents
loaded from redirected URLs
Added unit tests for SAXConverter
Added DatabaseBuilder sample based on Example 8-13 from Processing XML with Java
Silently preserve CDATA sections from parse to output when possible,
Added SourceCodeGenerator sample program that converts a well-formed XML
document into the XOM statements necessary to create the document
Renamed ParseException to ParsingException
===============
Done 1.0d21:
===============
Added checkDetach protected method in Node. Could this
and checkRemoveChild in Document make code any simpler by preventing
detaching of root?
copy() method is no loinger final in node classes
Cycles (an element acting as its own ancestor)
are no longer allowed. Attempting to create one throws a
CycleException.
NodeFactory.makeDocument() no longer takes an Element as an argument.
It is the responsibility of the NodeFactory to construct a suitable root
element. However, when parsing this will quickly be replaced by the
actual root element.
Serializer.setIndent throws an IllegalArgumentException
for negative values
Fixed bug where line breaks would be added if indenting, even in elements
where xml:space="preserve"
XInclude now consistently treats XPointers that don't match any
subresource as resource errors, rather than including nothing.
xml:base attributes added to XIncluded elements no longer
have fragment IDs