forked from pear/XML_HTMLSax3
-
Notifications
You must be signed in to change notification settings - Fork 1
/
package.xml
240 lines (232 loc) · 11.1 KB
/
package.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
<?xml version="1.0" encoding="UTF-8"?>
<package packagerversion="1.7.0RC1" version="2.0" xmlns="http://pear.php.net/dtd/package-2.0" xmlns:tasks="http://pear.php.net/dtd/tasks-1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://pear.php.net/dtd/tasks-1.0 http://pear.php.net/dtd/tasks-1.0.xsd http://pear.php.net/dtd/package-2.0 http://pear.php.net/dtd/package-2.0.xsd">
<name>XML_HTMLSax3</name>
<channel>pear.php.net</channel>
<summary>A SAX parser for HTML and other badly formed XML documents</summary>
<description>XML_HTMLSax3 is a SAX based XML parser for badly formed XML documents, such as HTML.
The original code base was developed by Alexander Zhukov and published at http://sourceforge.net/projects/phpshelve/. Alexander kindly gave permission to modify the code and license for inclusion in PEAR.
NOTE!
This package is now dual licensed under PHP license v3.01 and LGPL 3.0
See the CVS repo link for the actual licenses
PEAR::XML_HTMLSax3 provides an API very similar to the native PHP XML extension (http://www.php.net/xml), allowing handlers using one to be easily adapted to the other. The key difference is HTMLSax will not break on badly formed XML, allowing it to be used for parsing HTML documents. Otherwise HTMLSax supports all the handlers available from Expat except namespace and external entity handlers. Provides methods for handling XML escapes as well as JSP/ASP opening and close tags.
Version 1.x introduced an API similar to the native SAX extension but used a slow character by character approach to parsing.
Version 2.x has had it's internals completely overhauled to use a Lexer, delivering performance *approaching* that of the native XML extension, as well as a radically improved, modular design that makes adding further functionality easy.
Version 3.x is about fine tuning the API, behaviour and providing a mechanism to distinguish HTML "quirks" from badly formed HTML (later functionality not yet implemented)
A big thanks to Jeff Moore (lead developer of WACT: http://wact.sourceforge.net) who's largely responsible for new design, as well input from other members at Sitepoint's Advanced PHP forums: http://www.sitepointforums.com/showthread.php?threadid=121246.
Thanks also to Marcus Baker (lead developer of SimpleTest: http://www.lastcraft.com/simple_test.php) for sorting out the unit tests.</description>
<lead>
<name>Harry Fuecks</name>
<user>hfuecks</user>
<email>[email protected]</email>
<active>no</active>
</lead>
<date>2007-12-01</date>
<time>22:09:38</time>
<version>
<release>3.0.0</release>
<api>3.0.0</api>
</version>
<stability>
<release>stable</release>
<api>stable</api>
</stability>
<license uri="http://www.php.net/license">PHP</license>
<notes>Fixed bug #1850 HTMLtoXHTML.php does not produce XHTML [dufuz]
Fixed bug #11607 Requesting License change, emails to listed authors bounce [cdake}
Fixed bug #12159 not clarified license [hfuecks]
This package is now dual licensed under PHP license v3.01 and LGPL 3.0</notes>
<contents>
<dir name="/">
<file baseinstalldir="XML" md5sum="6443030f28b37aef18648f80c0ebfa50" name="docs/examples/example.html" role="doc" />
<file baseinstalldir="XML" md5sum="927697c9a4090472c9b52acf5fa2f297" name="docs/examples/ExpatvsHtmlSax.php" role="doc" />
<file baseinstalldir="XML" md5sum="6f5bf3660dee8a549e0163ed371ee3f8" name="docs/examples/HTMLtoXHTML.php" role="doc" />
<file baseinstalldir="XML" md5sum="069662117a0fbea381abad112d47b150" name="docs/examples/SimpleExample.php" role="doc" />
<file baseinstalldir="XML" md5sum="010de9495fe6793a9c95e9dc7170b493" name="docs/examples/SimpleTemplate.php" role="doc" />
<file baseinstalldir="XML" md5sum="3d230b9d9301cc49bcab6314155d49c1" name="docs/examples/simpletemplate.tpl" role="doc" />
<file baseinstalldir="XML" md5sum="0ff3c67cf9e09dd8a0088aa14e633449" name="docs/examples/worddoc.htm" role="doc" />
<file baseinstalldir="XML" md5sum="fbf07a95c981cc40823dec345af99c3d" name="docs/examples/WordDoc.php" role="doc" />
<file baseinstalldir="XML" md5sum="48ff7cce1acdf3725a4f3c43f72ae491" name="docs/Readme" role="doc" />
<file baseinstalldir="XML" md5sum="ee68fca44daf5ea4ab0d1e89898dbac5" name="HTMLSax3/Decorators.php" role="php" />
<file baseinstalldir="XML" md5sum="53bc57b65555dbb2f11723c9cfef35c0" name="HTMLSax3/States.php" role="php" />
<file baseinstalldir="XML" md5sum="58a613410f75656db17d6d0ea27e2534" name="tests/index.php" role="test" />
<file baseinstalldir="XML" md5sum="3d7e5c0266f5c807e9b21a194630f17b" name="tests/unit_tests.php" role="test" />
<file baseinstalldir="XML" md5sum="4b67baeaf8d3032ffd55affef4c8a7ff" name="tests/xml_htmlsax_test.php" role="test" />
<file baseinstalldir="XML" md5sum="909dd989c6c6ece13efe37c028c4fefd" name="HTMLSax3.php" role="php" />
</dir>
</contents>
<dependencies>
<required>
<php>
<min>4.0.5</min>
</php>
<pearinstaller>
<min>1.4.0b1</min>
</pearinstaller>
</required>
<optional>
<extension>
<name>pcre</name>
</extension>
</optional>
</dependencies>
<phprelease />
<changelog>
<release>
<date>2004-06-02</date>
<version>
<release>3.0.0RC1</release>
<api>3.0.0RC1</api>
</version>
<stability>
<release>beta</release>
<api>beta</api>
</stability>
<license uri="http://www.php.net/license">PHP</license>
<notes>* Re PEAR version naming rules, you now include XML/HTMLSax3.php and the main class is called XML_HTMLSax3
* Now able to parse Word generated HTML - fixed bug with parsing of XML escape sequences
* API break (minor): no longer extends PEAR
* API break (minor): attributes with no value (like option selected) are now populated with NULL instead of TRUE
* API break (minor): replaced XML_OPTION_FULL_ESCAPES with XML_OPTION_STRIP_ESCAPES - by default you now get back the complete escape sequence
* Added some more examples</notes>
</release>
<release>
<version>
<release>2.1.2</release>
<api>2.1.2</api>
</version>
<stability>
<release>stable</release>
<api>stable</api>
</stability>
<date>2003-12-05</date>
<license uri="http://www.php.net/license">PHP</license>
<notes>* Bug fixed (thanks Jeff) where badly formed attributes resulted in infinite loop
* Added additional boolean argument to open and close handler calls to spot empty tags like br/ - should not break exising APIs
* Added XML_OPTION_FULL_ESCAPES which (when = 1) passes through the complete content in an XML escape, allowing comment / cdata reconstruction</notes>
</release>
<release>
<version>
<release>2.1.1</release>
<api>2.1.1</api>
</version>
<stability>
<release>stable</release>
<api>stable</api>
</stability>
<date>2003-10-08</date>
<license uri="http://www.php.net/license">PHP</license>
<notes>* Reporting of byte index with get_current_position() more accurate on opening tags (thanks to Alexander Orlov at x-code.com)
* All parser options now available to PHP versions lt 4.3.x, using implementation of html_entity_decode in PHP</notes>
</release>
<release>
<version>
<release>2.1.0</release>
<api>2.1.0</api>
</version>
<stability>
<release>stable</release>
<api>stable</api>
</stability>
<date>2003-09-10</date>
<license uri="http://www.php.net/license">PHP</license>
<notes>* Well (unit) tested with SimpleTest</notes>
</release>
<release>
<version>
<release>2.0.2</release>
<api>2.0.2</api>
</version>
<stability>
<release>alpha</release>
<api>alpha</api>
</stability>
<date>2003-08-11</date>
<license uri="http://www.php.net/license">PHP</license>
<notes>* API is backwards compatible apart from the renaming of parser options
* Performance dramatically increased. Not much slower than Expat
* Better handling of XML comments and CDATA
* Option to trigger additional data handler calls for linefeeds and tabs
* Option to trigger additional data handler calls for XML entities and parse them if required.
* Added public get_current_position() and get_length() methods</notes>
</release>
<release>
<version>
<release>1.1</release>
<api>1.1</api>
</version>
<stability>
<release>stable</release>
<api>stable</api>
</stability>
<date>2003-06-26</date>
<license uri="http://www.php.net/license">PHP</license>
<notes>* Bug fixes to Attribute_Parser to cope with newline, tag, forward slash and whitespace issues.</notes>
</release>
<release>
<version>
<release>1.0</release>
<api>1.0</api>
</version>
<stability>
<release>stable</release>
<api>stable</api>
</stability>
<date>2003-06-08</date>
<license uri="http://www.php.net/license">PHP</license>
<notes>* Modifications to file structure to place Attributes_Parser.php
and State_Machine.php in subdirectory HTMLSax
* XML_HTMLSax.php includes Attributes_Parser.php and State_Machine.php
using require_once()</notes>
</release>
<release>
<version>
<release>0.9.0rc2</release>
<api>0.9.0rc2</api>
</version>
<stability>
<release>beta</release>
<api>beta</api>
</stability>
<date>2003-05-18</date>
<license uri="http://www.php.net/license">PHP</license>
<notes>*First release under PEAR
*Changed package name to XML_HTMLSax
*Added patch from John Luxford to parse single quoted attributes
*Modified State_Machine to be a simple variable store</notes>
</release>
<release>
<version>
<release>0.9.0rc1</release>
<api>0.9.0rc1</api>
</version>
<stability>
<release>beta</release>
<api>beta</api>
</stability>
<date>2003-05-09</date>
<license uri="http://www.php.net/license">PHP</license>
<notes>A summary of the main differences between this version
of HTML_Sax and HTMLSax2002082201 are as follows;
*Instead of extending HTMLSax with your own "handlers" class,
you now use the set_object() method to pass an instance of the
class to HTMLSax.
*Class method callbacks are specified using the following methods;
*set_element_handler('startHandler','endHandler') <tag> and </tag>
*set_data_handler('dataHandler') for contents of an element
*set_pi_handler('piHandler') for <?php ?>, <?xml ?> etc.
*set_escape_handler(') for anything beginning with <!
*set_jasp_handler() - set listener for <% %> tags
*Attributes which no value are created and set to true
*Comments are handled and may contain entities; < >
*The callback handlers will all be passed an instance of HTMLSax
in the same way as the native PHP XML Expat extension
*Setting of parser options is handled specifically by the set_option()
method. Available options are;
*skipWhiteSpace; instruct the parser to ignore whitespace characters
*trimDataNodes; trim whitespace inside character data
*breakOnNewLine; newline characters found in character data are treated
as new events triggering another data callback
*caseFolding; converts element names to uppercase</notes>
</release>
</changelog>
</package>