man-pages/man3/HTML::Parse.3pm.html
2021-03-31 01:06:50 +01:00

215 lines
5.1 KiB
HTML

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<HTML><HEAD><TITLE>Man page of HTML::Parse</TITLE>
</HEAD><BODY>
<H1>HTML::Parse</H1>
Section: User Contributed Perl Documentation (3pm)<BR>Updated: 2019-01-13<BR><A HREF="#index">Index</A>
<A HREF="/cgi-bin/man/man2html">Return to Main Contents</A><HR>
<A NAME="lbAB">&nbsp;</A>
<H2>NAME</H2>
HTML::Parse - Deprecated, a wrapper around HTML::TreeBuilder
<A NAME="lbAC">&nbsp;</A>
<H2>VERSION</H2>
This document describes version 5.07 of
HTML::Parse, released August 31, 2017
as part of HTML-Tree.
<A NAME="lbAD">&nbsp;</A>
<H2>SYNOPSIS</H2>
<PRE>
See the documentation for HTML::TreeBuilder
</PRE>
<A NAME="lbAE">&nbsp;</A>
<H2>DESCRIPTION</H2>
Disclaimer: This module is provided only for backwards compatibility
with earlier versions of this library. New code should <I>not</I> use
this module, and should really use the HTML::Parser and
HTML::TreeBuilder modules directly, instead.
<P>
The <TT>&quot;HTML::Parse&quot;</TT> module provides functions to parse <FONT SIZE="-1">HTML</FONT> documents.
There are two functions exported by this module:
<DL COMPACT>
<DT id="1">parse_html($html) or parse_html($html, $obj)<DD>
This function is really just a synonym for <TT>$obj</TT>-&gt;parse($html) and <TT>$obj</TT>
is assumed to be a subclass of <TT>&quot;HTML::Parser&quot;</TT>. Refer to
HTML::Parser for more documentation.
<P>
If <TT>$obj</TT> is not specified, the <TT>$obj</TT> will default to an internally
created new <TT>&quot;HTML::TreeBuilder&quot;</TT> object configured with <B>strict_comment()</B>
turned on. That class implements a parser that builds (and is) a <FONT SIZE="-1">HTML</FONT>
syntax tree with HTML::Element objects as nodes.
<P>
The return value from <B>parse_html()</B> is <TT>$obj</TT>.
<DT id="2">parse_htmlfile($file, [$obj])<DD>
Same as <B>parse_html()</B>, but pulls the <FONT SIZE="-1">HTML</FONT> to parse, from the named file.
<P>
Returns <TT>&quot;undef&quot;</TT> if the file could not be opened, or <TT>$obj</TT> otherwise.
</DL>
<P>
When a <TT>&quot;HTML::TreeBuilder&quot;</TT> object is created, the following variables
control how parsing takes place:
<DL COMPACT>
<DT id="3">$HTML::Parse::IMPLICIT_TAGS<DD>
Setting this variable to true will instruct the parser to try to
deduce implicit elements and implicit end tags. If this variable is
false you get a parse tree that just reflects the text as it stands.
Might be useful for quick &amp; dirty parsing. Default is true.
<P>
Implicit elements have the <B>implicit()</B> attribute set.
<DT id="4">$HTML::Parse::IGNORE_UNKNOWN<DD>
This variable contols whether unknown tags should be represented as
elements in the parse tree. Default is true.
<DT id="5">$HTML::Parse::IGNORE_TEXT<DD>
Do not represent the text content of elements. This saves space if
all you want is to examine the structure of the document. Default is
false.
<DT id="6">$HTML::Parse::WARN<DD>
Call <B>warn()</B> with an appropriate message for syntax errors. Default is
false.
</DL>
<A NAME="lbAF">&nbsp;</A>
<H2>REMEMBER!</H2>
HTML::TreeBuilder objects should be explicitly destroyed when you're
finished with them. See HTML::TreeBuilder.
<A NAME="lbAG">&nbsp;</A>
<H2>SEE ALSO</H2>
HTML::Parser, HTML::TreeBuilder, HTML::Element
<A NAME="lbAH">&nbsp;</A>
<H2>AUTHOR</H2>
Current maintainers:
<DL COMPACT>
<DT id="7">&bull;<DD>
Christopher J. Madsen <TT>&quot;&lt;perl&nbsp;AT&nbsp;cjmweb.net&gt;&quot;</TT>
<DT id="8">&bull;<DD>
Jeff Fearn <TT>&quot;&lt;jfearn&nbsp;AT&nbsp;cpan.org&gt;&quot;</TT>
</DL>
<P>
Original HTML-Tree author:
<DL COMPACT>
<DT id="9">&bull;<DD>
Gisle Aas
</DL>
<P>
Former maintainers:
<DL COMPACT>
<DT id="10">&bull;<DD>
Sean M. Burke
<DT id="11">&bull;<DD>
Andy Lester
<DT id="12">&bull;<DD>
Pete Krawczyk <TT>&quot;&lt;petek&nbsp;AT&nbsp;cpan.org&gt;&quot;</TT>
</DL>
<P>
You can follow or contribute to HTML-Tree's development at
&lt;<A HREF="https://github.com/kentfredric/HTML-Tree">https://github.com/kentfredric/HTML-Tree</A>&gt;.
<A NAME="lbAI">&nbsp;</A>
<H2>COPYRIGHT AND LICENSE</H2>
Copyright 1995-1998 Gisle Aas, 1999-2004 Sean M. Burke,
2005 Andy Lester, 2006 Pete Krawczyk, 2010 Jeff Fearn,
2012 Christopher J. Madsen.
<P>
This library is free software; you can redistribute it and/or
modify it under the same terms as Perl itself.
<P>
The programs in this library are distributed in the hope that they
will be useful, but without any warranty; without even the implied
warranty of merchantability or fitness for a particular purpose.
<P>
<HR>
<A NAME="index">&nbsp;</A><H2>Index</H2>
<DL>
<DT id="13"><A HREF="#lbAB">NAME</A><DD>
<DT id="14"><A HREF="#lbAC">VERSION</A><DD>
<DT id="15"><A HREF="#lbAD">SYNOPSIS</A><DD>
<DT id="16"><A HREF="#lbAE">DESCRIPTION</A><DD>
<DT id="17"><A HREF="#lbAF">REMEMBER!</A><DD>
<DT id="18"><A HREF="#lbAG">SEE ALSO</A><DD>
<DT id="19"><A HREF="#lbAH">AUTHOR</A><DD>
<DT id="20"><A HREF="#lbAI">COPYRIGHT AND LICENSE</A><DD>
</DL>
<HR>
This document was created by
<A HREF="/cgi-bin/man/man2html">man2html</A>,
using the manual pages.<BR>
Time: 00:05:45 GMT, March 31, 2021
</BODY>
</HTML>