perl-HTML-Strip - Perl extension for stripping HTML markup from text

Property Value
Distribution openSUSE Leap 42.3
Repository Packman all
Package name perl-HTML-Strip
Package version 2.10
Package release 1.4
Package architecture aarch64
Package type rpm
Installed size 83.07 KB
Download size 26.04 KB
Official Mirror
This module simply strips HTML-like markup from text rapidly and brutally.
It could easily be used to strip XML or SGML markup instead; but as
removing HTML is a much more common problem, this module lives in the
HTML:: namespace.
It is written in XS, and thus about five times quicker than using regular
expressions for the same task.
It does _not_ do any syntax checking (if you want that, use HTML::Parser),
instead it merely applies the following rules:
* 1
Anything that looks like a tag, or group of tags will be replaced with a
single space character. Tags are considered to be anything that starts with
a '<' and ends with a '>'; with the caveat that a '>' character may appear
in either of the following without ending the tag:
* Quote
Quotes are considered to start with either a ''' or a '"' character, and
end with a matching character _not_ preceded by an even number or escaping
slashes (i.e. '\"' does not end the quote but '\\\\"' does).
* Comment
If the tag starts with an exclamation mark, it is assumed to be a
declaration or a comment. Within such tags, '>' characters do not end the
tag if they appear within pairs of double dashes (e.g. '<!-- <a
href="old.htm">old page</a> -->' would be stripped completely). No parsing
for quotes is performed within comments, so for instance '<!-- comment with
both ' quote types " -->' would be entirely stripped.
* 2
Anything the appears within what we term _strip tags_ is stripped as well.
By default, these tags are 'title', 'script', 'style' and 'applet'.
HTML::Strip maintains state between calls, so you can parse a document in
chunks should you wish. If one chunk ends half-way through a tag, quote,
comment, or whatever; it will remember this, and expect the next call to
parse to start with the remains of said tag.
If this is not going to be the case, be sure to call $hs->eof() between
calls to $hs->parse(). Alternatively, you may set 'auto_reset' to true on
the constructor or any time after with 'set_auto_reset', so that the parser
will always operate in one-shot basis (resetting after each parsed chunk).


Package Version Architecture Repository
perl-HTML-Strip-2.10-1.6.x86_64.rpm 2.10 x86_64 Packman
perl-HTML-Strip-2.10-1.4.armv7hl.rpm 2.10 armv7hl Packman
perl-HTML-Strip - - -


Name Value - -
perl(:MODULE_COMPAT_5.18.2) -
perl(Test::Exception) -


Name Value -
perl(HTML::Strip) = 2.10
perl-HTML-Strip = 2.10-1.4
perl-HTML-Strip(aarch-64) = 2.10-1.4


Type URL
Binary Package perl-HTML-Strip-2.10-1.4.aarch64.rpm
Source Package perl-HTML-Strip-2.10-1.4.src.rpm

Install Howto

  1. Add the Packman repository:
    # zypper addrepo packman
  2. Install perl-HTML-Strip rpm package:
    # zypper install perl-HTML-Strip




2016-05-05 -
- updated to 2.10
see /usr/share/doc/packages/perl-HTML-Strip/Changes
2.10  Fri Apr 22 12:16:17 BST 2016
- fix to building on Windows / MSVC (RT#102389)
- fix duplicate DESTROY in Strip(.pm,.xs) warning (RT#104379, Debian bug #785032)
2015-04-16 -
- updated to 2.09
see /usr/share/doc/packages/perl-HTML-Strip/Changes
2.09  Mon Jan  5 16:51:17 GMT 2015
- fixed latin1 support, added test case for it (RT#100969)
2.08  Tue Dec  9 15:02:02 GMT 2014
- replaced html entities in russian.html (read by utf8 test), as the
test should not fail due to problems with HTML::Entities
2.07  Thu Dec  4 14:07:03 GMT 2014
- improvements for Kwalitee
2.06  Thu Dec  4 12:59:54 GMT 2014
- strip_spaces in utf8 test was using perl v5.14+ features
- reading of DATA in utf8 test should be native utf8 not use Encode,
which mangles it on some platforms
2.05  Wed Dec  3 16:05:13 GMT 2014
- fix to bug in t/300_utf8.t causing whitespace not to be stripped
2.04  Tue Nov 25 11:14:08 GMT 2014
- many cpan tester failures due to witespace in utf8 test,
main test done with whitespace stripped, todo test as before
2.03  Mon Nov 24 13:48:44 GMT 2014
- removed trailing libicu deps
- perl minimum version to 5.8 (needed for unicode support)
- cleaned up test suite
- version bump in META.YML (RT#100457)
- 'use feature' breaking perl 5.8, removed (RT#100453)
- added Test::Exception to build_requires
2.02  Thu Nov 20 11:21:35 GMT 2014
- removed dependency on libicu-dev, which isn't as universal as expected
and was causing a bunch of cpan tester failures
2.01  Wed Nov 19 10:48:04 GMT 2014
(patch contributed by Michi Steiner)
- clean buffer needs an extra char when emit_spaces=1 and the input has
nothing to be removed (RT#41035)
2.00  Tue Nov 18 16:14:42 GMT 2014
- utf8 support via libicu (RT#42834)
- smoke test and utf8 test, tests ordered
1.10  Tue Sep 30 14:34:47 UTC 2014
- Fix for RT#99207 (script mathematical symbol bug)
1.09  Tue Sep 30 10:39:47 UTC 2014
- offbyone.t disabled under Windows (RT#99219)
1.08  Fri Sep 26 15:02:37 UTC 2014
- system perl used in offbyone.t (RT#99151)
1.07  Tue Sep 23 14:44:08 UTC 2014
- fix to bug RT#19036 - tags not replaced with spaces when only a single
character is between the tags
- fix to bug RT#35345 - mathematical conparisons within <script> tags
(patches contributed by Adriano Ferreira)
- Exporter was never needed
- Allow other filtering operations than just decoding of HTML entities
- Modernised test suite
- Adds 'auto_reset' attribute, which allows automagic use of $hs->eof
- fixes quotes in html comments (RT#32355)
(patch contributed by Reini Urban)
- MSVC doesnt define strcasecmp, use stricmp instead
(patch contributed by Damyan Ivanov)
- fixes POD errors
2010-12-01 -
- switch to perl_requires macro
2010-07-07 -
- initial package created by cpanspec 1.78

See Also

Package Description
perl-HTTP-Cache-Transparent-0.7-2.3.aarch64.rpm Cache the result of http get-requests persistently
perl-HTTP-Cache-Transparent-0.7-2.3.armv7hl.rpm Cache the result of http get-requests persistently
perl-HTTP-Cache-Transparent-0.7-2.5.x86_64.rpm Cache the result of http get-requests persistently
perl-Inline-C-0.78-1.3.aarch64.rpm C Language Support for Inline
perl-Inline-C-0.78-1.3.armv7hl.rpm C Language Support for Inline
perl-Inline-C-0.78-1.5.x86_64.rpm C Language Support for Inline
perl-Lingua-EN-Numbers-Ordinate-1.02-8.3.aarch64.rpm Convert cardinal numbers (3) to ordinals ("3rd")
perl-Lingua-EN-Numbers-Ordinate-1.02-8.3.armv7hl.rpm Convert cardinal numbers (3) to ordinals ("3rd")
perl-Lingua-EN-Numbers-Ordinate-1.02-8.5.x86_64.rpm Convert cardinal numbers (3) to ordinals ("3rd")
perl-Lingua-Preferred-0.2.4-12.3.aarch64.rpm Perl extension to choose a language
perl-Lingua-Preferred-0.2.4-12.3.armv7hl.rpm Perl extension to choose a language
perl-Lingua-Preferred-0.2.4-12.5.x86_64.rpm Perl extension to choose a language
perl-Log-TraceMessages-1.4-13.3.aarch64.rpm Perl extension for trace messages used in debugging
perl-Log-TraceMessages-1.4-13.3.armv7hl.rpm Perl extension for trace messages used in debugging
perl-Log-TraceMessages-1.4-13.5.x86_64.rpm Perl extension for trace messages used in debugging