Pod::Man(3pm) Perl Programmers Reference Guide Pod::Man(3pm)
NAME
Pod::Man - Convert POD data to formatted *roff input
SYNOPSIS
use Pod::Man;
my $parser = Pod::Man->new (release => $VERSION, section => 8);
# Read POD from STDIN and write to STDOUT.
$parser->parse_file (\*STDIN);
# Read POD from file.pod and write to file.1.
$parser->parse_from_file ('file.pod', 'file.1');
DESCRIPTION
Pod::Man is a module to convert documentation in the POD format (the
preferred language for documenting Perl) into *roff input using the man
macro set. The resulting *roff code is suitable for display on a
terminal using nroff(1), normally via man(1), or printing using
troff(1). It is conventionally invoked using the driver script
pod2man, but it can also be used directly.
By default (on non-EBCDIC systems), Pod::Man outputs UTF-8. Its output
should work with the man program on systems that use groff (most Linux
distributions) or mandoc (most BSD variants), but may result in mangled
output on older UNIX systems. To choose a different, possibly more
backward-compatible output mangling on such systems, set the "encoding"
option to "roff" (the default in earlier Pod::Man versions). See the
"encoding" option and "ENCODING" for more details.
See "COMPATIBILTY" for the versions of Pod::Man with significant
backward-incompatible changes (other than constructor options, whose
versions are documented below), and the versions of Perl that included
them.
CLASS METHODS
new(ARGS)
Create a new Pod::Man object. ARGS should be a list of key/value
pairs, where the keys are chosen from the following. Each option
is annotated with the version of Pod::Man in which that option was
added with its current meaning.
center
[1.00] Sets the centered page header for the ".TH" macro. The
default, if this option is not specified, is "User Contributed
Perl Documentation".
date
[4.00] Sets the left-hand footer for the ".TH" macro. If this
option is not set, the contents of the environment variable
POD_MAN_DATE, if set, will be used. Failing that, the value of
SOURCE_DATE_EPOCH, the modification date of the input file, or
the current time if stat() can't find that file (which will be
the case if the input is from "STDIN") will be used. If taken
from any source other than POD_MAN_DATE (which is used
verbatim), the date will be formatted as "YYYY-MM-DD" and will
be based on UTC (so that the output will be reproducible
regardless of local time zone).
encoding
[5.00] Specifies the encoding of the output. The value must be
an encoding recognized by the Encode module (see
Encode::Supported), or the special values "roff" or "groff".
The default on non-EBCDIC systems is UTF-8.
If the output contains characters that cannot be represented in
this encoding, that is an error that will be reported as
configured by the "errors" option. If error handling is other
than "die", the unrepresentable character will be replaced with
the Encode substitution character (normally "?").
If the "encoding" option is set to the special value "groff"
(the default on EBCDIC systems), or if the Encode module is not
available and the encoding is set to anything other than
"roff", Pod::Man will translate all non-ASCII characters to
"\[uNNNN]" Unicode escapes. These are not traditionally part
of the *roff language, but are supported by groff and mandoc
and thus by the majority of manual page processors in use
today.
If the "encoding" option is set to the special value "roff",
Pod::Man will do its historic transformation of (some) ISO
8859-1 characters into *roff escapes that may be adequate in
troff and may be readable (if ugly) in nroff. This was the
default behavior of versions of Pod::Man before 5.00. With
this encoding, all other non-ASCII characters will be replaced
with "X". It may be required for very old troff and nroff
implementations that do not support UTF-8, but its
representation of any non-ASCII character is very poor and
often specific to European languages.
If the output file handle has a PerlIO encoding layer set,
setting "encoding" to anything other than "groff" or "roff"
will be ignored and no encoding will be done by Pod::Man. It
will instead rely on the encoding layer to make whatever output
encoding transformations are desired.
WARNING: The input encoding of the POD source is independent
from the output encoding, and setting this option does not
affect the interpretation of the POD input. Unless your POD
source is US-ASCII, its encoding should be declared with the
"=encoding" command in the source. If this is not done,
Pod::Simple will will attempt to guess the encoding and may be
successful if it's Latin-1 or UTF-8, but it will produce
warnings. See perlpod(1) for more information.
errors
[2.27] How to report errors. "die" says to throw an exception
on any POD formatting error. "stderr" says to report errors on
standard error, but not to throw an exception. "pod" says to
include a POD ERRORS section in the resulting documentation
summarizing the errors. "none" ignores POD errors entirely, as
much as possible.
The default is "pod".
fixed
[1.00] The fixed-width font to use for verbatim text and code.
Defaults to "CW". Some systems prefer "CR" instead. Only
matters for troff output.
fixedbold
[1.00] Bold version of the fixed-width font. Defaults to "CB".
Only matters for troff output.
fixeditalic
[1.00] Italic version of the fixed-width font (something of a
misnomer, since most fixed-width fonts only have an oblique
version, not an italic version). Defaults to "CI". Only
matters for troff output.
fixedbolditalic
[1.00] Bold italic (in theory, probably oblique in practice)
version of the fixed-width font. Pod::Man doesn't assume you
have this, and defaults to "CB". Some systems (such as
Solaris) have this font available as "CX". Only matters for
troff output.
guesswork
[5.00] By default, Pod::Man applies some default formatting
rules based on guesswork and regular expressions that are
intended to make writing Perl documentation easier and require
less explicit markup. These rules may not always be
appropriate, particularly for documentation that isn't about
Perl. This option allows turning all or some of it off.
The special value "all" enables all guesswork. This is also
the default for backward compatibility reasons. The special
value "none" disables all guesswork. Otherwise, the value of
this option should be a comma-separated list of one or more of
the following keywords:
functions
Convert function references like foo() to bold even if they
have no markup. The function name accepts valid Perl
characters for function names (including ":"), and the
trailing parentheses must be present and empty.
manref
Make the first part (before the parentheses) of manual page
references like foo(1) bold even if they have no markup.
The section must be a single number optionally followed by
lowercase letters.
quoting
If no guesswork is enabled, any text enclosed in C<> is
surrounded by double quotes in nroff (terminal) output
unless the contents are already quoted. When this
guesswork is enabled, quote marks will also be suppressed
for Perl variables, function names, function calls,
numbers, and hex constants.
variables
Convert Perl variable names to a fixed-width font even if
they have no markup. This transformation will only be
apparent in troff output, or some other output format
(unlike nroff terminal output) that supports fixed-width
fonts.
Any unknown guesswork name is silently ignored (for potential
future compatibility), so be careful about spelling.
language
[5.00] Add commands telling groff that the input file is in the
given language. The value of this setting must be a language
abbreviation for which groff provides supplemental
configuration, such as "ja" (for Japanese) or "zh" (for
Chinese).
Specifically, this adds:
.mso <language>.tmac
.hla <language>
to the start of the file, which configure correct line breaking
for the specified language. Without these commands, groff may
not know how to add proper line breaks for Chinese and Japanese
text if the manual page is installed into the normal manual
page directory, such as /usr/share/man.
On many systems, this will be done automatically if the manual
page is installed into a language-specific manual page
directory, such as /usr/share/man/zh_CN. In that case, this
option is not required.
Unfortunately, the commands added with this option are specific
to groff and will not work with other troff and nroff
implementations.
lquote
rquote
[4.08] Sets the quote marks used to surround C<> text.
"lquote" sets the left quote mark and "rquote" sets the right
quote mark. Either may also be set to the special value
"none", in which case no quote mark is added on that side of
C<> text (but the font is still changed for troff output).
Also see the "quotes" option, which can be used to set both
quotes at once. If both "quotes" and one of the other options
is set, "lquote" or "rquote" overrides "quotes".
name
[4.08] Set the name of the manual page for the ".TH" macro.
Without this option, the manual name is set to the uppercased
base name of the file being converted unless the manual section
is 3, in which case the path is parsed to see if it is a Perl
module path. If it is, a path like ".../lib/Pod/Man.pm" is
converted into a name like "Pod::Man". This option, if given,
overrides any automatic determination of the name.
If generating a manual page from standard input, the name will
be set to "STDIN" if this option is not provided. In this
case, providing this option is strongly recommended to set a
meaningful manual page name.
nourls
[2.27] Normally, L<> formatting codes with a URL but anchor
text are formatted to show both the anchor text and the URL.
In other words:
L<foo|http://example.com/>
is formatted as:
foo <http://example.com/>
This option, if set to a true value, suppresses the URL when
anchor text is given, so this example would be formatted as
just "foo". This can produce less cluttered output in cases
where the URLs are not particularly important.
quotes
[4.00] Sets the quote marks used to surround C<> text. If the
value is a single character, it is used as both the left and
right quote. Otherwise, it is split in half, and the first
half of the string is used as the left quote and the second is
used as the right quote.
This may also be set to the special value "none", in which case
no quote marks are added around C<> text (but the font is still
changed for troff output).
Also see the "lquote" and "rquote" options, which can be used
to set the left and right quotes independently. If both
"quotes" and one of the other options is set, "lquote" or
"rquote" overrides "quotes".
release
[1.00] Set the centered footer for the ".TH" macro. By
default, this is set to the version of Perl you run Pod::Man
under. Setting this to the empty string will cause some *roff
implementations to use the system default value.
Note that some system "an" macro sets assume that the centered
footer will be a modification date and will prepend something
like "Last modified: ". If this is the case for your target
system, you may want to set "release" to the last modified date
and "date" to the version number.
section
[1.00] Set the section for the ".TH" macro. The standard
section numbering convention is to use 1 for user commands, 2
for system calls, 3 for functions, 4 for devices, 5 for file
formats, 6 for games, 7 for miscellaneous information, and 8
for administrator commands. There is a lot of variation here,
however; some systems (like Solaris) use 4 for file formats, 5
for miscellaneous information, and 7 for devices. Still others
use 1m instead of 8, or some mix of both. About the only
section numbers that are reliably consistent are 1, 2, and 3.
By default, section 1 will be used unless the file ends in
".pm" in which case section 3 will be selected.
stderr
[2.19] If set to a true value, send error messages about
invalid POD to standard error instead of appending a POD ERRORS
section to the generated *roff output. This is equivalent to
setting "errors" to "stderr" if "errors" is not already set.
This option is for backward compatibility with Pod::Man
versions that did not support "errors". Normally, the "errors"
option should be used instead.
utf8
[2.21] This option used to set the output encoding to UTF-8.
Since this is now the default, it is ignored and does nothing.
INSTANCE METHODS
As a derived class from Pod::Simple, Pod::Man supports the same methods
and interfaces. See Pod::Simple for all the details. This section
summarizes the most-frequently-used methods and the ones added by
Pod::Man.
output_fh(FH)
Direct the output from parse_file(), parse_lines(), or
parse_string_document() to the file handle FH instead of "STDOUT".
output_string(REF)
Direct the output from parse_file(), parse_lines(), or
parse_string_document() to the scalar variable pointed to by REF,
rather than "STDOUT". For example:
my $man = Pod::Man->new();
my $output;
$man->output_string(\$output);
$man->parse_file('/some/input/file');
Be aware that the output in that variable will already be encoded
in UTF-8.
parse_file(PATH)
Read the POD source from PATH and format it. By default, the
output is sent to "STDOUT", but this can be changed with the
output_fh() or output_string() methods.
parse_from_file(INPUT, OUTPUT)
parse_from_filehandle(FH, OUTPUT)
Read the POD source from INPUT, format it, and output the results
to OUTPUT.
parse_from_filehandle() is provided for backward compatibility with
older versions of Pod::Man. parse_from_file() should be used
instead.
parse_lines(LINES[, ...[, undef]])
Parse the provided lines as POD source, writing the output to
either "STDOUT" or the file handle set with the output_fh() or
output_string() methods. This method can be called repeatedly to
provide more input lines. An explicit "undef" should be passed to
indicate the end of input.
This method expects raw bytes, not decoded characters.
parse_string_document(INPUT)
Parse the provided scalar variable as POD source, writing the
output to either "STDOUT" or the file handle set with the
output_fh() or output_string() methods.
This method expects raw bytes, not decoded characters.
ENCODING
As of Pod::Man 5.00, the default output encoding for Pod::Man is UTF-8.
This should work correctly on any modern system that uses either groff
(most Linux distributions) or mandoc (Alpine Linux and most BSD
variants, including macOS).
The user will probably have to use a UTF-8 locale to see correct
output. This may be done by default; if not, set the LANG or LC_CTYPE
environment variables to an appropriate local. The locale "C.UTF-8" is
available on most systems if one wants correct output without changing
the other things locales affect, such as collation.
The backward-compatible output format used in Pod::Man versions before
5.00 is available by setting the "encoding" option to "roff". This may
produce marginally nicer results on older UNIX versions that do not use
groff or mandoc, but none of the available options will correctly
render Unicode characters on those systems.
Below are some additional details about how this choice was made and
some discussion of alternatives.
History
The default output encoding for Pod::Man has been a long-standing
problem. troff and nroff predate Unicode by a significant margin, and
their implementations for many UNIX systems reflect that legacy. It's
common for Unicode to not be supported in any form.
Because of this, versions of Pod::Man prior to 5.00 maintained the
highly conservative output of the original pod2man, which output pure
ASCII with complex macros to simulate common western European accented
characters when processed with troff. The nroff output was awkward and
sometimes incorrect, and characters not used in western European
scripts were replaced with "X". This choice maximized backwards
compatibility with man and nroff/troff implementations at the cost of
incorrect rendering of many POD documents, particularly those
containing people's names.
The modern implementations, groff (used in most Linux distributions)
and mandoc (used by most BSD variants), do now support Unicode. Other
UNIX systems often do not, but they're now a tiny minority of the
systems people use on a daily basis. It's increasingly common (for
very good reasons) to use Unicode characters for POD documents rather
than using ASCII conversions of people's names or avoiding non-English
text, making the limitations in the old output format more apparent.
Four options have been proposed to fix this:
o Optionally support UTF-8 output but don't change the default. This
is the approach taken since Pod::Man 2.1.0, which added the "utf8"
option. Some Pod::Man users use this option for better output on
platforms known to support Unicode, but since the defaults have not
changed, people continued to encounter (and file bug reports about)
the poor default rendering.
o Convert characters to troff "\(xx" escapes. This requires
maintaining a large translation table and addresses only a tiny part
of the problem, since many Unicode characters have no standard troff
name. groff has the largest list, but if one is willing to assume
groff is the formatter, the next option is better.
o Convert characters to groff "\[uNNNN]" escapes. This is implemented
as the "groff" encoding for those who want to use it, and is
supported by both groff and mandoc. However, it is no better than
UTF-8 output for portability to other implementations. See "Testing
results" for more details.
o Change the default output format to UTF-8 and ask those who want
maximum backward compatibility to explicitly select the old encoding.
This fixes the issue for most users at the cost of backwards
compatibility. While the rendering of non-ASCII characters is
different on older systems that don't support UTF-8, it's not always
worse than the old output.
Pod::Man 5.00 and later makes the last choice. This arguably produces
worse output when manual pages are formatted with troff into PostScript
or PDF, but doing this is rare and normally manual, so the encoding can
be changed in those cases. The older output encoding is available by
setting "encoding" to "roff".
Testing results
Here is the results of testing "encoding" values of "utf-8" and "groff"
on various operating systems. The testing methodology was to create
man/man1 in the current directory, copy encoding.utf8 or encoding.groff
from the podlators 5.00 distribution to man/man1/encoding.1, and then
run:
LANG=C.UTF-8 MANPATH=$(pwd)/man man 1 encoding
If the locale is not explicitly set to one that includes UTF-8, the
Unicode characters were usually converted to ASCII (by, for example,
dropping an accent) or deleted or replaced with "<?>" if there was no
conversion.
Tested on 2022-09-25. Many thanks to the GCC Compile Farm project for
access to testing hosts.
OS UTF-8 groff
------------------ ------- -------
AIX 7.1 no [1] no [2]
Alpine 3.15.0 yes yes
CentOS 7.9 yes yes
Debian 7 yes yes
FreeBSD 13.0 yes yes
NetBSD 9.2 yes yes
OpenBSD 7.1 yes yes
openSUSE Leap 15.4 yes yes
Solaris 10 yes no [2]
Solaris 11 no [3] no [3]
I did not have access to a macOS system for testing, but since it uses
mandoc, it's behavior is probably the same as the BSD hosts.
Notes:
[1] Unicode characters were converted to one or two random ASCII
characters unrelated to the original character.
[2] Unicode characters were shown as the body of the groff escape
rather than the indicated character (in other words, text like
"[u00EF]").
[3] Unicode characters were deleted entirely, as if they weren't there.
Using "nroff -man" instead of man to format the page showed the
same results as Solaris 10. Using "groff -k -man -Tutf8" to format
the page produced the correct output.
PostScript and PDF output using groff on a Debian 12 system do not
support combining accent marks or SMP characters due to a lack of
support in the default output font.
Testing on additional platforms is welcome. Please let the author know
if you have additional results.
DIAGNOSTICS
roff font should be 1 or 2 chars, not "%s"
(F) You specified a *roff font (using "fixed", "fixedbold", etc.)
that wasn't either one or two characters. Pod::Man doesn't support
*roff fonts longer than two characters, although some *roff
extensions do (the canonical versions of nroff and troff don't
either).
Invalid errors setting "%s"
(F) The "errors" parameter to the constructor was set to an unknown
value.
Invalid quote specification "%s"
(F) The quote specification given (the "quotes" option to the
constructor) was invalid. A quote specification must be either one
character long or an even number (greater than one) characters
long.
POD document had syntax errors
(F) The POD document being formatted had syntax errors and the
"errors" option was set to "die".
ENVIRONMENT
PERL_CORE
If set and Encode is not available, silently fall back to an
encoding of "groff" without complaining to standard error. This
environment variable is set during Perl core builds, which build
Encode after podlators. Encode is expected to not (yet) be
available in that case.
POD_MAN_DATE
If set, this will be used as the value of the left-hand footer
unless the "date" option is explicitly set, overriding the
timestamp of the input file or the current time. This is primarily
useful to ensure reproducible builds of the same output file given
the same source and Pod::Man version, even when file timestamps may
not be consistent.
SOURCE_DATE_EPOCH
If set, and POD_MAN_DATE and the "date" options are not set, this
will be used as the modification time of the source file,
overriding the timestamp of the input file or the current time. It
should be set to the desired time in seconds since UNIX epoch.
This is primarily useful to ensure reproducible builds of the same
output file given the same source and Pod::Man version, even when
file timestamps may not be consistent. See
<https://reproducible-builds.org/specs/source-date-epoch/> for the
full specification.
(Arguably, according to the specification, this variable should be
used only if the timestamp of the input file is not available and
Pod::Man uses the current time. However, for reproducible builds
in Debian, results were more reliable if this variable overrode the
timestamp of the input file.)
COMPATIBILITY
Pod::Man 1.02 (based on Pod::Parser) was the first version included
with Perl, in Perl 5.6.0.
The current API based on Pod::Simple was added in Pod::Man 2.00.
Pod::Man 2.04 was included in Perl 5.9.3, the first version of Perl to
incorporate those changes. This is the first version that correctly
supports all modern POD syntax. The parse_from_filehandle() method was
re-added for backward compatibility in Pod::Man 2.09, included in Perl
5.9.4.
Support for anchor text in L<> links of type URL was added in Pod::Man
2.23, included in Perl 5.11.5.
parse_lines(), parse_string_document(), and parse_file() set a default
output file handle of "STDOUT" if one was not already set as of
Pod::Man 2.28, included in Perl 5.19.5.
Support for SOURCE_DATE_EPOCH and POD_MAN_DATE was added in Pod::Man
4.00, included in Perl 5.23.7, and generated dates were changed to use
UTC instead of the local time zone. This is also the first release
that aligned the module version and the version of the podlators
distribution. All modules included in podlators, and the podlators
distribution itself, share the same version number from this point
forward.
Pod::Man 4.10, included in Perl 5.27.8, changed the formatting for
manual page references and function names to bold instead of italic,
following the current Linux manual page standard.
Pod::Man 5.00 changed the default output encoding to UTF-8, overridable
with the new "encoding" option. It also fixed problems with bold or
italic extending too far when used with C<> escapes, and began
converting Unicode zero-width spaces (U+200B) to the "\:" *roff escape.
It also dropped attempts to add subtle formatting corrections in the
output that would only be visible when typeset with troff, which had
previously been a significant source of bugs.
BUGS
There are numerous bugs and language-specific assumptions in the nroff
fallbacks for accented characters in the "roff" encoding. Since the
point of this encoding is backward compatibility with the output from
earlier versions of Pod::Man, and it is deprecated except when
necessary to support old systems, those bugs are unlikely to ever be
fixed.
Pod::Man doesn't handle font names longer than two characters. Neither
do most troff implementations, but groff does as an extension. It
would be nice to support as an option for those who want to use it.
CAVEATS
Sentence spacing
Pod::Man copies the input spacing verbatim to the output *roff
document. This means your output will be affected by how nroff
generally handles sentence spacing.
nroff dates from an era in which it was standard to use two spaces
after sentences, and will always add two spaces after a line-ending
period (or similar punctuation) when reflowing text. For example, the
following input:
=pod
One sentence.
Another sentence.
will result in two spaces after the period when the text is reflowed.
If you use two spaces after sentences anyway, this will be consistent,
although you will have to be careful to not end a line with an
abbreviation such as "e.g." or "Ms.". Output will also be consistent
if you use the *roff style guide (and XKCD 1285
<https://xkcd.com/1285/>) recommendation of putting a line break after
each sentence, although that will consistently produce two spaces after
each sentence, which may not be what you want.
If you prefer one space after sentences (which is the more modern
style), you will unfortunately need to ensure that no line in the
middle of a paragraph ends in a period or similar sentence-ending
paragraph. Otherwise, nroff will add a two spaces after that sentence
when reflowing, and your output document will have inconsistent
spacing.
Hyphens
The handling of hyphens versus dashes is somewhat fragile, and one may
get a the wrong one under some circumstances. This will normally only
matter for line breaking and possibly for troff output.
AUTHOR
Written by Russ Allbery <rra@cpan.org>, based on the original pod2man
by Tom Christiansen <tchrist@mox.perl.com>.
The modifications to work with Pod::Simple instead of Pod::Parser were
contributed by Sean Burke <sburke@cpan.org>, but I've since hacked them
beyond recognition and all bugs are mine.
COPYRIGHT AND LICENSE
Copyright 1999-2010, 2012-2020, 2022 Russ Allbery <rra@cpan.org>
Substantial contributions by Sean Burke <sburke@cpan.org>.
This program is free software; you may redistribute it and/or modify it
under the same terms as Perl itself.
SEE ALSO
Encode::Supported(3), Pod::Simple(3), perlpod(1), pod2man(1), nroff(1),
troff(1), man(1), man(7)
Ossanna, Joseph F., and Brian W. Kernighan. "Troff User's Manual,"
Computing Science Technical Report No. 54, AT&T Bell Laboratories.
This is the best documentation of standard nroff and troff. At the
time of this writing, it's available at <http://www.troff.org/54.pdf>.
The manual page documenting the man macro set may be man(5) instead of
man(7) on your system.
See perlpodstyle(1) for documentation on writing manual pages in POD if
you've not done it before and aren't familiar with the conventions.
The current version of this module is always available from its web
site at <https://www.eyrie.org/~eagle/software/podlators/>. It is also
part of the Perl core distribution as of 5.6.0.
perl v5.38.2 2023-11-28 Pod::Man(3pm)
perl 5.38.2 - Generated Tue Dec 10 15:08:05 CST 2024
