manpagez: man pages & more
man perlxstypemap(1)
Home | html | info | man
perlxstypemap(1)       Perl Programmers Reference Guide       perlxstypemap(1)



NAME

       perlxstypemap - Perl XS C/Perl type mapping


DESCRIPTION

       The more you think about interfacing between two languages, the more
       you'll realize that the majority of programmer effort has to go into
       converting between the data structures that are native to either of the
       languages involved.  This trumps other matter such as differing calling
       conventions because the problem space is so much greater.  There are
       simply more ways to shove data into memory than there are ways to
       implement a function call.

       Perl XS' attempt at a solution to this is the concept of typemaps.  At
       an abstract level, a Perl XS typemap is nothing but a recipe for
       converting from a certain Perl data structure to a certain C data
       structure and vice versa.  Since there can be C types that are
       sufficiently similar to one another to warrant converting with the same
       logic, XS typemaps are represented by a unique identifier, henceforth
       called an XS type in this document.  You can then tell the XS compiler
       that multiple C types are to be mapped with the same XS typemap.

       In your XS code, when you define an argument with a C type or when you
       are using a "CODE:" and an "OUTPUT:" section together with a C return
       type of your XSUB, it'll be the typemapping mechanism that makes this
       easy.

   Anatomy of a typemap
       In more practical terms, the typemap is a collection of code fragments
       which are used by the xsubpp compiler to map C function parameters and
       values to Perl values.  The typemap file may consist of three sections
       labelled "TYPEMAP", "INPUT", and "OUTPUT".  An unlabelled initial
       section is assumed to be a "TYPEMAP" section.  The INPUT section tells
       the compiler how to translate Perl values into variables of certain C
       types.  The OUTPUT section tells the compiler how to translate the
       values from certain C types into values Perl can understand.  The
       TYPEMAP section tells the compiler which of the INPUT and OUTPUT code
       fragments should be used to map a given C type to a Perl value.  The
       section labels "TYPEMAP", "INPUT", or "OUTPUT" must begin in the first
       column on a line by themselves, and must be in uppercase.

       Each type of section can appear an arbitrary number of times and does
       not have to appear at all.  For example, a typemap may commonly lack
       "INPUT" and "OUTPUT" sections if all it needs to do is associate
       additional C types with core XS types like T_PTROBJ.  Lines that start
       with a hash "#" are considered comments and ignored in the "TYPEMAP"
       section, but are considered significant in "INPUT" and "OUTPUT". Blank
       lines are generally ignored.

       Traditionally, typemaps needed to be written to a separate file,
       conventionally called "typemap" in a CPAN distribution.  With
       ExtUtils::ParseXS (the XS compiler) version 3.12 or better which comes
       with perl 5.16, typemaps can also be embedded directly into XS code
       using a HERE-doc like syntax:

         TYPEMAP: <<HERE
         ...
         HERE

       where "HERE" can be replaced by other identifiers like with normal Perl
       HERE-docs.  All details below about the typemap textual format remain
       valid.

       The "TYPEMAP" section should contain one pair of C type and XS type per
       line as follows.  An example from the core typemap file:

         TYPEMAP
         # all variants of char* is handled by the T_PV typemap
         char *          T_PV
         const char *    T_PV
         unsigned char * T_PV
         ...

       The "INPUT" and "OUTPUT" sections have identical formats, that is, each
       unindented line starts a new in- or output map respectively.  A new in-
       or output map must start with the name of the XS type to map on a line
       by itself, followed by the code that implements it indented on the
       following lines. Example:

         INPUT
         T_PV
           $var = ($type)SvPV_nolen($arg)
         T_PTR
           $var = INT2PTR($type,SvIV($arg))

       We'll get to the meaning of those Perlish-looking variables in a little
       bit.

       Finally, here's an example of the full typemap file for mapping C
       strings of the "char *" type to Perl scalars/strings:

         TYPEMAP
         char *  T_PV

         INPUT
         T_PV
           $var = ($type)SvPV_nolen($arg)

         OUTPUT
         T_PV
           sv_setpv((SV*)$arg, $var);

       Here's a more complicated example: suppose that you wanted "struct
       netconfig" to be blessed into the class "Net::Config".  One way to do
       this is to use underscores (_) to separate package names, as follows:

         typedef struct netconfig * Net_Config;

       And then provide a typemap entry "T_PTROBJ_SPECIAL" that maps
       underscores to double-colons (::), and declare "Net_Config" to be of
       that type:

         TYPEMAP
         Net_Config      T_PTROBJ_SPECIAL

         INPUT
         T_PTROBJ_SPECIAL
           if (sv_derived_from($arg, \"${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\")){
             IV tmp = SvIV((SV*)SvRV($arg));
             $var = INT2PTR($type, tmp);
           }
           else
             croak(\"$var is not of type ${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\")

         OUTPUT
         T_PTROBJ_SPECIAL
           sv_setref_pv($arg, \"${(my $ntt=$ntype)=~s/_/::/g;\$ntt}\",
                        (void*)$var);

       The INPUT and OUTPUT sections substitute underscores for double-colons
       on the fly, giving the desired effect.  This example demonstrates some
       of the power and versatility of the typemap facility.

       The "INT2PTR" macro (defined in perl.h) casts an integer to a pointer
       of a given type, taking care of the possible different size of integers
       and pointers.  There are also "PTR2IV", "PTR2UV", "PTR2NV" macros, to
       map the other way, which may be useful in OUTPUT sections.

   The Role of the typemap File in Your Distribution
       The default typemap in the lib/ExtUtils directory of the Perl source
       contains many useful types which can be used by Perl extensions.  Some
       extensions define additional typemaps which they keep in their own
       directory.  These additional typemaps may reference INPUT and OUTPUT
       maps in the main typemap.  The xsubpp compiler will allow the
       extension's own typemap to override any mappings which are in the
       default typemap.  Instead of using an additional typemap file, typemaps
       may be embedded verbatim in XS with a heredoc-like syntax.  See the
       documentation on the "TYPEMAP:" XS keyword.

       For CPAN distributions, you can assume that the XS types defined by the
       perl core are already available. Additionally, the core typemap has
       default XS types for a large number of C types.  For example, if you
       simply return a "char *" from your XSUB, the core typemap will have
       this C type associated with the T_PV XS type.  That means your C string
       will be copied into the PV (pointer value) slot of a new scalar that
       will be returned from your XSUB to Perl.

       If you're developing a CPAN distribution using XS, you may add your own
       file called typemap to the distribution.  That file may contain
       typemaps that either map types that are specific to your code or that
       override the core typemap file's mappings for common C types.

   Sharing typemaps Between CPAN Distributions
       Starting with ExtUtils::ParseXS version 3.13_01 (comes with perl 5.16
       and better), it is rather easy to share typemap code between multiple
       CPAN distributions. The general idea is to share it as a module that
       offers a certain API and have the dependent modules declare that as a
       built-time requirement and import the typemap into the XS. An example
       of such a typemap-sharing module on CPAN is
       "ExtUtils::Typemaps::Basic". Two steps to getting that module's
       typemaps available in your code:

       o   Declare "ExtUtils::Typemaps::Basic" as a build-time dependency in
           "Makefile.PL" (use "BUILD_REQUIRES"), or in your "Build.PL" (use
           "build_requires").

       o   Include the following line in the XS section of your XS file:
           (don't break the line)

             INCLUDE_COMMAND: $^X -MExtUtils::Typemaps::Cmd
                              -e "print embeddable_typemap(q{Basic})"

   Writing typemap Entries
       Each INPUT or OUTPUT typemap entry is a double-quoted Perl string that
       will be evaluated in the presence of certain variables to get the final
       C code for mapping a certain C type.

       This means that you can embed Perl code in your typemap (C) code using
       constructs such as "${ perl code that evaluates to scalar reference
       here }". A common use case is to generate error messages that refer to
       the true function name even when using the ALIAS XS feature:

         ${ $ALIAS ? \q[GvNAME(CvGV(cv))] : \qq[\"$pname\"] }

       For many typemap examples, refer to the core typemap file that can be
       found in the perl source tree at lib/ExtUtils/typemap.

       The Perl variables that are available for interpolation into typemaps
       are the following:

       o   $var - the name of the input or output variable, eg. RETVAL for
           return values.

       o   $type - the raw C type of the parameter, any ":" replaced with "_".
           e.g. for a type of "Foo::Bar", $type is "Foo__Bar"

       o   $ntype - the supplied type with "*" replaced with "Ptr".  e.g. for
           a type of "Foo*", $ntype is "FooPtr"

       o   $arg - the stack entry, that the parameter is input from or output
           to, e.g. ST(0)

       o   $argoff - the argument stack offset of the argument.  ie. 0 for the
           first argument, etc.

       o   $pname - the full name of the XSUB, with including the "PACKAGE"
           name, with any "PREFIX" stripped.  This is the non-ALIAS name.

       o   $Package - the package specified by the most recent "PACKAGE"
           keyword.

       o   $ALIAS - non-zero if the current XSUB has any aliases declared with
           "ALIAS".

   Full Listing of Core Typemaps
       Each C type is represented by an entry in the typemap file that is
       responsible for converting perl variables (SV, AV, HV, CV, etc.)  to
       and from that type. The following sections list all XS types that come
       with perl by default.

       T_SV
           This simply passes the C representation of the Perl variable (an
           SV*) in and out of the XS layer. This can be used if the C code
           wants to deal directly with the Perl variable.

       T_SVREF
           Used to pass in and return a reference to an SV.

           Note that this typemap does not decrement the reference count when
           returning the reference to an SV*.  See also:
           T_SVREF_REFCOUNT_FIXED

       T_SVREF_FIXED
           Used to pass in and return a reference to an SV.  This is a fixed
           variant of T_SVREF that decrements the refcount appropriately when
           returning a reference to an SV*. Introduced in perl 5.15.4.

       T_AVREF
           From the perl level this is a reference to a perl array.  From the
           C level this is a pointer to an AV.

           Note that this typemap does not decrement the reference count when
           returning an AV*. See also: T_AVREF_REFCOUNT_FIXED

       T_AVREF_REFCOUNT_FIXED
           From the perl level this is a reference to a perl array.  From the
           C level this is a pointer to an AV. This is a fixed variant of
           T_AVREF that decrements the refcount appropriately when returning
           an AV*. Introduced in perl 5.15.4.

       T_HVREF
           From the perl level this is a reference to a perl hash.  From the C
           level this is a pointer to an HV.

           Note that this typemap does not decrement the reference count when
           returning an HV*. See also: T_HVREF_REFCOUNT_FIXED

       T_HVREF_REFCOUNT_FIXED
           From the perl level this is a reference to a perl hash.  From the C
           level this is a pointer to an HV. This is a fixed variant of
           T_HVREF that decrements the refcount appropriately when returning
           an HV*. Introduced in perl 5.15.4.

       T_CVREF
           From the perl level this is a reference to a perl subroutine (e.g.
           $sub = sub { 1 };). From the C level this is a pointer to a CV.

           Note that this typemap does not decrement the reference count when
           returning an HV*. See also: T_HVREF_REFCOUNT_FIXED

       T_CVREF_REFCOUNT_FIXED
           From the perl level this is a reference to a perl subroutine (e.g.
           $sub = sub { 1 };). From the C level this is a pointer to a CV.

           This is a fixed variant of T_HVREF that decrements the refcount
           appropriately when returning an HV*. Introduced in perl 5.15.4.

       T_SYSRET
           The T_SYSRET typemap is used to process return values from system
           calls.  It is only meaningful when passing values from C to perl
           (there is no concept of passing a system return value from Perl to
           C).

           System calls return -1 on error (setting ERRNO with the reason) and
           (usually) 0 on success. If the return value is -1 this typemap
           returns "undef". If the return value is not -1, this typemap
           translates a 0 (perl false) to "0 but true" (which is perl true) or
           returns the value itself, to indicate that the command succeeded.

           The POSIX module makes extensive use of this type.

       T_UV
           An unsigned integer.

       T_IV
           A signed integer. This is cast to the required integer type when
           passed to C and converted to an IV when passed back to Perl.

       T_INT
           A signed integer. This typemap converts the Perl value to a native
           integer type (the "int" type on the current platform). When
           returning the value to perl it is processed in the same way as for
           T_IV.

           Its behaviour is identical to using an "int" type in XS with T_IV.

       T_ENUM
           An enum value. Used to transfer an enum component from C. There is
           no reason to pass an enum value to C since it is stored as an IV
           inside perl.

       T_BOOL
           A boolean type. This can be used to pass true and false values to
           and from C.

       T_U_INT
           This is for unsigned integers. It is equivalent to using T_UV but
           explicitly casts the variable to type "unsigned int".  The default
           type for "unsigned int" is T_UV.

       T_SHORT
           Short integers. This is equivalent to T_IV but explicitly casts the
           return to type "short". The default typemap for "short" is T_IV.

       T_U_SHORT
           Unsigned short integers. This is equivalent to T_UV but explicitly
           casts the return to type "unsigned short". The default typemap for
           "unsigned short" is T_UV.

           T_U_SHORT is used for type "U16" in the standard typemap.

       T_LONG
           Long integers. This is equivalent to T_IV but explicitly casts the
           return to type "long". The default typemap for "long" is T_IV.

       T_U_LONG
           Unsigned long integers. This is equivalent to T_UV but explicitly
           casts the return to type "unsigned long". The default typemap for
           "unsigned long" is T_UV.

           T_U_LONG is used for type "U32" in the standard typemap.

       T_CHAR
           Single 8-bit characters.

       T_U_CHAR
           An unsigned byte.

       T_FLOAT
           A floating point number. This typemap guarantees to return a
           variable cast to a "float".

       T_NV
           A Perl floating point number. Similar to T_IV and T_UV in that the
           return type is cast to the requested numeric type rather than to a
           specific type.

       T_DOUBLE
           A double precision floating point number. This typemap guarantees
           to return a variable cast to a "double".

       T_PV
           A string (char *).

       T_PTR
           A memory address (pointer). Typically associated with a "void *"
           type.

       T_PTRREF
           Similar to T_PTR except that the pointer is stored in a scalar and
           the reference to that scalar is returned to the caller. This can be
           used to hide the actual pointer value from the programmer since it
           is usually not required directly from within perl.

           The typemap checks that a scalar reference is passed from perl to
           XS.

       T_PTROBJ
           Similar to T_PTRREF except that the reference is blessed into a
           class.  This allows the pointer to be used as an object. Most
           commonly used to deal with C structs. The typemap checks that the
           perl object passed into the XS routine is of the correct class (or
           part of a subclass).

           The pointer is blessed into a class that is derived from the name
           of type of the pointer but with all '*' in the name replaced with
           'Ptr'.

           For "DESTROY" XSUBs only, a T_PTROBJ is optimized to a T_PTRREF.
           This means the class check is skipped.

       T_REF_IV_REF
           NOT YET

       T_REF_IV_PTR
           Similar to T_PTROBJ in that the pointer is blessed into a scalar
           object.  The difference is that when the object is passed back into
           XS it must be of the correct type (inheritance is not supported)
           while T_PTROBJ supports inheritance.

           The pointer is blessed into a class that is derived from the name
           of type of the pointer but with all '*' in the name replaced with
           'Ptr'.

           For "DESTROY" XSUBs only, a T_REF_IV_PTR is optimized to a
           T_PTRREF. This means the class check is skipped.

       T_PTRDESC
           NOT YET

       T_REFREF
           Similar to T_PTRREF, except the pointer stored in the referenced
           scalar is dereferenced and copied to the output variable. This
           means that T_REFREF is to T_PTRREF as T_OPAQUE is to T_OPAQUEPTR.
           All clear?

           Only the INPUT part of this is implemented (Perl to XSUB) and there
           are no known users in core or on CPAN.

       T_REFOBJ
           Like T_REFREF, except it does strict type checking (inheritance is
           not supported).

           For "DESTROY" XSUBs only, a T_REFOBJ is optimized to a T_REFREF.
           This means the class check is skipped.

       T_OPAQUEPTR
           This can be used to store bytes in the string component of the SV.
           Here the representation of the data is irrelevant to perl and the
           bytes themselves are just stored in the SV. It is assumed that the
           C variable is a pointer (the bytes are copied from that memory
           location).  If the pointer is pointing to something that is
           represented by 8 bytes then those 8 bytes are stored in the SV (and
           length() will report a value of 8). This entry is similar to
           T_OPAQUE.

           In principle the unpack() command can be used to convert the bytes
           back to a number (if the underlying type is known to be a number).

           This entry can be used to store a C structure (the number of bytes
           to be copied is calculated using the C "sizeof" function) and can
           be used as an alternative to T_PTRREF without having to worry about
           a memory leak (since Perl will clean up the SV).

       T_OPAQUE
           This can be used to store data from non-pointer types in the string
           part of an SV. It is similar to T_OPAQUEPTR except that the typemap
           retrieves the pointer directly rather than assuming it is being
           supplied. For example, if an integer is imported into Perl using
           T_OPAQUE rather than T_IV the underlying bytes representing the
           integer will be stored in the SV but the actual integer value will
           not be available. i.e. The data is opaque to perl.

           The data may be retrieved using the "unpack" function if the
           underlying type of the byte stream is known.

           T_OPAQUE supports input and output of simple types.  T_OPAQUEPTR
           can be used to pass these bytes back into C if a pointer is
           acceptable.

       Implicit array
           xsubpp supports a special syntax for returning packed C arrays to
           perl. If the XS return type is given as

             array(type, nelem)

           xsubpp will copy the contents of "nelem * sizeof(type)" bytes from
           RETVAL to an SV and push it onto the stack. This is only really
           useful if the number of items to be returned is known at compile
           time and you don't mind having a string of bytes in your SV.  Use
           T_ARRAY to push a variable number of arguments onto the return
           stack (they won't be packed as a single string though).

           This is similar to using T_OPAQUEPTR but can be used to process
           more than one element.

       T_PACKED
           Calls user-supplied functions for conversion. For "OUTPUT" (XSUB to
           Perl), a function named "XS_pack_$ntype" is called with the output
           Perl scalar and the C variable to convert from.  $ntype is the
           normalized C type that is to be mapped to Perl. Normalized means
           that all "*" are replaced by the string "Ptr". The return value of
           the function is ignored.

           Conversely for "INPUT" (Perl to XSUB) mapping, the function named
           "XS_unpack_$ntype" is called with the input Perl scalar as argument
           and the return value is cast to the mapped C type and assigned to
           the output C variable.

           An example conversion function for a typemapped struct "foo_t *"
           might be:

             static void
             XS_pack_foo_tPtr(SV *out, foo_t *in)
             {
               dTHX; /* alas, signature does not include pTHX_ */
               HV* hash = newHV();
               hv_stores(hash, "int_member", newSViv(in->int_member));
               hv_stores(hash, "float_member", newSVnv(in->float_member));
               /* ... */

               /* mortalize as thy stack is not refcounted */
               sv_setsv(out, sv_2mortal(newRV_noinc((SV*)hash)));
             }

           The conversion from Perl to C is left as an exercise to the reader,
           but the prototype would be:

             static foo_t *
             XS_unpack_foo_tPtr(SV *in);

           Instead of an actual C function that has to fetch the thread
           context using "dTHX", you can define macros of the same name and
           avoid the overhead. Also, keep in mind to possibly free the memory
           allocated by "XS_unpack_foo_tPtr".

       T_PACKEDARRAY
           T_PACKEDARRAY is similar to T_PACKED. In fact, the "INPUT" (Perl to
           XSUB) typemap is identical, but the "OUTPUT" typemap passes an
           additional argument to the "XS_pack_$ntype" function. This third
           parameter indicates the number of elements in the output so that
           the function can handle C arrays sanely. The variable needs to be
           declared by the user and must have the name "count_$ntype" where
           $ntype is the normalized C type name as explained above. The
           signature of the function would be for the example above and "foo_t
           **":

             static void
             XS_pack_foo_tPtrPtr(SV *out, foo_t *in, UV count_foo_tPtrPtr);

           The type of the third parameter is arbitrary as far as the typemap
           is concerned. It just has to be in line with the declared variable.

           Of course, unless you know the number of elements in the "sometype
           **" C array, within your XSUB, the return value from "foo_t **
           XS_unpack_foo_tPtrPtr(...)" will be hard to decipher.  Since the
           details are all up to the XS author (the typemap user), there are
           several solutions, none of which particularly elegant.  The most
           commonly seen solution has been to allocate memory for N+1 pointers
           and assign "NULL" to the (N+1)th to facilitate iteration.

           Alternatively, using a customized typemap for your purposes in the
           first place is probably preferable.

       T_DATAUNIT
           NOT YET

       T_CALLBACK
           NOT YET

       T_ARRAY
           This is used to convert the perl argument list to a C array and for
           pushing the contents of a C array onto the perl argument stack.

           The usual calling signature is

             @out = array_func( @in );

           Any number of arguments can occur in the list before the array but
           the input and output arrays must be the last elements in the list.

           When used to pass a perl list to C the XS writer must provide a
           function (named after the array type but with 'Ptr' substituted for
           '*') to allocate the memory required to hold the list. A pointer
           should be returned. It is up to the XS writer to free the memory on
           exit from the function. The variable "ix_$var" is set to the number
           of elements in the new array.

           When returning a C array to Perl the XS writer must provide an
           integer variable called "size_$var" containing the number of
           elements in the array. This is used to determine how many elements
           should be pushed onto the return argument stack. This is not
           required on input since Perl knows how many arguments are on the
           stack when the routine is called. Ordinarily this variable would be
           called "size_RETVAL".

           Additionally, the type of each element is determined from the type
           of the array. If the array uses type "intArray *" xsubpp will
           automatically work out that it contains variables of type "int" and
           use that typemap entry to perform the copy of each element. All
           pointer '*' and 'Array' tags are removed from the name to determine
           the subtype.

       T_STDIO
           This is used for passing perl filehandles to and from C using "FILE
           *" structures.

       T_INOUT
           This is used for passing perl filehandles to and from C using
           "PerlIO *" structures. The file handle can used for reading and
           writing. This corresponds to the "+<" mode, see also T_IN and
           T_OUT.

           See perliol for more information on the Perl IO abstraction layer.
           Perl must have been built with "-Duseperlio".

           There is no check to assert that the filehandle passed from Perl to
           C was created with the right open() mode.

           Hint: The perlxstut tutorial covers the T_INOUT, T_IN, and T_OUT XS
           types nicely.

       T_IN
           Same as T_INOUT, but the filehandle that is returned from C to Perl
           can only be used for reading (mode "<").

       T_OUT
           Same as T_INOUT, but the filehandle that is returned from C to Perl
           is set to use the open mode "+>".

perl v5.38.2                      2023-11-28                  perlxstypemap(1)

perl 5.38.2 - Generated Mon Dec 2 15:26:34 CST 2024
© manpagez.com 2000-2025
Individual documents may contain additional copyright information.