[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
11.3 Two-Way Communications with Another Process
From: brennan@whidbey.com (Mike Brennan) Newsgroups: comp.lang.awk Subject: Re: Learn the SECRET to Attract Women Easily Date: 4 Aug 1997 17:34:46 GMT Message-ID: <5s53rm$eca@news.whidbey.com> On 3 Aug 1997 13:17:43 GMT, Want More Dates??? <tracy78@kilgrona.com> wrote: >Learn the SECRET to Attract Women Easily > >The SCENT(tm) Pheromone Sex Attractant For Men to Attract Women The scent of awk programmers is a lot more attractive to women than the scent of perl programmers. -- Mike Brennan |
It is often useful to be able to send data to a separate program for processing and then read the result. This can always be done with temporary files:
# Write the data for processing tempfile = ("mydata." PROCINFO["pid"]) while (not done with data) print data | ("subprogram > " tempfile) close("subprogram > " tempfile) # Read the results, remove tempfile when done while ((getline newdata < tempfile) > 0) process newdata appropriately close(tempfile) system("rm " tempfile) |
This works, but not elegantly. Among other things, it requires that the program be run in a directory that cannot be shared among users; for example, ‘/tmp’ will not do, as another user might happen to be using a temporary file with the same name.
However, with gawk
, it is possible to
open a two-way pipe to another process. The second process is
termed a coprocess, since it runs in parallel with gawk
.
The two-way connection is created using the ‘|&’ operator
(borrowed from the Korn shell, ksh
):(62)
do { print data |& "subprogram" "subprogram" |& getline results } while (data left to process) close("subprogram") |
The first time an I/O operation is executed using the ‘|&’
operator, gawk
creates a two-way pipeline to a child process
that runs the other program. Output created with print
or printf
is written to the program’s standard input, and
output from the program’s standard output can be read by the gawk
program using getline
.
As is the case with processes started by ‘|’, the subprogram
can be any program, or pipeline of programs, that can be started by
the shell.
There are some cautionary items to be aware of:
-
As the code inside
gawk
currently stands, the coprocess’s standard error goes to the same place that the parentgawk
’s standard error goes. It is not possible to read the child’s standard error separately. -
I/O buffering may be a problem.
gawk
automatically flushes all output down the pipe to the coprocess. However, if the coprocess does not flush its output,gawk
may hang when doing agetline
in order to read the coprocess’s results. This could lead to a situation known as deadlock, where each process is waiting for the other one to do something.
It is possible to close just one end of the two-way pipe to
a coprocess, by supplying a second argument to the close()
function of either "to"
or "from"
(see section Closing Input and Output Redirections).
These strings tell gawk
to close the end of the pipe
that sends data to the coprocess or the end that reads from it,
respectively.
This is particularly necessary in order to use
the system sort
utility as part of a coprocess;
sort
must read all of its input
data before it can produce any output.
The sort
program does not receive an end-of-file indication
until gawk
closes the write end of the pipe.
When you have finished writing data to the sort
utility, you can close the "to"
end of the pipe, and
then start reading sorted data via getline
.
For example:
BEGIN { command = "LC_ALL=C sort" n = split("abcdefghijklmnopqrstuvwxyz", a, "") for (i = n; i > 0; i--) print a[i] |& command close(command, "to") while ((command |& getline line) > 0) print "got", line close(command) } |
This program writes the letters of the alphabet in reverse order, one
per line, down the two-way pipe to sort
. It then closes the
write end of the pipe, so that sort
receives an end-of-file
indication. This causes sort
to sort the data and write the
sorted data back to the gawk
program. Once all of the data
has been read, gawk
terminates the coprocess and exits.
As a side note, the assignment ‘LC_ALL=C’ in the sort
command ensures traditional Unix (ASCII) sorting from sort
.
You may also use pseudo-ttys (ptys) for
two-way communication instead of pipes, if your system supports them.
This is done on a per-command basis, by setting a special element
in the PROCINFO
array
(see section Built-in Variables That Convey Information),
like so:
command = "sort -nr" # command, save in convenience variable PROCINFO[command, "pty"] = 1 # update PROCINFO print … |& command # start two-way pipe … |
Using ptys avoids the buffer deadlock issues described earlier, at some
loss in performance. If your system does not have ptys, or if all the
system’s ptys are in use, gawk
automatically falls back to
using regular pipes.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |