2.9.1 A Simple CGI Library 
HTTP is like being married: you have to be able to handle whatever
you're given, while being very careful what you send back.
Phil Smith III,
http://www.netfunny.com/rhf/jokes/99/Mar/http.html
In A Web Service with Interaction,
we saw the function CGI_setup as part of the web server
“core logic” framework. The code presented there handles almost
everything necessary for CGI requests.
One thing it doesn't do is handle encoded characters in the requests.
For example, an ‘&’ is encoded as a percent sign followed by
the hexadecimal value: ‘%26’.  These encoded values should be
decoded.
Following is a simple library to perform these tasks.
This code is used for all web server examples
used throughout the rest of this web page.
If you want to use it for your own web server, store the source code
into a file named ‘inetlib.awk’. Then you can include
these functions into your code by placing the following statement
into your program
(on the first line of your script):
But beware, this mechanism is
only possible if you invoke your web server script with igawk
instead of the usual awk or gawk.
Here is the code:
|  | # CGI Library and core of a web server
# Global arrays
#   GETARG --- arguments to CGI GET command
#   MENU   --- menu items (path names)
#   PARAM  --- parameters of form x=y
# Optional variable MyHost contains host address
# Optional variable MyPort contains port number
# Needs TopHeader, TopDoc, TopFooter
# Sets MyPrefix, HttpService, Status, Reason
BEGIN {
  if (MyHost == "") {
     "uname -n" | getline MyHost
     close("uname -n")
  }
  if (MyPort ==  0) MyPort = 8080
  HttpService = "/inet/tcp/" MyPort "/0/0"
  MyPrefix    = "http://" MyHost ":" MyPort
  SetUpServer()
  while ("awk" != "complex") {
    # header lines are terminated this way
    RS = ORS    = "\r\n"
    Status      = 200             # this means OK
    Reason      = "OK"
    Header      = TopHeader
    Document    = TopDoc
    Footer      = TopFooter
    if        (GETARG["Method"] == "GET") {
        HandleGET()
    } else if (GETARG["Method"] == "HEAD") {
        # not yet implemented
    } else if (GETARG["Method"] != "") {
        print "bad method", GETARG["Method"]
    }
    Prompt = Header Document Footer
    print "HTTP/1.0", Status, Reason     |& HttpService
    print "Connection: Close"            |& HttpService
    print "Pragma: no-cache"             |& HttpService
    len = length(Prompt) + length(ORS)
    print "Content-length:", len         |& HttpService
    print ORS Prompt                     |& HttpService
    # ignore all the header lines
    while ((HttpService |& getline) > 0)
        continue
    # stop talking to this client
    close(HttpService)
    # wait for new client request
    HttpService |& getline
    # do some logging
    print systime(), strftime(), $0
    CGI_setup($1, $2, $3)
  }
}
function CGI_setup(   method, uri, version, i)
{
    delete GETARG
    delete MENU
    delete PARAM
    GETARG["Method"] = method
    GETARG["URI"] = uri
    GETARG["Version"] = version
    i = index(uri, "?")
    if (i > 0) {  # is there a "?" indicating a CGI request?
        split(substr(uri, 1, i-1), MENU, "[/:]")
        split(substr(uri, i+1), PARAM, "&")
        for (i in PARAM) {
            PARAM[i] = _CGI_decode(PARAM[i])
            j = index(PARAM[i], "=")
            GETARG[substr(PARAM[i], 1, j-1)] = \
                                         substr(PARAM[i], j+1)
        }
    } else { # there is no "?", no need for splitting PARAMs
        split(uri, MENU, "[/:]")
    }
    for (i in MENU)     # decode characters in path
        if (i > 4)      # but not those in host name
            MENU[i] = _CGI_decode(MENU[i])
}
 | 
This isolates details in a single function, CGI_setup.
Decoding of encoded characters is pushed off to a helper function,
_CGI_decode. The use of the leading underscore (‘_’) in
the function name is intended to indicate that it is an “internal”
function, although there is nothing to enforce this:
|  | function _CGI_decode(str,   hexdigs, i, pre, code1, code2,
                            val, result)
{
   hexdigs = "123456789abcdef"
   i = index(str, "%")
   if (i == 0) # no work to do
      return str
   do {
      pre = substr(str, 1, i-1)   # part before %xx
      code1 = substr(str, i+1, 1) # first hex digit
      code2 = substr(str, i+2, 1) # second hex digit
      str = substr(str, i+3)      # rest of string
      code1 = tolower(code1)
      code2 = tolower(code2)
      val = index(hexdigs, code1) * 16 \
            + index(hexdigs, code2)
      result = result pre sprintf("%c", val)
      i = index(str, "%")
   } while (i != 0)
   if (length(str) > 0)
      result = result str
   return result
}
 | 
This works by splitting the string apart around an encoded character.
The two digits are converted to lowercase characters and looked up in a string
of hex digits.  Note that 0 is not in the string on purpose;
index returns zero when it's not found, automatically giving
the correct value!  Once the hexadecimal value is converted from
characters in a string into a numerical value, sprintf
converts the value back into a real character.
The following is a simple test harness for the above functions:
|  | BEGIN {
  CGI_setup("GET",
  "http://www.gnu.org/cgi-bin/foo?p1=stuff&p2=stuff%26junk" \
       "&percent=a %25 sign",
  "1.0")
  for (i in MENU)
      printf "MENU[\"%s\"] = %s\n", i, MENU[i]
  for (i in PARAM)
      printf "PARAM[\"%s\"] = %s\n", i, PARAM[i]
  for (i in GETARG)
      printf "GETARG[\"%s\"] = %s\n", i, GETARG[i]
}
 | 
And this is the result when we run it:
|  | $ gawk -f testserv.awk
-| MENU["4"] = www.gnu.org
-| MENU["5"] = cgi-bin
-| MENU["6"] = foo
-| MENU["1"] = http
-| MENU["2"] =
-| MENU["3"] =
-| PARAM["1"] = p1=stuff
-| PARAM["2"] = p2=stuff&junk
-| PARAM["3"] = percent=a % sign
-| GETARG["p1"] = stuff
-| GETARG["percent"] = a % sign
-| GETARG["p2"] = stuff&junk
-| GETARG["Method"] = GET
-| GETARG["Version"] = 1.0
-| GETARG["URI"] = http://www.gnu.org/cgi-bin/foo?p1=stuff&
p2=stuff%26junk&percent=a %25 sign
 |