The rc shell, on which CSA is based, exports all variables and functions to the program environment, making them global to large portions of code (IMHO this is both an advantage and a nuissance, but I won't get into this discussion here). Because of this, we have the need to establish a precise naming convention for all names, also to make them as self-documenting as possible.
In the CSA name space, a few name prefixes are either reserved for CSA or they are used internally by the rc shell, by AWK or by other underlying utility programs. The reserved prefixes are:
Prefixes reserved for CSA use. They SHOULD NOT be used by application programs to identify their own commands, functions and variable names.
rc function definitions.
In addition to reserved prefixes there are others that, although not strictly reserved, are simply typical, or conventional, and they are:
Decoded HTTP GET/POST variables.
XML-encoded strings (PCDATA), that can be safely included in XML files and templates.
Template variables that are to be inserted in HTML response pages and forms. Such variables are derived from their unencoded versions by replacing special characters with their ISO representations (i.e. the newline becomes , and so on). CGI response variables SHOULD always be encoded like this, also to prevent the so called cross-site scripting (CSS) vulnerabilities.
These are similar to ISO_
variables,
but the escaping of special characters is
done according to RFC 1378 and the resulting values are meant to
be inserted in the QUERY_STRING part of Uniform Resource
Indicator (URI) strings
(i.e. the newline becomes %0A, and so on).
These are a special version of the URI_
variables, that are meant to be used in the PATH_INFO part of a URI.
They are slightly different from their URI_
equivalents,
to account for the decoding done by the CGI interface on the
PATH_INFO part of a URI.
Recommended prefix for what I dubbed "Custom Name Space" (CNS) variables. These are global application-level variables that application programmers may use for their own Inter-Program Communications (IPC) needs.
For an application name-space not to clash with the CSA one, just make sure that your own CSA applications always use mixed-case function and command names (i.e. "someFunction", "MyCommand", and that), and either lower- or mixed-case variable names. Never define names that begin with either "csa" or "CSA". In fact, casual all upper-case names (like "SOME_NAME") SHOULD be avoided altogether. If nevertheless you need them, you SHOULD prefix them with the string "CNS_" (i.e. "CNS_MYVAR"), if they are to be exported to the environment, or "CNS" (like "CNSMYVAR") otherwise. This stands for "Custom Name Space", and it is the prefix that CSA sets aside for use by application programs, as already explained.
In addition to the naming conventions that we have seen so far, there are others that concern the naming of files on disk. The following few filename extensions are typical in a CSA application:
A generic data file, usually a TAB-delimited table.
RCS/CVS repository of file.data
If file.data is a table, this is the associated cross-reference file.
Clustered table directory.
Advisory lock on file.data.
CSA sets and uses a number of environment variables for its own purposes. All of these variables are available also to the application program, but not all of them may be freely modified by the latter, or things may break. This is the complete list of current environment variables. See the following paragraph for more detailed documentation on many of them.
In the explanations that that follow, each variable is tagged with
a Scope attribute that defines
whether it can be set/unset by the application
program or it is reserved for CSA internal use. A "CSA"
scope means that the variable can still be tested and used also by an
application program, but it SHOULD be considered read-only. If the scope
is "profile", then the value can still be set by an application
program, but its most proper setting place is the application profile
"$CSA_ROOT/csa.rc
".
The single-quote character.
Default: "\047
". Scope: CSA.
The horizontal-tab character.
Default: "\009
". Scope: CSA.
The empty rc value.
Default: ''
. Scope: CSA.
The newline character.
Default: "\010
". Scope: CSA.
The carriage-return character.
Default: "\013
". Scope: CSA.
Current CSA UNIX user name. Default: the value returned by whoami(1). Scope: CSA.
UNIX account that the CSA application MUST run as. If the application finds itself to be running under a different account then it MUST "commit suicide", i.e. stop immediately with an error message. CGI programs are often executed by setuid wrappers and this security measure is meant to avod that, if the wrapper crashes, the application program runs as root, with unpredictable and usually dangerous effects on the integrity and security of the system. Default: "nobody". Scope: profile.
If set to "1", then any changes to files done
by the csaCommit
function will be subject to rcs(1)
versioning. Do not set this variable if change-management is already
done with something different/better, like CVS for example.
Default: unset. Scope: profile.
"domain" attribute of the User Session Cookie. Default: unset. Scope: profile.
"path" attribute of the User Session Cookie. Default: unset. Scope: profile.
"secure" attribute of the User Session Cookie. Default: unset. Scope: profile.
URL, either relative or absolute, of the CGI program directory on the CSA Web server. Default: "/cgi-bin". Scope: profile.
Same as CSA_CGIBIN, but for SSL Web connections. CGI program directory on the CSA Web server. Default: "$CSA_CGIBIN". Scope: profile.
Name and arguments of the local md5sum(1) command, which is run through a library function, as its output format varies between different versions of UNIX/Linux. Default: "md5sum". Scope: profile.
Name and arguments of the local ps(1) command, which syntax can vary Default: "ps h -u $CSA_ALLOW_USER". Scope: profile.
Reserved for CSA internal use. See mainlib.rc. Default: unset. Scope: CSA.
Common prefix of the HTTP cookies sent to the Web client by CSA programs. See section HTTP Cookies, for more on this. Default: unset. Scope: all.
If set to "1", then the
$CSA_ROOT/var/debug.log
file will be
created, containing the CSA function-call trace and a lot of other
useful debugging information. Writing such file is quite expensive
in terms of extra system resources, so it should be avoided when not
strictly necessary, by setting CSA_DEBUG=0.
Default: "0". Scope: profile.
Document-root directory of either the Web Server or the Virtual Host. Default: $DOCUMENT_ROOT. Scope: profile.
Same as CSA_DOCROOT, but for SSL Web connections. Default: $CSA_DOCROOT. Scope: profile.
List of Globally Unique IDs generated during the current run. Default: unset. Scope: CSA.
Host-name of the Web Server running the current CSA application instance. Default: the value returned by the "hostname -s" command. Scope: CSA.
File-name extension of template files. Default: "html". Scope: profile.
CSA application name. It MUST comprise only characters in the set "[A-Za-z0-9_-]", such as "foo", "test", "example", etc. Default: unset. Scope: profile.
CSA installation directory. Default: "/usr/local/csa". Scope: profile.
Contents of the QUERY_STRING variable of an ISINDEX HTTP request. Given the current MIME-RPC CGI calling conventions, pure ISINDEX queries may no longer occur, as the "?" URL argument is already used to identify the target CSA program. Default: unset. Scope: CSA.
Local language code for messages and Web pages, according to the usual classification (en, en_US, it, en_UK, es, etc.). The selected language MUST correspond to message and template directories that actually exist on the server. Currently, CSA provides messages only in the "it" and "en_US" versions. Default: "en_US". Scope: profile.
List of lock-files (semaphores) created
so far by the CSA program through the csaLock
function.
Default: unset. Scope: CSA.
Max. no. of active processes allowed on the system. If this limit is exceeded then no new requests will be denied with an error message. This check is normally not active. Default: unlimited. Scope: profile.
Current message group name. Default: "CSA_SYSTEM". Scope: CSA.
Current message number. Default: "0000". Scope: CSA.
Current message text. Default: unset. Scope: CSA.
Extra actions to be performed by
csaCommit
. It MUST be a valid rc program fragment,
properly
escaped to make it eval-safe. It is up to the application program to
make sure that special rc characters have been properly
escaped in the fragment.
Default: unset. Scope: all.
Name of the current CSA program, to be used in
messages printed by csaPrintMsg
.
Default: "CSA". Scope: CSA.
NFS-safe unique identificator of the current CSA program. Default: $CSA_HOST.$pid. Scope: CSA.
Current request URL. Default: $REQUEST_URI.$pid. Scope: CSA.
Used by many CSA library functions to return the function result to the caller, mainly to sav a `{} subprocess. Default: unset. Scope: all.
Current request-ID. Default: $CSA_HOST^_$pid. Scope: CSA.
CSA application installation directory. Default: "/". Scope: profile.
Path to a temporary file containing the RPC request. Default: $TMPDIR/rpc$pid.tmp. Scope: CSA.
Max. size in bytes of GET/POST data. Default: 10000. Scope: CSA.
Path to a temporary file containing the CSA program call (GET/POST) variables, in rc syntax. Default: dinamically set. Scope: CSA.
Generic CSA error flag, used by some of the library functions. Default: "0". Scope: all.
Generic flag to tell programs that we are running in test mode. Whether to test this flag is up to the application programs. Default: "0". Scope: profile.
Base URL of the current CSA server application. Default: "http://localhost". Scope: profile.
Same as CSA_URL, but for SSL Web connections. Default: "$CSA_URL". Scope: profile.
Web User ID, for authenticated sessions. Default: unset. Scope: CSA.
User Path-Based Clustering index. This is the relative directory tree under which pieces of data belonging to the current Web user can be found in PBC structures. For instance, if the current user-ID is "goofy" then the associated CSA_USER_PBC value will be "g/o", as explained in section Path-Based Clustering. Default: unset. Scope: CSA.
Current CSA version. Default: fixed. Scope: CSA.
Write-back message.
This is a variable through which a sub-program,
i.e. a called process, can request the parent rc program to
print a CSA message through csaPrintMsg
.
This capability requires that
the caller capture the called program output into a temporary file,
which is then sourced with the "." shell operator. After sourcing the
generated script file, the caller will then test the content of said
variable and take the proper actions. Building a source-safe script is
the responsibility of the called program.
Default: unset. Scope: CSA.
List of the work-files created by the
CSA application during the current run. Such files will be removed
on exit by the csaExit
function.
Default: unset. Scope: CSA.
A number of CSA variables are used to hold date and time values in various formats. The values account DST as appropriate. These variables are set by storing the output of one single invocation of the following shell command:
* =`{date -d now '+%Y %m %d %H %M %S %Z %a %b %s %z'}
The "now" argument of date(1) can be overridden
by a new explicit call to the csaSetTime
function. Here's
the complete list of the date/time variables, with their settings:
CSA_TIME_YEAR = $1
CSA_TIME_MONTH = $2
CSA_TIME_DAY = $3
CSA_TIME_HOUR = $4
CSA_TIME_MIN = $5
CSA_TIME_SEC = $6
CSA_TIME_TZ = $7
CSA_TIME_STAMP = $1$2$3$4$5$6
CSA_TIME_ISO = $1$2$3^T$4:$5:$6
CSA_TIME_ISO2 = $1-$2-$3' '$4:$5:$6
CSA_TIME_ISO3 = $1-$2-$3^T$4:$5:$6$11
CSA_TIME_ISO4 = $1-$2-$3^T$4:$5:$6
CSA_TIME_DNAME = $8
CSA_TIME_MNAME = $9
CSA_TIME_UNIX = $10
CSA_TIME_LOG = $CSA_TIME_ISO2.$CSA_TIME_TZ
To date, the CSA libraries provide the following functions. Providing documentation for all of them in this document is going to be a major effort. In the meantime please refer to the explanatory comments that are contained in the library files, and to the example programs.
The rc shell, on which CSA is largely based, exports
all names and function definitions to the program environment by default.
Function definitions, in particular, may cause the environment to
become really big and cluttered. To try and mitigate this problem,
different CSA shell libraries often re-define previously defined shell
functions. For instance, the csaExit.fault
function is defined
in mainlib.rc
and cgilib.rc
.
In all cases the function serves the same purpose: exiting on errors.
The way it accomplishes its job, however, may be different in the
three cases. A command-line shell script will only load mainlib.rc
,
and the version of csaExit.fault
contained in that
library will simply exit non-zero (after doing some housekeeping) if
called in that context. A CGI program, however, beside loading
mainlib.rc
will also load cgilib.rc
. This second
library will provide its own re-definition of csaExit.fault
.
The latter, if called by the CGI program, will do the housekeeping, send
an error HTML page to the client and exit non-zero.
In this way, by re-using the same function names
for different context-specific code, I managed to:
In the AWK libraries provided by CSA, I have occasionally
tried and mimic C concepts. For instance, in csalib.awk
there are functions like strdup(), ctime(),
stat(), creat() and others, that
try and behave somewhat like their C-library counterparts. The
similarities are rough at best, so do not expect to use those functions
exactly in the
same way as you would do in a real C program, but the basics should be
there. Beside C-like function names, I have also tried and use C-like
error codes. To date, the following Linux-style symbolic error codes
have been defined (see errno(3) and <errno.h>
for more info):
ENOENT = 2
EIO = 5
EACCES = 13
EISDIR = 21
EINVAL = 22
ENOMSG = 42
EMSGSIZE = 90
Note that these error flags have rc(1) conterparts with the same name, prefixed by the string "CSA_" as mandated by CSA naming conventions for environment variables.
Beside the few global AWK variables described above, here follows the complete list of global CSA names that are defined in the relevant AWK library functions. Please refer to the associated files for more info on them.
This is the list of the AWK functions that are currently provided by the relevant CSA libraries. As it was the case with the list of rc(1) functions, describing all of them in this document will be a major task. In the meantime please refer directly to the comments in the associated library files.