=back
-=head REQUEST-RELATED FUNCTIONS AND METHODS
+=head1 REQUEST-RELATED FUNCTIONS AND METHODS
All of these are only valid after C<check_divert> or C<check_ok> has
been called. (In the case of C<check_ok> it won't normally be sensible
call this function. See L</REQUEST TYPES>, and
L<GENERATING URLS, FORMS AND AJAX QUERIES>.
-C<check_mutate> will either return successfully, indicating that all
+C<check_nonpage> will either return successfully, indicating that all
is well and the request should proceed, or it will die, like
C<check_mutate>.
-=head RESPONSE-RELATED FUNCTIONS AND METHODS
+=head1 RESPONSE-RELATED FUNCTIONS AND METHODS
=item C<< $authreq->url_with_query_params($params, [$nonpagetype]) >>
cookie in the forms generated by C<check_ok>. You may also set it
yourself (and indeed you must do so if you use C<check_divert>).
-=item C<< $authreq->chain_params() >>
-
-Returns a hash of the "relevant" parameters to this request, in a form
-used by XXX. This is all of the query parameters which are not
-related to CGI::Auth::Flexible. The PATH_INFO from the request is
-returned as the parameter C<< '' >>.
-
-xxx why use this function
-
=back
-=head OTHER FUNCTIONS AND METHODS
+=head1 OTHER FUNCTIONS AND METHODS
=over
See L</REQUEST TYPES>.
-=item C<< $verifier_or_authreq->($data) | CGI::Auth::Flexible->>>
+=item C<< CGI::Auth::Flexible::srcdump_dir_cpio($cgi,$verifier,$dumpdir,$dir,$outfn,$how,$script) >>
+
+Helper function for implementing the C<srcdump_process_item> hook.
+Generates a tarball using cpio and includes it in the prepared source
+code distribution.
+
+The arguments are mostly the same as for that hook. C<$dir> is the
+root directory at which to start the archive. C<$how> is a short text
+string which will be mentioned in the log.
+
+C<$script> is a shell script fragment which must output a
+nul-separated list of filenames (e.g. the output of C<find -print0>).
+It is textually surrounded by C<( )> and will be executed with C<set -e>
+in force. Its cwd will be C<$dir>.
+
+=item C<< $verifier_or_authreq->($data) | CGI::Auth::Flexible-> >>
Hashes the supplied data using the hash function specified by the
C<hash_algorithm> setting, and converts the result to a string of hex
=back
+=head1 REQUEST TYPES
+
+The C<$reqtype> values understood by C<check_nonpage> are strings.
+They are:
+
+=over
+
+=item C<PAGE>
+
+A top-level HTML page load. May contain confidential information for
+the benefit of the logged-in user.
+
+=item C<FRAME>
+
+An HTML frame. May contain confidential information for
+the benefit of the logged-in user.
+
+=item C<IFRAME>
+
+An HTML iframe. May contain confidential information for
+the benefit of the logged-in user.
+
+=item C<SRCDUMP>
+
+Source dump request, whether for the licence or actual source code
+tarball; returned value is not secret.
+
+=item C<STYLESHEET>
+
+CSS stylesheet. B<MUST NOT> contain any confidential data. If the
+stylesheet depends on the user, then attackers may be able to
+determine what stylesheet the user is using. Hopefully this is not a
+problem.
+
+=item C<FAVICON>
+
+"Favicon" - icon for display in the browser's url bar etc. We aren't
+currently aware of a way that attackers can get a copy of this.
+
+=item C<ROBOTS>
+
+C<robots.txt>. Should not contain any confidential data (obviously).
+
+=item C<IMAGE>
+
+Inline image, for an C<< <img src=...> >> element.
+
+Unfortunately it is not possible to sensibly show top-level
+confidential images (that is, have the user's browser directly visit a
+url which resolves to an image rather than an HTML page with an inline
+image). This is because images need to have a per-session hidden form
+parameter to avoid cross-site scripting, which breaks bookmarks etc.
+
+=item C<SCRIPT>
+
+JavaScript for a C<< <script> >> element. (Possibly confidential for
+the user.)
+
+=item C<AJAX-XML>
+
+C<< XMLHttpRequest >> returning XML data. (Possibly
+confidential for the user.)
+
+=item C<AJAX-JSON>
+
+C<< XMLHttpRequest >> returning JSON data. (Possibly
+confidential for the user.)
+
+=item C<AJAX-OTHER>
+
+C<< XMLHttpRequest >> returning data of some other kind. (Possibly
+confidential for the user.)
+
+=back.
+
+=head1 DIVERT SPEC
+
+The return value from C<check_divert> indicates how the request should
+be handled. It is C<undef> if all is well and the user is logged in.
+
+Otherwise the return value is a hash ref with the following keys:
+
+=over
+
+=item C<Kind>
+
+Scalar string indicating the kind of diversion required.
+
+=item C<Message>
+
+Scalar string for display to the user in relation to the diversion.
+Has already been translated. In HTML but normally does not contain
+any tags.
+
+=item C<CookieSecret>
+
+The secret cookie which should be set along with whatever response is
+sent to the client. The value in the hash is the actual secret value
+of the cookie as a string. C<undef> means no cookie setting header
+should be sent; C<''> means the cookie should be cleared.
+
+=item C<Params>
+
+The extra hidden form parameters (and the C<PATH_INFO>) which should
+be set when the subsequent request bounces back from the client, in
+the form used by C<url_with_query_params>.
+
+The contents of this hashref does not include the CAF-specific
+parameters such as the secret cookie, those which follow from the kind
+of diversion requested, etc.
+
+It is correct to always include the contents of C<Params> as hidden
+parameters in the urls for all redirections, and as hidden input
+fields in all generated forms. The specific cases where C<Params> is
+currently relevant are also mentioned in the text for each divert
+kind.
+
+=back
+
+The values of C<Kind> are:
+
+=over
+
+=item C<SRCDUMP->I<item>
+
+We should respond by sending our application source code. I<item>
+(which will contain only word characters, and no lower case) is the
+specific item to send, normally C<SOURCE> or C<LICENCE>.
+
+=item C<REDIRECT-HTTPS>
+
+We should respond with an HTTP redirect to the HTTPS instance of our
+application.
+
+=item C<REDIRECT-LOGGEDOUT>
+
+We should redirect to a page showing that the user has been logged
+out. (Ie, to a url with one of the the C<loggedout_param_names> set.)
+
+=item C<SMALLPAGE-LOGGEDOUT>
+
+We should generate a page showing that the user has been logged out.
+There can be a link on the page pointing to the login page so that the
+user can log back in.
+
+=item C<SMALLPAGE-NOCOOKIE>
+
+We should generate a page reporting that the user does not have
+cookies enabled. It should probably contain a link pointing to the
+login page with additionally all the parameters in C<Params>. When
+this divert spec is generated, C<Message> will explain the problem
+with cookies so there is no need to do that again in the page body if
+you include the contents of C<Message>.
+
+=item C<LOGIN-STALE>
+
+The user's session was stale (this is described in C<Message>). We
+should generate a login form.
+
+=item C<LOGIN-BAD>
+
+The user supplied bad login credentials. The details are in
+C<Message>. We should generate a login form (with additionally the
+parameters from C<Params> as hidden fields).
+
+=item C<LOGIN-INCOMINGLINK>
+
+We should generate a login form (with the specified parameters); the
+user is entering the site via a cross-site link but is not yet logged
+in.
+
+=item C<LOGIN-FRESH>
+
+We should generate a login form. The user is not yet logged in.
+
+=item C<REDIRECT-LOGGEDIN>
+
+We should redirect to our actual application, with the specified
+parameters. (The user has just logged in.)
+
+=item C<MAINPAGEONLY>
+
+We should generate our main page but B<ignoring all form parameters>
+and B<ignoring the path_info>. Most applications will find this
+difficult to implement.
+
+An alternative is to generate a small page with a form or link which
+submits our own main page without any parameters.
+
+(Applications which set C<promise_check_mutate> do not see this divert
+kind.)
+
=head1 SETTINGS
C<new_verifier> and C<new_request> each take a list of settings, as
%implicit_settings_hash{'some_hook'}($cgi,$authreq,@stuff)
would be a legal call. (However, the settings hash is not exposed.)
+When a hook's default implementation is mentioned and named, that
+function won't also be described in the section on the module's
+functions.
+
=over
=head2 GENERAL SETTINGS
=head2 SETTINGS RELATED TO HTML GENERATION
-These are only used if you call C<check_ok> or xxx some other functions?.
+These are only used if you call C<check_ok> (or other functions
+mentioned in this section).
Settings whose names are of the form C<gen_...> are hooks which each
return an array of strings, normally HTML strings, for use by
description C<gettext("Password")>, and a login submit button (with
description C<gettext("Login")>.
-Default is available as the class function C<gen_plain_login_form>.
+Default is available as the module function C<gen_plain_login_form>.
=item C<gen_login_link>($cgi,$authreq))
<a href="http:...">gettext(Log in again to continue.)</a>
-Default is available as the class function C<gen_plain_login_link>.
+Default is available as the module function C<gen_plain_login_link>.
=item C<gen_postmainpage_form>($cgi,$authreq,$params))
<input type="submit" ... value=getext('Continue')>
-Default is available as the class function C<gen_postmainpage_form>.
+Default is available as the module function C<gen_postmainpage_form>.
=item C<gen_start_html>($cgi,$authreq,$title)
[gen_source_link_html].
</address>
-Default is available as the class function C<gen_plain_footer_html>.
+Default is available as the module function C<gen_plain_footer_html>.
=item C<gen_licence_link_html>($cgi,$authreq)>
downloading the licence, and returns:
<a href="...">GNU Affero GPL</a>
-Default is available as the class function C<gen_plain_licence_link_html>.
+Default is available as the module function C<gen_plain_licence_link_html>.
=item C<gen_source_link_html>($cgi,$authreq)>
downloading the source, and returns:
<a href="...">Source available</a>
-Default is available as the class function C<gen_plain_source_link_html>.
+Default is available as the module function C<gen_plain_source_link_html>.
=item C<form_entry_size>
If this is a relative path, it is in C<dir>.
-=back
+=item C<srcdump_dump($cgi,$authreq,$srcobj)>
+
+Dump the source code (C<$srcobj='source'> or licence data
+(C<$srcobj='licence'>). The default implementation checks that
+C<$srcobj> has reasonable syntax and uses the files C<$srcobj.data>
+and C<$srcobj.ctype> with the C<dump> hook.
+
+=item C<dump($cgi,$authreq,$contenttype,$datafilehandle)>
+
+Responds to the request by sending the contents of $datafilehandle
+(which should just have been opened) and specifying a content type of
+$contenttype.
+
+The default implmentation uses the C<print> hook, and also calls
+C<$cgi->header('-type' => $contenttype>, and is available as the
+module function C<dump_plain>.
+
+=item C<srcdump_prepare($cgi,$verifier)>
+
+Prepares the source code for download when requested. Invoked by
+C<new_verifier>, always, immediately before it returns the
+just-created verifier object.
+
+The default implementation is the module function
+C<srcdump_dirscan_prepare>, which prepares a manifest, licence file
+and source code tarball of tarballs, as follows:
+
+It processes each entry in the return value from C<srcdump_listitems>.
+These are the software's include directories and any other directories
+containing source code. It handles C<.> specially (see
+C<srcdump_filter_cwd>).
-=head2 SETTINGS FOR HTML GENERATION
+For each entry it looks, relative to that, for the licence as a file
+with a name mentioned in C<srcdump_licence_files>. The first such
+file found is considered to be the licence. It then calls the hook
+C<srcdump_process_item> for the entry.
-=over
+The licence, a manifest file, and all the outputs generated by the
+calls to C<srcdump_process_item>, are tarred up and compressed as a
+single source tarball.
-=back
+It uses the directory named by C<srcdump_path> as its directory for
+working and output files. It uses the filename patterns
+C<generate.*>, C<licence.*>, C<s.[a-z][a-z][a-z].*>, C<manifest.*>,
+C<source.*> in that directory.
+
+=item C<srcdump_process_item>($cgi,$verifier,$dumpdir,$item,\&outfn,\$needlicence,\%dirsdone)>
+
+Processes a single include directory or software entry, so as to
+include the source code found there. Called only by the default
+implementation of C<srcdump_prepare>.
+
+C<$dumpdir> is the directory for working and output files. C<$item>
+is the real (no symlinks) absolute path to the item.
+
+C<\$needlicence> is a ref to a scalar: this scalar is undef if we have
+already found the licence file; otherwise it is the filename to which
+the licence should be copied. If the referent is undef on entry,
+C<srcdump_process_item> needs to see if it finds the licence; if it
+does it should copy it to the named file and then set the scalar to
+undef.
+
+C<\%dirsdone> is a ref to the hash used by C<srcdump_prepare> to avoid
+including a single directory more than once. If
+C<srcdump_process_item> decides to process a directory other than
+C<$item> it should check this hash with the real absolute path of the
+other directoy as a key: if the hash entry is true, it has already
+been done and should be skipped; otherwise the hash entry should be set.
+
+C<\&outfn> is a coderef which C<srcdump_process_item> should call each
+time it wants to generate a file which should be included as part of
+the source code. It should be called using one of these patterns:
+ $outfn->("message for manifest");
+ $outfile = $outfn->("message for manifest", "extension");
+The former simply prints the message into the manifest in the form
+ none: message for manifest
+The latter generates and returns a filename which should then
+be created and filled with some appropriate data. C<"extension">
+should be a string for the file extension, eg C<"txt">. The output
+can be written directly to the named file: there is no need to
+write to a temporary file and rename. C<$outfn> writes the filename
+and the message to the manifest, in the form
+ filename leaf: message
+In neither case is the actual name of C<$dir> on the system
+disclosed per se although of course some of the contents of some of
+the files in the source code dump may mention it.
+
+The default implementation is the module function
+C<srcdump_process_item>.
+
+It skips directories for which C<srcdump_system_dir> returns true.
+
+It then searches the item and its parent
+directories for a vcs metadata directory (one of the names in
+C<srcdump_vcs_dirs>); if found, it calls the C<srcdump_byvcs> hook
+(after checking and updaeing C<%dirsdone>).
+Otherwise it calls the C<srcdump_novcs> hook.
+
+=item C<srcdump_novcs($cgi,$verifier,$dumpdir,$item,$outfn)>
+
+Called by the default implementation of C<srcdump_process_item>, with
+the same arguments, if it doesn't find vcs metadata.
+The default implementation is the module function C<srcdump_novcs>.
+If C<$item> is a directory, it uses C<srcdump_dir_cpio> to prepare a
+tarball of all the files under C<$item> which have the world read bit
+set. Directories are not included (and their permissions are
+disregarded). The contents of C<srcdump_excludes> are excluded.
+
+If it's a plain file it uses C<srcdump_file> to include the file.
+
+=item C<srcdump_byvcs($cgi,$verifier,$dumpdir,$item,$outfn,$vcs)>
+
+Called by the default implementation of C<srcdump_process_item>, with
+the same arguments, if it finds vcs metadata. The additional argument
+C<$vcs> is derived from the entry of C<srcump_vcs_dirs> which was
+used: it's the first sequence of word characters, lowercased.
+
+The default implementation is the module function C<srcdump_byvcs>.
+It simply calls C<srcdump_dir_cpio> with a script from the setting
+C<srcdump_vcsscript>.
+
+=item C<srcdump_vcs_dirs>
+
+Array ref of leaf names of vcs metadata directories. Used by the
+default implementation of C<srcdump_process_item>. The default value
+is C<['.git','.hg','.bzr','.svn']>.
+
+=item C<srcdump_vcs_script>
+
+Hash ref of scripts for generating vcs metadata. Used by the default
+implementation of C<srcdump_byvcs>. The keys are values of C<$vcs>
+(see C<srcdump_byvcs>); the values are scripts as for
+C<srcdump_dir_cpio>.
+
+The default has an entry only for C<git>:
+ git ls-files -z
+ git ls-files -z --others --exclude-from=.gitignore
+ find .git -print0
+
+=item C<srcdump_excludes>
+
+Array ref of exclude glob patterns, used by the default implementation
+of C<srcdump_novcs>. The default value is C<['*~','*.bak','*.tmp','#*#']>.
+
+Entries must not contain C<'> or C<\>.
+
+=item C<srcdump_listitems($cgi,$verifier)>
+
+Returns an array of directories which might contain source code of the
+web application and which should be therefore be considered for
+including in the source code delivery.
+
+Used by the default implementation of C<srcdump_prepare>.
+
+Entries must be directories, plain files, or nonexistent; they may
+also be symlinks which resolve to one of those.
+
+If C<.> is included it may be treated specially - see
+C<srcdump_filter_cwd>.
+
+The default implementation returns
+C<(@INC, $ENV{'SCRIPT_FILENAME'}, $0)>.
+
+=item C<srcdump_system_dir($cgi,$verifier,$dir)>
+
+Determines whether C<$dir> is a "system directory", in which any
+source code used by the application should nevertheless not be
+included in the source code dump.
+
+Used by the default implementation of C<srcdump_item>.
+
+The default implementation is as follows: Things in C</etc/> are
+system directories. Things in C</usr/> are too, unless they are in
+C</usr/local/> or C</usr/lib/cgi*>.
+
+=item C<srcdump_filter_cwd>
+
+Boolean which controls the handling of C<.> if it appears in the
+return value from C<srcdump_listitems>. Used only by the default
+implementation of C<srcdump_prepare>.
+
+If set to false, C<.> is treated normally and no special action is
+taken.
+
+However often the current directory may be C</>, or a data directory,
+or some other directory containing data which is confidential, or
+should not be included in the public source code distribution for
+other reasons. And for historical reasons Perl has C<@INC> containing
+C<.> by default (which is arguably dangerous and wrong).
+
+So the default this setting is true, which has the following effects:
+
+C<.> is not searched for source code even if it appears in C<@INC>.
+C<.> is removed from C<@INC> and C<%INC> is checked to see if any
+modules appear to have already been loaded by virtue of C<.> appearing
+in C<@INC> and if they have it is treated as a fatal error.
+
+Only the literal string C<.> is affected. If the cwd is included by
+any other name it is not treated specially regardless of this setting.
+
+=back
=head1 DATABASE TABLES
xxx make _db_setup_do explicitly overrideable
-xxx divert spec
-xxx reqtype
-xxx settings
-xxx html generators
-xxx document cookie
+xxx remaining settings
+ assocdb_password
+ username_password_error
+ login_ok
+ get_cookie_domain
+ gettext
+ print
+ debug
+
+xxx document cookie usage
+xxx document construct_cookie fn
xxx bugs wrong default random on Linux
xxx bugs wrong default random on *BSD