upd_stats utilities submitted by Douglas Wilson Version dated: Jul 11, 2002 ---------------------------------------------------------------------------- Generic unix utilities: dirsz - Give size(s) of directorie(s) with some nifty options. I used to write this sort of utility in shell/awk, but this is better/faster. notail - like 'tail' except return all BUT last 'n' lines tailf - like 'tail -f' but exit upon reaching some given regexp(s). tac - my perl implementation of 'tac' utility, cat file in reverse order. sendfile - send email with file attachment(s). Requires Mail::Sender module. Informix specific utilities: onfileschk - check permissions of Informix files against what's in onfiles file. locks - info on who has what type of locks on what. wholock - info on who has locks on a certain table and what rows are locked. itags - create 'tags' file for 4gl files in current directory. tuser - Shell script to report on locks on Standard Engine tables. Fourgen specific utilities: db - set DBSRC path (from libs in Makefile) and start 4gl debugger. litags - create 'tags' file for library functions of Fourgen generated modules. ---------------------------------------------------------------------------- Fixed wholock which didn't always use the correct table number. Added db debugger utility for Fourgen users. ---------------------------------------------------------------------------- Updated sendfile to work better with later version of library Updated tuser to output file name for table tuser wasn't reporting on the ppid of the locking process if there was more than one lock on the file. ---------------------------------------------------------------------------- Updated litags to skip files it can't open, and added tuser to archive. Updated some perl scripts to silence warnings that weren't complaining in earlier versions of perl. ---------------------------------------------------------------------------- Utility: updstats Synopsis: Generates optimal update statistics statements according to the Informix 7.3 Performance Guide, and optionally executes them. Author: Douglas Wilson Version(s) supported: only 7.xx has been tested File(s): updstats Comments: Requires perl (5.005 or later), and the perl modules DBI (1.13 or later) and DBD::Informix (0.95 or later) modules (available from www.cpan.org). DBD::Informix installation requires ESQL/C (maybe not, read the docs). You may have to change the path to perl at the top line of the script, depending on where your perl is installed, and a default database is hardcoded in the script which you may want to change. The 'connect' statements in the program may need a username and password depending on your database setup. I just have the empty strings '' for those which works for me. The output of the script by default goes to stdout, so you may redirect it to a file, or you can immediately execute it with the -x option, which just pipes the output to xupdstats (which must be in your PATH). If you want to simultaneously output to a file and execute it, thats what 'tee' is for. Yes, this does pretty much the same thing (I hope) as Art's dostats.ec utility (BTW thanks for the help Art), and without as many options (so far), (ok, now there is a bunch of options which do all sorts of obscure things) but I just wanted to do it in perl, and I wanted to automatically execute the statements in parallel (which gets done in xupdstats). Release Notes: Version 1.1: Allows multiple database selection with '-d' option, allows host selection with '-h' option, removed '-a' ansi option which is unneccessary since according to DBD::Informix man pages, all databases are in 'AutoCommit' mode by default. No longer outputs column list for MEDIUM updates when there are no indexes on the table. Single column indexes were generating update LOW statements, but no longer (except with -F option). Added -F option to disable single column optimization. Version 1.2: Added -p option to update statistics for stored procedures. Added optional argument to -x option to start n sub-processes. Updated usage message. Version 1.22: '-t all' was not updating all tables, now it is. Version 1.23: added '-pt' option to work with xupdstats '-t' option. Version 1.24: The Performance Guide says to update high on tables with a small number of rows. It neglects to say what is a small number. So I added an option to update high on tables with less than or equal to whatever number of rows you would like to define as a small number. The number of rows is retrieved from 'systables.nrows' so this will be inaccurate for tables which have never had statistics updated, but will save a bit of time over doing a count(*) (and saves me programming time). Maybe I'll change it or add an option to actually do a count(*) on the table in the future since a count(*) doesn't really take that much time. Also, changed all options that take an argument to actually require an argument. Version 1.25: No longer generates duplicate UPDATE LOW statements for indexes with duplicate column lists. Version 1.26: Added support for standard engine with '-s' option. This is untested though since I don't have access to a standard engine database. Added -R,-r,and -C options to specify resolution and confidence for HIGH and MEDIUM updates. No error checking is done on the values supplied, you're expected to read the manual on such things. Also since the Performance guide suggests different resolution on large tables, I added a -rn option to only add the resolution clause if the number of rows is >= some number. Added -c option to perform a count(*) on all the tables instead of using 'systables.nrows', this about doubles the execution time (from ~12 to ~24 seconds on my database of about 1000 tables), but I also streamlined the code in the process (saving about 1 second on the aforementioned database). Version 1.30: Now the '-s' option for SE should actually work, also the '-pt' option, with the help of patches from Roderick Schertler (Thanks again). Since there's no sysmaster database in SE, I use a DBI function (data_sources) to get the available databases if needed (e.g. if you say 'all' databases or use a wildcard in the database name), but the function only works for the local host, so using '-h' will not get the database NAMES from the specified host for SE. I thought about aborting in this case, but who knows it might still behave the way you want if you have duplicate database names on systems. Version 1.31: Slight bug if you had wildcards in the database name. You will still be out of luck if you use '-h' and also have a database name that contains an explicit host name, e.g. 'db@host'; this will probably cause an error. Or if you use 'all' to get all available databases and also include a 'db@host' database arg, then you will lose the 'db@host' arg. I may remedy that later if its desired and after I think about it more. ---------------------------------------------------------------------------- Utility: xupdstats Synopsis: Executes the sql output from updstats utility. Several child processes are spawned, and each batch of statements with a common tablename are sent to a process. As each batch finishes in each process, another is sent. Author: Douglas Wilson Version(s) supported: only 7.xx has been tested File(s): xupdstats Comments: Requires perl (5.005 or later?), and the perl DBI and DBD::Informix modules (available from www.cpan.org). DBD::Informix installation requires ESQL/C. You may have to change the path to perl at the top line of the script, depending on where your perl is installed, and a default database is hardcoded in the script which you may want to change. The 'connect' statements in the program may need a username and password depending on your database setup. I just have the empty strings '' for those which works for me. Release Notes: Version 1.1: Allows multiple databases via 'database' statements in the sql script. Connection to a new database is made in the subprocesses only when neccessary. Allows host selection with '-h' option, removed '-a' ansi option which is unneccessary since according to DBD::Informix man pages, all databases are in 'AutoCommit' mode by default. Version 1.2: Allow update statistics for stored procedures. Set lock mode to wait for the sql connections since the system tables being updated have page level locking. Version 1.22: Stored procedure statistics were not really being updated, now they are. Added an option to execute all statements for a table in parallel. I think think is ok since we're generating complete column lists and adding 'DISTRIBUTIONS ONLY' whenever possible. Added some signal handling, so that this program is automatically nohup'd if you run it in the background but if you kill any processes (with -15 (SIGTERM), please) then they will all die after finishing the current sql statement. I'm open to suggestions on how to kill the processes more cleanly when one of them is TERM'd. None of that matters if you just let it run all the way through anyway. Version 1.23: Program waits for update stats on tables to finish before starting on stored procedures. Version 1.24: Last modification fixed. We were only waiting for tables to finish on the first database; after that, we were waiting, but in the wrong place. Also, changed all options that take an argument to require an argument. Version 1.25: Somehow a bug sneaked in where -t was being done for all tables even if -t was not specified. Fixed it. Version 1.30: Doing a 'ps' when this is running should be more informative thanks to patch from Roderick (see above). Version 1.31: Better error reporting. Main program exits w/error status if any child sql statement fails. Again thanks to Roderick for patch. ----------------------------------------------------------------------------