July 1994 Newsletter
Volume 4, No. 3
Highlights of This Issue
User Group Activities and Events
Second Annual Special Issue for New Members
This is the second year we have produced a special edition of our newsletter for our new
members and for the Informix Worldwide User Conference in Tampa, Florida. In the last year
we have grown from 200 to over 500 Informix users on our mailing lists. We held a one day
Forum last November with 220 attendees, 16 speakers and 10 exhibitors. We had technical
presentations on Informix 4GL programming tips, networking, Online DSA 6.0. Our
quarterly meetings have included presentations on support services through Informix Link and
Informix on Windows NT. This has been a very exciting year for our user group and we look
forward to another challenging year. Our objective is to provide technical tips, new product
information and the opportunity to network with other Informix Users.
The Informix User Group in the Washington area was started in May 1991. Nineteen people
from a variety of companies and federal agencies met to hear about release 5.0 and make plans
for a user group. At the meeting we identified the goals of the user group as to share
information, learn about new products, provide technical input, discuss problems, share
solutions, and network with other Informix Users. Our first newsletter was sent out in June
1991. This issue is also being mailed to many people who are not on our membership lists in
the Maryland, Virginia and the District of Columbia. If you would like to receive regular
mailings from our user group please send in the membership form on the last page.
Next Meeting
Our next meeting is Wednesday, August 24, 1994 from 5:00 to 9:00 pm. The meeting will be
held at the Informix Office, 2110 Wilson Blvd., 9th Floor, Arlington, VA. Please note the
new address and time for the meeting.
Agenda:
- New products from Informix (Informix 4GL++, and Online DSA 6.0)
- Question and answer session.
- Discussion on plans for Forum 94 - a one day Informix User Group event
Newsletter Sponsorship
The user group has been supported by many companies over the years. If your company
would like to sponsor a mailing of the newsletter please call for more information. We would
like to thank the following companies for sponsoring this newsletter:
- Advanced DataTools Corporation
- Business Systems Support Group, Inc
- C Associates
- Four Seasons Software
- Summit Data Group
Last Meeting - Informix Database on Windows NT
Our May meeting was a joint meeting with the Advanced Systems User Group (Windows NT
User Group). Pat Higbie, President of DataFocus, Incorporated, demonstrated Informix SE
on Windows NT. DataFocus helped developed the POSIX subsystem in NT and ported
Informix SE to Windows NT for Informix. They are experts in the area of UNIX porting and
compatibility and have a product to port from Unix to NT. This was a rare behind-the-scenes
look at the Informix database and Windows NT. Microsoft Corporation hosted the meeting.
Elections for the Board of Directors
At the August meeting we will be holding elections to the WAIUG Board of Directors. The
Board is composed of volunteers who plan our activities and work hard at putting them
together. If you are interested in volunteering to be on the Board, or would like to nominate
a member, please contact the Secretary of the WAIUG, Rick Montgomery, at 703-756-7273.
User Group Membership
Our user group has grown to over 500 names on our mailing list this year. We are
incorporated as a non-profit organization. Our objective is to share information, learn about
new products, provide technical input, discuss problems, share solutions, and network with
other Informix users. In order to cover our expenses and support continued growth, the board
of directors decided to implement membership dues in 1994. The membership dues will be
$20.00. Everyone who registered and paid for the User Group Forum in 1993 will
automatically be covered as a member for one year. Membership will allow you to receive all
our newsletters and a discount on our forum next year. For more information about
membership, please call our Membership Director, John Petruzzi, at 703-490-4598
SQL Optimization - the Set Explain Command
by Kevin Fennimore
When performing queries on a database the engine must determine how to access the tables.
This is the job of the optimizer. The optimizer determines the appropriate order to read from
tables when performing joins and how to access the data whether it is via an index,
sequentially or some other method. It analyzes various information about the table or tables
involved in the query such as the size of the table, the index on the table and the selectivity of
filters in the query (for a discussion of selectivity see the Guide to SQL Tutorial Manual).
The order that the optimizer will use to read from the tables involved in a join is called the
query path. When three tables are joined the optimizer may decide to read from table 2, then
table 3 and finally table 1. This table2-table3-table1 ordering is the query path. The right
query path can prevent the engine from doing extra work by reducing the amount of data that
is read to execute a join. This might be accomplished by reading from a smaller table first and
then joining to a larger table. Consider the following query:
SELECT * FROM table, table2
WHERE table1.x = table2.y
Table1 has 100 records and table2 has 50,000 records. For every record we read from table1
(100 reads) we would read 50,000 records from table2 to satisfy the join (assuming no indexes
on table2). This would be a total of 5,000,100 reads (100 table1 reads + 100 * 50,000 table2
reads). If, however, we read from table2 first (50,000 reads) and then from table1 (100 for
each table2 record), we would perform 5,050,000 reads (50,000 table2 reads + 50,000 * 100
table2 reads). This order requires an additional 49,900 reads.
Now consider what would happen if we added indexes to the table. Let's assume that an
indexed read into table1 requires 2 index reads and 1 data read for a total of 3 reads to get a
particular value and table2 requires 4 reads (3 index reads and 1 data read). When we read
from table2 first, we'd read each record from table2 and do an indexed read into table1. This
would require 200,000 reads (50,000 reads from table2 + 50,000 * 3 reads from table1). But,
when we read from table1 first and then do an indexed read into table2 we only need 500 reads
(100 reads from table1 + 100 * 4 reads from table2).
As you can see the number of reads required can be reduced by just reading tables in the right
order and even further reduced by adding indexes to the join columns. In addition to reading
fewer records other savings can be realized. By reading rows from a table via an index, which
will return rows sorted based on the index key, the need for additional sorting is eliminated.
In order to determine what the optimizer is doing and how data is being accessed, we can use
the SET EXPLAIN ON command. This command is available in SQL, 4GL and any of the
embedded-SQL languages. For example:
SET EXPLAIN ON;
SELECT 1;
SELECT 2;
SET EXPLAIN OFF;
When this set of SQL commands is run a query plan is output to the file sqexplain.out in the
current directory. The file is created if it does not already exist and is appended to if it does
exist. Following is an example of some queries run with the SET EXPLAIN turned on and an
explanation of the output. Explanations are in italics.
All queries begin with the following line, followed by the query statement:
QUERY:
------
select * from stock order by description
The next two lines are common to all queries.
Estimated Cost: 6
The estimated cost is a number assigned to the query and is
primarily used as a comparison to other query paths. When this
query is run over and over again this number may vary slightly.
This number is useful when a change is made to the query (i.e.
adding an index) and it drops significantly.
Estimated # of Rows Returned: 15
The estimated number of rows returned is only an estimate. I have
seen cases where a query returns thousands of rows and this number
was only 2 or 3.
Temporary Files Required For: Order By
This line reports if the optimizer has determined that a temporary
file is required. In this example a temporary file is required for
the order by. As you may recall from the "Indexing Strategies"
article, creating an index on the description field would eliminate
the need for this temporary table since the optimizer could use the
index to order the rows, as in the next example. This would
eliminate the need for a temporary table but may not be desirable
since the description field is character.
The rest of the query plan lists the table access strategy. This
is the order that the tables will be read and how they will be
accessed.
1) informix.stock: SEQUENTIAL SCAN
In the example above only one table is involved so it will be read first and it will be read via a
SEQUENTIAL SCAN. In this case we are selecting all of the rows with no joins so other than
adding the index for the order by there isn't much else we could improve.
QUERY:
------
select * from stock where unit_price>20 order by stock_num
Estimated Cost: 3
Estimated # of Rows Returned: 5
1) informix.stock: INDEX PATH
Filters: informix.stock.unit_price > 20
(1) Index Keys: stock_num manu_code
This is an example (above) of the same query except it is ordered on stock_num and has a filter
on unit_price. Notice that no temporary files are required and that the access this time is INDEX
PATH and not SEQUENTIAL SCAN. We can also see the columns that make up the index used
on the Index Keys line (stock_num, manu_code). This is a good example of the engine using
the leading portion (stock_num) of a composite index (stock_num, manu_code). The Filters are
applied to each row that is read from the database to determine if it meets the the query criteria.
QUERY:
------
select manu_code from stock
Estimated Cost: 2
Estimated # of Rows Returned: 15
1) informix.stock: INDEX PATH
(1) Index Keys: stock_num manu_code (Key-Only)
For this query the optimizer uses an index path to read the information from the table. It uses
the index because the optimizer is able to take advantage of the Key-Only read capabilities
available with OnLine (this type of access is not available with the Standard Engine). A Key-
Only read is when the engine reads all of the selected data from the key of an index. This can
greatly reduce the number of reads required to satisfy the query. Since many keys can fit into one
node, reading one node could yield many rows of data. Also note here that the engine is able to
perform Key-Only reads for a column(manu_code) that is not the leading portion of a composite
index as is the case when using composite indexes for joins and ordering.
QUERY:
------
select * from stock,items
where stock.stock_num=items.stock_num
and items.quantity>1
Estimated Cost: 9
Estimated # of Rows Returned: 22
When two or more tables are used in a query the tables are listed in the order in which they will
be accessed. In this example we read from the stock table sequentially and then access the items
table via the index on stock_num, manu_code. The filter is also applied on the quantity column
of the items table.
1) informix.stock: SEQUENTIAL SCAN
2) informix.items: INDEX PATH
Filters: informix.items.quantity > 1
(1) Index Keys: stock_num manu_code
Lower Index Filter: informix.items.stock_num =
informix.stock.stock_num
When an indexed read is done it is necessary to find the first key value where the indexed read
will begin. This starting position is defined in the Lower Index Filter. As we will see in the next
example, there is also an Upper index Filter which defines the key value at which to stop reading
the index.
QUERY:
------
select * from stock where stock_num>10
and stock_num<14
Estimated Cost: 1
Estimated # of Rows Returned: 1
1) informix.stock: INDEX PATH
(1) Index Keys: stock_num manu_code
Lower Index Filter: informix.stock.stock_num > 10
Upper Index Filter: informix.stock.stock_num < 14
Here we are doing an indexed read of the table and using the Lower and Upper Index Filters to
satisfy the filter condition on stock_num.
QUERY:
------
select * from items,stock
where items.total_price=stock.unit_price
Estimated Cost: 19
Estimated # of Rows Returned: 59
1) informix.items: SEQUENTIAL SCAN
SORT SCAN: informix.items.total_price
2) informix.stock: SEQUENTIAL SCAN
SORT SCAN: informix.stock.unit_price
MERGE JOIN
Merge Filters: informix.stock.unit_price =
informix.items.total_price
This example uses the concept of the Sort Merge Join to join the two tables together where no
suitable index can be found. The SORT SCAN indicates that a sort will be done on the specified
columns in preparation for a Merge Join. Then the MERGE JOIN indicates that the sorted
column are to be merged. This type of joining is not very efficient since additional overhead is
incurred during the sort and merge processes. This query is a prime candidate for adding indexes
to the join columns to improve performance.
Note: The Sort Merge Join is only available in versions 5.0 and beyond. Prior to version 5.0 an
AUTOINDEX PATH would have been used. This is even less efficient than the Sort Merge Join
since it actually builds a temporary index on the columns being joined. Both of these access
methods should be avoided whenever possible.
QUERY:
------
select * from qps@ctl0104:tord_stat
Estimated Cost: 2
Estimated # of Rows Returned: 7
1) qps@ctl0104:informix.tord_stat: REMOTE PATH
Remote SQL Request:
select x0.ord_stat_ind ,x0.ord_stat_desc ,x0.user_stamp
,x0.dt_t
imestamp ,x0.trans_dt_timestamp from qps:"informix".tord_stat
x0
This query is accessing a table on a remote database server. This is the reason for the REMOTE
PATH. The request that is actually sent to the remote server is also output. There is no way to
access the query plan for the request on the remote server.
when using the SET EXPLAIN command you should keep in mind the affect of the UPDATE
STATISTICS command on the optimizer. As the database changes the UPDATE
STATISTICS command should be run occasionally to assure that the statistics used by the
optimizer are up to date. Running update statistics on a table can completely change the
query plan in the SET EXPLAIN output.
The SET EXPLAIN command is the best tool available for analyzing slow queries. But it can
also be a useful tool in the development process. It can be used to double check that a query
is using the indexes you think it should be using. You can also use the SET EXPLAIN
command in a program to analyze all of the queries that are performed during the program's
execution. This output can be reviewed in a code walk through or used to follow the logic of
unfamiliar code. Development applications could include a call to a function that checks an
environment variable and based on its value turns the SET EXPLAIN on. This way no
recompilation is necessary to turn the explain feature on, simply change the environment
variable. It is a good idea not to have the SET EXPLAIN turned on at all times since it will
append to the sqexplain.out file in the current directory which could cause some rather large
files.
Kevin Fennimore, UCI Consulting, 2413 Arbor Lane, Hillsborough, NC 27278, Phone: 919-644-1073
Informix 4GL Programming Tips
By Lester Knutsen
This article is a result of a presentation I did for the Washington Area Informix Users Group
last March, on 10 Informix 4GL Programming Tips. Since then, I have expanded on the ideas
with additional input from a second presentation for the Atlanta User Group, training
seminars I have conducted, and a discussion in comp.databases.informix on the fastest way to
perform an update. It has grown to 12 4GL tips, most of which also apply to SQL. Most of
these tips are basic, and this is targeted for the new Informix 4GL programmer.
1. Only select columns that you really need
When you select data it goes from disk to memory. Ten un-needed bytes of data from 10,00
rows translates into reading 100,000 bytes unnecessarily. Avoid "SELECT *" statements and
only select the fields you will be using in your program. Some of the biggest performance gains
I have seen have come from eliminating unused fields from select statements. In an network
environment this will also reduce network traffic.
2. Only select rows when they are really needed
Several programs I have done performance reviews on started out by selecting all the rows in
a table, and making the user wait while this select was occurring. Even if the user only wants
to locate a few records they have to wait until all records are located. Avoid select statements
without a where clause, and statements that select rows a user may not need.
3. Use the dual cursor model for selecting and updating data
One approach is to use a dual cursor model. The first cursor selects all the rowids a user may
need, and as the user picks a next row, a second cursor selects the columns of data from a row.
I have an example program at the end of the article that uses the dual cursor model.
4. Only update the columns that need to change
We want to avoid transferring unnecessary bytes between memory and disk. Only update the
columns that need to change. Avoid updating all columns unnecessarily, for example, using
"set tab.* = rec.*". Also, avoid updating columns with indexes, as the data has to be written
to disk and the index updated, requiring two or more writes to disk.
5. Use "UPDATE WHERE CURRENT OF" syntax
When you are updating a row, you most likely have already selected the row and loaded it in
memory. The statement "update table set X=X where rowid = rec.rowid" re-selects the data
from disk again before updating it. Use the statement "update where current of" to update
a row you have already selected. This will speed up your updates. See the example program
at the end of this article.
6. Use of direct updates vs. selecting data for update
There was a very interesting discussion on the Internet in comp.database.informix on the
fastest way to update 2 million rows. The general consensus was to turn off logging and let
the database engine do the update for the fastest results. I did some tests and this was 5 to 6
times faster than selecting the data and then updating it using the dual cursor method.
7. Using LET vs INITIALIZE
The INITIALIZE statement performs a subroutine which is slower than the LET statement
which performs a byte copy. At the beginning of a program, I like to initialize a null record
(null_rec.*). Then in the program, use "LET p_rec.* = null_rec.*". This is faster than calling
INITIALIZE repeatedly.
8. Using GLOBAL variables vs. passing parameters
Passing variables to functions requires push and pop functions on the 4GL stack. Look at the
C code generated by 4GL and examine what happens when you pass a variable to a function.
There is some extra overhead in this. I recommend carefully selecting variables that are passed
frequently to functions and making them GLOBAL variables. GLOBAL variables are
accessed directly in memory.
9. Program initialization functions
Avoid repeatedly opening and closing the same forms, preparing static cursors, and initialing
NULL variables throughout a program. At the beginning of a program, have a function to
perform all initialization of null variables, opening forms and preparing cursors.
10. Update statistics
The database optimizer needs correct information about what is in your database to perform
well. After adding or deleting many rows in a 4GL program, use the statement "UPDATE
STATISTICS FOR table_name" to update the information the optimizer uses.
11. Plan your locking method
Row locking allows many people to update data at the same time. However, it can consume
system resources and may fail if your system does not have enough locks available. If users are
updating single rows, then row level locking works best. However, for batch updates, consider
locking the entire table. One rule I have used is if the update will involve more than 200 rows
or 25% of the rows, the table needs to be locked.
12. Plan transaction length
Avoid long processes that may fill up your logs and lock the database engine. This is one
advantage of the dual cursor model vs. the direct update model. The direct update model is
one long transaction and may fill up your logs. Using a cursor and committing after each row
is updated will ensure that your logs do not fill up.
Figure 1. Example Program Using the Dual Cursor Model
######################################################################
# Copyright 1994 Advanced DataTools Corporation
# Function: updcursor.4gl
# Author: Lester B. Knutsen
# Description: A program to demo the dual cursor approach
# Date Name Comments
#----- ----- ----------------
#020194 LK Created
######################################################################
database stores
globals
define price like items.total_price,
new_price like items.total_price,
rowno integer
end globals
main
## Declare the big cursor selecting only the rowid or a unique indexed field for
all rows required.
prepare pc_select_all from " select rowid from items "
declare c_select_all cursor with hold for pc_select_all
## The with hold option will keep the cursor open after a commit work
## Declare the small update cursor to lock one row at a time - locking the
smallest data possible.
prepare pc_lock from "select total_price from items where rowid = ? for update
of total_price"
declare c_lock cursor for pc_lock
## Prepare the update statement to update the current row in memory.
prepare pc_upd from "update items set ( total_price ) = ( ? ) where current of
c_lock "
## Fetch the Big cursor.
foreach c_select_all into rowno, price
## On every row - fetch the small cursor. This will lock and update one row
at a time
begin work ## Used when database logging is turned on
open c_lock using rowno
fetch c_lock into price
let new_price = price * 1.1
execute pc_upd using new_price
display "Update from ", price, " to ", new_price
commit work ## Used when database logging is turned on
end foreach
display "Update complete "
display ""
end main
Lester Knutsen, Advanced DataTools Corporation, 4510 Maxfield Drive, Annandale, VA
22003, Phone: 703-256-0267 or Email: lester@access.digex.net.
Avoiding Database Crashes
by Joe Lumbley
In an environment with multiple programmers who are familiar with Unix, your most common
problem will occur when somebody kills an engine process. Informix does not take lightly to
the untimely death of a sqlturbo engine. This often causes the database to abort, necessitating
a restart and recovery that can take anywhere from minutes to hours.
In some instances, you will have a "rogue" sqlturbo process running that cannot seem to be
killed. If a user kills this process at just the wrong time, the entire database could go down.
More sophisticated Unix users will often just resort to the "kill -9" in order to try to stop their
queries.
This problem can best be solved by enforcing a prohibition that NOBODY but the DBA can
"kill -9" a sqlturbo process. It's safe for anyone to "kill -15" a process. The problem here is
that if a Unix user gives a "kill -15" command, the process will not necessarily die immediately.
If the sqlturbo process goes into a rollback status because it received the command in the
middle of a transaction, the process will remain around until the rollback completes. An
impatient user may try a few "kill -15's" and finally try for a sure kill with "kill -9". If the
system is in rollback, OnLine will probably crash. If the process was holding any locks or if
it was in a critical write process, the OnLine can crash. The Informix command "tbmode -z"
command works in the same way. The job doesn't immediately disappear.
Both "kill -15" and "tbmode -z" are safe to use any time, as long as the user is prepared to be
patient. It's usually best to have the users check with the DBA if the jobs don't die within
about 15 minutes.
Processes that take longer than fifteen minutes can pose a quandary even to the DBA. It's
possible that the query was a disjoint query and that the system is trying to sort a billion rows
for you. You could be in a long transaction on the verge of either going into forced rollback
or filling up all your logs, causing a crash. If the DBA can't identify exactly what's happening
it is usually best to let the process finish its job, unless this will cause other problems in the
database, such as filling up the logs. If the runaway job begins to affect performance or
response time the DBA usually needs to manage to kill it.
By the time the DBA sees the typical rogue sqlturbo process, the user has probably already
tried to kill it with "kill -15" or "tbmode -z". It's probably in rollback. Rollback can be
identified by the flags "--R--X--" in a "tbstat -u" process. Using the Unix "ps" command
will tell the DBA whether the sqlturbo process is getting any time. If so, your rollback is in
process. No matter what you do, the rollback has to complete. It'll either complete with the
system online or it'll complete during the recovery stage of database startup. Online is usually
best.
The worst case for the DBA is the process that was written in ESQL/C or that has other
reasons why the process only recognizes a "kill -9". Here, if you absolutely have to stop the
process, a "kill -9" is your last resort. If your Unix has process control (do you have a "bg"
command to put a job in the background), read the next paragraph before doing a "kill -9".
Otherwise, do it and cross your fingers.
Some Unix operating systems that support placing jobs in the foreground or background have
another option to "kill" that could possibly give you an out in the above situation. If you can
either stop the process by process control or with a "kill -STOP" command, you may see that
the Unix "ps" command shows that the process eventually is getting no time and is in a
sleeping state. You can then use tbstat -u and check the flags for the terrorist process. If the
flags don't show an "X" indicating that the process is not in a critical state, you have a chance.
If it is in a critical state, use your equivalent of a "kill -CONT" command to crank it up again.
Keep stopping and starting it until it gets no time and doesn't have an "X". Note the
username column of the tbstat -u output. There is also a column named "locks". If the
process is holding no locks, you're probably safe in doing a "kill -9". If it doesn't work and
you really needed to kill the process, you haven't lost anything. You would have had to do the
"kill -9" anyway.
No matter what you do, no matter how careful you are, you will crash like this sometimes. If
the consequences are bad enough, the users usually learn to make the DBA kill the tough jobs.
This is an exerpt from a new book on Database Administration by Joe Lumbley.
The Informix Database Administrator's Survival Guide. ISBN 0-13-124314-1, Informix Press.
Joe Lumbley, 819 N. Bishop Street, Dallas, TX 75208, Phone: 214-450-9896, Email:
jlumbley@netcom.com
The PREPARE Statement
by Kevin Fennimore
As defined by the Informix Guide to SQL manual, the prepare statement is used to "parse,
validate, and generate an execution plan for SQL statements whose structure is unknown
when your program is compiled." The most common use of the prepare statement is for
implementing query-by-examples, as in the following segment of code:
CONSTRUCT BY NAME where_clause ON screen_rec.*
LET stmt = "SELECT * FROM some_table WHERE ", where_clause
CLIPPED
PREPARE p1 FROM stmt
DECLARE c1 CURSOR FOR p1
FOREACH c1 INTO some_rec.*
. . .
END FOREACH
However, there are other uses for the prepare statement, such as significantly increasing
performance. To understand how the prepare statement impacts performance, let's take a
closer look at what happens when a statement is prepared and when a statement is run in
an application.
When a statement is prepared, the following actions are performed by the engine(i.e. the
sqlexec or sqlturbo process):
- The statement is checked for correct syntax.
- The current user's permissions are checked for the statement.
- and -
- The statement is optimized.
These actions require reads of the system catalogues( systables, syscolumns, sysindexes, etc.
). The syntax checks require the reading of catalogues like systables and syscolumns to
check for the existence of specified table and column names. The permission checks require
reading the sysuers, systabauth and syscolauth tables. The optimization requires the
reading of catalogues such as sysindexes and systables. This is just an example of some of
the catalogues involved in preparing a statement but there are many others.
When a statement is run in any Informix application it must be
prepared and then executed. For example, the following
statement:
INSERT INTO some_table VALUES( p1, p2, p3 )
when run in 4GL or an ESQL program, actually translates to a
prepare statement and an execute statement. The above insert
statement is the same as the following statements:
PREPARE p1 FROM "INSERT INTO some_table VALUES( ?, ?, ? )"
EXECUTE p1 USING p1, p2, p3
Let's consider an example of an insert statement inside a loop:
LOOP 1000 TIMES
INSERT INTO some_table VALUES( p1, p2, p3 )
END LOOP
In this example the statement is prepared 1000 times and then executed 1000 times. Recall
from above, that when a statement is prepared, it must be checked for syntax and
permissions and the statement is optimized. All of which requires the overhead of reading
the system catalogues and additional processing time. However, the statement only needs
to be prepared once which means that the above example is incurring a large amount of
overhead by preparing the statement an additional 999 times. A better way to perform this
insert is as follows:
PREPARE p1 FROM "INSERT INTO some_table VALUES( ?, ?, ? )"
LOOP 1000 TIMES
EXECUTE p1 USING p1, p2, p3
END LOOP
This example prepares the statement once and then executes it 1000 times removing the
overhead of unnecessarily preparing the statement 999 times.
This concept can be carried over into other statements as well, such as select and update
statements. Consider the following function which validates a state:
FUNCTION valid_state( entered_state )
DEFINE entered_state CHAR(2)
DEFINE not_used CHAR(2)
SELECT state INTO not_used FROM state_table
WHERE state = entered_state
IF( STATUS = NOTFOUND )
THEN
RETURN FALSE
ELSE
RETURN TRUE
END IF
END FUNCTION
If this function was called in an after field clause of an input statement the user might
notice a temporary pause in the movement of the cursor to the next field. This could be
due to the system performance in searching the state table. More likely, however, there will
be a pause while the program prepares the statement and then executes the select. A more
optimal way of writing this function would be:
FUNCTION init_valid_state()
DEFINE stmt CHAR(50)
LET stmt="SELECT state FROM state_table WHERE state =
?"
PREPARE p_valid_state FROM stmt
DECLARE c_valid_state CURSOR FOR p_valid_state
END FUNCTION
FUNCTION valid_state( entered_state )
DEFINE entered_state CHAR(2)
DEFINE not_used CHAR(2)
OPEN c_valid_state USING entered_state
FETCH NEXT c_valid_state INTO not_used
IF( STATUS = NOTFOUND )
THEN
RETURN FALSE
ELSE
RETURN TRUE
END IF
END FUNCTION
The function init_valid_state() would be called at the beginning of the program to prepare
the select statement and declare a cursor for it. Then when the valid_state() function is
called, it simply opens the cursor and fetches the value. This avoids the need to prepare the
cursor each time the function is called. There are drawbacks to preparing too many
statements. One is that the select is prepared every time the program is run, even if the
valid_state() function is never called. Also, if there are several statements being prepared
at the beginning of the program it will take longer to start up. There is also a limit to the
number of statements that can be prepared at any given time; version 4.01 has a limit of
256 and earlier versions have a limit of 64. However, statements can be free'ed to release
their resources and allow other statements to be prepared.
The most common question about prepares is "How many times does a statement have to
be run before it should be prepared?" There is no definitive answer to this question. The
thing to keep in mind is that any statement that is run will be prepared and then executed.
This means that preparing a statement that is only executed one time doesn't hurt
performance. When using the Standard-Engine, preparing statements will show a greater
performance improvement over those same statements in an OnLine environment. This is
due to the fact that the Standard-Engine does not have the buffering capabilities of
OnLine. Typically, OnLine will have most of the system catalogs buffered in shared
memory so the preparation of a statement will not require as many physical disk reads.
However, the preparation will still require processing time which can still be costly.
Preparing statements in any application is always a good idea. Overall performance is
increased when a statement that is run more than once is prepared once and then executed
many times. Obviously, the more a statement is executed, the more noticeable the increase
in performance will be when it is prepared. The practice of preparing statements whenever
possible will enhance performance before performance becomes a problem. Remember, it
is always easier to take out a few unnecessary prepare statements than to go back and add
the code to prepare 256 statements.
Kevin Fennimore, UCI Consulting, 2413 Arbor Lane, Hillsborough, NC 27278, Phone: 919-644-1073.
An Introduction to INFORMIX Database Security
by Lester Knutsen
Informix applies security at the database, table and column level. All security is based on a
users login name. Using Informix SQL, the creator of a database can give and remove
privileges. The SQL command to give access is the grant command and the revoke
command removes privileges.
Database Privileges
There are three levels of database privileges: DBA, Resource and Connect. Connect
privilege allows a user to access the database and add, modify and delete data. Resource
privilege allows a user all Connect privileges, and in addition, the ability to create new
tables, indexes and procedures. DBA privilege allows connect and resource privileges, and
also gives the user the authority to grant resource, connect or DBA to another user. A user
with DBA privilege can also drop any object in the database, and/or drop the database
entirely.
Connect privilege is sufficient for most database activity. A programmer or developer who
is creating new objects may require Resource privilege. DBA privilege is like the "super-user" of the Unix Operating System. Only a few selected users should be granted DBA
privileges.
Table Privileges
There are six basic table privileges and a seventh privilege that was added with Informix
release 5.0. The privileges are:
Select - view data in a table
Insert - add new rows of data to a table
Update- change existing rows of data in a table
Delete - remove rows of data from a table
Index - add indexes to a table
A user must also have the database privilege of
resource to index a permanent table.
Alter - modify the structure of a table.
Reference - ability to reference columns in referential
constraints
This is a new feature with Release 5.0.
Column Privileges
There are two column level privileges - select and update. In order for column level
privileges to be effective, the user should not have select or update privileges at the table
level. If the user has table level select or update privileges, these will override the column
level privileges. Column level privileges limit what a user can see or change to specified
columns of a record in a table.
Public Privileges
Informix uses the keyword public to identify privileges that apply to all users. To allow all
users on a system to access a database, the database privilege Connect would be granted to
public. To allow all users with access to a database to select data from a column, select
privileges would be granted to public. The public keyword is a very powerful tool to allow
open access to your data. By the same token, to prevent all users not specifically
authorized from connecting to a database and selecting data, public privileges can be
revoked entirely.
Owner Privileges
The Owner of an object (database, table, etc.) in a database has special privileges over that
object. The owner has all seven table level privileges. These cannot be revoked from the
user who created an object. This can create problems when a system moves from
development to production, or when the original owner of a table or database leaves the
organization. This privilege is also not easily identified. When you use the Informix SQL
"info" command to see table privileges, owners' privileges are not identified. Another user,
even a DBA, cannot revoke privileges from an owner. This requires that when a database
or table is created, the owner must be carefully decided. The owner of a table is also the
only one who can grant other users privileges for that table.
Informix Defaults
When an Informix database is created, the only database level privilege granted is DBA
privilege to the creator. When a table is created, by default public is granted select, update,
insert and delete privileges on that table. This means that unless the privileges are
changed, any person with database access can view and modify a table.
Database Security Model
Establishing database security takes planning. For best results, start when the database is
created. A production environment is different from an environment where programs and
the database are still under development. To facilitate database security, two environments
are needed - one for the development of software, and one for the production system.
Another key component of an effective database security model is definition and
assignment of the roles of Database Administrator, and Database Security Officer. These
roles may be combined and handled by one individual, or user login.
The Database Administrator
It is strongly recommended that an abstract user login be created as the Database
Administrator (DBA). This is essentially a super-user for a database. Establishing a
separate login minimizes the risk of a user accidently dropping a database they have
created. This is especially important in a production environment. The DBA login name
should be the creator of the database and all tables in the database. The DBA should be
the only user with alter privileges on any table in the database. As the owner of the
database and all tables within the database, the DBA can grant privileges to other users.
Lester Knutsen, Advanced DataTools Corporation, 4510 Maxfield Drive, Annandale, VA
22003, Phone: 703-256-0267 or Email: lester@access.digex.net.
Washington Area Informix User Group Sponsorship
The following options are available for companies who would like to participate in our
activities:
Presentation at Meetings
The user group meets every other month. We would like to have one presentation per
meeting from vendors that have products that work with Informix. If you would like to
make a presentation at a meeting please contact Lester Knutsen at 703-256-0267.
Newsletter Sponsorship
The newsletter is produced quarterly. Each mailing goes to over 500 users in the
Washington area. Companies sponsoring the newsletter may place a one page ad. If you
would like to sponsor an issue, please contact Lester Knutsen at 703-256-0267.
Forum 1994 - An Informix User Group One Day Event
We are planning a one day user group forum in November. John Petruzzi is the
coordinator for this event. If you would be interested in demonstrating your products,
speaking or conducting a seminar please contact John Petruzzi at 703-490-4598.
This Newsletter is published by the Washington Area Informix Users Group.
Lester Knutsen, President/Editor
Washington Area Informix Users Group
4216 Evergreen Lane, Suite 136, Annandale, VA 22003
Phone: 703-256-0267
lester@access.digex.net
|