| Other Issues |







#!/bin/shHere is an example of the script in action:
dbaccess finderr <<EOF
select errcol from errtable where bts_contains(errcol,"$*");
EOF

The Blade Manager was then used to register the BTS module to the finderr database, and the index created with:
create database finderr with buffered log;
create table errtable (errcol lvarchar(10000));
load from 'errtable.unl' insert into errtable;
The BTS extension is a fast and flexible way to make freeform text searches available in applications. It is also undergoing continued development, look out for enhancements in future releases of BTS such as user configurable stop words.
create index err_bts on errtable (errcol bts_lvarchar_ops)
using bts (delete='immediate') in bts_extspace;
This article describes how opaque, row and collection types are stored on disk
Storing Opaque Types
What is Byte Alignment on Opaque Types? The first thing to consider when trying to understand byte alignment is how the data is parsed, so consider the following c structures:
struct {
integer i1;
short s1
char c3;
char c4;
} mystruct1_t
struct {
integer i1;
short s1;
char c1;
} mystruct2_t
struct {
integer i1;
char c1;
short c1;
} mystruct3_t
The size requirement for each of the structure elements are strictly defined, which means the size of the example structures can easily be calculated. The size of mystruct1_t is 8 bytes and the size of mystruct2_t and mystruct3_t are 7 bytes each. When these structures are in memory it is inefficient to parse the structure on a byte-by-byte basis, it is much better to parse by the word. A word is the term the operating system uses for a group of bytes. Usually a word is the same size as an integer but it can and does vary depending on the operating system. In a 32-bit operating system, each word and the integer are 32 bits, or 4 bytes, in length. In a 64-bit OS, words and integers (corresponding to the C 'long' type) are 8 bytes long
With a size of 8 bytes mystruct_1 is perfect; the structure fits into one or two words. However, mystruct2_t and mystruct3_t are both 1 byte short of a word and so these structures have a 1 byte pad when stored in memory ensuring the structure begins on a word boundary.
So that should be easy, but life is never easy and things are a little more complicated. On 32 bit systems the internal alignments are 1, 2 or bytes, for 64 bit systems the values are 1, 2, or 4 and depending the system 8 bytes. Because the alignment value depends on the data type the padding can be differ between structure of the same size but different elements. The ‘alignment rules’ mean a structure component always has to fit in an alignment that is appropriate for its data type.
|
Type |
Size |
Alignment |
|
Char |
2 |
2 |
|
Short |
2 |
2 |
|
Integer |
4 |
4 |
|
Float |
4 |
4 |
|
Double [1] |
8 |
8 (Windows) 4 Linux |
1: Assuming a x86 processor
So how can the actual physical structures be viewed? The unix command od can be used to see the structure on disk and for the structures defined earlier they would look like the following on a 32 bit system.
|
|
word |
Word |
||||||
|
|
byte |
byte |
byte |
byte |
byte |
byte |
byte |
Byte |
|
mystruct1_t |
integer |
short |
char |
char |
||||
|
mystruct1_t |
integer |
short |
char |
pad |
||||
|
mystruct1_t |
integer |
char |
pad |
short |
||||
Prior to IDS 9.x, Informix stored all information in a data stream. That is, no padding was done so that tuples could be stored as compactly as possible. When the data was read into memory, the SQL parser read the data and padded any extra bytes required for word alignment in memory.
In IDS version 9.x, data is not compacted for opaque data types; the opaque type is regarded as a black box with no compaction or alignment. This makes it necessary for the designer of the opaque type to handle alignment issues
Storing Row Types
Firstly, there is no difference in between the way named row type columns and unnamed row type columns are stored.
Since row type data is nothing more than a structure consisting of other data types, it is simply stored in the same way that it's element data is stored. But, there is always a but, but row types always begin on a 4-byte boundary with at least one extra and unused byte at the start of the row type. This means that if previous column ends on a 4-byte boundary then the column has 4 unused bytes added and the next column starts on the next boundary.
When tables are created using ‘OF TYPE’ syntax then columns will have the same name as the specified row type. Once the table is created the row type is no longer associated with the table.
Storing Collection Types
As long as the collection fits on a single page then it is stored like any other non-blob data type, i.e. in a tblspace data page. If the collection size is greater than a page then it will be written to a blobpage in the tblspace and a descriptor maintained in the data row to indicate the collection column is in a blob.
Although not always used, the space required for the descriptor is allocated and is written to the row. So when a collection is small, the collection data is stored as a part of a descriptor. For large collections the descriptor points to the location of the blobpage that holds the descriptor data. A flag in the descriptor maintains the how the data is stored, either in blobpage or tblspace.
The oncheck output below shows a simple collection of three items (“nn”,”ii”,tt”) that fits easily in a single page
oncheck -pD mytest:mycollection
TBLspace data check for mytest:informix.mycollection
page_type rowid length fwd_ptr
HOME 101 81 0
0:61 61 61 0 11 0 0 0 0 0 0 1 0 0 0 9 aaa.............
16:6e 6e 69 69 74 74 0 0 0 0 0 0 0 0 0 0 nniitt..........
32: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ................
48: 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 ................
64: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 62 62 ..............aa
80:61 a...............
TBLOB: tblspace bstamp flags
0 0 1 BLOBISNULL
The underlined area show the mostly unused descriptor area and the flag set indicating the collection values are in row i.e. the tblspace.
In the next example the collection data will not fit in row and is stored in a blobpage (0x1201ef) and the row within the page (0x101)
oncheck -pD mytest:mycollection
TBLspace data check for mytest:informix.mycollection
page_type rowid length fwd_ptr
HOME 301 83 0
0:63 63 63 63 63 63 63 63 63 0 d 0 0 0 0 0 ccccccccc.......
16: 0 0 0 2 0 0 0 0 ff ff 7 a7 0 12 1 ef ...........'...m
32: 0 0 0 0 0 0 9 a8 0 0 9 a8 0 0 1 1 .......(...(....
48: 0 0 0 0 0 0 0 0 0 0 fe a2 ff ff 0 8 ..........~"....
64: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ................
80:63 63 63 ccc.............
TBLOB: tblspace bstamp flags
1201ef -350 8
addr family vol size bstamp coloff flags type medium
101 0 0 2472 -350 1959 8 PNBLOB FIX_MAG
BLOBPAGE: addr size bstamp nbpage nbstamp
101 2008 -350 201 -346
201 464 -346 ffffffff 0
Further Reading
http://www-128.ibm.com/developerworks/library/pa-dalign/