Hi,
first I have to say that I don't know IDS11 (only up to IDS10) and checkpoint algorithms and monitoring has changed between these versions.
But I would have a look at two things
onstat -R |tail
That will show if there is IO writing dirty pages to disk.
More interesting should be
onstat -u | grep X
That will show threads in critical sections - maybe you can find some that are responsible for the block.
If it is not an internal thread (btree cleaner , backup , ...) you can
further look at the corresponding session (column 3 of onstat -u output).
With IDS10 we have some issues with long checkpoints due to INSERTs on tables with many changes (insert/delete) and indexes of huge size. We have to rebuild these indices quite often to avoid excessive checkpoints.
Regards,
Andreas Kutsche
>
-------------------------------------------
SPAR Österreichische Warenhandels-AG
Hauptzentrale
A - 5015 Salzburg, Europastrasse 3
FN 34170 a
Tel: +43 662 4470 24223
Mobile: +43 664 6259575
E-Mail: Andreas.KUTSCHE@spar.at
Internet: http://www.spar.at
Wichtiger Hinweis: Der Inhalt dieser E-Mail kann vertrauliche und rechtlich geschützte Informationen, insbesondere Betriebs- oder Geschäftsgeheimnisse, enthalten, zu deren Geheimhaltung der Empfänger verpflichtet ist. Die Informationen in dieser E-Mail sind ausschließlich für den Adressaten bestimmt. Sollten Sie die E-Mail irrtümlich erhalten haben so ersuchen wir Sie, die Nachricht von Ihrem System zu löschen und sich mit uns in Verbindung zu setzen.
Über das Internet versandte E-Mails können leicht manipuliert oder unter fremdem Namen erstellt werden. Daher schließen wir die rechtliche Verbindlichkeit der in dieser Nachricht enthaltenen Informationen aus. Der Inhalt der E-Mail ist nur rechtsverbindlich, wenn er von uns schriftlich bestätigt und gezeichnet wird.
Sollte trotz der von uns verwendeten Virus-Schutzprogramme durch die Zusendung von E-Mails ein Virus in Ihre Systeme gelangen, haften wir nicht für evtl. hieraus entstehende Schäden.
Wir danken für Ihr Verständnis.
Important notice: The contents of this e-mail may contain confidential and legally protected information that is in particular related to operational and trade secrets, which the recipient is obliged to treat as confidential. The information in this e-mail is made available exclusively for use by the addressee. In the event that the e-mail may have been sent to you in error, we would ask you to kindly delete this communication from your system and to contact us.
E-mails sent via the Internet can be easily manipulated or sent out under someone else's name. We therefore do not accept legal liability for the information contained in this communication. The contents of the e-mail are only legally binding if they have been confirmed and signed by us in writing.
If, in spite of our using Antivirus protection software, a virus may have penetrated your system through the sending of this e-mail, we do not accept liability for any damage that may possibly arise as a result of this.
We trust that you appreciate our position.
-------------------------------------------
-----Ursprüngliche Nachricht-----
> Von: ids-bounces@iiug.org [mailto:ids-bounces@iiug.org] Im Auftrag von Tim
> Ertl
> Gesendet: Dienstag, 30. September 2008 02:47
> An: ids@iiug.org
> Betreff: RE: Informix 11.10FC2W5 hangs & crashes [13522]
>
> Obnoxio the Clown,
> Wow the best question in the world is the one nobody can answer. The
> customer (me) always say, Nothing in the past 6 weeks? But even I have to
> ask what changed. We have had an increase in database activity and one of
> the tables had a new field added. Three programs had some business logic
> changed but even after backing off all changes prior to the first
> Assertion
> we still had 5 more problems. So it looks like the increase in loading I
> guess.
>
> I am told by IBM that the fix is to add a Btree scanner called alice. I
> read
> a little about it and it looks more like a performance tweek. I can not
> understand how that can crash twice and hang 4 times a whole server. All 6
> times on a Blocked Checkpoint. I am one of the "admin free" kinda sites.
> We
> set things up and they run for 8 years. Then I upset the cart by
> installing
> 11.10FC2 on a new Sun 6 weeks ago and now it goes bump, mostly after the
> onbar backups have finished (3 times it did that).
>
> If I get a full nights sleep tonight I might be willing to accept Alice
> but
> I am still skeptical. I wish I ask more about the one line fix to the
> onconfig file. I may have to go to another user group meeting to get that
> answer.
>
> Tim Ertl
> 413-442-9000 x6211
>
> -----Original Message-----
> From: ids-bounces@iiug.org [mailto:ids-bounces@iiug.org] On Behalf Of
> Obnoxio The Clown
> Sent: Monday, September 29, 2008 7:07 PM
> To: ids@iiug.org
> Subject: Re: Informix 11.10FC2W5 hangs & crashes [13521]
>
> Tim Ertl wrote:
> > We have about 6 or 7 MAJOR failures of Infmormix 11.10FC2W5 in the past
> > few days. I have a pmr going and a tech support person working on it for
> > several days now.
> >
> > I am hoping someone else might have seen this problem and fixed it them
> > selves. I am all Googled out.
> >
> > We have had Assertions and HANGS on Blocked Check Points. Sometimes we
> > can stay up for only 3 hours other times we stay up for a whole day.
> > This morning alone we Hang at 6am and then again at 9. Now we are quite
> > till 17:30. We have restricted usage to Medical Emergencies Only to
> > prevent any unnecessary usage. We move some stuff over to M/S Sql
> > (something some people around here looked for excuses to do).
> >
> > We have been running 11.10FC2 since Aug 16th no problems. Then all of a
> > sudden we have an assertion at 3 oclock last Thursday monring. Since
> > then we have had about 6/7 Informix Failures.
> > IBM Support had us upgrade to W5 and since we did that we have had 4
> > more problems. IBM has added and onmode -C alice 6 for some reason I do
> > not understand. We can not continue like this.
> >
> > ALL of these problems start off with about an hour of the informix log
> > saying Txns blocked:1 and checkpoint durations greater than 0. We used
> > to have zero all the time for these. During the hour before lock up we
> > have slow periods of up to 5 minutes then all of a sudden things free up
> > and we go back to normal, on and off hangs. Then after about an hour of
> > this the system freezes on blocked check point.
> >
> > Physical log is at 1% used and there are 7 logical logs and only one is
> > in use. The others are backed up.
> >
> > I have tons of onstat -a's. Has anyone had some experience with blocked
> > check points that might be able to shed some light on this.
> >
> > Why would informix go belly up all of a sudden like this?
> >
> > Running on a Sun Sparc T5120 32gb memory, 64 virtual processors (cores
> > or what ever they call them). Ton's of disk space now. We upgraded Aug
> > 16th and things have been great, then crash!
> >
> > Thanks for any suggestions.
> > We may need to hire special help!
> >
>
> What has changed?
>
> --
> Cheers,
> Obnoxio the Clown
>
> http://obotheclown.blogspot.com
>
> **************************************************************************
> **
> ***
> Forum Note: Use "Reply" to post a response in the discussion forum.
>
>
> **************************************************************************
> *****
> Forum Note: Use "Reply" to post a response in the discussion forum.