 |
IDS Forum
Re: Two table join
Posted By: Jack Parker Date: Thursday, 28 July 2005, at 8:47 a.m.
In Response To: RE: dbimport error (Tomasz.Lato@breinwest.com.pl)
You have limited control over where sorting and grouping occur. You can attempt to make it more efficient 1 - increase the number of tempspaces up to 2 per cpu (3 per cpu is the max that the engine will use, but that's overkill) You could get fancy and set up RAMdisks as temp spaces if this is what you are trying to achieve. I've never bothered to try it. 2 - export PSORT_NPROCS=n where n is the number of parallel processes you want to SORT (and group) with.
80M of DS Memory is not a lot to work with. DS memory comes out of your virtual memory and you control how much of it you use with PDQPRIORITY - although the engine may not choose to use that much - and it certainly is not going to use it for a sort or group - it prefers to use the tempspace. DSMemory is most useful in a hash join. You monitor this memory with onstat -g mgm (or 'rgm' depending on your engine) - I am a little confused here, you started this thread in the xps group and now you've shifted it over to the ids group - so I am unclear as to which engine you have. If you are monitoring memory with 'mgm' then it's ids.
PDQPRIORITY also controls the degree of parallelism, but then read this article: http://www-128.ibm.com/developerworks/db2/library/techarticle/dm-0205parker/
It's a cookbook article on how to tune queries, it goes into PDQ as well.
Sounds like the query you want is:
select a.image_loc,a.access_domain,min(a.access_datetime) access_datetime from access a where a.access_success='F' and (a.image_loc is not null or not exists (select 0 from image where a.image_loc=image.image_loc)) and not exists (select 0 from access where a.key=access.key and a.access_success='F' and access_success='S') group by 1,2
That's not the prettiest of data structures against which to run this sort of query. If you've only got 100K rows, then it's not a big deal, but if you are going to scale this up, then I recommend not trying to search a history table of this nature for this sort of information. Better to keep track of current status of an image in a status table, which can be indexed and searched without joins.
cheers j.
---
Sorting in memory:
One of my Sql statement having an group by clause is using temporary dbspace to sort. I need this sorting to happen in memory itself. How can it be achieved. I tried setting the PDQ priority parameter but still the sorts are going to temp space.
PDQ memory:
I am bit confused from where the PDQ memory is allocate. Pls clarify it.
I have set shmvirtsize of 80M and DS_total_memory as 80M and MAXPDQPRIORITY and PDQPRIORITY as 100. when i run a query and issue a onstat -g mgm i find the mgm is showing 80M is occupied but in onstat -g ses that amount is not taken. It shows only around 125K(in memory column) in both onstat -g seg and onstat -g ses. I think the mgm allocates the memory from Virtual memory. so when i give a query with this settings, ideally there should not be any free memory in virtual memory.
----- Original Message ----- From: "pathri chakravarthy" <ckvarthy@yahoo.com> To: "Jack Parker" <vze2qjg5@verizon.net>; <ids@iiug.org> Sent: Thursday, July 28, 2005 1:02 AM Subject: Re: Two table join [39]
> Hi, > > Yes your right the statistics are not being updated. > > My requirement is to fetch all the failed image > locations. The image location may fail at one point in > time(status F) and later succeed with a status of 'S'. > I am doing a self join to get only the failed > ones(which are not succeeded in future). > > Also i want to know whether the server allocates > necessary memory as and when required or it allocates > some default memory for the joins and sorts. > > Is there a way i can control the allocation of memory > for joins and sorts. Also can these joins be performed > parallely. > > > > bye. > > --- Jack Parker <vze2qjg5@verizon.net> wrote: > > > I should have been clearer. > > > > 1 select nrows from systables where tabname = > > 'access'; Just see what the > > database thinks you have there for rows - bet it's > > small. This is where > > 'update statistics' will help. Your estimated cost > > value from the explain > > plan screams that stats are not updated. You might > > consider an update > > statistics medium for the columns involved. > > > > 2. The 'not in' is more expensive than an 'exists' > > clause, consider > > replacing it with: > > 'and not exists (select 0 from image where > > a.image_loc=image.image_loc)' > > > > 3. you don't indicate what is indexed. That would > > be useful information, as > > well as rowsize and fragmentation of both tables > > (access and image) > > > > 4. Why do you have the self join in there? You > > aren't using anything from > > the second copy of the table and aren't using the > > second table to filter > > anything. > > > > select > > a.image_loc,a.access_domain,min(a.access_datetime) > > access_datetime > > from access a > > where a.access_success='F' > > and (a.image_loc is not null or not exists > > (select 0 from image where > > a.image_loc=image.image_loc)) > > group by 1,2 > > > > Should do exactly the same thing. > > > > Fix these things first, then let's talk about the > > rest of it. > > > > j. > > ----- Original Message ----- > > From: "PARAMESHWAR...." <pcdudyala@yahoo.com> > > To: <xps@iiug.org> > > Sent: Wednesday, July 27, 2005 9:14 AM > > Subject: Two table join [35] > > > > > > > Hi, > > > I have a two table self join. The table has around > > 1 lakh rows. I did a > > set explain on the sql and found the cost was 4313. > > WHen i execute the query > > it is taking hours together to complete. I feel > > there might be some resource > > allocation issues. What can i do to allocate more > > memory for the join. > > > > > > the query is as follows. > > > > > > select > > a.image_loc,a.access_domain,min(a.access_datetime) > > access_datetime > > > from > > > access a,access b where a.image_loc=b.image_loc > > > and a.access_success='F' > > > and b.access_success='F' > > > and (a.image_loc is not null or a.image_loc not in > > (select image_loc from > > image) > > > ) > > > group by 1,2Estimated Cost: 4413 > > > Estimated # of Rows Returned: 1 > > > Maximum Threads: 1 > > > Temporary Files Required For: Group By > > > > > > 1) informix.a: INDEX PATH > > > > > > Filters: (informix.a.image_loc IS NOT NULL OR > > informix.a.image_loc != AL > > > L <subquery> ) > > > > > > (1) Index Keys: access_success access_domain > > (Parallel, fragments: ALL) > > > Lower Index Filter: informix.a.access_success = > > 'F' > > > > > > 2) informix.b: INDEX PATH > > > > > > Filters: informix.b.access_success = 'F' > > > > > > (1) Index Keys: image_loc (Parallel, fragments: > > ALL) > > > Lower Index Filter: informix.a.image_loc = > > informix.b.image_loc > > > NESTED LOOP JOIN > > > > > > Subquery: > > > --------- > > > Estimated Cost: 4361 > > > Estimated # of Rows Returned: 54600 > > > Maximum Threads: 1 > > > > > > 1) ciaadm.image: INDEX PATH > > > > > > (1) Index Keys: image_loc (Key-Only) (Parallel, > > fragments: ALL) > > > > > > > > > bye > > > > > > > > > > > > > > > > > > > ___________________________________________________________ > Yahoo! Messenger - NEW crystal clear PC to PC calling worldwide with voicemail http://uk.messenger.yahoo.com
Messages In This Thread
- dbimport error
Tomasz.Lato@breinwest.com.pl -- Monday, 20 January 2003, at 2:56 a.m.
- RE: dbimport error
Kokane Rajesh, IBM -- Monday, 20 January 2003, at 3:53 a.m.
- RE: dbimport error
Tomasz.Lato@breinwest.com.pl -- Monday, 20 January 2003, at 4:22 a.m.
- Re: Two table join
Jack Parker -- Thursday, 28 July 2005, at 8:47 a.m.
- Re: Two table join
Jack Parker -- Friday, 29 July 2005, at 7:29 a.m.
- RE: dbimport error
Colin Bull -- Monday, 20 January 2003, at 4:48 a.m.
IDS Forum is maintained by Administrator with WebBBS 5.12.
|
 |