데이터베이스를 사랑하는 사람들의 모임 데이터베이스 사랑넷

DB2 Q&A 1046 게시물 읽기

No. 1046

쿼리문좀 봐주세요~!
작성자	이문선(yurina99)	작성일	2006-07-11 15:13	조회수	7,955

select * from
(
select * from tableA
where cd1>'00' and cd2>'00'
) bb, tableB tb
where not exists
(
select npdt.no
from
(
select indexed, indexed2, code from
(
select indexed, indexed2, code from CodeA
union all
select indexed, indexed2, code from CodeB
)aa
group by indexed, indexed2, code
)aa
where aa.index1 = bb.index1
and aa.index2 = bb.index2
and aa.index3 = bb.index3
)and bb.no = tb.no

-----------------------------------

tableA : 4400만건
tableB : 14000건
CodeA + CodeB : 4500건

A테이블중 cd1>'00' and cd2>'00'중에

codeA, codeB 테이블의 코드값과 일치하지 않는 데이터중에

tableB의 넘버와 일치하는 경우를 뽑고자 하는데..

전체 건수가 너무 많아서...시간이..너무 많이 걸리더군요.

먼가 효과적인 다른방법은 없을까요?

이 글에 대한 댓글이 총 10건 있습니다.

select *

from

(

select *

from tableA bb

where not exists

(

select 'x'

from

(

select indexed, indexed2, code from

(

select indexed, indexed2, code from CodeA

union all

select indexed, indexed2, code from CodeB

)aa

group by indexed, indexed2, code

)aa

where aa.index1 = bb.index1

and aa.index2 = bb.index2

and aa.index3 = bb.index3

)

)cc, tableB tb

where cc.no = tb.no

tableA에서 먼저 code테이블의 not exist인 경우를 찾고 그것의 결과를 tableB와 조인하는 방법으로 순서를 바꿔봤습니다. 순서를 바꾼 이유는 tableA와 tableB를 조인하면 상대적으로 not exist를 위한 full scan이 줄어들것이라 생각하여 먼저 기준 집합을 줄인후 full scan을 하도록 유도하기 위함이었습니다.

자잘한 에러를 고치시면서 실제 테이블이 있는 DB에서 돌려보시기 바랍니다.

이대현(root)님이 2006-07-11 16:07에 작성한 댓글입니다.
이 댓글은 2006-07-11 16:08에 마지막으로 수정되었습니다.

위방법으로도 해보았으나

같은 실행계획이 나오면서

수행시간도 거의 비슷하게 나오더라구요..

이문선(yurina99)님이 2006-07-11 16:17에 작성한 댓글입니다.

아래 2 sql의 값이 궁굼합니다.

1. select count(*) from tableA
where cd1>'00' and cd2>'00'

2. select count(*) from
(
select * from tableA
where cd1>'00' and cd2>'00'
) bb where not exists
(
select npdt.no
from
(
select indexed, indexed2, code from
(
select indexed, indexed2, code from CodeA
union all
select indexed, indexed2, code from CodeB
)aa
group by indexed, indexed2, code
)aa

where aa.index1 = bb.index1
and aa.index2 = bb.index2
and aa.index3 = bb.index3
)

도님이 2006-07-11 16:29에 작성한 댓글입니다. Edit Ｘ

1. 약3500만건

2, 약190만건

select * from
(
select * from tableA
where cd1>'00' and cd2>'00'
) bb

where not exists
(
select npdt.no
from
(
select indexed, indexed2, code from
(
select indexed, indexed2, code from CodeA
union all
select indexed, indexed2, code from CodeB
)aa
group by indexed, indexed2, code
)aa
where aa.index1 = bb.index1
and aa.index2 = bb.index2
and aa.index3 = bb.index3
)

실제로 이부분에서 시간을 많이 차지하는것 같습니다.

select npdt.no
from

select * from tableA
where cd1>'00' and cd2>'00'
) bb,
(
select indexed, indexed2, code from
(
select indexed, indexed2, code from CodeA
union all
select indexed, indexed2, code from CodeB
)aa
group by indexed, indexed2, code
)aa
where aa.index1 = bb.index1
and aa.index2 = bb.index2
and aa.index3 = bb.index3

여기까지만 해서는 얼마 안걸리는데

not exists부분에서 너무 많은 시간을 소요하는듯 합니다.

이문선(yurina99)님이 2006-07-11 16:50에 작성한 댓글입니다.
이 댓글은 2006-07-11 17:27에 마지막으로 수정되었습니다.

tablea에 index1+index2+index3 index가 있는지와 plan에서 이 인덱스를 사용하는지 궁굼합니다.

도님이 2006-07-11 18:54에 작성한 댓글입니다. Edit Ｘ

현재는 세개로 묶여진 인덱스는 없으며

인덱스를 추가할 상황이 못되어 추가도 어려운 실정입니다..

이문선(yurina99)님이 2006-07-12 09:24에 작성한 댓글입니다.

문제는 tableA 4400만건 full scan에 있는것같습니다.

이 sql은 자주 실행하지 않으실것 같은데..

tableA에서 필요 컬럼들만 export를 받아 tableA_temp라는 테이블에 import해서 이 테이블로 sql을 실행해보세요. tableA보다 tableA_temp테이블의 크기가 훨씬 작아진다면 함 고려해보세요.

tableA_temp테이블에는 index1+index2+index3 로 index를 생성해주시고 runstats를 실행해주세요. tableA 테이블의 과거 데이터의 변경이 없다면 기간별로 tableA_temp에 tableA의 데이터를 insert해주시면서 사용해보시는것도 ..

월배치 테이블을 만들수는 없나요?

도님이 2006-07-12 09:36에 작성한 댓글입니다.
이 댓글은 2006-07-12 10:04에 마지막으로 수정되었습니다. Edit Ｘ

현재는 tableA의 인덱스를 수정하거나 결과를 따로 temp 테이블에 넣는 등의 일을 할수 없는..상황이라...^^;;;

이것저것 해보고 주위 분들께 물어보았습니다.

select count(*) from
(
select * from tableA where cd1>'00' and cd2>'00'

except

select *
from
(select * from tableA where cd1>'00' and cd2>'00')bb,
(
  select indexed, indexed2, code from
  (
   select indexed, indexed2, code from CodeA
   union all
   select indexed, indexed2, code from CodeB
  )aa group by indexed, indexed2, code
)aa
where aa.index1 = bb.index1
and aa.index2 = bb.index2
and aa.index3 = bb.index3

)aa, tableB tb
where aa.no = tb.no

이런식으로 except를 쓰니까 시간이 20-30분안에 결과가 나오더라구요..

음...정확한 이유는 잘 모르겠습니다..

전체에서 비율이 적은 결과를 not exist로 구하는것과 except로 구하는것의 차이는..좀더 생각해 봐야겠습니다..

이문선(yurina99)님이 2006-07-12 10:53에 작성한 댓글입니다.
이 댓글은 2006-07-12 10:55에 마지막으로 수정되었습니다.

select count(*) from
(
select * from tableA where cd1>'00' and cd2>'00'

==>

select * from
(
select * from tableA where cd1>'00' and cd2>'00'

이런식으로 변경을 해도 속도가 빠른가요?

도님이 2006-07-12 11:02에 작성한 댓글입니다. Edit Ｘ

위 상황과 똑같이 하려했으나...결과가 너무 많은관계로

똑같은 tableA 에서

select count(*) from
(
select * from tableA where cd1='00' and cd2>'00'

==>

select * from
(
select * from tableA where cd1='00' and cd2>'00'

이경우로 해보았습니다.

전체 4400만건중 540건이 출력됩니다.

이것도 전에 수시간을 잡아먹었는데..

출력과, 카운트 모두 7-8분정도의 시간을 소요하였습니다.

이문선(yurina99)님이 2006-07-12 11:07에 작성한 댓글입니다.
이 댓글은 2006-07-12 11:17에 마지막으로 수정되었습니다.

[Top]

No.	제목	작성자	작성일	조회
1049	schema 생성 [2]	김민영	2006-07-14	6957
1048	instance를 추가하면.. [2]	김민영	2006-07-13	7765
1047	인스턴스 변경 [2]	김민영	2006-07-12	6480
1046	쿼리문좀 봐주세요~! [10]	이문선	2006-07-11	7955
1045	maxappls 값 변경하는 방법 [1]	궁금이	2006-07-10	6518
1044	db2 connect 시 에러 [1]	궁금이	2006-07-10	7376
1043	TOAD 평가판 사용하시는분 실행이..	하늘	2006-07-07	7343