오늘 있었던 일 입니다. 상기님의 도움을 받아 해결해 보려고
노력했는데 결국 시간에 쫓기기도 해서 이 DB 클러스터는 포기하고 새로
initdb해서 덤프 했던걸 리스토어 하고 있습니다.. ㅠ_ㅠ;;;
- 설정 상태
- fsync=off
- checkpoint_segments = 16
- checkpoint_timeout = 3600
- checkpoint_warning = 0
- work_mem = 2048
- maintenance_work_mem = 65536
- wal_buffers =256
- autovacuum = off
- 시스템
- CPU: P4 3Ghz 싱글(2개로 잡힘)
- MEM: 512MB
- pgsql: 8.1.3
- 상황
- BEGIN;
- 약 58만개의 row 삭제
- 새로운 제약조건 생성
ALTER TABLE "working"."tb_object"
ADD CONSTRAINT "tb_object_mid" FOREIGN KEY ("mid")
REFERENCES "public"."tb_member"("mid")
ON DELETE CASCADE
ON UPDATE CASCADE
DEFERRABLE
INITIALLY DEFERRED;
- COMMIT;
- VACUUM FULL;
- EXPLAIN SELECT 로 인덱스 사용 확인
- VACUUM ANALYZE VRBOSE;
- 시스템 다운
- 리부팅
- pgsql 시작 -> startup 프로세스가 한참 작업하다가
죽어버림(시그널 6)
[postgres@onpamdb01 /]$ WARNING: autovacuum not started because of misconfiguration
HINT: Enable options "stats_start_collector" and "stats_row_level".
LOG: database system was interrupted while in recovery at 2006-04-25 16:34:19 KST
HINT: This probably means that some data is corrupted and you will have to use the last backup for recovery.
LOG: checkpoint record is at 2/4112B198
LOG: redo record is at 2/4110001C; undo record is at 0/0; shutdown FALSE
LOG: next transaction ID: 1314926; next OID: 759566
LOG: next MultiXactId: 1; next MultiXactOffset: 0
LOG: database system was not properly shut down; automatic recovery in progress
LOG: redo starts at 2/4110001C
PANIC: block 13 unfound
LOG: startup process (PID 2788) was terminated by signal 6
LOG: aborting startup due to startup process failure
- fsck -> 결과 정상
- 상기님의 도움으로 pg_resetxlog 수행
- 역시 시작 안됨
- pg_resetxlog -f 로 강제로 리셋 수행
- 시작 됨
- 문제의 DB로 접근 -> 에러나며 접근 안됨(psql과
vacuumdb명령)
FATAL: index "pg_class_oid_index" is not a btree
psql: FATAL: index "pg_class_oid_index" is not a btree
- base/1 내의 해당 인덱스의 OID를 찾아서 복사
- pg_ctl restart -> 다시 접근 -> 다른 인덱스가
문제됨
- 문제되는 인덱스를 모두 찾아 template1에서 복사해줌
- 새로운 에러와 함께 접근시 아래와 같은 프로세스 상태가
되며 서버가 재시작됨
postgres 3091 1.7 1.2 32004 6372 pts/0 S 17:12 0:00 /usr/local/pgsql8.1.3/bin/postmaster
postgres 3099 0.0 1.2 32004 6412 pts/0 S 17:12 0:00 \_ postgres: writer process
[postgres@onpamdb01 pg_xlog]$ vacuumdb MagicHome
LOG: server process (PID 3141) was terminated by signal 11
LOG: terminating any other active server processes
vacuumdb: could not connect to database MagicHome: server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
LOG: all server processes terminated; reinitializing
[postgres@onpamdb01 pg_xlog]$ LOG: database system was interrupted at 2006-04-25 17:13:59 KST
LOG: checkpoint record is at 2/590000E8
LOG: redo record is at 2/590000E8; undo record is at 0/0; shutdown TRUE
LOG: next transaction ID: 1316400; next OID: 759566
LOG: next MultiXactId: 1; next MultiXactOffset: 0
LOG: database system was not properly shut down; automatic recovery in progress
LOG: record with zero length at 2/5900012C
LOG: redo is not required
LOG: database system is ready
LOG: transaction ID wrap limit is 1075055134, limited by database "MagicHome"
- 도저히 해결 안됨;;
- 결과
- VACUUM ANALYZE 도중 절대 크래쉬 나면 안된다;;;;
|