Logging
ACID
ACID transactions are:
- Atomic: Either the whole or the none of the transaction is done.
- Consistent: Database constraints reserved. For example, a + b = 10, if a changed, b also changes.
- Isolated: It appears to the user that only one process is executing at one time.
- Durable: Effects of a process survive a crash.
Correctness Principle
If a transaction executes in the absence of any other transactions or system errors, and it starts with the database in a consistent state, then the database is also in a consistent state when the transaction ends.
State: a DB has state means a value for each of its elements.
Consistent state: certain state satisfies all constraints.
Another description of correctness principle:
- a transaction is atomic, and the complete execution of a single transaction is from one consistent state to another consistent state. That is to say, if only part of a transaction executes, then db state may not be consisitent.
- multiple transactions execute at the same time may lead to inconsistency unless using concurrecy control.
So, a transaction is a collection of actions that preserve consistency.
If T starts with consistent state and executes in isolation, then T leaves with consistent state.
Three address spaces:
- Local address space of transaction (Memory)
- Buffer (Memory)
- Space of disk
INPUT(X): block containing X disk to buffer/memory
OUTPUT(X): block containing X buffer to disk
READ(X, t): do input(X) if necessary; read value of x in buffer to t in local address space of transaction.
WRITE(X, t): do output(X) if necessary; write t in local address space of transaction to the value of x in buffer.
Undo Logging
only update record:
- transaction T has changed database element X and its former value was v
The change reflected by an update record normally occurs in memory, not disk.
- the log record is a response to a WRITE action, not an OUTPUT action
undo log must be written to disk in the following order:
- The log records that indicating changed database elements.
- The changed database elements themselves.
- The COMMIT log record.
So we got 2 rules:
U1:
U2:
In order to force log records to disk:
- The log manager needs a flush-log command that tells the buffer manager to copy to disk any log blocks that have not previously been copied to disk.
- FLUSH LOG
When a transaction begins, a
Then, execute OUTPUT(A), OUTPUT(B), … Only after the value of A, B, … is written to disk, a log
Then FLUSH LOG (after execute OUTPUT B, let’s assume that B is the last one) and
Recovery Rules
Incomplete transactions must be undone!
Scan the undo log from the end:
Let S = set if transactions with
in log but no or record in log For each
in log, in reverse order (latest to earliest) do: If Ti in S then
WRITE(X, v)
,OUTPUT(X)
For each Ti in S do: write
to log
Simple checkpointing
In a simple checkpoint,
Stop accepting new transactions.
Wait until all currently active transactions commit or abort and have written a COMMIT or ABORT record on the log.
Flush the log to disk.
Write a log record
, and flush the log again. Resume accepting transactions.
Problem: must shut the system while the ckpt is being made
Nonquiescent checkpointing
Nonquiescent checkpointing
Write a log record
and flush the log.Wait until all of T1, …, TK commit or abort, but do not prohibit other transactions from starting
When all of T1, …, TK have completed, write a log record
and flush the log
Two cases:
If we first meet an
, then we know that all incomplete transactions began after the previous . Scan backwards as far as If we first meet an
, then the crash occurred during the checkpoint
A general rule:
Once an
Redo Logging
Problem for undo logging: First write all its changed data to disk then commit a transaction — too many disk I/Os.
Save disk I/Os: Let changed data reside only in main memory for a while.
Redo logging requires the COMMIT record appear on disk before any changed values reach disk.
Redo logging rules
R1: Before modifying any database element X on disk, it is necessary that all log records pertaining to this modification of X, including both the update record
- The log records indicating changed db elements.
- The COMMIT log record.
- The changed db elements themselves.
First COMMIT then OUTPUT.
To recover, using a redo log, after a system crash:
Identify the committed transactions.
Scan the log forward from the beginning. For each log record <T, X, v> encountered:
a) If T is not a committed transaction, do nothing.
b) If T is committed, write value v for database element X. (That’s because no COMMIT no values on disk)
For each incomplete transaction T, write an
record to the log and flush the log.
The steps to perform a nonquiescent checkpoint of redo log:
Write a log record
, where T1, …, Tk are all the active (uncommitted) transactions, and flush the log Write to disk all database elements that were written to buffers but not yet to disk by transactions that had already committed when the START CKPT record was written to the log
Write an
record to the log and flush the log
Undo/redo Logging
The update log record:
- the former value is v, the new value is w
UR1: Before modifying any db element X on disk because of changes made by some transaction T, it is necessary that the update record
UR2: A
The undo/redo recovery policy is:
- Redo all the commited transactions in the order earliest-first.
- Undo all the incomplete transactions in the order latest-first.
系统故障造成数据库不一致状态的原因
未完成事务对数据库的更新已写入数据库
已提交事务对数据库的更新还留在缓冲区没来得及写入数据库
恢复方法
\1. Undo 故障发生时未完成的事务
\2. Redo 已完成的事务
系统故障的恢复由系统在重新启动时自动完成,不需要用户干预
系统故障的恢复步骤
- 正向扫描日志文件(即从头扫描日志文件)
重做(REDO) 队列: 在故障发生前已经提交的事务
这些事务既有BEGIN TRANSACTION记录,也有COMMIT记录
撤销 (Undo)队列:故障发生时尚未完成的事务
这些事务只有BEGIN TRANSACTION记录,无相应的COMMIT记录
- 对撤销(Undo)队列事务进行撤销(UNDO)处理
反向扫描日志文件,对每个UNDO事务的更新操作执行逆操作
即将日志记录中“更新前的值”写入数据库
- 对重做(Redo)队列事务进行重做(REDO)处理
正向扫描日志文件,对每个REDO事务重新执行登记的操作
即将日志记录中“更新后的值”写入数据库
A nonquiescent checkpoint is somewhat simpler for undo/redo logging than for other logging methods
Write a
record to the log, where T1, …, Tk are all the active transactions, and flush the log Write to disk all the buffers that are dirty; i.e., they contain one or more changed database elements. Unlike redo logging, we flush all dirty buffers, not just those written by committed transactions
Write an
record to the log and flush the log
Archiving
Maintaining a copy of the db separate from the db itself.
- shut down the db for a while
- make a backup copy on some storage medium
- Store the copy in some remote secure location
Two levels of archiving:
- a full dump, in which the entire db is copied
- an incremental dump, in which only those db elements changed since the previous full or incremental dump are copied
Concurrency Control
The process of assuring that transactions preserve consistency when executing simultaneously.
The scheduler takes read/write requests from transactions and executes them in buffers or delay them.
Serial and Serializable Schedules
A schedule is serial if its actions consist of all the actions of one transaction, then all the actions of another transaction, and so on.
The correctness principle tells us that every serial schedule will preserve consistency of the db state.
A schedule is serializable if there is a serial schedule S’ such that for every initial db state, the effects of S and S’ are the same.
Conflicts: A pair of consecutive actions in a schedule such that, if their order is interchanged then the behaviour of at least one of the transactions involved can change.
本文作者 : preccrep
原文链接 : https://preccrep.github.io/2021/06/24/Logging/
版权声明 : 本博客所有文章除特别声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明出处!