Main M i Memory M Database Systems Behnam Dezfouli Fall 2006
M i M Main Memory D Database t b
Using physical memory as primary storage and p probably y a disk subsystem y for backup p
Primary copy lives permanently in Memory
Main Memory Database
2
Main Memory Databases Vs. Disk Resident Databases
MMDB : the primary copy of data live permanently in main i memory DRDB : the primary copy of data is permanently disk resident
MMDB : there can be a backup copy, resident on disk DRDB : data can be temporarily cached in main memory for access speed up
Main Memory Database
3
Main Memory Databases Vs. Disk Resident Databases (Differences)
Access
time of MMDB, orders of magnitude less than for disks
MMDB is normally volatile
Nonvolatile memory possible (at some cost)
Disks
have high fixed cost per access independent of the amount of the retrieved data (Block Oriented)
Sequential access is not important in MMDB, while is important in DRDB
MMDB data are more vulnerable to software errors, since they can be directly accessed by the processor Main Memory Database
4
Why y and when to use MMDB?
Motivations:
In
some cases when DB is of limited size.
Requirement of short Access/Response time and better transaction throughput
Applications handling high traffic of data, e.g. Router.
Real-Time applications : Telecommunication, Radar tracking
Main Memory Database
5
Why and when to use MMDB?
When DB does not fit in memory: There might be different classes of data (e (e.g., g satellite image data) Hot data: frequently q y access,, low volume Cold data: accesses rarely, voluminous So partition data into logical DBs Possibility of Migration In Telecommunication:
R ti ttable: Routing bl H Hott
Fast Path
C Customer’s t ’ d data: t C Cold ld
Main Memory Database
6
MMDB Vs. DRDB with a very large cache
Not taking full advantage of memory
index
designed for disk access (B (B-Tree) Tree)
Access through Buffer Manager
Better to use memory addresses (optimization)
Main Memory Database
7
Special hardware for nonvolatile memory? Is that reliable?
Even if special hardware can enhance MM reliability (by UPS, battery, error detection) , periodic backup is necessary
Several Factors: 1.
MMDB is vulnerable due to direct access by processor, content is lost if system crashes
2.
If a single memory board fails, the entire machine must be powered down , so loosing all data. Need to recent backup
3.
Battery backed memory or UPS are active devices. Disks are passive devices: Don’t have to do anything to remember data
Main Memory Database
8
Concurrency Co cu e cy Co Control to
Lock based, used in MMDB Faster transaction completion, lock contentions may not be as important as DRDB Advantage of small lock granules is removed (contention) If entire DB :
serial execution (desirable), like TPK in Princeton, but not practical in long transactions Less cache flushes
Conventional systems: hash table for locked objects In MMDB: IMS uses two bits in each object
Main Memory Database
9
Commit Processing
Necessary to have Backup and Log (in stable storage) Before commit, activity records written to log Ö Affects on response time and throughput (bottleneck log) Solutions:
Small amount of stable main memory for log (special processor), MARS and MM-DBMS Pre-committing: Pre committing: releasing locks as soon as log records are in log, without waiting for propagation to disk Group commit: reduce log bottleneck, single operation commits multiple transactions
Main Memory Database
10
MMDB Access Methods
Index structures like B-Trees designed for block oriented storage. Not useful is MMDB. Hashing:
Fast lookup & updating
Not as space efficient as a tree MMDB trees need not to be short and bushy.
Trees such as T-Tree designed for MMDB.
Index structures store pointers to indexed data, eliminates problem of storing variable length fields in and index, so saves space
Pointers are of fixed short lengths
Main Memory Database
11
MMDB Data Representation
S Space consuming i d due tto d duplicate li t values l (i (in conventional ti l systems) t )
In a MMDB:
Relational tuples can be represented as a set of pointers to data values
use of pointers is space efficient when large values appear multiple times in the database the actual value needs to only be stored once
Pointers also simplify the handling of variable length fields since variable length data can be represented using pointers into a heap
Main Memory Database
12
MMDB Query y Processing g
Query processing for DRDB focus on reducing disk access costs Query processing techniques based on fast sequential access lose that advantage
Query processing for MMDB must focus on reducing processing costs
Operation costs vary from system to system
No general optimization technique
Example: a p e jo joining g relations e at o s R a and dSo over e a co common o att attribute bute
Main Memory Database
13
Recovery y
In MMDB: Checkpointing and recovery are the only reasons to access to the disk resident copy of DB Application pp cat o transactions t a sact o s never e e require equ e access to d disk s resident es de t data Disk access can be tailored to suite the needs of checkpointing alone
After failure:
Disk I/O using g very y large g block size
restore data from disk resident backup Update using the log
Load blocks of the DB “on demand” Disk striping or disk arrays Main Memory Database
14
Application pp Programming g g Interface and Protection
In conventional systems: application calls database system, giving the object id and address of a private buffer in its address space In MMDB: using the actual memory position of object
First time by relation name and primary key S b Subsequent t accesses by b memory address dd
Eliminates translation and buffer copying Commits the system to leave the object in place
Potential problems: Direct access causes unauthorized access S t System has h no way off knowing k i what h t has h b bees modified difi d
Solution: run transactions compiled by special DB system compiler
Main Memory Database
15
Conclusion
Main Memory has short response time, and its decreasing cost makes it affordable and suitable for real-time applications
as memory becomes cheaper, it becomes cost effective to keep more and more data permanently in memory. This implies that memory resident database systems will become more common in the future
mechanisms ec a s s and a d opt optimizations at o s we e have a e discussed d scussed in tthis s pape paper will become commonplace
Main Memory Database
16