August 2009 Master Of Science In Information Technology (mscit-new) –

Uploaded by: Birendra Singhb
0
0

June 2020
PDF

Download

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA

Overview

Download & View August 2009 Master Of Science In Information Technology (mscit-new) – as PDF for free.

More details

Words: 7,900
Pages: 28

Preview
Full text

August 2009 Master of Science in Information Technology (MScIT-NEW) – Semester 3 MT0046 –: Oracle9i– 4Credits (Book ID: B0894)

Assignment Set – 1 (40 Marks)

[Book ID: B0894] Each question carries 5 marks 1.

Mention the different modular database application programs in PL/SQL

Structure of PL/SQL-Blocks PL/SQL is a block-structured language. Each block builds a (named) program unit, and blocks can be nested. Blocks that build a procedure, a function, or a package must be named. A PL/SQL block has an optional declare section, a part containing PL/SQL statements, and an optional exception-handling part. Thus the structure of a PL/SQL looks as follows (brackets [ ] enclose optional parts): [] [declare <User defined exceptions>] begin [exception <Exception handling>] end;

The block header specifies whether the PL/SQL block is a procedure, a function, or a package.

If no header is specified, the block is said to be an anonymous PL/SQL block. Each PL/SQL block again builds a PL/SQL statement. Thus blocks can be nested like blocks in conventional programming languages. The scope of declared variables (i.e., the part of the program in which one can refer to the variable) is analogous to the scope of variables in programming languages such as C or Pascal.

4.1.3 Declarations Constants, variables, cursors, and exceptions used in a PL/SQL block must be declared in the declare section of that block. Variables and constants can be declared as follows: [constant] [not null] [:= <expression>]; Valid data types are SQL data types (see Section 1.1) and the data type boolean. Boolean data may only be true, false, or null. The not null clause requires that the declared variable must always have a value different from null. <expression> is used to initialize a variable. If no expression is specified, the value null is assigned to the variable. The clause constant states that once a value has been assigned to the variable, the value cannot be changed (thus the variable becomes a constant). Example: declare hire date date; /* implicit initialization with null */ job title varchar2(80) := ’Salesman’; emp found boolean; /* implicit initialization with null */ salary incr constant number(3,2) := 1.5; /* constant */ ... begin . . . end;

2.

Write the syntax of creating PL/SQL trigger

Use the CREATE TRIGGER statement to create and enable a database trigger, which is • •

A stored PL/SQL block associated with a table, a schema, or the database or An anonymous PL/SQL block or a call to a procedure implemented in PL/SQL or Java

Oracle automatically executes a trigger when specified conditions occur. When you create a trigger, Oracle enables it automatically. You can subsequently disable and enable a trigger with the DISABLE and ENABLE clause of the ALTER TRIGGER or ALTER TABLE statement.

Prerequisites Before a trigger can be created, the user SYS must run a SQL script commonly called DBMSSTDX.SQL. The exact name and location of this script depend on your operating system. • • •

To create a trigger in your own schema on a table in your own schema or on your own schema (SCHEMA), you must have the CREATE TRIGGER privilege. To create a trigger in any schema on a table in any schema, or on another user's schema (schema.SCHEMA), you must have the CREATE ANY TRIGGER privilege. In addition to the preceding privileges, to create a trigger on DATABASE, you must have the ADMINISTER DATABASE TRIGGER system privilege.

If the trigger issues SQL statements or calls procedures or functions, then the owner of the trigger must have the privileges necessary to perform these operations. These privileges must be granted directly to the owner rather than acquired through roles.

Syntax create_trigger::=

dml_event_clause::=

referencing_clause::= Semantics

OR REPLACE Specify OR REPLACE to re-create the trigger if it already exists. Use this clause to change the definition of an existing trigger without first dropping it.

schema Specify the schema to contain the trigger. If you omit schema, then Oracle creates the trigger in your own schema.

trigger Specify the name of the trigger to be created. If a trigger produces compilation errors, then it is still created, but it fails on execution. This means it effectively blocks all triggering DML statements until it is disabled, replaced by a version without compilation errors, or dropped. You can see the associated compiler error messages with the SQL*Plus command SHOW ERRORS.

3.

Differentiate between shared pool and large pool of oracle memory component

Introduction The Oracle shared pool provides critical services for sharing of complex objects among large numbers of users. DBAs often spent much time learning about the shared pool memory management and configure and tune shared pool usage. The Automatic Shared Memory Management (ASMM) features introduced in 10GR1 solved this problem by providing the DBA a simple, automatic self-tuning mechanism for configuring shared memory components of SGA, including the buffer cache and shared pool.

Purpose of the Shared Pool The shared pool was introduced as a feature of the Oracle database in version 7, primary as repository for Shared SQL and PL/SQL. Since that time, although it has evolved to serves in its primary role for caching of cursors shared between sessions connected to the database.

At the most fundamental level, the shared pool is a metadata cache. Whereas the buffer cache is almost completely utilized for the caching of data, the shared pool is used for caching complex objects describing where the data is stored, how it relates to other data and how it can be retrieved. Much of the shared pool usage is to support the execution od shared SQL and PL/SQL packages; but in order to build a cursor or compile a PL/SQL procedure, we need to know all about the database objects referenced by the SQL or PL/SQL being compiled. For example, in order to build a cursor for a simple select statement from a table, we need to know metadata about the table including column names, data types, indexes and optimizer statistics. All of this additional metadata is also cached in the shared pool, independently of the cursors or program unit.

Components of the Shared Pool Generally what is of far greater interest to the DBA, are the allocation of memory that can be aged in and out of the cache since building new objects in the shared pool is expensive and impacts scalability and performance. Memory allocations for objects that can be rebuilt are sometimes referred to as 'recreatable'.V$SGASTATThe easiest way for examining the contents of the shared pool at a given time is to query the fixed view V$SGASTAT. Since Oracle keeps running totals of component memory allocations in the shared pool, selecting from this view is inexpensive and will not impact the production system. This vire has been enhanced in 10GR2 to provide a finer-granularity of memory allocation data. Example: SELECT * FROM(SELECT NAME, BYTES/(1024*1024) MB FROM V$SGASTAT WHERE POOL = 'shared pool'ORDER BY BYTES DESC)WHERE ROWNUM <= 10; NAME MB -------------------------- ---------ktcmvcb 6856.96056 KGH: NO ACCESS 1610.02487 sql area 571.81263 free memory 502.999672 CCursor 301.799118 PCursor 176.69886 library cache 112.56636 kglsim object batch 77.3775787 gcs resources 75.5597076 sql area:PLSQL 51.2854691 Multiple selects from V$SGASTAT will show that some components remain constant in size.

Large Pool The large pool is to reduce the strain on the shared pool, the following will use the large pool if configured: • • • •

Shared server processes the UGA’s will be stored here Parallel execution servers if you have enabled parallel query, the parallel servers communicate via the large pool I/O slave processes if you have enabled db writer slaves or tape i/o slaves. RMAN the rman channel processes use the large pool for caching buffers during backup and restore operations

Large pool configuration Large Pool Size Java Pool

show parameter large_pool_size; select * from v$sgastat where pool='large pool';

Java stored procedures are loaded from the data dictionary into the shared pool, the purpose of the Java pool is to provide room for the runtime memory structures used by a Java application. It is controlled by three instance parameters: • • •

Java_pool_size dynamically create the java pool Java_max_sessionspace_size maximum amount of space aloud for any one session (static) Java_soft_sessionspace_limit a message is written to the trace file when the limit is breached, can help with monitoring. (static)

Java pool configuration Java Pool Size

4.

show parameter java_pool_size; select * from v$sgastat where pool='java pool';

How background processes are useful in oracle database? Explain

To maximize performance and accommodate many users, a multiprocess Oracle Database system uses background processes. Background processes consolidate functions that would otherwise be handled by multiple database programs running for each user process. Background processes asynchronously perform I/O and monitor other Oracle Database processes to provide increased parallelism for better performance and reliability. Table 4-4 describes the basic Oracle Database background processes, many of which are discussed in more detail elsewhere in this book. The use of additional database server features or options can cause more background processes to be present. For example, when you use Advanced Queuing, the queue monitor (QMNn) background process is present. When you specify the FILE_MAPPING initialization parameter for mapping datafiles to physical devices on a storage subsystem, then the FMON process is present. Table 4-4 Oracle Database Background Processes

Process Name

Description

Database writer (DBWn)

The database writer writes modified blocks from the database buffer cache to the datafiles. Oracle Database allows a maximum of 20 database writer processes (DBW0-DBW9 and DBWa-DBWj). The DB_WRITER_PROCESSES initialization parameter specifies the number of DBWn processes. The database selects an appropriate default setting for this initialization parameter or adjusts a user-specified setting based on the number of CPUs and the number of processor groups. For more information about setting the DB_WRITER_PROCESSES initialization parameter, see the Oracle Database Performance Tuning Guide.

Log writer (LGWR)

The log writer process writes redo log entries to disk. Redo log entries are generated in the redo log buffer of the system global area (SGA). LGWR writes the redo log entries sequentially into a redo log file. If the database has a multiplexed redo log, then LGWR writes the redo log entries to a group of redo log files. See Chapter 10, "Managing the Redo Log" for information about the log writer process.

Checkpoint (CKPT)

At specific times, all modified database buffers in the system global area are written to the datafiles by DBWn. This event is called a checkpoint. The checkpoint process is responsible for signalling DBWn at checkpoints and updating all the datafiles and control files of the database to indicate the most recent checkpoint.

System monitor (SMON)

The system monitor performs recovery when a failed instance starts up again. In an Oracle Real Application Clusters database, the SMON process of one instance can perform instance recovery for other instances that have failed. SMON also cleans up temporary segments that are no longer in use and recovers dead transactions skipped during system failure and instance recovery because of file-read or offline errors. These transactions are eventually recovered by SMON when the tablespace or file is brought back online.

Process monitor (PMON)

The process monitor performs process recovery when a user process fails. PMON is responsible for cleaning up

Process Name

Description the cache and freeing resources that the process was using. PMON also checks on the dispatcher processes (described later in this table) and server processes and restarts them if they have failed.

Archiver (ARCn)

One or more archiver processes copy the redo log files to archival storage when they are full or a log switch occurs. Archiver processes are the subject of Chapter 11, "Managing Archived Redo Logs".

Recoverer (RECO)

The recoverer process is used to resolve distributed transactions that are pending because of a network or system failure in a distributed database. At timed intervals, the local RECO attempts to connect to remote databases and automatically complete the commit or rollback of the local portion of any pending distributed transactions. For information about this process and how to start it, see Chapter 33, "Managing Distributed Transactions".

Dispatcher (Dnnn)

Dispatchers are optional background processes, present only when the shared server configuration is used. Shared server was discussed previously in "Configuring Oracle Database for Shared Server".

Global Cache Service (LMS)

In an Oracle Real Application Clusters environment, this process manages resources and provides interinstance resource control.

5.

Explain the notions of transparency and site autonomy.

Site autonomy means that each server participating in a distributed database is administered independently from all other databases. Although several databases can work together, each database is a separate repository of data that is managed individually. Some of the benefits of site autonomy in an Oracle Database distributed database include: • • •

Nodes of the system can mirror the logical organization of companies or groups that need to maintain independence. Local administrators control corresponding local data. Therefore, each database administrator's domain of responsibility is smaller and more manageable. Independent failures are less likely to disrupt other nodes of the distributed database. No single database failure need halt all distributed operations or be a performance bottleneck.

• • •

Administrators can recover from isolated system failures independently from other nodes in the system. A data dictionary exists for each local database. A global catalog is not necessary to access local data. Nodes can upgrade software independently.

Although Oracle Database permits you to manage each database in a distributed database system independently, you should not ignore the global requirements of the system. For example, you may need to: •

Create additional user accounts in each database to support the links that you create to facilitate server-to-server connections.

Set additional initialization parameters such as COMMIT_POINT_STRENGTH, and OPEN_LINKS. 6.

Describe the Oracle Architecture with a neat labeled diagram

All buildings, from the home you live in to the place you work, have an architecture that keeps the building together. This architecture (if it’s well designed), will keep the building from falling apart, keep you nice and warm, and provide you with easy access to facilities like water fountains and restrooms. A well-crafted architecture will also help you work more efficiently. The same is true for databases. Despite the name “Oracle”, there is no magic! The architecture that Oracle has built its database around is designed to perform quickly, efficiently and without errors. In this section we will introduce you to the Oracle architecture. This architecture includes the following components: * The System Global Area (SGA) and other memory areas that utilize RAM * Database related background processes * Tablespaces and Datafiles * Database related files * The instance and the database Next we will look at each of these components in more detail. Finally we put the components together into a single “big picture” view, and see how all these pieces fit together to complete the Oracle puzzle. Keep in mind that the discussion in this chapter is limited to the overall database architecture. In later chapters we will get into the finer details of managing these structures.

With this in mind, let’s drill down a bit deeper into the architecture of Oracle and learn more about memory structures, the part of Oracle that utilizes your system’s RAM. Diving into the RAM Pools All computers have memory. Memory should not be confused with disk storage. Memory is volatile, which means its content is lost after the power is removed. Memory is also very fast. RAM memory is expressed in nanoseconds (billionths of a second) and disk speed is in milliseconds (thousandths of a second). In Oracle, RAM speed is hundreds of times faster than disks. Don’t get burned by tiny pools. See the Google search ”oracle cache disk speed” for details. Disk storage is non-volatile. This means that the data stored on a disk will remain after the power is turned off. Disks are always slower than RAM, but disks are hundreds of times cheaper than RAM. There is a trade-off between memory and disks: Memory is fast but expensive (about $1,000 per gigabyte), whereas disks are slower but very cheap. Thus, memory is used for short-term storage of information that is frequently needed and disks are used for longterm storage of information. Oracle has a number of memory areas that it uses to store information. In this section we will address the main Oracle memory areas. They are called: * The System Global Area (SGA) – RAM areas for the Oracle programs. * The Program Global Area (PGA) – Private RAM areas for individual client connections to Oracle The Instance and the Database We are almost at the end of our introduction to the Oracle database architecture. We can’t complete this journey, however, until we define two more terms, instance and database. The Oracle instance is the combination of the Oracle SGA and the related background processes (the programs, PMON, SMON, etc.). When the SGA RAM memory is successfully allocated and the Oracle database processes are running normally, the instance is said to be “up”. However, and instance is sometimes different from a database. You can have the instance running, but the database might not be mounted or open. The Oracle Database includes the physical files we have discussed: the datafiles, the control file, and the redo log files. When the Oracle instance is running, it can attach itself to a database, “mount” the control file, and finally “open” the datafiles and redo log files. This is an important distinction because many Oracle operations must be done with the instance started but the database is not open.

Oracle Architecture Concepts In the previous section we discussed the Oracle physical architecture. Things like files, programs and hardware are all considered physical pieces (or physical properties) of the Oracle database. In this section we are concerned with the logical pieces of the Oracle database. Oracle segregates “physical components” (the .dbf files on disk) my mapping them into “logical” containers called tablespaces. In turn, we allocate our tables and indexes inside these tablespace, and Oracle takes-care of the interface to the physical disk files. 7.

Explain the individual components of the Architecture.

The Oracle database has a logical layer and a physical layer. The physical layer consists of the files that reside on the disk; the components of the logical layer map the data to these physical components.

The Physical Layer The physical layer of the database consists of three types of files: One or more datafiles--Datafiles store the information contained in the database. You can have as few as one datafile or as many as hundreds of datafiles. The information for a single table can span many datafiles or many tables can share a set of datafiles. Spreading tablespaces over many datafiles can have a significant positive effect on performance. The number of datafiles that can be configured is limited by the Oracle parameter MAXDATAFILES. • Two or more redo log files--Redo log files hold information used for recovery in the event of a system failure. Redo log files, known as the redo log, store a log of all changes made to the database. This information is used in the event of a system failure to reapply changes that have been made and committed but that might not have been made to the datafiles. The redo log files must perform well and be protected against hardware failures (through software or hardware fault tolerance). If redo log information is lost, you cannot recover the system. • One or more control files--Control files contain information used to start an instance, such as the location of datafiles and redo log files; Oracle needs this information to start the database instance. Control files must be protected. Oracle provides a mechanism for storing multiple copies of control files. •

The Logical Layer The logical layer of the database consists of the following elements: •

One or more tablespaces.

The database schema, which consists of items such as tables, clusters, indexes, views, stored procedures, database triggers, sequences, and so on. •

Tablespaces and Datafiles New Term: The database is divided into one or more logical pieces known as tablespaces. A tablespace is used to logically group data together. For example, you can create one tablespace for accounting and a separate tablespace for purchasing. Segmenting groups into different tablespaces

simplifies the administration of these groups (see Figure 2.1). Tablespaces consist of one or more datafiles. By using more than one datafile per tablespace, you can spread data over many different disks to distribute the I/O load and improve performance. The relationship between the database, tablespaces, and datafiles. As part of the process of creating the database, Oracle automatically creates the SYSTEM tablespace for you. Although a small database can fit within the SYSTEM tablespace, it's recommended that you create a separate tablespace for user data. The SYSTEM tablespace is where the data dictionary is kept. The data dictionary contains information about tables, indexes, clusters, and so on. Datafiles can be operating system files or, in the case of some operating systems, RAW devices. Datafiles and data access methods are described in detail on Day 12.

The Database Schema New Term: The database schema is a collection of logical-structure objects, known as schema objects, that define how you see the database's data. These schema objects consist of structures such as tables, clusters, indexes, views, stored procedures, database triggers, and sequences. Table--A table, which consists of a tablename and rows and columns of data, is the basic logical storage unit in the Oracle database. Columns are defined by name and data type. A table is stored within a tablespace; often, many tables share a tablespace. • Cluster--A cluster is a set of tables physically stored together as one table that shares a common column. If data in two or more tables is frequently retrieved together based on data in the common column, using a clustered table can be quite efficient. Tables can be accessed separately even though they are part of a clustered table. Because of the structure of the cluster, related data requires much less I/O overhead if accessed simultaneously. • Index--An index is a structure created to help retrieve data more quickly and efficiently (just as the index in this book allows you to find a particular section more quickly). An index is declared on a column or set of columns. Access to the table based on the value of the indexed column(s) (as in a WHERE clause) will use the index to locate the table data. •

NOTE: A new feature in Oracle8 is the index-only table. In an index-only table, the data and index are stored together. This is discussed in detail on Day 13, "Using Indexes and Sequences." View--A view is a window into one or more tables. A view does not store any data; it presents table data. A view can be queried, updated, and deleted as a table without restriction. Views are typically used to simplify the user's perception of data access by providing limited information from one table, or a set of information from several tables transparently. Views can also be used to prevent some data from being accessed by the user or to create a join from multiple tables. • Stored procedure--A stored procedure is a predefined SQL query that is stored in the data dictionary. Stored procedures are designed to allow more efficient queries. Using stored procedures, you can reduce the amount of information that must be passed to the RDBMS and thus reduce network traffic and improve performance. •

Database trigger--A database trigger is a procedure that is run automatically when an event occurs. This procedure, which is defined by the administrator or developer, triggers, or is run whenever this event occurs. This procedure could be an insert, a deletion, or even a selection of data from a table. • Sequence--The Oracle sequence generator is used to automatically generate a unique sequence of numbers in cache. By using the sequence generator you can avoid the steps necessary to create this sequence on your own such as locking the record that has the last value of the sequence, generating a new value, and then unlocking the record. •

Segments, Extents, and Data Blocks Within Oracle, the space used to store data is controlled by the use of logical structures. These structures consist of the following: Data blocks--A block is the smallest unit of storage in an Oracle database. The database block contains header information concerning the block itself as well as the data. • Extents--Extents consist of data blocks. • Segments--A segment is a set of extents used to store a particular type of data, as shown in Figure 2.2. •

Segments, extents, and data blocks. Segments An Oracle database can use four types of segments: Data segment--Stores user data within the database. • Index segment--Stores indexes. • Rollback segment--Stores rollback information used when data must be rolled back. • Temporary segment--Created when a SQL statement needs a temporary work area; these segments are destroyed when the SQL statement is finished. These segments are used during various database operations, such as sorts. •

Extents Extents are the building blocks of segments; in turn, they consist of data blocks. An extent is used to minimize the amount of wasted (empty) storage. As more and more data is entered into tablespaces in your database, the extents used to store that data can grow or shrink as necessary. In this manner, many tablespaces can share the same storage space without preallocating the divisions between those tablespaces. At tablespace-creation time, you can specify the minimum number of extents to allocate as well as the number of extents to add at a time when that allocation has been used. This arrangement gives you efficient control over the space used in your database. Data Blocks Data blocks are the smallest pieces of an Oracle database; they are physically stored on disk. Although the data block in most systems is 2KB (2,048 bytes), you can change this size for efficiency depending on your application or operating system.

NOTE: Oracle blocks do not need to be, and may not be the same as, operating system data blocks. In fact, in most cases they are not.

The Oracle Instance The Oracle instance consists of the Oracle processes and shared memory necessary to access information in the database. The instance is made up of the user processes, the Oracle background processes, and the shared memory used by these processes (see Figure 2.3).

The Oracle Memory Structure New Term: Oracle uses shared memory for several purposes, including caching of data and indexes as well as storing shared program code. This shared memory is broken into various pieces, or memory structures. The basic memory structures associated with Oracle are the System Global Area (SGA) and the Program Global Area (PGA). 8.

Discuss the features of the heterogeneous services

Oracle9i, like previous versions, supports heterogeneous services to allow data in non-Oracle database to be queried using SQL. This support has been in the form of transparent gateways, which are vendor specific, or generic connectivity which uses ODBC or OLEDB to make the connections. The functionality supported by generic connectivity is typically more limited than that possible when using vendor specific gateways, but it is quick and simple to configure. The steps listed below can be used to connect Oracle to any ODBC compliant database (MS Access, SQL Server etc.) on the local server: • • •

Make sure a valid username and password are present for the non-Oracle database (MyUser/MyPassword). Create an ODBC DataSource for the non-Oracle database (Access1). Create a file called: "ORACLE_HOME\hs\admin\initACCESS1.ora" which contains: HS_FDS_CONNECT_INFO = Access1 #ODBC DSN HS_FDS_TRACE_LEVEL = OFF

•

Add the following entry to the tnsnames.ora file: ACCESS1.WORLD = (DESCRIPTION= (ADDRESS=(PROTOCOL=tcp)(HOST=localhost)(PORT=1521)) (CONNECT_DATA=(SID=ACCESS1)) (HS=OK) )

•

Add the following entry into the listener.ora file: (SID_DESC= (SID_NAME=ACCESS1) (ORACLE_HOME=D:\Oracle\Ora\9011) (PROGRAM=hsodbc) )

• •

Reload the listener: lsnrctl reload Create a database link using: CREATE DATABASE LINK access1.world CONNECT TO "MyUser" IDENTIFIED BY "MyPassword" USING 'ACCESS1.WORLD';

•

Query the table using: SELECT * FROM [email protected];

With MS Access the first query is slow due to the Access application being loaded. As expected, subsequent queries do not display a similar lag.

August 2009 Master of Science in Information Technology (MScIT-NEW) – Semester 3 MT0046 –: Oracle9i– 4Credits (Book ID: B0894)

Assignment Set – 2 (40 Marks) [Book ID: B0894] Each Question carries 8 marks

1. List the objectives of Query processing 2. Explain the features of RMAN Repository 3.

Discuss the relative advantages of centralized and distributed databases

4. What benefits does strict two-phase locking provide? What disadvantages 5.

What are characteristics of distributed query processor? Explain

result?

1.

List the objectives of Query processing ?

Ans. : the main objective of query processing in a distributed environment is to form a high level query on a distributed database, which is seen as a single database by the users, into an efficient execution strategy expressed in a low level language on local databases. An important point of query processing is query optimization. Because many execution strategies are correct transformations of the same high level query, the one that optimizes (minimizes) resource consumption should be retained. The good measures of resources consumption are: i.

The total cost that will be incurred in processing the query. It is the some of all times incurred in processing the operations of the query at various sites and intrinsic communication.

ii.

The resource time of the query. This is the time elapsed for executing the query. Since operations can be executed in parallel at different sites, the response time of a query may be significantly less that its cost. Obviously the total cost should be minimized.

i.

ii.

In a distributed system, the total cost be minimized included CPU, I/O, and communication costs. These costs can be minimized by reducing the number of I/O operation through fast access methods to the data and efficient use of main memory. The communication cost is the time needed for exchanging the data between sites participating in the execution of the query. In centralized systems, only CPU and I/O cost have to be considered.

16

Objectives of query processing – transform a high-level query on a DDB into an efficient execution strategy in a low-level language on local DB’s high level user query query processor low level data manipulation commands General steps in query processing – Query specified by a high-level query language » SQL is the most widely used declarative – Query execution plan » low-level physical steps for executing high-level (declarative) user queries _Query optimization » how do we determine the “best” execution plan? Relational calculus query on DDB Relational algebraic query on DDB Algebraic query on fragments Optimized fragment query with communication operations Optimized local query Selecting alternatives – A calculus query may have many equivalent algebra queries

17

– Select execution strategy that could minimize Execution time, or total resource consumption » Computationally intractable problem Choose a solution close to the optimum _ In DDB » communication cost for data exchange occurs – select best sites » to process data, and possibly » the way data should be transferred

2. Explain the features of RMAN Repository ? Ans. The RMAN repository is the collection of metadata about the target databases that RMAN uses for backup, recovery, and maintenance. RMAN always stores this information in records in the control file. The version of this information in the control file is the authoritative record of RMAN's backups of your database. This is one reason why protecting your control file is a important part of your backup strategy. RMAN can conduct all necessary backup and recovery operations using just the control file to store the RMAN repository information, and maintains all records necessary to meet your configured retention policy. You can also create a recovery catalog, an external Oracle database in which to store this information. The control file has finite space for records of backup activities, while a recovery catalog can store a much longer history. The added complexity of operating a recovery catalog database can be offset by the convenience of having the extended backup history available if you have to do a recovery that goes further back in time than the history in the control file. There are also a few features of RMAN that only function when you use a recovery catalog. For example, RMAN stored scripts are stored in the recovery catalog, so commands related to them require the use of a recovery catalog. Other RMAN commands are specifically related to managing the recovery catalog and so are not available (and not needed) if RMAN is not connected to a recovery catalog. The recovery catalog's version of the RMAN repository is maintained solely by RMAN. The target instance never accesses it directly. RMAN propagates information about the database structure, archived redo logs, backup sets, and datafile copies into the recovery catalog from the target database's control file after any operation that updates the repository, and also before certain operations. Storage of the RMAN Repository in the Control File Because most information in the recovery catalog is also available in the target database's control file, RMAN supports an operational mode in which it uses the target database control file instead of

18

a recovery catalog. This mode is especially appropriate for small databases where installation and administration of a separate recovery catalog database is burdensome. The only RMAN feature that is not supported in NOCATALOG mode is stored scripts. Types of Records in the Control File When you do not use a recovery catalog, the control file is the exclusive source of information about backups and copies as well as other relevant information. The control file contains two types of records: circular reuse records and noncircular reuse records. Circular Reuse Records Circular reuse records contain noncritical information that is eligible to be overwritten if the need arises. These records contain information that is continually generated by the database. Circular reuse records are arranged in a logical ring. When all available record slots are full, the database either expands the control file to make room for a new record or overwrites the oldest record. The CONTROL_FILE_RECORD_KEEP_TIME initialization parameter specifies the minimum age in days of a record before it can be reused. Noncircular Reuse Records Noncircular reuse records contain critical information that does not change often and cannot be overwritten. Some examples of information in noncircular reuse records include datafiles, online redo logs, and redo threads. Recovery Without a Recovery catalog To make it easier to restore and recover the database without using a recovery catalog, Oracle recommends that you: •

Enable the control file autobackup feature, which causes RMAN to automatically back up the control file, and also enables RMAN to restore the control file autobackup without access to a repository

•

Keep a record of your DBID, which you may need to recover your database in the event that you lose your control file

•

Use a minimum of two multiplexed or mirrored control files on separate disks

•

Keep all Recovery Manager backup logs.

If you lose the current control files, then you can restore a control file autobackup even if you do not use a recovery catalog. Storage of the RMAN Repository in the Recovery Catalog It is recommended that you store the recovery catalog in a dedicated database. If you store the recovery catalog alongside other data in some other database, then if you lose that other database you will lose your recovery catalog as well. This will make your recovery more difficult.

19

Registration of Databases in the Recovery Catalog The enrolling of a database in a recovery catalog is called registration. You can register more than one target database in the same recovery catalog. For example, you can register databases prod1, prod2, and prod3 in a single catalog owned by catowner in the database catdb. Because RMAN distinguishes databases by unique database identifier (DBID), each database registered in a given catalog must have a unique DBID. Contents of the Recovery Catalog The recovery catalog contains information about RMAN operations, including: •

Datafile and archived redo log backup sets and backup pieces

•

Datafile copies

•

Archived redo logs and their copies

•

Tablespaces and datafiles on the target database

•

Stored scripts, which are named user-created sequences of RMAN commands

•

Persistent RMAN configuration settings

Resynchronization of the Recovery Catalog The recovery catalog obtains crucial RMAN metadata from the target database control file. Resynchronization of the recovery catalog ensures that the metadata that RMAN obtains from the control file stays current. Resynchronizations can be full or partial. Backups of the Recovery Catalog A single recovery catalog is able to store information for multiple target databases. Consequently, loss of the recovery catalog can be disastrous. You should back up the recovery catalog frequently. If the recovery catalog is destroyed and no backups of it are available, then you can partially reconstruct the catalog from the current control file or control file backups. Nevertheless, you should always aim to have a valid, recent backup of the catalog. Compatibility of the Recovery Catalog When you use RMAN with a recovery catalog in an environment where you have run past versions of the database, you can wind up with versions of the RMAN client, recovery catalog database, recovery catalog schema, and target database that all originated in different releases of the database. Media Management Oracle9s Media Management Layer (MML) API lets third-party vendors build a media manager, software that works with RMAN and the vendor's hardware to allow backups to sequential media devices such as tape drives. The media manager handles loading, unloading and labeling of

20

sequential media such as tapes. You must install media manager software to use RMAN with sequential media devices. When backing up or restoring, the RMAN client connects to the target instance and directs the instance to send requests to its media manager. No direct communication occurs between the RMAN client and media manager. Performing Backup and Restore with a Media Manager Before performing backup or restore to a media manager, you must allocate one or more channels to handle the communication with the media manager. You can also configure default channels for use with the media manager, which will be applied for all backup and recovery tasks that use the media manager where you do not explicitly allocate channels. For example, this sequence of commands would configure channels for the media manager and back up the database to the media manager: RMAN> CONFIGURE DEVICE TYPE sbt PARALLELISM 1; RMAN> CONFIGURE DEFAULT DEVICE TYPE TO sbt; RMAN> CONFIGURE CHANNEL DEVICE TYPE sbt PARMS 'ENV=(NSR_SERVER=bksvr1)'; RMAN> BACKUP DATABASE; When RMAN executes the BACKUP DATABASE command, it sends the backup request to the database server session performing the backup. The database server session identifies the output channel as a media management device and makes a request to the media manager to write the output. RMAN does not issue specific commands to load, label, or unload tapes. When backing up, RMAN gives the media manager a stream of bytes and associates a unique name with that stream. When RMAN needs to restore the backup, it asks the media manager to retrieve the byte stream. All details of how and where that stream is stored are handled entirely by the media manager. The media manager labels and keeps track of the tape and names of files on each tape, and automatically loads and unloads tapes, or signals an operator to do so. Some media managers support proxy copy functionality, in which they handle the entire data movement between datafiles and the backup devices. Such products may use technologies such as high-speed connections between storage and media subsystems to reduce load on the primary database server. RMAN provides a list of files requiring backup or restore to the media manager, which in turn makes all decisions regarding how and when to move the data.

21

3. Discuss the relative advantages of centralized and distributed databases Ans. : A distributed database is a database that is under the control of a central database management system (DBMS) in which storage devices are not all attached to a common CPU. It may be stored in multiple computers located in the same physical location, or may be dispersed over a network of interconnected computers. Collections of data (eg. in a database) can be distributed across multiple physical locations. A distributed database can reside on network servers on the Internet, on corporate intranets or extranets, or on other company networks. Replication and distribution of databases improve database performance at end-user worksites. [1] Relative Advantages of centralized distributed databases •

Reflects organizational structure — database fragments are located in the departments they relate to.

•

Local autonomy — a department can control the data about them (as they are the ones familiar with it.)

•

Protection of valuable data — if there were ever a catastrophic event such as a fire, all of the data would not be in one place, but distributed in multiple locations.

•

Improved performance — data is located near the site of greatest demand, and the database systems themselves are parallelized, allowing load on the databases to be balanced among servers. (A high load on one module of the database won't affect other modules of the database in a distributed database.)

•

Economics — it costs less to create a network of smaller computers with the power of a single large computer.

•

Modularity — systems can be modified, added and removed from the distributed database without affecting other modules (systems).

•

Reliable transactions - Due to replication of database.

single site failure doesnot affect performance of system. all transactions follow A.C.I.D property. aatmocity : the transaction takes place as whole or not at all. c-consistency : maps one consistent DB state to another i-isolation : each transaction sees a consistent DB d-durabality : the results of a transaction must survive system failures

22

4. What benefits does strict two-phase locking provide? What disadvantages result? Ans. The management of distributed transaction means with interrelated problem like reliability, concurrency control and the efficient utilization of the resources of the complete system. The two type of protocols like 2 phase commit protocol for recovery and 2 phase locking for concurrency control. The basic idea of locking is that whenever a transaction accesses a data item, it locks it, and that a transaction which wants to lock a data item which is already locked by another transaction must wait until the other transaction has released the lock (unlock). A sophisticated locking mechanism known as 2-Phase locking which included the all said principles is normally used. According to this there are two separate phase: i. Growing Phase: each transactions there is a first phase during which new locks are acquired. ii. Shrinking Phase: a second phase during which locks are only released. LOCKING IN CENTRALIZED DATABASE: i. Lock mode: transaction locks the data item in the following: •

Shared mode: Read the date item.

•

Exclusive mode: Edit the data item.

ii. The well-formed transactions: the transactions are always well formed if it always locks a data item in shared mode before reading it, and it always locks a data item in exclusive mode before writing it. iii. Compatibility rules existing between lock mode. •

A transaction can lock a data item in shared mode if it is not locked at all or it is locked in shared mode by another transaction.

iv. Conflicts: two transcations are confilict if they want ot want to lock the same data item with two compatible modes. Two types of confilict : read-write conflict and Write-Write conflict. v. Granularity of locking: this term relates to the size of objects that are locked with a single lock operation.

SUCCESSFUL RULES ARE FOLLOWED:

23

•

Transactions are well-formed

•

Compatibilty rules are observed

•

Each transaction does not request new locks after it has released a lock.

SYNTEX: (Begin application) Begin transaction Acquire locks before reading or writing Commit Release locks (End application)

24

5.

What are characteristics of distributed query processor? Explain

Ans. it is very difficult to give the characteristics, which differentiates centralized and distributed query processors. Still some of them have been listed here. Out of them, the first four are common to both and the next four are particular to distributed query processors. Characterization of Query Processing i. Languages ii. Types of Optimization iii. Optimization timing iv. Statistics v. Decision sites vi. Exploitation of the Network Topology vii. Exploitation of replicated fragments. i. Languages: The input language to the query processor can be based on relational calculus or relational algebra. In distributed context, the output language is generally some form of relational algebra augmented with communication primitives. ii. Types of Optimization Types of optimizers – Cost-based: exhaustive search » can be optimal, potentially » combinatorial complexity in the number of relations – Heuristics » may not be optimal, in general » commonly used heuristics _ regroup common sub-expressions _ perform selection, projection first _ reorder operations to reduce intermediate relation size

25

_replace a join by a series of semi-joins iii. Optimization timing – Static » optimization before execution, i.e., at compile time _ needs to estimate the size of the intermediate results _ good for queries executed multiple times » (ex) R* – Dynamic » optimization at query execution time

_ potentially, exact information on intermediate results » must be repeated for each execution _ good for ad-hoc queries » (ex) Distributed INGRES – Hybrid » compile using a static algorithm » if the error in estimate sizes > threshold, _ reoptimize at run time » (ex) MERMAID iv. Statistics » effectiveness of query optimization relies on statistics on DB – relation » cardinality, size of a tuple, join selectivity – attribute » cardinality of domain, actual number of distinct values _Common assumptions » independence between different attribute values » uniform distribution of attribute values within their domain _Statistics need to be periodically updated v. Decision sites » who makes a decision? – Centralized

26

» single site determines the “best” schedule » simple, but need knowledge about entire DDB – Distributed » cooperation among sites to determine the schedule » need only local information » cost of cooperation – Hybrid » one site makes a major decision, and _ other sites can make local decisions » (ex) System R* vi. Network topology – Wide area networks (WAN) _ low bandwidth, low speed, high protocol overhead » communication cost will dominate _ ignore all other cost factors » global schedule to minimize communication cost » local schedules according to centralized query optimization – Local area networks (LAN) _ communication costs are comparable to I/O costs » I/O and CPU costs need to be considered » broadcasting can be exploited (e.g., for joins) vii. Use of semijoins – good for reducing the size of data transfer – (ex) SDD-1 makes an extensive use of semijoins » SDD-1 was designed for slow WAN – (ex) System R* do not employ semijoins » System R* assumes faster networks Query decomposition – decompose the distributed calculus query into an algebraic query on global relations _ Data localization – transform the distributed query into a fragment query _ Global query optimization

27

– optimized fragment query (e.g., join ordering) with communication operations _ Local query optimization – optimized local query

28

August 2009 Master Of Science In Information Technology (mscit-new) –

Overview

More details

Related Documents

Master Of Science In Management

184_advances In Information Technology

Bachelor Of Science, Information Technology Security Emphasis

Information Technology In Agriculture

Master Thesis On Master Of Science In Information Technology (mscit) On Online Banking - Rajendra Man Banepali

Information Technology In India

More Documents from ""

Hiv Medic Program-2008 Promoted By Student World Assembly

Iso9001_2008_seminar