ASSIGNMENT No. 2
DATABASE TERMINOLOGY; GIVE THE DEFINITION FOR EACH TERMINOLOGY. THAT INFORMATION MUST BE CLEAR AND ENCLOSE AN EXAMPLE. ITT (Information Technology in Tourism) Submitted to: Saritha Pradhan
Submitted by: Bal Gopal Subudhi PGDM (TT) Roll No.18 9TH Oct 2009 1|Page
KEY TERMS 1. FOREIGN KEY In the context of relational databases, a foreign key is a referential constraint between two tables. The foreign key identifies a column or a set of columns in one (referencing) table that refers to a column or set of columns in another (referenced) table. The columns in the referencing table must be the primary key or other candidate key in the referenced table. The values in one row of the referencing columns must occur in a single row in the referenced table. Thus, a row in the referencing table cannot contain values that don't exist in the referenced table (except potentially NULL). This way references can be made to link information together and it is an essential part of database normalization. Multiple rows in the referencing table may refer to the same row in the referenced table. Most of the time, it reflects the one (master table, or referenced table) to many (child table, or referencing table) relationship. The referencing and referenced table may be the same table, i.e. the foreign key refers back to the same table. Such a foreign key is known in SQL: 2003 as self‐referencing or recursive foreign key. Example An accounts database has a table with invoices and each invoice is associated with a particular supplier. Supplier details (such as address or phone number) are kept in a separate table; each supplier is given a 'supplier number' to identify it. Each invoice record has an attribute containing the supplier number for that invoice. Then, the 'supplier number' is the primary key in the Supplier table. The foreign key in the Invoices table points to that primary key. The relational schema is the following. Primary keys are marked in bold, and foreign keys are marked in italics. 2. FORMS GENERATOR 3. HETEROGENEOUS DATA ENVIRONMENT 4. HIERARCHICAL DATABASE MODEL A hierarchical data model is a data model in which the data is organized into a tree‐like structure. The structure allows repeating information using parent/child relationships: each parent can have many children but each child only has one parent. All attributes of a specific record are listed under an entity type. In a database, an entity type is the equivalent of a table; each individual record is represented as a row and an attribute as a column. Entity types are related to each other using 1: mapping, also known as one‐to‐many relationships. The most recognized and used hierarchical database is IMS developed by IBM. 2|Page
5. MAIN MEMORY Primary storage, presently known as memory, is the only one directly accessible to the CPU. The CPU continuously reads instructions stored there and executes them as required. Any data actively operated on is also stored there in uniform manner. Historically, early computers used delay lines, Williams tubes, or rotating magnetic drums as primary storage. By 1954, those unreliable methods were mostly replaced by magnetic core memory, which was still rather cumbersome. Undoubtedly, a revolution was started with the invention of a transistor that soon enabled then‐unbelievable miniaturization of electronic memory via solid‐state silicon chip technology. This led to a modern random‐access memory (RAM). It is small‐sized, light, but quite expensive at the same time. (The particular types of RAM used for primary storage are also volatile, i.e. they lose the information when not powered). 6. MEMORY BUFFER A Memory Buffer Register (MBR) is the register in a computer's processor, or central processing unit, CPU, that stores the data being transferred to and from the immediate access store. It acts as a buffer allowing the processor and memory units to act independently without being affected by minor differences in operation. A data item will be copied to the MBR ready for use at the next clock cycle, when it can be either used by the processor or stored in main memory. This register holds the contents of the memory which are to be transferred from memory to other components or vice versa. A word to be stored must be transferred to the MBR, from where it goes to the specific memory location, and the arithmetic data to be processed in the ALU first goes to MBR and then to accumulated register, and then it is processed in the ALU. 7. METADATA Metadata (Meta data, or sometimes metainformation) is "data about data", of any sort in any media. Metadata is text, voice, or image that describes what the audience wants or needs to see or experience. The audience could be a person, group, or software program. Metadata is important because it aids in clarifying and finding the actual data. An item of metadata may describe an individual datum, or content item, or a collection of data including multiple content items and hierarchical levels, such as a database schema. In data processing, metadata provides information about, or documentation of, other data managed within an application or environment. This commonly defines the structure or schema of the primary data. Example Metadata would document data about data elements or attributes, (name, size, data type, etc) and data about records or data structures (length, fields, columns, etc) and data about data (where it is located, how it is associated, ownership, etc.). Metadata may include descriptive information about the context, quality and condition, or characteristics of the data. It may be recorded with high or low granularity. 3|Page
8. NETWORK DATABASE MODEL The network model is a database model conceived as a flexible way of representing objects and their relationships. Its distinguishing feature is that the schema, viewed as a graph in which object types are nodes and relationship types are arcs, is not restricted to being a hierarchy or lattice. The network model's original inventor was Charles Bachman, and it was developed into a standard specification published in 1969 by the CODASYL Consortium. 9. NON ‐ VOLATILE STORAGE Non‐volatile memory, nonvolatile memory, NVM or non‐volatile storage, is computer memory that can retain the stored information even when not powered. Examples of non‐volatile memory include read‐only memory, flash memory, most types of magnetic computer storage devices (e.g. hard disks, floppy disks, and magnetic tape), optical discs, and early computer storage methods such as paper tape and punch cards. Non‐volatile memory is typically used for the task of secondary storage, or long‐term persistent storage. The most widely used form of primary storage today is a volatile form of random access memory (RAM), meaning that when the computer is shut down, anything contained in RAM is lost. Unfortunately, most forms of non‐volatile memory have limitations that make them unsuitable for use as primary storage. Typically, non‐volatile memory either costs more or performs worse than volatile random access memory. Several companies are working on developing non‐volatile memory systems comparable in speed and capacity to volatile RAM. For instance, IBM is currently developing MRAM (Magnetoresistive RAM). Not only would such technology save energy, but it would allow for computers that could be turned on and off almost instantly, bypassing the slow start‐up and shutdown sequence. Non‐volatile data storage can be categorized in electrically addressed systems (read only memory) and mechanically addressed systems (hard disks, optical disc, magnetic tape, holographic memory and such). Electrically addressed systems are expensive, but fast, whereas mechanically addressed systems have a low price per bit, but are slow. Non‐volatile memory may one day eliminate the need for comparatively slow forms of secondary storage systems, which include hard disks. 10. OBJECT‐ORIENTED DATABASE MODEL The object‐oriented paradigm has been applied to database technology, creating a various kinds of new programming model known as object databases. These databases attempt to bring the database world and the application programming world closer together, in particular by ensuring that the database uses the same type system as the application program. This aims to avoid the overhead (sometimes referred to as the impedance mismatch) of converting information between its representation in the database (for example as rows in tables) and its representation in the application program (typically as objects). At the same time, object 4|Page
databases attempt to introduce the key ideas of object programming, such as encapsulation and polymorphism, into the world of databases. A variety of these ways have been tried for storing objects in a database. Some products have approached the problem from the application programming end, by making the objects manipulated by the program persistent. This also typically requires the addition of some kind of query language, since conventional programming languages do not have the ability to find objects based on their information content. Others have attacked the problem from the database end, by defining an object‐oriented data model for the database, and defining a database programming language that allows full programming capabilities as well as traditional query facilities. 11. OBJECT‐RELATIONAL DATABASE MODEL An object‐relational database (ORD), or object‐relational database management system (ORDBMS), is a database management system (DBMS) similar to a relational database, but with an object‐oriented database model: objects, classes and inheritance are directly supported in database schemas and in the query language. In addition, it supports extension of the data model with custom data‐types and methods. An object‐relational database can be said to provide a middle ground between relational databases and object‐oriented databases (OODBMS). In object‐relational databases, the approach is essentially that of relational databases: the data resides in the database and is manipulated collectively with queries in a query language; at the other extreme are OODBMSes in which the database is essentially a persistent object store for software written in an object‐ oriented programming language, with a programming API for storing and retrieving objects, and little or no specific support for querying. 12. OPERATING SYSTEM SOFTWARE An Operating System (OS) is an interface between hardware and user which is responsible for the management and coordination of activities and the sharing of the resources of the computer that acts as a host for computing applications run on the machine. As a host, one of the purposes of an operating system is to handle the details of the operation of the hardware. This relieves application programs from having to manage these details and makes it easier to write applications. Almost all computers (including handheld computers, desktop computers, supercomputers, video game consoles) as well as some robots, domestic appliances (dishwashers, washing machines), and portable media players use an operating system of some type. Some of the oldest models may, however, use an embedded operating system that may be contained on a compact disk or other data storage device. 13. PHYSICAL DATA POINTER 5|Page
14. PRIMARY KEY In relational database design, a unique key or primary key is a candidate key to uniquely identify each row in a table. A unique key or primary key comprises a single column or set of columns. No two distinct rows in a table can have the same value (or combination of values) in those columns. Depending on its design, a table may have arbitrarily many unique keys but at most one primary key. A unique key must uniquely identify all possible rows that exist in a table and not only the currently existing rows. Example Social Security numbers (associated with a specific person) or ISBNs (associated with a specific book). Telephone books and dictionaries cannot use names, words, or Dewey Decimal system numbers as candidate keys because they do not uniquely identify telephone numbers or words. 15. PRODUCTION DATABASE (DBMS) 16. QUERY Literally, a question you ask about data in the database in the form of a command, written in a query language, defining sort order and selection, that is used to generate an ad hoc list of records; The output subset of data produced in response to a query. 17. QUERY OPTIMIZER The query optimizer is the component of a database management system that attempts to determine the most efficient way to execute a query. The optimizer considers the possible query plans for a given input query, and attempts to determine which of those plans will be the most efficient. Cost‐based query optimizers assign an estimated "cost" to each possible query plan, and choose the plan with the smallest cost. Costs are used to estimate the runtime cost of evaluating the query, in terms of the number of I/O operations required, the CPU requirements, and other factors determined from the data dictionary. The set of query plans examined is formed by examining the possible access paths (e.g. index scan, sequential scan) and join algorithms (e.g. sort‐merge join, hash join, nested loops). The search space can become quite large depending on the complexity of the SQL query. Generally, the query optimizer cannot be accessed directly by users: once queries are submitted to database server, and parsed by the parser, they are then passed to the query optimizer where optimization occurs. However, some database engines allow guiding the query optimizer with hints. 18. QUERY PROCESSOR 6|Page
19. RAID RAID is an acronym first defined by David A. Patterson, Garth A. Gibson, and Randy Katz at the University of California, Berkeley in 1987 to describe a redundant array of inexpensive disks, a technology that allowed computer users to achieve high levels of storage reliability from low‐ cost and less reliable PC‐class disk‐drive components, via the technique of arranging the devices into arrays for redundancy. More recently, marketers representing industry RAID manufacturers reinvented the term to describe a redundant array of independent disks as a means of dissociating a "low cost" expectation from RAID technology. "RAID" is now used as an umbrella term for computer data storage schemes that can divide and replicate data among multiple hard disk drives. The different schemes/architectures are named by the word RAID followed by a number, as in RAID 0, RAID 1, etc. RAID's various designs involve two key design goals: increase data reliability and/or increase input/output performance. When multiple physical disks are set up to use RAID technology, they are said to be in a RAID array. This array distributes data across multiple disks, but the array is seen by the computer user and operating system as one single disk. RAID can be set up to serve several different purposes. 20. RELATIONAL DATABASE MODEL (DBMS) The relational model for database management is a database model based on first‐order predicate logic, first formulated and proposed in 1969 by E.F. Codd. Its core idea is to describe a database as a collection of predicates over a finite set of predicate variables, describing constraints on the possible values and combinations of values. The content of the database at any given time is a finite (logical) model of the database, i.e. a set of relations, one per predicate variable, such that all predicates are satisfied. A request for information from the database (a database query) is also a predicate. 21. RELATIONSHIP 22. REPORT WRITER Report writing refers to the transfer of data into some form or document. It occurs not only during origination step and during distribution step, but also occurs throughout the processing cycle. 23. SCHEMA The schema (pronounced skee‐ma) of a database system is its structure described in a formal language supported by the database management system (DBMS). In a relational database, the schema defines the tables, the fields, relationships, views, indexes, packages, procedures, functions, queues, triggers, types, sequences, materialized views, synonyms, database links, directories, Java, XML schemas, and other elements. 7|Page
Schemas are generally stored in a data dictionary. Although a schema is defined in text database language, the term is often used to refer to a graphical depiction of the database structure. 24. SECONDARY STORAGE Secondary storage in popular usage differs from primary storage in that it is not directly accessible by the CPU. The computer usually uses its input/output channels to access secondary storage and transfers the desired data using intermediate area in primary storage. Secondary storage does not lose the data when the device is powered down—it is non‐volatile. Per unit, it is typically also an order of magnitude less expensive than primary storage. Consequently, modern computer systems typically have an order of magnitude more secondary storage than primary storage and data is kept for a longer time there. In modern computers, hard disk drives are usually used as secondary storage. The time taken to access a given byte of information stored on a hard disk is typically a few thousandths of a second, or milliseconds. By contrast, the time taken to access a given byte of information stored in random access memory is measured in billionths of a second, or nanoseconds. This illustrates the very significant access‐time difference which distinguishes solid‐state memory from rotating magnetic storage devices: hard disks are typically about a million times slower than memory. Rotating optical storage devices, such as CD and DVD drives, have even longer access times. 25. STRUCTURED QUERY LANGUAGE (SQL) SQL (Structured Query Language) is a database computer language designed for managing data in relational database management systems (RDBMS). Its scope includes data query and update, schema creation and modification, and data access control. SQL was one of the first languages for Edgar F. Codd's relational model in his influential 1970 paper, "A Relational Model of Data for Large Shared Data Banks" and became the most widely used language for relational databases. 26. TABLE (Database) In relational databases and flat file databases, a table is a set of data elements (values) that is organized using a model of vertical columns (which are identified by their name) and horizontal rows. A table has a specified number of columns, but can have any number of rows. Each row is identified by the values appearing in a particular column subset which has been identified as a candidate key. Table is another term for relations; although there is the difference in that a table is usually a multi‐set (bag) of rows whereas a relation is a set and does not allow duplicates. Besides the actual data rows, tables generally have associated with them some meta‐information, such as constraints on the table or on the values within particular columns.
8|Page
The data in a table does not have to be physically stored in the database. Views are also relational tables, but their data are calculated at query time. Another example are nicknames, which represent a pointer to a table in another database. 27. TRANSACTION A database transaction comprises a unit of work performed within a database management system (or similar system) against a database, and treated in a coherent and reliable way independent of other transactions. Transactions in a database environment have two main purposes: To provide reliable units of work that allow correct recovery from failures and keep a database consistent even in cases of system failure, when execution stops (completely or partially) and many operations upon a database remain uncompleted, with unclear status. To provide isolation between programs accessing a database concurrently. Without isolation the programs' outcomes are typically erroneous. A database transaction, by definition, must be atomic, consistent, isolated and durable. Database practitioners often refer to these properties of database transactions using the acronym ACID. Transactions provide an "all‐or‐nothing" proposition, stating that each work‐unit performed in a database must either complete in its entirety or have no effect whatsoever. Further, the system must isolate each transaction from other transactions, results must conform to existing constraints in the database, and transactions that complete successfully must get written to durable storage. 28. UNIFIED MODELING LANGUAGE (UML) Unified Modeling Language (UML) is a standardized general‐purpose modeling language in the field of software engineering. The Unified Modeling Language (UML) is used to specify, visualize, modify, construct and document the artifacts of an object‐oriented software intensive system under development. UML offers a standard way to visualize a system's architectural blueprints, including elements such as: Actors Business processes (Logical) components Activities Programming language statements Database schemas, and Reusable software components. 9|Page
29. VIEW In database theory, a view consists of a stored query accessible as a virtual table composed of the result set of a query. Unlike ordinary tables (base tables) in a relational database, a view does not form part of the physical schema: it is a dynamic, virtual table computed or collated from data in the database. Changing the data in a table alters the data shown in subsequent invocations of the view. Functions (in programming) can provide abstraction, so database users can create abstraction by using views. In another parallel with functions, database users can manipulate nested views, thus one view can aggregate data from other views. Without the use of views the normalization of databases above second normal form would become much more difficult. Views can make it easier to create lossless join decomposition. 30. ATTRIBUTE In computing, an attribute is a specification that defines a property of an object, element, or file. An attribute of an object usually consists of a name and a value; of an element, a type or class name; of a file, a name and extension. Each named attribute has an associated set of rules called operations: one doesn't add characters or manipulate and process an integer array as an image object— one doesn't process text as type floating point (decimal numbers). It follows that an object definition can be extended by imposing data typing: a representation format, a default value, and legal operations (rules) and restrictions ("Division by zero is not to be tolerated!") are all potentially involved in defining an attribute, or conversely, may be spoken of as attributes of that object's type. A JPEG file is not decoded by the same operations (however similar they may be—these are all graphics data formats) as a PNG or BMP file, nor is a floating point typed number operated upon by the rules applied to typed long integers. Example In computer graphics, line objects can have attributes such as thickness (with real values), color (with descriptive values such as brown or green or values defined in a certain color model, such as RGB), dashing attributes, etc. A circle object can be defined in similar attributes plus an origin and radius. 31. BINARY LARGE OBJECT (BLOB) DATA TYPE A binary large object, also known as a blob, is a collection of binary data stored as a single entity in a database management system. Blobs are typically images, audio or other multimedia objects, though sometimes binary executable code is stored as a blob. Database support for blobs is not universal. Blobs were originally just amorphous chunks of data invented by Jim Starkey at DEC, who describes them as "the thing that ate Cincinnati, Cleveland, or whatever". Later, Terry McKiever, a marketing person for Apollo felt that it needed to be an acronym and invented the 10 | P a g e
backronym Basic Large Object. Then Informix invented an alternative backronym, Binary Large Object 32. CENTRALIZED MODEL (DBMS) It is centralized if the data is stored at a single computer side. A centralized model can support many users, but the DBMS and the database themselves reside totally at a single computer side. 33. CONCURRENCY CONTROL In computer science, especially in the fields of computer programming (see also concurrent programming, parallel programming), operating systems, multiprocessors, and databases, concurrency control ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible. Concurrency control in database management systems (DBMS) ensures that database transactions are performed concurrently without the concurrency violating the data integrity of a database. Executed transactions should follow the ACID rules, as described below. The DBMS must guarantee that only serializable (unless Serializability is intentionally relaxed), recoverable schedules are generated. It also guarantees that no effect of committed transactions is lost, and no effect of aborted (rolled back) transactions remains in the related database. 34. CRUD Create; read, update and delete (CRUD) are the four basic functions of persistent storage. Sometimes CRUD is expanded with the words retrieve instead of read or destroys instead of delete. It is also sometimes used to describe user interface conventions that facilitate viewing, searching, and changing information; often using computer‐based forms and reports. 35. DATA ABSTRACTION The characteristic that allows program – data independence and program operation independence is called data abstraction. 36. DATABASE ENGINE A database engine (or "storage engine") is the underlying software component that a database management system (DBMS) uses to create, retrieve, update and delete (CRUD) data from a database. One may command the database engine via the DBMS's own user interface, and sometimes through a network port. 37. DATABASE MANAGEMENT SYSTEM (DBMS) A Database Management System (DBMS) is a set of computer programs that controls the creation, maintenance, and the use of the database of an organization and its end users. It allows organizations to place control of organization‐wide database development in the hands of database administrators (DBAs) and other specialists. DBMSs may use any of a variety of database models, such as the network model or relational model. In large systems, a DBMS allows users and other software to store and retrieve data in a structured way. It helps to 11 | P a g e
specify the logical organization for a database and access and use the information within a database. It provides facilities for controlling data access, enforcing data integrity, managing concurrency controlled, restoring database. 38. DATABASE PRACTITIONER 39. DATABASE SOFTWARE A Database Management System (DBMS) is a set of computer programs that controls the creation, maintenance, and the use of the database of an organization and its end users. It allows organizations to place control of organization‐wide database development in the hands of database administrators (DBAs) and other specialists. DBMSs may use any of a variety of database models, such as the network model or relational model. In large systems, a DBMS allows users and other software to store and retrieve data in a structured way. It helps to specify the logical organization for a database and access and use the information within a database. It provides facilities for controlling data access, enforcing data integrity, managing concurrency controlled, and restoring database. 40. DATA CATALOG 41. DATA DICTIONARY A data dictionary, as defined in the IBM Dictionary of Computing, is a "centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format." The term may have one of several closely related meanings pertaining to databases and database management systems (DBMS): A document describing a database or collection of databases. An integral component of a DBMS that is required to determine its structure. A piece of middleware that extends or supplants the native data dictionary of a DBMS. 42. DATA REPOSITORY A data dictionary, as defined in the IBM Dictionary of Computing, is a "centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format." 43. DATA TYPE All programming languages explicitly include the notion of data type, though different languages may use different terminology. Most programming languages also allow the programmer to define additional data types, usually by combining multiple elements of other types and defining the valid operations of the new data type. For example, a programmer might create a new data type named "Person" that specifies that data interpreted as Person would include a name and a date of birth. Common data types may include: Integers, Floating‐point numbers (decimals), and Alphanumeric strings. 12 | P a g e
44. DECISION SUPPORT DATABASE A database from which data is extracted and analysed statistically (but not modified) in order to inform business or other decisions. This is in contrast to an operational database which is being continuously updated. For example:‐ A decision support database might provide data to determine the average salary of different types of workers, whereas an operational database containing the same data would be used to calculate pay check amounts. Often, decision support data is extracted from operation databases 45. DIRECT MEMORY ACCESS (DMA) Direct memory access (DMA) is a feature of modern computers and microprocessors that allows certain hardware subsystems within the computer to access system memory for reading and/or writing independently of the central processing unit. Many hardware systems use DMA including disk drive controllers, graphics cards, network cards and sound cards. DMA is also used for intra‐chip data transfer in multi‐core processors, especially in multiprocessor system‐ on‐chips, where its processing element is equipped with a local memory (often called scratchpad memory) and DMA is used for transferring data between the local memory and the main memory. Computers that have DMA channels can transfer data to and from devices with much less CPU overhead than computers without a DMA channel. Similarly a processing element inside a multi‐core processor can transfer data to and from its local memory without occupying its processor time and allowing computation and data transfer concurrency. 46. DISTRIBUTED MODEL (DBMS) A distributed model or DBMS can have actual database. DBMS software is distributed over many sites, connected by a computer network. 47. ENTITY An entity is something that has a distinct, separate existence, though it need not be a material existence. In particular, abstractions and legal fictions are usually regarded as entities. In general, there is also no presumption that an entity is animate. Entities are used in system developmental models that display communications and internal processing of, say, documents compared to order processing. In software engineering, an Entity‐Relationship Model (ERM) is an abstract and conceptual representation of data. Entity‐relationship modeling is a database modeling method, used to produce a type of conceptual schema or semantic data model of a system, often a relational database, and its requirements in a top‐down fashion.
13 | P a g e