DATA AND INFORMATION Data • Can include text, numbers, images, audio and video • Used to generate information Information • Are processed data • Data that is organized, meaningful and useful • Can be in the form of documents, audio, images and video • Ex: bank statements; membership card information; receipt information; stock analyses; credit reports • Computers process data in a database into information. Data Integrity • Identifies the quality of data in a database • The more errors the data contains, the lower its integrity • Data integrity is important because computers and people use information to take actions and make decisions GIGO (Garbage In, Garbage Out) • Points out that the accuracy of a computer’s output depends on the accuracy of the input • Incorrect data will give you incorrect results/information Qualities of Valuable Information • Accurate o Error free; incorrect information lead to incorrect decisions • Verifiable o Can be proven as correct or incorrect • Timely o The information has age suited to its use • Organized o Arranged to suit the needs and requirements of the decision maker • Accessible o Available when the decision maker needs it o Having to wait for information may delay an important decision • Useful Information o Has meaning to the person to the person who receives it o Avoid distributing useless information • Cost-effective o Should give more value than it costs to produce Hierarchy of Data Character ⇒ Represents a single number, letter, punctuation mark or other symbols Field ⇒ A combination of one or more related characters ⇒ Field names uniquely identify each column in a table in Access o Text, Numeric, AutoNumber, Currency, Date, Memo, Yes/No, Hyperlink, Object Record
⇒ represents a collection of attributes that describe a real-world entity. A record consists of fields, with each field describing an attribute of the entity (object, person, place, or event). ⇒ a group of related fields o Primary key – a key field, which uniquely identifies each record in a file File ⇒ A collection of related records stored on a storage medium such as HD, CD, DVD ⇒ Files are frequently classified by the application for which they are primarily used (employee file). Database ⇒ is an integrated collection of logically related records or files. A database consolidates records previously stored in separate files into a common pool of data records that provides data for many applications. The data is managed by systems software called database management systems (DBMS). The data stored in a database is independent of the application programs using it and of the types of secondary storage devices on which it is stored.
NORAMLIZATION ⇒ Normalization organizes a database into one or several normal forms to remove ambiguous relationships between data and minimize data redundancy ⇒ The objective of normalization is to prevent replication of data, with all its negative consequences. ⇒ In a relational database, each table is normalized, which means that o all its fields will contain single data elements o all its records will be distinct, and o each table will describe only a single class of entities ⇒ When a database suffers from poor table organization, experts say it's not normalized. There are rules that govern how a relational database should store its tables. These rules are called the Rules of Data Normalization.
Rule#1: Avoid Repeated Information (Fields/Columns) Let’s say that you want to keep contact information on your customers along with a record of each transaction they make. If you kept it all in one table, you would have to repeat the customer’s full name, address and phone number each time you entered a new transaction! And it would also be a nightmare if the customer’s address changed; you would have to make the change to every transaction. CUSTOMER NAME
CUSTOMER ADDRESS
ABC Plumbing ABC Plumbing ABC Plumbing Jack’s Emporium Jack’s Emporium Millie’s Pizza
201 W. 44th St. 201 W. 44th St. 201 W. 44th St. 1155 Conner Ave. 1155 Conner Ave. 108 Ponting St.
CUSTOMER PHONE (317)555-2394 (317)555-2394 (317)555-2394 (317)555-4501 (317)555-4501 (317)554-2349
ORDER DATE 2/5/96 5/14/96 7/9/96 6/6/95 7/26/96 8/29/96
ORDER TOTAL $155.90 $90.24 $224.50 $1,592.99 $990.41 $39.95
A better way is to assign each customer an ID number. Include that ID number in a table that contains names and addresses (see the Customers Table). Then you can include the same ID number as a link in separate table that contains transactions (see the Orders Table).
Customers Table CUSTOMER ID CUSTOMER NAME 1 ABC Plumbing 2 Jack’s Emporium 3 Millie’s Pizza Orders Table CUSTOMER ID 1 1 1 2 2 3
ORDER DATE 2/5/96 5/14/96 7/9/96 6/6/95 7/26/96 8/29/96
CUSTOMER ADDRESS 201 W. 44th St. 1155 Conner Ave. 108 Ponting St.
CUSTOMER PHONE (317)555-2394 (317)555-4501 (317)554-2349
ORDER TOTAL $155.90 $90.24 $224.50 $1,592.99 $990.41 $39.95
Rule#2: Avoid Redundant Data Let’s say you want to keep track of which employees have attended certain training classes. There are lots of employees and lots of classes. One way to do this would be to keep it all in a single Personnel table, like this: Employee Name Phil Sharp Becky Rowan Nick Gianti Jan Donato Andrea Jafar
Employee Address 211 W. 16th St. 40 Westfield Ct. 559 Ponting St. 720 E. Warren 3904 110th St.
Employee Phone (317)555-4321 (317)555-3905 (317)555-7683 (317)555-2930 (317)555-0293
Training Date 5/5/96 5/5/96 6/15/96 5/5/96 6/15/96
Class Taken Leadership Skills Customer Service Public Speaking Public Speaking Leadership Skills
Credit Hours 3 2 9 9 3
But what if employee takes more than one class? You’d have to add a duplicate line in the table to list it, and then you’d have the problem described as multiple records with identical field entries. And what if the only employee who has taken a certain class leaves the company? When you delete that employee’s record, you delete the information about the class’s credit hours, too. A better way would be to create separate tables for Employees, Classes, and Training. Here are some examples of what such tables might look like: Employee Table Employee ID Employee Name 1 Phil Sharp 2 Becky Rowan 3 Nick Gianti 4 Jan Donato 5 Cynthia Hedge Class Table Class ID C1 C2 C3
Employee Address 211 W. 16th St. 40 Westfield Ct. 559 Ponting St. 720 E. Warren 108 Carroll St.
Class Taken Leadership Skills Customer Service Public Speaking
Training Table Employee ID Training Date 1 5/5/96
Class C1
Credit Hours 3 2 9 Passed? Yes
Employee Phone (317)555-4321 (317)555-3905 (317)555-7683 (317)555-2930 (317)555-5990
Passed? Yes Yes Yes No Yes
2 3 4 5 6
5/5/96 6/15/96 5/5/96 6/15/96 6/15/96
C2 C3 C3 C2 C1
Yes Yes No Yes Yes
Guidelines in developing a database: 1. Determine the purpose of the database 2. Design the tables Design tables on paper first Each table should contain data about one subject Ex. Member table contains data about members only 3. Design the records and fields for each table 4. Be sure every record has a primary key (A Primary key is a key field, which uniquely identifies each record in a file) Use separate fields for logically distinct items (ex. A name could be stored in six fields: Title [Mr., MRs., Dr., etc.] First name Middle NameLast Name Suffix [Jr., Sr., etc.] Nickname
5.
Do not create fields for information that can be derived from entries in other fields. Ex. Do not include a field for Age. Instead store the birthdate and compute the age
Determine the relationship among the tables.