The Er Model

November 2019
PDF

Download

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA

Overview

Download & View The Er Model as PDF for free.

More details

Words: 4,875
Pages: 16

Preview
Full text

2

The ER model In this chapter: • Approaches to database design • Definitions of entity, attribute, and relationship • Representing one-to-many relationships

“How do I start to design a database?” This is a question that students frequently will ask. It’s also a good question for anyone who has just purchased a database software package, and wants to put it to use. We can think of several ways to get started: •

Pick an example from the ones supplied with your software, or use the software’s built-in guides (Access, for example, calls these “wizards”). If your application looks like one of theirs, this may be a good method. But you probably won’t learn the reasoning behind the examples, and some of the “wizards” may not be as smart as they should be. (We’ll show examples of this.)

•

List all of the data elements that you need, and build them into a single table. This is what many naïve designers do, and the result is guaranteed to be a disaster for anyone who has to use it or maintain it. We have seen many “horror-stories” of databases in real organizations that were (non-) designed this way; perhaps you have, too. Actually, listing all of the necessary data isn’t a bad way to start, but you will need to develop the skills to turn the list into a well-designed database. From a software engineering perspective, you could call this a “bottom-up” design strategy. You’ll learn how to do it later in this book.

24

Practical Relational Database Design

•

Think of the real-world enterprise that you are modeling. Try to identify what objects, events, or facts have to be represented; what characteristics of these objects should be stored in the database; and how the objects are related to one another. This could be called a “top-down” design strategy, and we believe that it is the easiest and most effective way to start.

The ER model supports the third design strategy, real-world enterprise modeling. There are only three elements in an ER model: entities, attributes, and relationships.

2.1 Entities An entity is any “thing” that is part of the enterprise and has to be represented in a database. It could be a physical object: a person, a piece of furniture, or an item for sale in a store. It could be a conceptual object: a movie, for example, that has a title, a plot, dialog and action but no physical form until it is stored on film or video tape. It could be an event that happens: a customer places an order, a student enrolls in a class, or a bus makes a stop to pick up passengers. It could simply be a fact: a medical patient’s blood serum level is measured at a specific value. For our first try at ER modeling, let’s consider the enterprise of a library. You’re familiar with this enterprise; you’ve used libraries and you know their purpose and how they operate. What are the Figure 2-1: “things” in this enterprise that might need to be repLibrary entities resented in a database? Well, books would probably be the first to come to mind. But there are also customers (who check out books), librarians (who catalog and maintain the books), shelves (to store the books), reading tables, and probably many more. Of these, which are relevant to a database? Since our purpose in this example is to keep track of which books are currently loaned out to which customers, those two (books and customers) should be the entities we start with. We’ll diagram them as simple labeled rectangles or boxes. As it happens, both of these entities are physical objects, but we will soon show examples of entities that you can’t touch or see. We will use the word “entity” (and the names of entities) in different ways, which normally should be clear from the context. Where it

Chapter 2 The ER model

may not be clear, we will specifically identify which one of the following we are talking about.

Figure 2-2: Entity concepts

•

An entity individual is one specific “thing” (physical or conceptual object, fact, or event) associated with the enterprise. We will always refer to an entity individual in the singular: “a book” or “a customer.”

•

An entity type is a class of individuals associated with the enterprise, which can be described by a finite set of syntactic and semantic rules determined by the enterprise—according to their intrinsic, real-world structure and their meaning in the enterprise. Any individual can be classified as belonging to or not belonging to the entity type by evaluating it according to these rules. The entity type is what is shown in the ER diagram; you might also think of it as a prototype that defines the characteristics of each individual. We will normally refer to the entity type in the singular (BOOK or BOOK type), unless the context of a sentence requires the plural.

•

The active set of an entity type consists of all those individuals of the given type who are currently participating in the enterprise. We will always refer to the active set in the plural (BOOKS). As a student, you might fit the semantic rules for being a library customer—but if you don’t have a library card yet, you aren’t in the active set of library customers. (Notice that you don’t have to have any books checked out to be in the active set of customers.) The concept of entity individuals and active sets will be important when we discuss relationships

25

26

Practical Relational Database Design

and mapping (below). At the ER model level, we are not concerned yet with which specific entity individuals are active in the enterprise—this will be represented in the relational model (Chapter 5) Remember that an entity (individual, type, or active set) is something that exists in the real world, as we show in Figure 2-2. You can’t magically store it in a computer. You have to decide what characteristics of it can be (and need to be) represented in the actual database.

2.2 Attributes In Chapter 1, we claimed that building a model was a process of selecting relevant characteristics of whatever we were modeling. The Civil War soldier was represented by his name, rank, height, eye color, and so on. The pipe, painted by Magritte, was represented by its shape and color. The cities, roads, and other features on the map were represented by lines, symbols, colors, and text. The characteristics of each entity that you select to include in your ER model are called the attributes of that entity. When you are first thinking of attributes, try to decide what (relevant) characteristics are needed to describe the entity. Or to put it another way: what does the client need to know about this entity? Unlike entities, attributes can actually be represented in a computer. We can say that an attribute defines storable values that can be stored on disk, transmitted over a network, and retrieved to an output device in the form of a text string, a number, a digitized picture, an audio or video file, and so on. We can also think of an attribute more formally as a function that assigns a single storable value (from a well-defined set of storable values) to each individual in a set of real-world entities. So name (you) = “Your Name,” birthdate (you) = “1 July 1975” (or whatever it is), and so on. The set of storable values that each attribute can assume is called the domain of that attribute. You will learn more about attribute domains in Chapter 9. Let’s think about the entities in our library model. What characteristics—attributes—describe each of them? What does the client need to know about them? •

Books have titles and authors, but in a library they also have call numbers and possibly copy numbers (if the library has

Chapter 2 The ER model

multiple copies). If you want to find a specific book, you look up the call number and go to the shelf where it is stored. The combination of call number and copy number (if needed) is enough to identify the particular book that you have in your hand. When you check out the book, this is how the library will keep track of it, so it’s clearly an important attribute. For simplicity in this example, we’ll say that the title is also an attribute of the book. Later on, we will think of the title as an attribute of another (conceptual) entity. We’ll avoid the problem of the author attribute for the moment, since books can be written by more than one author. You’ll find out how to handle this situation in the next chapter. You could probably think of more attributes of a book: whether it is cloth-bound or paperback, what size it is, and so on. Some of these may be important to the library: size, for example, if extra-large books are shelved in a separate room. Again, we’ll keep the example simple by excluding these. •

Customers have names, addresses, and telephone numbers, at the minimum (so the library can send out over-due notices). We’ll include these as attributes in our model. Other attributes of a customer such as status (student/faculty/staff) could be added later. But we need to think for a moment about the name attribute. What are the chances that two customers have the same name? If the library is large, this is very likely—and you will need to distinguish between them in some way. We’ll fix the problem by giving each customer an identification number, which we will guarantee is unique to each person. (We’ll look at this problem again below, and in much more detail in Chapter 4.)

We will list the attributes of each entity in the same boxes that we drew before. This makes an uncluttered, easy-to-read graphic presentation, which is also reasonably consistent with many of the modern ER design packages.

Figure 2-3: Customer and book attributes

27

28

Practical Relational Database Design

The attributes that are starred are those that serve to uniquely identify individuals of the entity type. Attributes are defined for each entity type according to the rules of the type. If the attribute value A(j) is defined for an individual j of type E whenever j belongs to the active set for E, then the attribute A is said to be defined for the type E. The attribute set SE for an entity type E is the set of all attributes defined for E. If an attribute is defined for every individual of a given type then it is defined for that type. However, the converse is not true. Some attributes that are defined for the type may not be defined for individuals that are not in the active set. For example, books that are not in the library do not have call numbers. However, every book in the library has a call number. Thus, the attribute call number is defined for the entity type BOOK in the library, even though call number is not defined for every book individual in the universe.

2.3 Relationships Once we have our first entities, and know at least a few of their attributes, we need to decide how they are connected—what they have to do with each other. (If an entity has absolutely nothing to do with any other entity in the model, it shouldn’t be there.) The connection between two entities is called, not surprisingly, a relationship.

2.3.1 Relationships represent mappings between entities A relationship describes the connection between two entity types. Entity Relationship It actually represents a mapping concept concept between members of each entity Entity type Relationship set. When an individual of one Active set Mapping entity set is paired with an individual of another entity set, we Individual Link say that these individuals are linked by the relationship. There- Figure 2-4: Correspondence between entity and relationship concepts fore, the mapping between entity sets is the set of all links between individuals of those entity sets. These concepts are easiest to understand with an example, for which we will use the library’s customers and books.

Chapter 2 The ER model

In general, what is the connection between customers and books; what do they have to do with each other? Clearly, customers Figure 2-5: Relationship between customer and book entities borrow books—or, said the other way, books are borrowed by customers. A relationship always can be described in either direction, and we’ll show it by simply drawing a line between the two related entities. (We’ve labeled the line in the illustration. Some development tools let you do this; others don’t—but you should try to do so whenever possible.) Let’s be a bit more precise in our definition of this relationship. Think about only one customer. How many books can that customer borrow at a given time? Normally, the answer is many—although a given customer might have only one book (or none at all) checked out right now. We are concerned here with the largest number, though, so we will describe the relationship as “each customer borrows many books.” Now think about the relationship the other way, but start with only one book. If we are only concerned with the current status of the book (whether or not it is checked out, and if so, to whom), then each book is borrowed by (at most) one customer at a time. The relationship between the entity types, shown in Figure 2-5 above, represents the mapping of customers (members of the active CUSTOMERS entity set) to books (members of the active BOOKS entity set) that they have currently borFigure 2-6: Mapping Customers to Books rowed. Figure 2-6 illustrates the mapping; each line connecting a customer to a book represents a single link in this mapping.

29

30

Practical Relational Database Design

•

We call this a one-to-many relationship between customers and books. For each customer that participates in the relationship (by checking out books), there can be many books participating in the relationship (by being checked out).

•

But it is a many-to-one relationship between books and customers. For each book that participates in the relationship (by being checked out), there is only one customer participating (by having checked out the book).

The easiest way to understand the one-to-many and many-to-one mappings of a relationship is to always Figure 2-7: ER representation of one-todescribe them with many relationship sentences constructed like the ones we have shown above. Form these sentences in standard English syntax: subject – verb – direct object. •

The subject—one entity name—is always modified by the determiner “each.”

•

The verb describes the relationship.

•

The object—the other entity name—is always modified by a determiner of either “one” or “many.”

•

Do this in both directions! “Each customer borrows many books.” “Each book is borrowed by one customer.”

Actually, the second sentence, “Each book is borrowed...,” is in the form that your English writing teacher would call “passive voice”— with the object coming first and the subject coming after the (passive) verb. In writing, this is usually a mistake; in describing relationships, it’s unavoidable.

Chapter 2 The ER model

2.3.2 Mappings are constrained in number The mapping between entities is always constrained in number as it is here between customers and books. We call this constraint the cardinality of the relationship. (Some books use the term maximum Figure 2-8: Cardinality cardinality.) Informally, we can say that symbols cardinality is a constraint on a relationship that describes the maximum number of individuals in one active set that may be linked to a single individual in the other set. Cardinality may be oneto-many or many-to-one, as in the example above. It may also be oneto-one, for a relationship in which each member of one entity set is mapped to (at most) only one member of the other entity set, in both directions. It may not be many-to-many; you will find out how to handle what appear to be “many-to-many” relationships in the next chapter. The symbols we use to represent “one” and “many” are commonly used (but not standardized) in textbooks and database software.

2.4 Representing relationships: local, surrogate, and foreign attributes Before we complete the model, we have to identify attributes that will represent the relationship. Said another way, for every book that is checked out, we need to know which customer has it (including the customer’s name, address, and phone number). While the book is checked out, the attributes of the customer who has it temporarily become attributes of the book itself. We are actually dealing here with three types of attributes: local, surrogate and foreign. •

Local attribute: An attribute A for an entity type E is local if it is defined from intrinsic properties of E. That means, every individual p in the active set of E has a well defined value A(p) which is obtained by applying the semantic rules of E directly to the individual p. The name, address, and telephone number are local attributes of a CUSTOMER; the call number, copy number, and title are local attributes of a BOOK.

31

32

Practical Relational Database Design

•

Surrogate attribute: A surrogate attribute for an entity type E is an attribute that is manufactured by the database designer and has no relationship to the semantic rules which define E. The customer ID is a good example of a surrogate attribute. Surrogate attributes exist solely for the convenience of the database designer, and have no descriptive value. In this case, the surrogate attribute is designed to insure that we can distinguish between two customers who have the same name. We will look at surrogate attributes in detail in Chapter 4.

•

Foreign attribute: An attribute A of an entity type E is a foreign attribute if it imports its definition from another entity type F. The foreign attribute A can only be defined when E relates to another type F via a relationship r that is one-to-one or many-to-one. Moreover, r must map each individual in the active set of E to some individual q in type F. (That is, E’s participation in the relationship is mandatory.) In this case attribute A is defined as follows. For each individual p in the active set of E, A(p)= A'(q) where A’ is an attribute of F and q is the unique individual of type F that is paired with p by the relationship r. All of the attributes of the customer (customer ID, name, address, phone) become foreign (imported) attributes of the book when the book is checked out.

Although these definitions are wordy, the concepts are straightforward. Local attributes are intrinsic to the type—home grown. Surrogate attributes are useful but artificial creations. Foreign attributes are defined outside the entity type in which they appear, and their meaning is imported.

2.4.1 Each attribute is local to one entity An important design guideline is that every attribute must be local to at most one entity type. We certainly don’t want to repeat all of the information about a customer in the BOOK entity. Fortunately, we don’t need to. We designed the customer identification so that it uniquely identifies each customer. (It is called the primary key of the customer entity, and is starred in the ER diagram. We’ll define this term in Chapter 4.) So all we need to know (about a checked-out book) is the identification number of the customer who has checked it out. Given that, we can always find other attributes of the customer

Chapter 2 The ER model

(name, address, etc.) because they are part of the customer entity. Remember how the soldier in our Civil War example could have accomplished the same thing by using an officer’s initials to save writing the name over and over. In other words, the customer ID is the only foreign attribute that has to appear in the BOOK entity; all of the other customer attributes will be inherited by the book along with it. This special foreign attribute, used to represent the relationship between a customer and a book, is called a foreign key (which we will also define in Chapter 4). A primary key value and a matching foreign key value form the link between two individuals of their respective entity types. In our ER diagrams, we show foreign keys in italic, so you can immediately see their function. (Some designers add “fk” to foreign key attribute names.) Notice that each customer ID will appear as a member of the active CUSTOMERS entity set only once (remember that we designed it to be guaranteed unique). Importantly, the local attributes associated with that customer ID (name, address, and phone) will also appear only once in the database, as our design guideline above suggested. But because the relationship of customers to books is one-to-many, the same customer ID might appear as a foreign key attribute value of many books. When the customer returns a book, the customer ID attribute value will be removed from that book. In other words, a book that is not checked out will not have a customer ID associated with it. When an attribute value like this is missing we call it null. The special constant value null, compatible with the domain of any attribute in a relational database, means that the attribute is undefined for this entity individual.

2.4.2 This model has a problem Although our model at this point has correctly represented the one-tomany relationship between customers and books—we can show which customer has which book checked out—there is still a fairly serious problem: where should we put the due date? Clearly, this is not an attribute of the CUSTOMER entity, since each customer may have many books checked out. Almost as clearly, it really doesn’t make much sense as an attribute of the BOOK, either—although for the moment, this is the only place to put it. (Notice that the due date, like

33

34

Practical Relational Database Design

the customer ID, is entered only if a book is checked out; otherwise, it will be null.) What has happened is that we’ve not yet identified all of the entities that are needed to model the checkout process. This is Figure 2-9: Library, Version 1 (incorrect) realistic—complete models don’t just appear, they are built step-by-step. An apparent problem like the awkwardly located and frequently null customer ID and due date can help you to identify where a model needs further work. We’ll refine this one in the next chapter; for the time being, our library ER model looks like Figure 2-9.

2.5 Summary: one-to-many relationships The one-to-many relationship (or many-to-one, read in the other direction) is the most common relationship in the ER model. As a final example, we will look at another enterprise: a medical clinic. We wish to track the level of various substances (for example, cholesterol or alcohol) in the blood of patients. To do this we develop a model with two entities: PATIENT and BLOOD SAMPLE. Within this model there are local, surrogate and foreign attributes. BLOOD SAMPLE has the local attributes Substance, Date that the sample Figure 2-10: Patient and Blood Sample entities was taken, and Level that was measured. PATIENT has the local attribute patient name and the surrogate attribute patient ID, a number defined by the hospital to identify the patient and protect privacy. It is important to note that patient ID can also be defined for every BLOOD SAMPLE individual. Each sample is related to exactly one patient, although each patient may give many samples. So,

Chapter 2 The ER model

patient ID (of a particular blood sample) = patient ID (of the patient who gave the sample). The surrogate attribute patient ID is the primary key of PATIENT. Thus the single foreign attribute patient ID in BLOOD SAMPLE constitutes a foreign key. The attribute in BLOOD SAMPLE that corresponds to patient ID need not have the same name in the ER model, although we usually give matching attributes the same name. The designer could call it donor ID if this seems more appropriate. In that case: donor ID (of a particular blood sample) = patient ID (of the patient who gave the sample). So far, we’ve seen only relationships in which at least one side has a cardinality of one. In fact, we’ve said that relationships can only be one-to-many, many-to-one, or one-to-one. What happens if both sides of a relationship appear to have a value of many? We’ll look at this situation in the next chapter.

2.6 Glossary Entity: any “thing” that is part of the enterprise and has to be represented in a database. (p. 24) Entity individual: one specific “thing” (physical or conceptual object, fact, or event) associated with the enterprise. (p. 25) Entity type: a class of individuals associated with the enterprise, which can be described by a finite set of syntactic and semantic rules determined by the enterprise—according to their intrinsic, real-world properties and their meaning (semantics) in the enterprise. (p. 25) Active set (of an entity type): all those individuals of the given type who are currently participating in the enterprise. (p. 25) Attribute: a characteristic of an entity that you select to include in your ER model. Formally, an attribute is a function that assigns a single storable value (from a well-defined set of storable values) to each individual in a set of real-world entities. (p. 26) Domain: the set of storable values that each attribute can assume. (p. 26)

35

36

Practical Relational Database Design

Relationship: a description of the connection between two entity types, which represents a mapping of members of one active entity set to members of the other active entity set. (p. 28) Mapping: the set of all links between individual members of the active sets of two entity types. (p. 28) Link: a pairing of an individual of one active entity set with an individual of another active entity set. (p. 28) Cardinality: a constraint on a relationship that describes the maximum number of individuals in one active entity set that may be mapped to a single individual in the other set. Cardinality may be oneto-many, many-to-one, or one-to-one. (p. 31) Local attribute: an attribute of an entity type E that is defined from intrinsic properties of E. (p. 31) Surrogate attribute: an attribute of an entity type E that is manufactured by the database designer and has no relationship to the semantic rules that define E. (p. 32) Foreign attribute: an attribute of an entity type E that imports its definition from another entity type F. (p. 32) Null: a special constant value, compatible with the domain of any attribute in a relational database, which means that the attribute is undefined for this entity individual. (p. 33)

2.7 Exercises 2-1. Any university database will have an entity type called STUDENT. Develop a model of this entity type, including any attributes which you think are appropriate. To demonstrate the concept of storable values, try to include some attributes that cannot be represented in text (character) form. Describe in your own words what (or who) a set of students might include, and give an example of an individual member of the student entity set. 2-2. Many large companies are organized into different departments (for example, sales, manufacturing, administration, and so on). Employees are typically hired by (or administratively assigned to) only one department, regardless of where they actually work. Create the ER diagram of employees, departments, and the relationship

Chapter 2 The ER model

between them. Don’t forget to show local and foreign attributes, and the cardinality of the relationship. 2-3. You are developing a database which will contain information about cities and states in the United States. Each city is located in only one state (Texarkana TX is different than Texarkana, AR). Create the ER diagram of cities, states, and the relationship between them. Again, include local and foreign attributes, and the cardinality of the relationship. 2-4. Think of an activity or enterprise that you are interested in. It could be something from home, school, work, or recreation. Identify two entities in this enterprise that have a one-to-many relationship, and develop the ER diagram to illustrate this. 2-5. Build the tables of the library or the patients database, as we have shown them in this chapter. If your software permits setting up relationships at design time, link the tables on their related attributes. • Enter data in each of the tables. Be sure to illustrate how customer IDs may be copied several times into the BOOKS table (and patient IDs into the BLOOD SAMPLES table). •

Develop a query that demonstrates how the one-to-many relationship is actually represented by the data. Normally, the query result will show only the books that are currently checked our (compare with the BOOKS table) and the customers that currently have books checked out (compare with the CUSTOMERS table). Review Figure 2-5 for a conceptual view of the mapping between these entities.

•

What happens if you enter a customer ID in the BOOKS table that isn’t already entered in the CUSTOMERS table? (Results may vary, depending on your software.)

37

38

Practical Relational Database Design

The Er Model

Overview

More details

Related Documents

The Er Model

Er

Er

Er

Er

Er