DB 문서들
DSN 갤러리
DBMS Columns 85 게시물 읽기
 News | Q&A | Columns | Tutorials | Devel | Files | Links
No. 85
Entity Relationship Modeling from an ORM Perspective: Part 1 -- By Dr. Terry Halpin
2001-11-19 20:58

원본 출처 :<br><hr noshade width=100%><br>



<p align="center"><strong><font face="Arial Black">Entity

Relationship Modeling from an ORM Perspective: Part 1</font> -- <em><font face="Arial">By Dr. Terry



<font face="Arial Black">Introduction</font>

<p><font face="Arial">Entity Relationship modeling (ER) views the

application domain in terms of entities that have attributes and participate

in relationships. For example, the fact that an employee was born on a date

is modeled by assigning a birthdate attribute to the Employee entity type,

whereas the fact that an employee works for a department is modeled as a

relationship between them. This view of the world is quite intuitive, and in

spite of the recent rise of UML for modeling object-oriented applications,

ER is still the most popular data modeling approach for database


<p><font face="Arial">The ER approach was originally proposed by Peter Chen

in 1976, in the very first issue of an influential ACM journal [2]. As shown

in Figure 1, Chen's original notation used rectangles for entity types,

diamonds for relationships, and ellipses for attributes. The double ellipse

indicates unique identifier attributes, and the "n and "1"

indicate the relationship is many to one (each employee works for at most

one department, but many employees may work for the same department).</font></p>

<p><img border="0" src="graphics/halpin1.gif" width="318" height="105"></p>

<p><font size="2"><b>Figure 1</b> <i>An early ER notation used by Chen</i></font></p>

<p><font face="Arial">The direction in which relationship names are to be

read is formally undecided, unless we add some additional marks (e.g.

arrows) or rules (e.g. always read from left to right and from top to

bottom). For example, does the employee work for the department, or does the

department work for the employee? Although we can use our background

knowledge to informally disambiguate this example, it is quite common

nowadays to see ER models with relationships whose intended direction can

only be guessed at by anybody other than the model's creator. For example,

consider the impact of misreading the intended direction for the following:

Person killed Animal; Person is loved by Person. This problem is exacerbated

if the verb phrase used to name the relationship is shortened to one word

(e.g. "work", "love"), unfortunately still a fairly

common practice.</font></p>

<p><font face="Arial">Chen's notation evolved over time. His current

ER-Designer tool uses hexagons instead of diamonds, and supports n-ary

relationships. Outside academia however, Chen's notation seems to be rarely

used nowadays, so I'll say no more about it here. One of the problems with

the ER approach is that there are so many versions of it, with no single

standard. In industrial practice, the most popular versions of ER are the

Oracle and Information Engineering (IE) notations. Another popular data

modeling notation is IDEF1X, but since this is a hybrid of ER and relational

notation, I don't count it as a true ER representative. As discussed in

earlier articles [3], UML class diagrams can be regarded as an extended

version of ER. The rest of this article focuses on basic aspects of the

Oracle notation for ER. Later articles will examine IE and IDEF1X.</font></p>

<p><font face="Arial Black">Oracle ER: the basics</font></p>

<p><font face="Arial">The Oracle ER notation has been used for many years in

CASE tools from Oracle Corporation, and the discussion here is based on the

classic treatment by Richard Barker [1]. Oracle's Object Designer tool now

supports UML as an alternative to its traditional ER notation. For database

applications, many modelers still prefer Oracle's ER notation in preference

to UML, and it will be interesting to see whether this changes over time.

Dave Hay, an experienced modeler and ardent fan of the Oracle ER notation,

argues that "there is no such thing as 'object-oriented analysis'"

[5], only object-oriented design, and that "UML is ?not suitable for

analyzing business requirements in cooperation with business


<p><font face="Arial">While I agree with Dave Hay that UML class diagrams

are less than ideal for data modeling, I feel that his preferred ER notation

shares some of UML's weaknesses in being attribute-based. As I've discussed

before in a UML context [3, 4], using attributes in a base conceptual model

adds complexity and instability, while making it harder to validate models

with domain experts using verbalization and sample populations. Attributes

are great for logical design, since they allow compact diagrams that

directly represent the data structures (e.g. relations or object-relations)

used for the actual design. However when I'm performing conceptual analysis,

I just want to know what the facts and rules are about the business, and I

want to communicate this information in sentences, so that the model can be

understood by the domain experts. I sure don't want to bother about how

facts are grouped into multi-fact structures. Whether some fact will end up

in the design as an attribute is not a conceptual issue to me. As Ron Ross

says, "Sponsors of business rule projects must sign off on the

sentences-not on graphical data models. Most methodologies and CASE tools

have this more or less backwards" [6, p.15]. The ORM reporting

facilities in Visio Enterprise allow the domain expert to inspect ORM models

fully verbalized into sentences with examples, making validation much easier

and safer.</font></p>

<p><font face="Arial">Now that I've stated my bias up front, let's examine

the Oracle ER notation itself. The basic conventions are illustrated in

Figure 2. Entity types are shown as soft rectangles (rounded corners) with

their name in capitals. Attributes are written below the entity type name.

Some constraint information may appear before an attribute name. A

"#" indicates that the attribute is the primary identifier of the

entity type, or at least a component of its primary identification scheme. A

"*" or heavy dot "?quot; indicates the attribute is

mandatory (i.e. each instance in the database population of the entity type

must have a non-null value recorded for this attribute). A "?quot;

indicates the attribute is optional. Some modelers also use a period

"." to indicate the attribute is not part of the identifier.</font></p>

<p><img border="0" src="graphics/halpin2.gif" width="280" height="85"></p>

<p><font size="2"><b>Figure 2</b> <i>The basic ER notation used by Oracle</i></font></p>

<p><font face="Arial">Relationships are restricted to binaries (no unaries,

ternaries or longer relationships), and are shown as lines with a

relationship name at the end from which that relationship name is to be

read. This name placement overcomes the ambiguous direction problem

mentioned earlier. Both forward and inverse readings may be displayed for a

binary relationship, one on either side of the line. This makes the notation

superior to UML for verbalizing relationships.</font></p>

<p><font face="Arial">From an ORM perspective, each end (or half) of a

relationship line corresponds to a role. Like ORM, Oracle ER treats role

optionality and cardinality as distinct, orthogonal concepts, instead of

lumping them together into a single concept (e.g. multiplicity in UML). A

solid line-half denotes a mandatory role, and a dotted line-half indicates

an optional role. For cardinality, a fork or crow's foot intuitively

indicates "many", by its many "toes". The absence of a

crow's foot intuitively indicates "one". The crow's foot notation

was invented by Gordon Everest, who originally used the term "inverted

arrow" but now just calls it a "fork". The basic

correspondence with the ORM notation for uniqueness and mandatory role

constraints is shown in Figure 3.</font></p>

<p><img border="0" src="graphics/halpin3.gif" width="290" height="154"></p>

<p><font size="2"><b>Figure 3</b> <i>The ER diagram (a) is equivalent to the

ORM diagram (b)</i></font></p>

<p><font face="Arial">If each of the two roles in a binary association may

be assigned one of optional/mandatory and one of many/one, there are sixteen

patterns. The equivalent Oracle ER and ORM diagrams for the first eight of

these cases are shown in Figure 4.</font></p>

<p><img border="0" src="graphics/halpin4.gif" width="455" height="470"></p>

<p><font size="2"><b>Figure 4</b> <i>Some equivalent cases</i></font></p>

<p><font face="Arial">The other eight cases are shown in Figure 5. Although

all eight are legal in ORM, the last case where both roles of a many:many

relationship are mandatory is considered illegal in Oracle.</font></p>

<p><img border="0" src="graphics/halpin5.gif" width="455" height="478"></p>

<p><font size="2"><b>Figure 5</b> <i>Other equivalent cases</i></font></p>

<p><font face="Arial">Ring associations that are illegal in Oracle are shown

in Figure 6(a). Although rare, they sometimes occur in reality, so should be

allowed at the conceptual level, as permitted in ORM. As an exercise, you

may wish to invent satisfying populations for the ORM associations in Figure

6 (b).</font></p>

<p><img border="0" src="graphics/halpin6.gif" width="454" height="189"></p>

<p><font size="2"><b>Figure 6</b> <i>Illegal ring associations in Oracle (a)

that are rare but allowed in ORM (b)</i></font></p>

<p><font face="Arial">In Oracle ER, a bar "|" across one end of a

relationship indicates that the relationship is a component of the primary

identifier for the entity type at that end. In Figure 7 for example,

Employee and Building have simple identifiers, but Room has a composite

reference scheme, being identified partly by its room number and partly by

the building in which it is included.</font></p>

<p><img border="0" src="graphics/halpin7.gif" width="377" height="85"></p>

<p><font size="2"><b>Figure 7</b> <i>Room is identified by combining its

room nr and its relationship to Building</i></font></p>

<p><font face="Arial">The use of identification bars provides some of the

functionality afforded by external uniqueness constraints in ORM. For

example, the schemas in Figure 8 are equivalent. The other attributes of

Room and Building in ORM would be modeled in ORM as relationships. ORM's

external uniqueness notation seems to me to convey more intuitively the idea

that each RoomNr, Building combination is unique (i.e. refers to at most one

room). But maybe I'm biased. At any rate, this constraint (as well as any

other graphic constraint) can be automatically verbalized in natural


<p><img border="0" src="graphics/halpin8.gif" width="463" height="127"></p>

<p><font size="2"><b>Figure 8</b> <i>Composite identification in Oracle ER

(a) and ORM (b)</i></font></p>

<p><font face="Arial">Some people misread the bar notation for composite

identification as a "1", since this is what the symbol means in

many other ER notations. But this isn't a problem if you don't have to work

with multiple versions of ER. The main problem with the bar notation is that

it doesn't generalize, since it can't be used for declaring composite

identification schemes involving only attributes (no relationships). Because

ORM always uses relationships instead of attributes, it doesn't have this

problem. As an example, suppose we wanted to model the information shown in

Table 1, as well as other facts about rooms.</font></p>

<p><font size="2"><span style="mso-bidi-font-family: Arial"><a name="_Ref469051116"><b><span style="mso-bidi-font-family:

Arial">Table <span style="mso-field-code: SEQ Table \* ARABIC \s 1; mso-bidi-font-family: Arial">1</span></span></b></a><span style="mso-tab-count: 2; mso-bidi-font-family: Arial">

</span><span style="mso-bidi-font-family:Arial;font-weight:normal"><i>A

simple data use case for room scheduling</i></span></span></font></p>

<table border="1" width="59%" cellspacing="0" cellpadding="0">


<td width="23%" bgcolor="#C0C0C0">Room</td>

<td width="28%" bgcolor="#C0C0C0">Time</td>

<td width="23%" bgcolor="#C0C0C0">ActivityCode</td>

<td width="59%" bgcolor="#C0C0C0">ActivityName</td>



<td width="23%">20</td>

<td width="28%">Mon 9 am</td>

<td width="23%">VMC</td>

<td width="59%">VisioModeler class</td>



<td width="23%">20</td>

<td width="28%">Tue 2 pm</td>

<td width="23%">VMC</td>

<td width="59%">VisioModeler class</td>



<td width="23%">33</td>

<td width="28%">Mon 9 am</td>

<td width="23%">AQD</td>

<td width="59%">ActiveQuery demo</td>



<td width="23%">33</td>

<td width="28%">Fri 5 pm</td>

<td width="23%">SP</td>

<td width="59%">Staff party</td>



<td width="23%">...</td>

<td width="28%">...</td>

<td width="23%">...</td>

<td width="59%">...</td>



<p><font face="Arial">The table suggests that rooms can be simply identified

by room numbers, so let's accept that. One way of modeling the situation in

Oracle ER is shown in Figure 9. Here the bar notation is used to show that

RoomTimeSlot is identified by combining its time and room number.</font></p>

<p><img border="0" src="graphics/halpin9.gif" width="307" height="178"></p>

<p><font size="2"><b>Figure 9</b> <i>An ER diagram for room scheduling</i></font></p>

<p><font face="Arial">The use of attributes in this model makes it hard to

verbalize and populate the schema for validation purposes. Moreover, there

is at least one constraint missing. Compare this with the populated ORM

model for the same situation. Here the facts are naturally verbalized as a

ternary (Room at Time is used Activity) and a binary (Activity has

ActivityName). The associated fact tables include the original facts, as

well as counter-facts (italicized) to test the constraints. The first

counter-row (20, Mon 9 am, AQD) tests the uniqueness constraint that a room

at a time is used for at most one activity. The second counter-row tests the

uniqueness constraint that at most one room can be used for a given activity

at a given time. This constraint may well be wrong, but at least we can

express it and test it in ORM. With the ER model there is no way of even

specifying the constraint, much less testing it. The counter rows (SP, Sales

phonecalls) and (PTY, Staff party) are designed to check the uniqueness

constraints that each Activity has at most one Activity name and vice versa.

If these are rejected, the association really is 1:1, as its basic

population suggests. Since the ER notation being discussed doesn't include a

way of indicating that attributes other than the primary identifier are

unique, it isn't very helpful here. As a small point, the Y2K row has been

added to the original population to indicate that it is possible for some

listed activities to be unscheduled.</font></p>

<p><img border="0" src="graphics/halpin10.gif" width="413" height="213"></p>

<p><font size="2"><b>Figure 10</b> <i>An ORM diagram for room scheduling,

with sample and counter data</i></font></p>

<p><font face="Arial">In case it looks like I'm just bashing attribute-based

approaches like ER in this article, let me say again that I find

attribute-based models useful for compact overviews and for getting closer

to the implementation model. However I generate these by mapping from ORM,

which I use exclusively for conceptual analysis. This makes it easier to get

the model right in the first place, and to modify it as the underlying

domain evolves. Unlike ER (and UML for that matter), ORM was built from a

linguistic basis, and its graphic notation was carefully chosen to exploit

the potential of sample populations. To reap the benefits of verbalization

and population for communication with and validation by domain experts, it's

better to use a language that was designed with this in mind. An added

benefit of ORM is that its graphic notation can capture many more business

rules than popular ER notations.</font></p>

<p><font face="Arial Black">Next issues</font></p>

<p><font face="Arial">Later articles in this series will consider more

advanced aspects of the Oracle ER notation, including exclusion constraints,

frequency constraints, subtyping and non-transferable relationships, and

then examine the Information Engineering notation for ER, before concluding

with a discussion of IDEF1X.</font></p>

<p><font face="Arial Black">References</font></p>


<li><font face="Arial">Barker, R. 1990, CASE*Method: Tasks and

Deliverables, Addison-Wesley, Wokingham, England.</font></li>

<li><font face="Arial">Chen, P.P. 1976, 'The entity-relationship

model-towards a unified view of data', ACM Transactions on Database

Systems, vol. 1, no. 1, pp. 9-36.</font></li>

<li><font face="Arial">Halpin, T.A. 1998-9, 'UML data models from an ORM

perspective: Parts 1-10', Journal of Conceptual Modeling, InConcept,

Minneapolis USA.</font></li>

<li><font face="Arial">Halpin, T.A. & Bloesch, A.C. 1999, 'Data

modeling in UML and ORM: a comparison', Journal of Database Management,

vol. 10, no. 4, Idea group Publishing Company, Hershey, USA, pp. 4-13.</font></li>

<li><font face="Arial">Hay, D.C. 1999, 'There is no object-oriented

analysis', DataToKnowledge Newsletter, vol. 27, no. 1, Business Rule

Solutions, Inc., Houston TX, USA.</font></li>

<li><font face="Arial">Hay, D.C. 1999, 'Object orientation and information

engineering: UML', The Data Administration Newsletter, no. 9, (June

1999), ed. R.S. Reiner, available online at <a href=""></a>.</font></li>

<li><font face="Arial">Ross, R.G. 1998, Business Rule Concepts, Business

Rule Solutions, Inc., Houston TX, USA.</font></li>





<td width="100%" valign="top" style="text-align: justify">




<p><em>Dr Terry Halpin, BSc, DipEd, BA, MLitStud, PhD, is a Program Manager in

Database Modeling for the Enterprise Frameworks and Tools Unit, Microsoft Corporation, USA., Seattle WA, USA.

During a lengthy career as an academic in computer science, he also worked in industry on

database modeling technology and as a data modeling consultant. His recent positions

include head of database research at Asymetrix Corporation, and research director of

InfoModelers Inc., which was acquired by Visio Corporation. For several years, his

research has focused on conceptual modeling and conceptual query technology for

information systems, using a business rules approach. Dr Halpin has presented papers and

tutorials at many international conferences. His doctoral thesis provided the first full

formalization of Object-Role Modeling (ORM/NIAM), and his publications include over

ninety technical papers, as well as four books, including Information Modeling

and Relational Databases (Morgan Kaufmann, 2001).</em></p>


<p><em>Contact Information:</em></p>


<p><em>Dr Terry Halpin



Program Manager, Database Modeling&nbsp;&nbsp;&nbsp; <br>

Enterprise Framework & Tools Unit, Microsoft Corporation



One Microsoft Way<br>

Redmond WA 98052-6399 (USA)<br>

<a href=""></a><br>

(425) 705 9190<br>

fax: (425) 936 7329<br>

<a href=""></a></em></p>






116생명 정보공학이란 무엇인가?
114기지개 펴는 객체 데이터베이스
100Linux May Be More Ready Than You Think
85Entity Relationship Modeling from an ORM Perspective: Part 1 -- By Dr. Terry Halpin
54SQLite 2.0
32Open source Database 가 충분하지 못할 때
Valid XHTML 1.0!
All about the DATABASE... Copyleft 1999-2023 DSN, All rights reserved.
작업시간: 0.057초, 이곳 서비스는
	PostgreSQL v14.6으로 자료를 관리합니다