Home > Link, Microsoft SQL Server, MSBI, Optimization, Query, SQL Mentalist, SQL PraRup, SQL Query, SQL Server, SQL Tricks, Technology,, Vishal Pawar > BISQL – Laymen to SQL Developer # 30 – Functional Dependencies and Normalization For Relational Databases #1 – Introduction, Information Design Guide Lines for Relational Database

BISQL – Laymen to SQL Developer # 30 – Functional Dependencies and Normalization For Relational Databases #1 – Introduction, Information Design Guide Lines for Relational Database

Hi Folks,

This post is part of Series Database Management Systems

Currently running topic for this series is listed as below :

Series of Database Management Systems

>>Chapter 1 : DBMS [Database Management Systems]

>>Chapter 2 : Database Core Concepts and  Applications

>>Chapter 3 : Record Storage and Primary File Organization

>>Chapter 4 : Index Structures of Files

>>Chapter 5 : Entity-Relationship Model

>>Chapter 6 : Relational Algebra

>>Chapter 7 : SQL

>>Chapter 8 : Functional Dependencies and  Normalization For Relational Databases<You are Here>

Continuing from my previous post on this series.

We are going to Cover the Following Points in this article

  • Introduction
  • Information Design Guide Lines for Relational Database

Introduction

Normalization is the process of building database structures to store data, because any application ultimately depends on its data structures. If the data structures are poorly designed, the application will start from a poor foundation. This will require a lot more work to create a useful and efficient application. Normalization is the formal process for deciding which attributes should be grouped together in a relation. Normalization serves as a tool for validating and improving the logical design, so that the logical design avoids unnecessary duplication of data, i.e. it eliminates redundancy and promotes integrity. In the normalization process we analyze and decompose the complex relations into smaller, simpler and well-structured relations.

Information Design Guide Lines for Relational Database

Some criteria for good and bad relation schemas are:

· Semantics of attributes

· Reducing the redundant values in tuples

· Reducing the null values in tuples

· Disallowing spurious tuples.

Semantics of the Attributes:

Understanding the meaning of the attribute values in the tuple is the attribute values in the tuple are referred as semantics. Addition to this semantic also specifies how they are related to one another. Whenever we group attributes to form a relation, we assume that a certain meaning is associated with the attributes. This meaning is called Semantics, and specifies how the attribute values in a tuple relate to one another.

E.g.: consider company database schema. The various relations considered for this database are:

EMPLOYEE

ENAME

SSN

BDATE

ADDRESS

DNUMBER

DEPARTMENT f.k

clip_image002

DNAME DNUMBER DMGRSSN

Fig. Simplified version of the COMPANY relational database schema

The meaning of the Employee relation is quite simple, each tuple represents an employee. The Dnumber attribute is a foreign key that represents an implicit relationship between EMPLOYEE and DEPARTMENT relations.

Guideline-1: design a relation schema so that it is easy to explain its meaning. Do not combine attributes from multiple entity types and relationship types into a single relation.

Reducing redundant values on tuples:

Storage space is one of the most important considerations of a relational schema. Improper grouping of attributes has a significant effect on the storage space of the relational schema.

Emp.no

Emp.Name

Salary

Address

       

Figure A

Dept_no Dname D_location

Figure B

In figure B each department information appears only once in the department relation.

If we integrate figure (A) and figure (B) as single table Emp_dept.

Emp.no

Emp.Name

Salary

Addr

Dept.no

D.Name

D.loc

             

Figure C: Emp_dept

There will be serious problem in using Figure C; that is insertion anomalies, deletion anomalies and modification anomalies.

Here whenever we are inserting tuples, there maybe n employees in department 10, Dept.no, D.name, D_loc values are repeated n times, which leads to data redundancy.

Update anomalies: Update anomalies are those problems which arise from the data redundancy of the un-normalized database table.

The following are the Update anomalies.

· Insertion Anomalies:

It is difficult to insert a ne department that has no employees as yet in the Emp_dept relation. This causes a problem because Emp.no is the primary key of Emp_dept. This problem does not occur in the design of fig.(B), because a department is entered in the DEPARTMENT relation, whether or not any employee works for it.

· Deletion Anomalies:

If we deletie the lost employee of a department from the emp_dept relation, than the whole information about that department will be lost. This problem does not occur in the database of fig.(B) because DEPARTMENT tuples are stored separately.

· Modification Anomalies:

In Emp_dept. if we change the value of one of the attributes of a particular department, say location of department 5, we must update the tuples of employees who work in that department, otherwise DB will become inconsistent.

Guide-line 2:

Design DB so that no insertion, deletion or modification anomalies are present in that relation. If there are any anomalies, note them clearly, so that proper actions can be taken.

NULL values in tuples:

These include unnecessary attributes in the relation. If many of the attributes do not take any values, we insert NULL values. This can waste space at the storage level, and it also leads to problems in understanding the meaning of the attributes and specifying join operation. Null’s may lead to counting problems while using aggregate functions.

Guideline 3:

As far as possible avoid using NULL values for attributes in a relation.

Disallowing spurious tuples:

Design relational schema so that they can be joined with equality conditions.

Emp_loc

Emp_Name P_loc

Emp_project

SSN

PNO

P_Name

P_Loc

Figure B

If we attempt a natural join operation on figure A and Figure B, the result produces many more tuples than the actual combination of tuples. Additional tuples are called Spurious Tuples,_ because they represent wrong information.

Guideline 4:

Design relation schemas so that they can be joined with equality conditions on attributes that are either primary key or foreign key. It guarantees that no spurious tuples are generated.

Hope you will like Series of Database Management Systems series !

If you have not yet subscribe this Blog , Please subscribe it from “follow me” tab !

So that you will be updated @ real time and all updated knowledge in your mail daily for free without any RSS subscription OR news reading !!

Happy Learning and Sharing !!

For More information related to BI World visit our all Mentalist networks Blog

SQL Server Mentalist … SQL Learning Blog

Business Intelligence Mentalist … Business Intelligence World

Microsoft Mentalist … MVC,ASP.NET, WCF & LinQ

MSBI Mentalist … MS BI and SQL Server

NMUG Bloggers …Navi Mumbai User Group Blog

Architectural Shack … Architectural implementation and design patterns

DBA Mentalist …Advance SQL Server Blog

MVC Mentalist … MVC Learning Blog

Link Mentalist … Daily Best link @ your email

Infographics Mentalist … Image worth explaining thousand Words

Hadoop Mentalist … Blog on Big Data

BI Tools Analysis … BI Tools

Connect With me on

| Facebook |Twitter | LinkedIn| Google+ | Word Press | RSS | About Me |

Advertisement
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: