BISQL – Laymen to SQL Developer # 16 – Record Storage and Primary File Organization #1 – Introduction, Memory Hierarchy, Secondary Storage Devices
Hi Folks,
This post is part of Series Database Management Systems
Currently running topic for this series is listed as below :
Series of Database Management Systems
>>Chapter 1 : DBMS [Database Management Systems]
>>Chapter 2 : Database Core Concepts and Applications
>>Chapter 3 : Record Storage and Primary File Organization <You are Here>
Continuing from my previous post on this series, If you have missed any link please visit link below
We are going to Cover the Following Points in this article
- Introduction
- Memory Hierarchy
- Secondary Storage Devices
Introduction
In this unit you will study about the definition of the storage media and how the data is stored in the database. You will also come to know the difference between the conventional file systems and database management systems.
The collection of data is stored on some computer storage medium. The DBMS software can then retrieve update and process this data as needed. Computer storage media includes two main categories:
Ø Primary Storage: This category includes storage media that can be operated directly by the computer central processing unit [CPU], such as the computer main memory and smaller but faster cache memories.
Ø Secondary storage devices include magnetic disks, optical disks, tapes and drums, and usually are of larger capacity, cost less and provide slower access to data than primary storage devices. Data in secondary storage cannot be processed directly by the CPU; it must first be copied into primary storage.
The following reasons show why databases are stored on secondary storage.
Ø Databases are too large to fit entirely in main memory.
Ø Secondary storage devices are nonvolatile storage, whereas main memory is often called volatile storage.
Ø The cost of storage per unit of data is less for disk than for primary storage.
Ø Indexes are used to speed up the retrieval of records.
Ø Indexes can be created using one or more columns, providing the basis for both rapid random lookups and efficient ordering of access to records.
Ø The disk space required to store the index is typically less than the storage of the table (since indexes usually contain only the key-fields according to which the table is to be arranged, and exclude all the other details in the table).
Ø Index file consists of two fields, the first field contains the value and second field contains the list of pointers to address values in the disk block.
Ø Searching an index is much faster than searching the table because the index is sorted and its rows are very small.
Ø Index access structure is usually defined on a single field of a file, called an indexing field.
Memory Hierarchy
The orderly arrangement o f the storage in t e system architecture is called the memory hierarchy.
The memory hierarchy has following level:
« Processor register: it is usually used for the fastest possible access at the rate of 1 CPU cycle. It can access only hundreds of bytes in size.
« Level 1 Cache: level 1 cache is often accessed in just a few cycles, usually in tens of kilobytes.
« Level 2 cache: It has got higher latency than L1 by 2 times to 10 times in
512 KiB or more (KiB=KibiByte refered as Kilo Binary bytes, 1 Kibibyte =
210 bytes. Its value is nearer to kilobyte.
« Main memory: it can take hundred of cycles in multiple gigabytes.
« Disk storage: it has the latency of million of cycles, but very large.
« Tertiary storage: Latency of several records can be huge.
Secondary Storage Devices
This section describes some characteristics of magnetic disk and magnetic tape storage devices.
Hardware Description of Disk Devices:
« Magnetic disks are used for storing large amounts of data.
« The most basic unit of data on the disk is a single bit of information. By magnetizing an area on disk in certain ways, we can represent a bit value of either 0 [zero] or 1 [one]. The capacity of a disk is the number of bytes it can store, usually in kilobytes [Kbytes or 1000 bytes] megabytes and gigabytes [Gbyte or 1 billion bytes].
« Disks are all made of magnetic material shaped as a thin circular disk and protected by plastic or acrylic cover.
« A disk is single sided if its stores information on only one of its surfaces, and double-sided if both surfaces are used.
« Information is stored on disk surface on concentric circles, each having a distinct diameter. Each circle is called a track. For disk packs, the tracks with the same diameter on the various surfaces are called a cylinder. The concept of a cylinder is important, because data stores on the same cylinder can be retrieved much faster than if it distributed among different cylinders. Track usually contains a large amount of information; it is divided into smaller blocks or sectors. A disk is called a random access addressable device. Transfer of data between main memory and disk takes place in units of blocks.
« The actual hardware mechanism that reads or writes a block is the disk read/write head, which is part of a system called a disk drive.
A read/write head includes an electronic component attached to a mechanical arm. Disk packs with multiple surfaces are controlled by several read/write heads – one for each surface.
The disk drive begins to rotate the disk whenever a particular read or write request is initiated. Once the read/write head is positioned on the right track, and the block specified in the block address moves under the read/write head, the electronic component of the read/write head is activated to transfer the data.
To transfer a disk block, given its address, the disk drive must first mechanically position the read/write head on the correct track. The time required to do this is called the seek time. Following that, there is another delay – called the rotational delay or latency – while the beginning of the desired block rotates into position under the read/write head. Some additional time is needed to transfer the data; this is called the block transfer time. Hence, the total time needed to locate and transfer an arbitrary block, given its address, is the sum of the seek time, rotational delay, and block transfer time. The seek time and rotational delay are usually much larger than the block transfer time.
RAID: RAID stands for Redundant Array of Inexpensive Disks. In order to balance the I/O load you have to equate the dbase files across all the disk drives. Therefore maintaining of individual disk drives in large number is cumbersome. For the sake of simplification of this task RAID was introduced.
RAID system can be configured in different ways depending upon your needs. The criterion for configurations is of fault tolerance. These are known as levels in RAID.
Even though these levels work separately but serve the same purpose of creating a logical disk drive out of two or more physical disks. The multiple disk drives are configured in the form of array to provide the desired performance and the fault tolerance properties.
The RAID array is classified into following levels.
« RAID 0
« RAID 1
« RAID 0+1
« RAID 5
RAID 0: In this level even though redundancy is not there it is state as RAID level. This level is also known as „striping‟. It is basic of all RAID levels.
RAID 0 helps in splitting up the logical volume to physical request. These requests are sent one by one but they are executed simultaneously. In this level performance level has reached to maximum since the processing doesn’t‟t have overhead. Since all the disk space is used, if single disk drive fails all the data is lost. So there is no fault tolerance.
RAID 1: It is also known as mirroring. RAID 1 makes each disk to duplicate itself entirely there by avoiding the loss of all the data when the single disk drive fails. But this is expensive since it requires the number of disks as storage system.
RAID 0+1: This level is the combinations of RAID 0 and RAID 1. IN RAID 0+1 disk drives are mirrored and then striped. Hence it includes both the properties of large disk space and performance and mirroring.
RAID 5: It uses parity check method for fault tolerance. This eliminates the use of double the number of disk drives just by adding a drive to store parity.
RAID comparison
RAID level |
Read performance |
Write performance |
Fault tolerance |
Cost |
RAID 0 |
Good |
Good |
None |
Low |
RAID 1 & 0+1 |
Good |
Ok 1 logical write = 2 physical I/Os |
Excellent |
Highest |
RAID 5 |
Good |
Poor 1 logical write = 4 physical I/Os |
Ok |
Best for fast tolerance |
Mainly RAID is used for improving Reliability and performance in large organizations.
Magnetic Tape Storage Devices:
« Magnetic tapes are sequential access devices;
« To access the nth block on tape, we must first read the proceeding n – 1 block.
« Data is stored on reels of high – capacity magnetic tape, somewhat similar to audio or videotapes.
« Tape access can be slow, and tapes are not used to store on-line data.
However, tapes serve a very important function – that of backing up the database. One reason for backup is to keep copies of disk files in case the data is lost because of disk crash, which can happen if the disk read/write head touches the disk surface because of mechanical malfunction.
Hope you will like Series of Database Management Systems series !
If you have not yet subscribe this Blog , Please subscribe it from “follow me” tab !
So that you will be updated @ real time and all updated knowledge in your mail daily for free without any RSS subscription OR news reading !!
Happy Learning and Sharing !!
For More information related to BI World visit our all Mentalist networks Blog
SQL Server Mentalist … SQL Learning Blog
Business Intelligence Mentalist … Business Intelligence World
Microsoft Mentalist … MVC,ASP.NET, WCF & LinQ
MSBI Mentalist … MS BI and SQL Server
NMUG Bloggers …Navi Mumbai User Group Blog
Architectural Shack … Architectural implementation and design patterns
DBA Mentalist …Advance SQL Server Blog
MVC Mentalist … MVC Learning Blog
Link Mentalist … Daily Best link @ your email
Infographics Mentalist … Image worth explaining thousand Words
Hadoop Mentalist … Blog on Big Data
BI Tools Analysis … BI Tools
Connect With me on
| Facebook |Twitter | LinkedIn| Google+ | Word Press | RSS | About Me |