Normalization to 3NF

GCSE Databases Resources(14-16 years)

  • An editable PowerPoint lesson presentation
  • Editable revision handouts
  • A glossary which covers the key terminologies of the module
  • Topic mindmaps for visualising the key concepts
  • Printable flashcards to help students engage active recall and confidence-based repetition
  • A quiz with accompanying answer key to test knowledge and understanding of the module

A-Level Introduction to Databases (16-18 years)

  • An editable PowerPoint lesson presentation
  • Editable revision handouts
  • A glossary which covers the key terminologies of the module
  • Topic mindmaps for visualising the key concepts
  • Printable flashcards to help students engage active recall and confidence-based repetition
  • A quiz with accompanying answer key to test knowledge and understanding of the module

What is Normalization?

Standardization is a database arrangement procedure that lessens data redundancy and gets rid of unpleasant attributes like Insertion, Update and Deletion Anomalies. Normalization rules isolate greater relations into more unobtrusive relations and associations them utilising associations. The explanation behind Normalization in SQL is to get rid of overabundance (excess) data and assurance data is taken care of reliably.

Definition:

Normalization is a way of arranging the database data to eliminate data duplication, anomaly of addition, anomaly of modification & anomaly of deletion.

Normalization is the transition to a series of simpler, stable data models of dynamic user views and data stores. Normalized data models are more readily managed than other data forms, in addition to being easy and more reliable.

Normal Forms of Normalization:

The Normal Forms of the normalization are listed below

  • 1NF
  • 2NF
  • 3NF
  • BCNF
  • 4NF
  • 5NF
  • 6NF

In SQL, the Principle of Data Normalization is up-to – date. And on the 6th Standard Form, for example, there are conversations. Nevertheless, in most practical applications, in the third normal form, normalization does its best. Below is the development of the hypotheses of normalization portrayed.

Normalization to 3NF Image 1
(source: https://www.guru99.com/database-normalization.html)

What is Database Normalization

Information base standardisation is a method of ordering the details in the data set. Standardisation is a proper procedure that applies a lot of rules to connect ascribes with elements. Standardisation is utilised when planning an information base.

Information base standardisation is for the most part used to:

  • Wipe out redundant information.
  • Guarantee information is intelligently put away (brings about a more adaptable information model).

Standardisation of an information model comprises a few stages. These means are called standardisation rules. Each standard is alluded to as a typical structure (1NF, 2NF, 3NF). The initial three structures are the most significant ones. There are in excess of 3 ordinary structures however those structures are seldom utilised and can be overlooked without bringing about a non adaptable information model. Every typical structure compels the information more than the past ordinary structure. This implies that you should initially accomplish the main typical structure (1NF) so as to have the option to accomplish the subsequent ordinary structure (2NF). You should accomplish the subsequent ordinary structure before you can accomplish the third typical structure (3NF).

 Main steps of Normalization

Start with either a customer sees or a data reserve made for a data word reference, the master normalises a data formation in three phases, as shown in the diagram underneath. Every movement incorporates a critical framework, one that improves the data formation.

Normalization to 3NF Image 2
(source: https://www.w3computing.com/systemsanalysis/normalization-steps-example/)

The association got from the customer seeing or data reserve will certainly be unnormalized. The principle period of the cycle consolidates disposing of all repeating social events and perceiving the basic key. To do thus, the association ought to be isolated into at any rate two relations. Presently, the relations may be starting at now be of the third common structure, anyway it is matching more advances will be required to change the tables to the third average structure.

The resulting advance makes sure that all non-key credits are totally depending on the basic key. All most of the way conditions are taken out and set in another association.

KEY?

A KEY is a worth utilise to recognise an information in a relation interestingly. A KEY could be a solitary section or blend of various segments

Note: Columns in a relation that are NOT utilised to recognise information interestingly are known as non-key sections.

Primary Key?

An essential is a solitary section esteem utilised to recognise an information base record interestingly.

Normalization to 3NF Image 3
(Source https://www.guru99.com/database-normalization.html)

It has following ascribes

  • An essential key can’t be NULL
  • An essential key worth must be extraordinary
  • The essential key qualities ought to once in a while be altered
  • The essential key must be given a worth when another information is embedded.

Composite Key?

A composite key is an essential key made out of various segments utilised to recognise an information extraordinarily in our information base, we have two individuals with a similar name Robert Phil, yet they reside in a better spot.

Normalization to 3NF Image 4
(Source https://www.guru99.com/database-normalization.html)

Thus, we need both Full Name and Address to distinguish an information extraordinarily. That is a composite key.

Anomalies in DBMS

There are three kinds of irregularities that happen when the information base isn’t standardised. These are Insertion, update and cancellation peculiarity.

Example:

Assume an amassing association stores the agent nuances in a table named specialist that has four credits: EmployeeID for taking care of delegate’s id, employeeName for taking care of agent’s name, worker Address for taking care of agent’s area and EmployeeDept for taking care of the division nuances where the agent works. Inevitably of time the table looks like this:

EmployeeIDEmployeeNameEmployee AddressEmployeeDept
1AsifHouseMath
2AtifHouse2Biology
3ArifHouse3English
4SafHouse4Physics
5KefHouse5Botany

Update anomaly:

In the above table we have two segments for specialist Asif as he has a spot with two divisions of the association. If we have to invigorate the area of Asif, by then we have to revive the identical in two segments or the data will get clashing. Accepting here and there or another, the correct area gets invigorated in one division anyway not in another at that point as indicated by the database, Asif would have two one of a kind areas, which isn’t right and would work up contender data.

Insert anomaly:

Expect another member to join the association, who is getting ready and not consigned to any office as of now, because if the EmployeeDept area would not accept nulls, we will not have the option to install the data into the table.

Delete anomaly:

Agree, once the organisation closes the division D890 at a time, then eradicating the segments that have EmployeeDept as D890 will also remove the details of Maggie ‘s staff because she is just dispensed to this office.

First Normal Form (1NF)

The first normal form of the normalization has four properties.

  • It should unambiguously have single estimate credits
  • Attributes set aside in a part should be of a comparable sector
  • All the portions in a relation should have astounding names
  • Additionally, the solicitation wherein data is taken care of, doesn’t have any kind of effect

Second Normal Form (2NF):

The second normal form of the normalization has two properties.

  •  It should be in the first normal form.
  • Likewise, it should not have an incomplete Protectorate.

Third Normal Form (3NF)

The third normal form of the normalization has two properties.

  •  It is in the Second Normal structure.
  • Moreover, it doesn’t have a Transitive Protectorate.

Example of Normalization:

A consumer sees the Al S. Well Hydraulic Machinery Business number shown below. The first SALESPERSON-NUMBER, the second SALESPERSON-NAME and the third SALES-AREA are seen in the report. The fourth CUSTOMER-NUMBER and fifth CUSTOMER-NAME are shown in the body of the study. Next is the sixth WAREHOUSE-NUMBER to assist the client, followed by the seventh WAREHOUSE-LOCATION, where the association is housed.The eighth SALES-AMOUNT is the last details found in the consumer guide. The lines (one for each client) on the client demonstrate that a repetitive meeting is organised by items 4 through 8.

Normalization to 3NF Image 5
(Source: https://www.w3computing.com/systemsanalysis/normalization-steps-example/)

On the off chance that the examiner was utilising an information stream/information word reference approach, a similar data in the client view would show up in an information shape. Figure underneath represents how the information shape would show up at the information word reference phase of examination. The rehashing bunch is additionally demonstrated in the information structure by a mark (*) and space.

Normalization to 3NF Image 6
(Source: https://www.w3computing.com/systemsanalysis/normalization-steps-example/)

Prior to continuing, note the information relationship of the information components in appeared in the figure underneath. This kind of outline is known as an air pocket chart or information model graph. Every substance is encased in a circle, and bolts are utilised to show the connections. In spite of the fact that it is conceivable to make these associations with an E-R graph, it is here and there simpler to utilise the less difficult air pocket outline to display the information.

Normalization to 3NF Image 7
(Source: https://www.w3computing.com/systemsanalysis/normalization-steps-example/)

In this model, there is just a single SALESPERSON-NUMBER appointed to every SALESPERSON-NAME, and that individual will shield just a single SALES-AREA, yet every SALES-AREA might be doled out to numerous salesmen: henceforth, the twofold bolt documentation from SALES-AREA to SALESPERSON-NUMBER. For every SALESPERSON-NUMBER, there might be numerous CUSTOMER-NUMBER(s).

Moreover, there is a coordinated similarity between CUSTOMER-NUMBER and CUSTOMER-NAME; the equivalent is valid for WAREHOUSE-NUMBER and WAREHOUSE-LOCATION. Client NUMBER will have just one WAREHOUSE-NUMBER and WAREHOUSE-LOCATION, yet each WAREHOUSE-NUMBER or WAREHOUSE-LOCATION may support numerous CUSTOMER-NUMBER(s). At last, to decide the SALES-AMOUNT for one sales rep’s calls to a specific organisation, it is important to know both the SALESPERSON-NUMBER and the CUSTOMER-NUMBER.

The basic impartial of the formalise cycle is to improve all the flighty data things that are consistently found in customer sees. For e.g, if the master were to take the customer see talk over in advance and try to make a social relation out of it, the relation would look like as shown as follows. Since this association relies upon our fundamental customer see, we suggest it as SALES-REPORT.

On the off chance that the information was recorded in an unnormalized table, there could be rehashing gatherings.

On the off chance that the information was recorded in an unnormalized table, there could be rehashing gatherings.

Marketing chart is an unnormalized connection, since it has rehashing gatherings. It is additionally essential to see that a solitary characteristic, for example, SALESPERSON-NUMBER can’t fill in as the key. The explanation is clear when one looks at the connections between SALESPERSON-NUMBER and different credits in the figure delineation beneath.

Normalization to 3NF Image 8
(Source: https://www.w3computing.com/systemsanalysis/normalization-steps-example/)

FIRST NORMAL FORM (1NF)

The underlying stage in normalizing an association is to dispense with the redo get-togethers. In our model, the unnormalized association SALES-REPORT will be smashed into two different tables. These two new tables will be named SALESPERSON and SALESPERSON-CUSTOMER. Diagram underneath represents how the first, unnormalized association SALES-REPORT is normalized by secluding the association into two new tables. Notice that the association SALESPERSON contains the fundamental key SALESPERSON-NUMBER and all the qualities that were not redo (SALESPERSON-NAME and SALES-AREA).

Normalization to 3NF Image 9
(Source: https://www.w3computing.com/systemsanalysis/normalization-steps-example/)

The resulting association, SALESPERSON-CUSTOMER, contains the fundamental key from the association SALESPERSON (the basic key of SALESPERSON can’t avoid being SALESPERSON-NUMBER), similarly as all the characteristics that were significant for the redo gathering (CUSTOMER-NUMBER, CUSTOMER-NAME, WAREHOUSE-NUMBER, WAREHOUSE-LOCATION, and SALES-AMOUNT). Knowing the SALESPERSON-NUMBER, regardless, doesn’t normally suggest that you will know the CUSTOMER-NAME, SALES-AMOUNT, WAREHOUSE-LOCATION, and so forth in this association, one must use a connected key (both SALESPERSON-NUMBER and CUSTOMER-NUMBER) to get to the rest of the record. It is possible to create the tables in hand writing documentation as follows:

The association SALESPERSON-CUSTOMER is a first standard association, yet it isn’t in its ideal structure. Issues develop because a bit of the characteristics are not basically depending upon the basic key (that is, SALESPERSON-NUMBER, CUSTOMER-NUMBER). Toward the day’s end, a bit of the non-key credits is depending just on CUSTOMER NUMBER and not on the connected key. The data sample layout in the diagram outline underneath represents that SALES-AMOUNT is depending on both SALESPERSON-NUMBER and CUSTOMER-NUMBER, anyway the other three characteristics are poor just on CUSTOMER-NUMBER.

Normalization to 3NF Image 10
(Source: https://www.w3computing.com/systemsanalysis/normalization-steps-example/)

SECOND NORMAL FORM (2NF):

In the resulting usual structure, all the properties will be basically depending upon the fundamental key. Subsequently, the accompanying stage is to take out all the to some degree poor credits and spot them in another association. Diagram underneath represents how the association SALESPERSON-CUSTOMER is part into two new tables: SALES and CUSTOMER-WAREHOUSE. These tables can similarly be imparted as follows:

Normalization to 3NF Image 11
(Source: https://www.w3computing.com/systemsanalysis/normalization-steps-example/)

The connection CUSTOMER-WAREHOUSE is in the subsequent typical structure. It can even now be streamlined further on the grounds that there are extra conditions in the connection. A portion of the non-key credits are needy on the essential key, yet in addition on a non-key characteristic. This reliance is alluded to as a transitive reliance.

Diagram underneath represents the conditions in the connection CUSTOMER-WAREHOUSE. For the connection to be a subsequent ordinary structure, all the properties must be reliant on the essential key CUSTOMER-NUMBER, as appeared in the outline. Distribution center place, be that as it may, is clearly subject to WAREHOUSE-NUMBER too. To rearrange this connection, another progression is needed

Normalization to 3NF Image 12
(Source: https://www.w3computing.com/systemsanalysis/normalization-steps-example/)

THIRD NORMAL FORM (3NF):

A standardised connection is in third ordinary structure if all the non-key credits are completely practically reliant on the essential key and there are no transitive conditions. In a way like the past advances, it is conceivable to split the connection CUSTOMER-WAREHOUSE into two tables, as appeared in the diagram beneath.

Normalization to 3NF Image 13
(Source: https://www.w3computing.com/systemsanalysis/normalization-steps-example/)

There are two new tables  which are called CUSTOMER and WAREHOUSE, and can be formed as follows:

The basic key for the association CUSTOMER can’t avoid being CUSTOMER-NUMBER, and the fundamental key for the association WAREHOUSE can’t avoid being WAREHOUSE-NUMBER.

Despite these basic keys, we can perceive WAREHOUSE-NUMBER to be a new key in the association CUSTOMER. A new key is any property that is non-key in one association yet a fundamental key in another association. We doled out WAREHOUSE-NUMBER as a new key in the past documentation and in the diagrams by emphasizing it with a run dashes line: __________.

At the end the first, exponentiated association SALES-REPORT has been changed into four 3NF tables. In reviewing the tables showed up in the diagram underneath, one can see that the one association SALES-REPORT was changed into the going with four tables:

Normalization to 3NF Image 14
(Source: https://www.w3computing.com/systemsanalysis/normalization-steps-example/)

The third ordinary structure is sufficient for most information base plan issues. The disentanglement picked up from changing an unnormalized connection into a lot of 3NF tables is an enormous advantage when it is near time to embed, erase, and update data in the information base.

An E-R graph for the information base appears in the diagram underneath. One SALESPERSON serves numerous CUSTOMER(s), who produce SALES and get their things from one WAREHOUSE (the nearest WAREHOUSE to their area). Set aside the effort to see how the elements and characteristics identify with the information base.

Normalization to 3NF Image 15
(Source: https://www.w3computing.com/systemsanalysis/normalization-steps-example/)

Using the Entity-Relationship Diagram to Determine Record Keys

The E-R diagram can be used to select the keys that a document or database relationship needs. Building the E-R diagram and printing an outstanding (basic) key for each data substance is the underlying development. The figure below provides an E-R overview for a consumer demand system. There are three components of the data: CUSTOMER, with a CUSTOMER-NUMBER basic key; ORDER, with an ORDER-NUMBER basic key; and Object, with the basic key being Object-NUMBER.

One CUSTOMER can place several orders, but each ORDER can only be set by one CUSTOMER, so the relationship is one-to-many. Many orders may contain different ITEM(s), and each ITEM may be included in different ORDER(s), so there are many-to-many ORDER-ITEM relationships.

Normalization to 3NF Image 16
(Source: https://www.w3computing.com/systemsanalysis/normalization-steps-example/)

A new key, though, is a data area that is the fundamental key to another expert document in a given database. For eg, on the STUDENT MASTER table, there might be a DEPARTMENT-NUMBER showing an understudy’s major. Division NUMBER may be the unusual key to the DEPARTMENT MASTER table in the same way.

Advantages of DBMS Normalization

Information base Normalization gives the accompanying fundamental points of interest:

  • Standardisation builds information consistency as it stays away from the tastier of information by putting away the information in one spot as it were.
  • Standardisation helps in gathering like or related information under a similar pattern, subsequently bringing about the better gathering of information.
  • Standardisation improves looking through quicker as files can be made quicker. Thus, the standardised information base or table is utilised for OLTP (Online Transaction Processing).

Disadvantages 0f Database Normalization

DBMS Normalization has the accompanying drawbacks:

  1. We can’t discover the related information for, state an item or representative in one spot and we need to join more than one table. This causes a postponement in recovering the information.
  2. In this manner, Normalization is anything but a decent choice in OLAP exchanges (Online Analytical Processing).

Before we continue further, we should comprehend the accompanying terms:

Entity:

  • Entity is a genuine item, where the information related to such an article is put away in the table. The case of such articles are workers, divisions, understudies, and so forth

Attributes:

  • Attributes are the qualities of the element, that give some data about the Entity. For Example, in the event that tables are elements, at that point the sections are their properties.

Purpose of Normalization:

The basic role of the standardization is to lessen the information repetition for example the information should just be put away once. This is to dodge any information peculiarities that could emerge when we endeavour to store similar information in two distinct tables, yet changes are applied uniquely to one and not to the next.

Denormalization:

Denormalization is a procedure to build the presentation of the information base. This strategy adds excess information to the data set, in spite of the standardised data set that eliminates the repetition of the information. This is done in immense information bases where executing a JOIN to get information from various tables is a costly undertaking. Subsequently, repetitive information is put away in various tables to maintain a strategic distance from JOIN tasks.

Conclusion:

Up until this point, we have all experienced three information base standardisation structures.

Hypothetically, there are higher types of information base normalizations like Boyce-Codd Normal Form, 4NF, 5NF. Notwithstanding, 3NF is the generally utilised standardisation structure in the creation information bases.

Summary:

The different types of information base standardisation are valuable while planning the composition of a data set so that there is no information replication which may potentially prompt irregularities. While planning the composition for applications, we ought to consistently consider how we might utilise these structures.

References:

  1. Demba, M. (2013). Algorithm for relational database Normalization up to 3NF. International Journal of Database Management Systems, 5(3), 39.
  2. Kolahi, S., & Libkin, L. (2006, June). On redundancy vs dependency preservation in normalization: an information-theoretic study of 3NF. In Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems (pp. 114-123).
  3. Yazici, A., & Karakaya, Z. (2006, May). Normalizing relational database schemas using mathematica. In International Conference on Computational Science (pp. 375-382). Springer, Berlin, Heidelberg.
  4. Makinouchi, A. (1977, October). A Consideration on Normal Form of Not-Necessarily-Normalized Relation in the Relational Data Model. In VLDB (Vol. 1977, pp. 447-453).
  5. https://beginnersbook.com/2015/05/normalization-in-dbms/
  6. https://www.guru99.com/database-normalization.html
  7. https://www.w3computing.com/systemsanalysis/normalization-steps-example/
  8. https://www.studytonight.com/dbms/database-normalization.php
  9. https://en.wikipedia.org/wiki/Third_normal_form#:~:text=Third%20normal%20form%20(3NF)%20is,in%201971%20by%20Edgar%20F.
  10. https://opentextbc.ca/dbdesign01/chapter/chapter-12-normalization/

Leave a Comment