lossless, dependency-preserving 3NF decomposition

That is a mouthful of unfamiliar vocabulary to me…

Let’s start with the first term:

Lossless

Lossless is a type of data compression method. The opposite of this is lossy.

Lossless keeps all data which in the sense of databases can be extremely useful. However, it can also become repetitive if the data is repetitive since lossless keeps everything.

In our everyday world, lossless data compression can be seen in the form of .zip files.

Dependency-preserving

Dependency-preserving is quite straightforward once you know the context of how this term is used.

Dependencies are often used in databases; where one attribute is dependent on another attribute or table. For example, if I have a table class, that class cannot exist without students, instructor, and a classroom. So for that class table, students, instructor, and a classroom are all needed for class, aka class is dependent on those other tables.

Preserving is not getting rid of data. This means in the context of a database, we’re preventing major changes to the dependencies.

These two together means that we’re trying to reduce redundancy without losing the result of the dependencies.

In our everyday world, dependency-preserving can look like this:

We have a social security number, our legal name, our birthdate, and our permanent address. When we apply for a credit card, they require all of the information. However, credit card companies really have all the information required just from our social security number and our current permanent address alone, so they technically don’t need our legal name or our birthdate.
That is dependency-preservation because the results of the credit card being approved will be the same whether we have just the our social security number and our current permanent address OR social security number, our legal name, our birthdate, and our permanent address.

3NF Decomposition

This one is a bit more complex.

3NF is short for Third Normal Form.

…for all functional dependencies in F+ of the form α → β, where α ⊆ R
and β ⊆ R, at least one of the following holds:
• α → β is a trivial functional dependency.
• α is a superkey for R.
• Each attribute A in β − α is contained in a candidate key for R.

Database System Concepts by Silberschatz, Korth, and Sudarshan

Example

I got this question for one of my homework assignments (I’ll adjust the question a bit so I don’t just full-fledge post the answers).

Let R = (A, B, C, D, E, G) and F = {A → BC, C → A, DEG → B, BG → D, CA → E}, give a
lossless, dependency-preserving 3NF decomposition of R.

Step 1: Find Candidate Keys
(AG)+, (CG)+
Step 2: Tables w/ Candidate Keys
R is in 3NF
AG is contained in a candidate key
CG is a super key
Step 3: Compute the canonical cover (work not shown)
Step 4: The for loop generates the following 3NF schema:
(A, C)
(C, A)
(DEG, B)
(CA, E)
Step 5: Detect and delete schemas that are subsets of other schemas:
Delete (A,C) and (C,A) since we already have (CA, E)
Step 6 (Result):
Simplified 3NF schema is:
(DEG, B)
(CA, E)


Discover more from Angela Cui

Subscribe to get the latest posts sent to your email. It’s FREE!