Possible
references:
Topic objective: Students
should understand the function and usage of databases, including
relational databases, and the related relational operations.
|
Database – a collection of ordered data, and the software that maintains it. |
In a
database, the data contained is independent from the program which maintains
it.
Types of data:
data
operations:
statistical data –
◊
comparison
◊
mathematical operations
nominal,
ordinal – comparison only (equality – nominal & ordinal; position (higher/lower) – ordinal)
complex data – special operations
depending on data e.g. for DNA: length, percentage
of pyrimidines, etc
data domain – set of allowed
values
maintenance of data collection
–
provide bin for each data category
–
provide slot in every bin for each data
–
label each slot and bin
–
catalogue bins and their slots
–
add new bin and slot when needed, and update
description catalogue
Data model Entity Relationship (ER) – can be used to lessen data redundancy
Data relationships
are reduced to three components
-
entities
(data elements)
-
attributes
(possessed by entities)
-
relations
(between entities)

Normalization –
Some
relations can be broken up to smaller relations without loss of information
→ normal form
Relational database(Codd
model, 1969)
-
correct
and complete
-
based on relations
|
relation – set of domains which together descibe an entity/event |
made up of –
head – domain name (, attribute)
–
body – data
cf.
table (but table – columns in
specific order; relation – not in specific order)
|
e,g. relation called MATCH its head <PTR : PLBS> <PPL : PLBS> <GOLTR : INT> <GOLPL : INT> <TP : DATE> (domain
PTR with attribute PLBS, domain PPL with attribute PLBS, …) body: Terengganu Kelantan 5
0 Kelantan Johor 3
4 05/01/01 Kelantan Melaka 2
1 07/01/01 - rows of tuples which
are indexed by a key (certain attribute or attributes) can refer to [relation
name].[attribute] for example MATCH.PTR |
Relational algebra:
union
– H1 ∪ H2 –
relations with same structure (same sets of attributes) – combine
bodies [example]
intersectionn – H1 ∩ H2 –
relations with same structure – retain common tuples only [example]
difference
– H1 – H2 – relations with same structure
– tuples in one which are not in the other [example]
Cartesian product – H1 × H2 –
not necessarily same structure – gives all possibilities [example]
limit /
choose operation – σattrib=‘…’ (H) – give
new relation with body containing selected tuples from original relation [example]
project
operation – πl1,l2,…(H) – give new relation with selected
attributes from original relation [example]
combine
/ join operation – (Cartesian product + project +
limit) give new relation by combining two relations – repeated
attributes discarded, tuples chosen according to key [example]
–
equal join – H1
atrib1=atrib1' H2
– combined according to one common column (also greater join –
H1
atrib1>atrib1' H2
dsb)
example:
|
|
relation A |
|
relation B |
|||
|
|
<V> |
<W> |
|
<X> |
<Y> |
<Z> |
|
|
r t p |
2 4 6 |
|
5 4 4 2 |
g d t m |
p e f q |
if
C ← JOIN A and B where A.W=B.X [C= A
A.W=B.X B]
then
C is:
|
|
relation C |
|||
|
|
<A.V> |
<A.W>=<B.X> |
<B.Y> |
<B.Z> |
|
|
r t t |
2 4 4 |
m d t |
q e f |
for
C ← JOIN A and B where A.W<B.X [C= A
A.W<B.X B]
C
is:
|
|
relation C |
||||
|
|
<A.V> |
<A.W> |
<B.X> |
<B.Y> |
<B.Z> |
|
|
r r r t |
2 2 2 4 |
5 4 4 5 |
g d t g |
p e f p |
natural join – H1
H2 – combine all common columns but keep one of
each column only.
external join:
H1
H2 left Natural
join complete in H1 though not fitting to common column
e.g.
C= A
B, with A.W and B.X as common columns; C
is:
|
|
relation C |
|||
|
|
<A.V> |
<A.W>=<B.X> |
<B.Y> |
<B.Z> |
|
|
r t t p |
2 4 4 6 |
m d t null |
q e f null |
null placed in columns which do not correspond
H1
H2 right Natural
join complete in H2 though not fitting to common column
H1
H2 full Natural
join complete in H1 and H2 though nt fitting to common column
partial
join – H1
P H2 = πA(H1
P
H2) where P is a certain predicate
and A represents all attributes in H1
divide
operation – H1 ÷ H2 – (limit + project) find all values in non-key attributes
of first relation which share tuples with every value in second relation
[example]
comparison
operations – (comparison of relations!)
= equal
!= not equal
<= subset
< proper subset
>= superset
> proper superset
additional operations (additional
to Codd)
- extend – add attribute (opposite
to project)
- insert – copy all tuples from a
relation to the body of another relation (of same structure)
- update – modify the value of an attribute
- delete – delete tuple depending on value of
certain attribute
- rename attribute – ρH'(atrib1',atrib2',…)(H)
dll.
Relational calculus – non-operational description of
relations – (has same capability as relational algebra)
e.g.
MATCH.PTR WHERE (MATCH.TP=05/01/01 AND MATCH.GOLTR >
MATCH.GOLPL)
WHERE
AND
OR
EXISTS (true
if true for one case)
FORALL (true if true
for all)
IF … THEN
IFF – if and
only if
-
addition
– have to be certain that external refers to an existing tuple
e.g. adding tuple Melaka
Johor … to PERLAWANAN requires existence of
Melaka Stadium Kubu in
STADIUM
-
deletion
– avoid ‘orphan’ tuples
e.g. delete tuple Terengganu
… from PERLAWANAN results in
orphan tuple Terengganu… in STADIUM
-
update
– like deletion and addition
domain integrity:
cases for example where –
if A happens, B cannot happen
e.g.
a brown-eyed person cannot have both parents blue-eyed
if A happens, B must happen
e.g.
if someone scores a goal, his team must have at least 1
goal in its favour
the value of A constrains the value of B
e.g.
the date of death of a person must be bigger than his date of birth
|
Commit/rollback protocol: A data ‘transaction’ may involve several processing steps. As a defence against damage, a copy (log) of the transaction is done, and when the copy is complete – commit point – if there is damage the transaction can be rebuilt. If a problem occurs before the commit point, the log can be used to rollback the steps that have been carried out. Problem: other transactions may have used the data updated by the halfway steps, and so on – cascading rollback. Lock: To prevent problems from transactions executing when other transactions are executing (problem of wrong summary, problem of lost updates) – locking mechanism for locking data – shared lock (other parties can read but not write), exclusive lock (other parties cannot read nor write). To prevent deadlocks : bleed-wait protocol – ‘young’ processes give way (wait, or unlock and rollback) to ‘old’ processes. |
Data
organized in a hierarchical form
Data
organized in a network form
Data organisation
uses the concept of the object-oriented
approach
etc
misal
suruhan, operator, dsb:
SELECT
l1, l2, …, ln FROM
j1, j2, …, jm WHERE
P yg bermaksud π l1, l2, …, ln (σP(j1× j2×…× jm))
ORDER
BY
GROUP
BY
HAVING
AND
CROSS
JOIN
JOIN
LEFT
JOIN
CREATE
TABLE
UPDATE
INSERT
dsb
Contoh
perisian alatan:
ACCESS
ADABAS
dBASE III
ORACLE
dsb
(KDD - Penemuan Pengetahuan
dalam Pangkalan Data)
[Data yang banyak
(dan kompleks) dalam pangkalan-pangkalan data
→]
pengeluaran tak remeh maklumat tersirat,
yang belum diketahui, dan berkemungkinan berguna, dari data
Teknik
statistik dan pembelajaran mesin digunakan termasuk:
pengelompokan, perumusan data,
pembelajaran hukum pengelasan, pencarian rangkaian kegantungan, analisis
perubahan, pengesanan anomali untuk penemuan, dan teknik
gambaran untuk persembahan maklumat dalam bentuk yang mudah
difahami.