Database Basic Notes

Basic Concepts

Feature

Massive
Persistent
Safe
Multi-user
Convenient
Efficient
Reliable

Common Words

create/drop(from)/insert into/delete from/update
restricts
sub-queries
views(shorthand for queries)
left/right join on ...
primary/foreign key
references

(id, name, birth, major, grade) is not normalized, because grade is not relevant to student id (id, name, birth) + (id, major, grade) is normalized (name, os, lang) is not normalized, because os isn't relevant to lang (name, os) + (name, lang) is normalized Data-intensive applications may not use DBMS/Query Language at all e.g Hadoop, all operations on data stores in files

Data Format

XML

single root element
matched tags with proper nesting
unique attributes within elements

DTD

Document Type Definition:

similar grammar to regular expression(*?)
ID/IDRef should be unique
CDATA: character data
(#PCDATA): parsed character data (plain text between tags)

<?xml version="1.0" encoding="utf-8"?>

<!DOCTYPE Bookstore [
    <!ELEMENT Bookstore (Book)*>
    <!ELEMENT Book (Title, Authors, Remark?)>
    <!ATTLIST Book  ISBN ID #REQUIRED
                    Price CDATA #REQUIRED
                    Authors IDREFS #REQUIRED>
    <!ELEMENT Title (#PCDATA)>
    <!ELEMENT Remark (#PCDATA | BookRef)*>
    <!ELEMENT BookRef EMPTY>
    <!ATTLIST BookRef book IDREF #REQUIRED>
    <!ELEMENT Authors (Author)*>
    <!ELEMENT Author (First_Name, Last_Name)>
    <!ATTLIST Author Ident ID #REQUIRED>
    <!ELEMENT First_Name (#PCDATA)>
    <!ELEMENT Last_Name (#PCDATA)>
]>

<Bookstore>
    <Book ISBN="ISBN-0-23-23333-233" Price="233" Authors="Sabertazimi">
        <Title>Kind of sword</Title>
        <Authors>
            <Author Ident="Sabertazimi">
                <First_Name>Yilong</First_Name>
                <Last_Name>Liu</Last_Name>
            </Author>
        </Authors>
    </Book>
</Bookstore>

Tools: XML Copy Editor, XML Linter:

xmllint --valid --noout Bookstore.xml

XSD

XML Schema Definition:

xmllint -schema Bookstore.xsd -noout Bookstore.xml

JSON

JavaScript Object Notation:

serializing data objects in files
human-readable data
semi-structured data
number/boolean/string/array/object(empty or key-value pair) recursive constructs

Relational Algebra

Operators

select operator σ(sigma): σ(sID < 100 ^ sAge > 20)Table_Name set constraints
project operator π(pi) : π(sID, GPA)Table_Name select certain columns
cross-product operator x: Table1 x Table2, m tuples(rows) x n tuples(rows) => m*n tuples(rows)
natural join operator ∞: σ(E1.A1 = E2.A1 ^ E1.A2 = E2.A2 ...) (E1 x E2)
theta join operator ∞(condition): σ(condition) (E1 x E2), call condition as ϴ
difference operator -: matching schemas => change rows/tuples
union/intersection operator ∪ / ∩: matching schemas => change rows/tuples
rename operator ρ: change schemas(attributes name), different schemas <=> same schemas (union/intersection/self-join)
assign statement :=
tree notation

π(sID, GPA) (σ(sID < 100 ^ GPA > 3.7) Student)

Higher-Level Database Design Models

Higher-Level Database Design Models -Translator-> Relational implemented by RDBMS

UML

Unified Modeling Language: PlantUML

Classes

for data modeling:

add PK(primary key)
drop methods

-----------
| student |
|---------|
|sID   PK |
|sName    |
|GPA      |
|---------|
|<methods>|
-----------

Associations

relationships between objects of 2 classes):

one to one: 1..1 --- 1..1.
many to one: * --- 1..1
one to many: 1..1 --- *.
many to many: * --- *.

-----------                   ---------
| student |                   |college|
|---------|                   |       |
|sID   PK |x..y   Apply   m..n|       |
|sName    |-------------------|       |
|GPA      |                   |       |
|---------|                   |       |
|<methods>|                   |       |
-----------                   ---------

Associations Classes

classes store information of relationship edge between 2 data classes
unnecessary if 0..1 or 1..1

c1 * --- 1..1 c2
information of relationship edge can stored in c1
owing to every object of c1 only associated with 1 object of c2

Subclasses

children classes

Composition and Aggregation

Entity Relationship Model

SQL

Basic SQL Statements

增删查改:

-- 插入
INSERT INTO table_name (column_name) VALUES (sql_value);

-- 删除
DELETE FROM table_name WHERE column_name = sql_value;

-- 查询
SELECT * FROM table_name WHERE column_name = sql_value;

-- 更新
UPDATE table_name SET column1 = value1 WHERE column2 = value2;

Table Management

-- 添加列
ALTER TABLE table_name ADD COLUMN column_name;

-- 删除列
ALTER TABLE table_name DROP COLUMN column_name;

-- 修改列
ALTER TABLE table_name MODIFY COLUMN column_name;

-- 重命名列
ALTER TABLE table_name RENAME COLUMN column_name TO new_column_name;

-- 重命名表
ALTER TABLE table_name RENAME TO new_table_name;

WHERE Condition

=
<>
>
>=
<
<=
NOT
AND
OR
IN (collection)
BETWEEN value1 AND value2
LIKE '%王_'
REGEXP 'regexp'
IS NULL
IS NOT NULL

Order

排序:

ORDER BY column_name [ASC|DESC]

Group

分组:

GROUP BY column_name [HAVING condition]

Aggregation

聚合函数:

AVG(column_name)

COUNT(column_name)

MAX(column_name)

MIN(column_name)

SUM(column_name)

Limit

限制数量:

LIMIT [offset,] length

Distinct

去重:

DISTINCT column_name

Set Operations

并集:

query UNION query

交集:

query INTERSECT query

差集:

query EXCEPT query

Join

表关联:

-- 内连接
INNER JOIN table_name ON condition

-- 左连接
LEFT JOIN table_name ON condition

-- 右连接
RIGHT JOIN table_name ON condition

-- 外连接
OUTER JOIN table_name ON condition

Index

索引:

CREATE [UNIQUE|FULLTEXT|SPATIAL] INDEX index ON table_name (column1, column2, ...);

DROP INDEX index ON table_name;

SHOW INDEX FROM table_name;

ALTER TABLE table_name ADD INDEX index (column1, column2, ...);

View

视图:

CREATE VIEW view AS query;

DROP VIEW view;

ALTER VIEW view AS query;

DROP VIEW IF EXISTS Standings;
DROP VIEW IF EXISTS Count;
DROP VIEW IF EXISTS Wins;
DROP TABLE IF EXISTS Matches;
DROP TABLE IF EXISTS Players;

-- Players Table
CREATE TABLE Players (
  id SERIAL primary key,
  name varchar(255)
);

-- Matches Table
CREATE TABLE Matches (
  id SERIAL primary key,
  player int references Players(id),
  opponent int references Players(id),
  result int
);

-- Wins View shows number of wins for each Player
CREATE VIEW Wins AS
  SELECT Players.id, COUNT(Matches.opponent) AS n
  FROM Players
  LEFT JOIN (SELECT * FROM Matches WHERE result>0) as Matches
  ON Players.id = Matches.player
  GROUP BY Players.id;

-- Count View shows number of matches for each Player
CREATE VIEW Count AS
  SELECT Players.id, Count(Matches.opponent) AS n
  FROM Players
  LEFT JOIN Matches
  ON Players.id = Matches.player
  GROUP BY Players.id;

-- Standings View shows number of wins and matches for each Player
CREATE VIEW Standings AS
  SELECT Players.id,Players.name,Wins.n as wins,Count.n as matches
  FROM Players,Count,Wins
  WHERE Players.id = Wins.id and Wins.id = Count.id;

Export and Import

mysqldump -u <user> -p<password> <database> > sync.sql

mysql -u <user> -p<password> <database> < sync.sql

Relational Design

Decomposition

start with mega-relations: including all attributes
decompose into smaller relations(BCNF/4NF)

Functional Dependencies

A -> B => 1-1/n-1 mapping
key sets: closure of sets contains all attributes

assuming relation R(A, B, C, D, ..., G) and closure of A, B {A, B} + A->C->D, B->E->F, F->G => {A, B}+ = {A, B, C, ..., G} then, {A, B} is a key if there no exists such closure, then treat all-attributes as a key

BCNF

boyce-codd normal form:

for each A -> B having A is super key && B isn't key
not exists A -> B -> C
here's the algorithm:

/*
 * @brief fixed point algorithm just like most algorithms from compiler
 *
 * by decomposing to transform non-key dependent attributes to key dependent attributes
 */

compute FDs for R
compute key for R using its FDs

while (there is relation R' aren't in BCNF) {
    pick any R' with A -> B that violates BCNF (A is not its key)
    decompose R' into R1(A, B) and R2(A, rest)
    compute FDs for R1 and R2
    compute keys for R1 and R2 using their FDs
}

Multi Valued Dependencies

A -> B && rest attributes => A ->> B.
A ->> B (1-n mapping), A ->> C (1-n mapping), no B -> C/C ->> B, B * C redundant tuples/rows.
A ->>B && A ->>C => A ->> B∩C.
A ->>B && B ->>C => A ->> C-B.

4NF

Forth normal form:

if A ->> B then A is key && B isn't key
here's the algorithm:

/*
 * @brief fixed point algorithm just like most algorithms from compiler
 *
 * by decomposing to transform non-key dependent attributes to key dependent attributes
 */

compute FDs and MVDs for R
compute key for R using its FDs

while (there is relation R' aren't in 4NF) {
    pick any R' with A ->> B that violates 4NF(A is not its key)
    decompose R' into R1(A, B) and R2(A, rest)
    compute FDs and MVDs for R1 and R2
    compute keys for R1 and R2 using their FDs
}

Normalized Design

every row has the same number of columns
every row has a unique key(PRIMARY KEY)
everything in a row is all relevant to unique key
everything in a row is all relevant to each other

Indexes

primary mechanism to improve performance of database
persist data structures stored in database (hash tables/B trees/B+ trees)
trade off: scale of database and workload(query/update rate) as input of physical design advisors

CREATE INDEX IndexName on T(A)
CREATE INDEX IndexName on T(A1, A2, ..., An)
CREATE UNIQUE INDEX IndexName on T(A)
DROP INDEX IndexName

Transactions

a sequence of one/more SQL operations treated as a unit
target: concurrency and failures recovery

Transaction Standard

all or nothing(atomicity)
transaction begins automatically on first SQL statement
on "commit": old transaction ends, new one begins
on session termination: current transaction ends
"AutoCommit" turns each statement into transaction

ACID Properties

Atomicity(Logging)
Consistency
Isolation: guarantee serializability(Locking)
Durability(Logging)

Isolation Level

weaker isolation level: read uncommitted < read committed < repeatable read < serializable

increased concurrency + decreased overhead = increased performance
weaker consistency guarantees
some system default: repeatable read

SET TRANSACTION READ ONLY;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;

Integrity Constraints

CREATE TABLE TableName (
    ... PRIMARY KEY,
    ... UNIQUE,
    ... CHECK (Condition),
    ... references TableName(ForeignKey),
    ... references TableName(ForeignKey) ON DELETE/UPDATE RESTRICT/SET NULL/CASCADE,
    ... ,
    PRIMARY KEY (Attr1, Attr2, ...),
    UNIQUE (Attr1, Attr2, ...),
    CHECK (Condition),
    FOREIGN KEY (Attr1, Attr2, ...) references
      TableName(Bttr1, Bttr2, ...) [ ON ... (default RESTRICT) ]
);

CREATE ASSERTION AssertionName
CHECK (Condition);

Triggers (DBMS Level Constraints)

CREATE TRIGGER TriggerName
BEFORE|AFTER|INSTEAD OF Events(INSERT/UPDATE OF/DELETE ON TableName)
[ referencing-variables ]
[ FOR EACH ROW ]
WHEN ( Condition )
[ BEGIN ]
Action
[ END ];

CREATE TRIGGER Cascade
After DELETE ON S
REFERENCING OLD ROW AS O
FOR EACH ROW
DELETE FROM R WHERE A = O.B (R.A = S.B)

CREATE TRIGGER Cascade
After DELETE ON S
REFERENCING OLD TABLE AS OT
DELETE FROM R WHERE A IN (SELECT B FROM BT)

Views

logical layer: hiding data from users
modularity and reuse of query

VIEW ViewName = VIEWQUERY (R1, R2, ..., Rn)

CREATE VIEW ViewName (T1, T2, ..., Tn) AS
< Query >

Modifications on Views

owing to views are logical layer, it's senseless to modify data on views

Implements Modification with Triggers

CREATE TRIGGER TriggerName
INSTEAD OF DELETE/UPDATE OF/INSERT ON ViewName
[ referencing-variables ]
[ FOR EACH ROW ]
WHEN ( Condition )
[ BEGIN ]
Action
[ END ];

SQL Standard - Updatable Views

SELECT (no DISTINCT) on single table T
no GROUP BY/HAVING or Aggregation
attributes can't be NULL/default values
sub-queries cant' refer to table T

CREATE VIEW
...
WITH CHECK OPTION;

MySQL

MySQL Setup

Setup in Docker:

services:
  mysql:
    image: mysql:8.0.33
    container_name: mysql
    restart: unless-stopped
    environment:
      # 时区上海
      TZ: Asia/Shanghai
      # root 密码
      MYSQL_ROOT_PASSWORD: 123456
      # 初始化数据库
      MYSQL_DATABASE: ry-vue
    ports:
      - '3306:3306'
    volumes:
      # 配置挂载
      - ./docker/mysql/conf:/etc/mysql/conf.d/
      - ./docker/mysql/conf/ry.cnf:/etc/mysql/conf.d/ry.cnf:ro
      # 数据挂载
      - ./docker/mysql/data:/var/lib/mysql/
      # 日志挂载
      - ./docker/mysql/logs:/var/log/mysql
      # 数据库初始化
      - ./docker/mysql/init:/docker-entrypoint-initdb.d
      # 数据库脚本
      - ./docker/mysql/sql:/opt/mysql/sql
    privileged: true
    command:
      # 将mysql8.0默认密码策略 修改为 原先 策略 (mysql8.0对其默认策略做了更改 会导致密码无法匹配)
      --default-authentication-plugin=mysql_native_password
      --character-set-server=utf8mb4
      --collation-server=utf8mb4_general_ci
      --explicit_defaults_for_timestamp=true
      --lower_case_table_names=1

允许外部连接:

mysql -uroot -proot
GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY 'root' WITH GRANT OPTION;
FLUSH PRIVILEGES;
chmod 644 /etc/mysql/conf.d/mysql.cnf
mysql -h 127.0.0.1 -P 3306 -uroot -proot

MySQL Infrastructure

MySQL Server Layer

MySQL Infrastructure

MongoDB

MongoDB Setup

MongoDB Installation

sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv EA312927
echo "deb http://repo.mongodb.org/apt/ubuntu trusty/mongodb-org/3.2 multiverse"
\ | sudo tee /etc/apt/sources.list.d/mongodb-org-3.2.list
sudo apt-get update
sudo apt-get install -y mongodb-org mongodb-org-server
\ mongodb-org-shell mongodb-org-mongos mongodb-org-tools

MongoDB Not Upgrade

echo "mongodb-org hold" | sudo dpkg --set-selections
echo "mongodb-org-server hold" | sudo dpkg --set-selections
echo "mongodb-org-shell hold" | sudo dpkg --set-selections
echo "mongodb-org-mongos hold" | sudo dpkg --set-selections
echo "mongodb-org-tools hold" | sudo dpkg --set-selections

MongoDB Control

sudo service mongod start
sudo service mongod stop
sudo service mongod restart

MongoDB Uninstall

sudo service mongod stop
sudo apt-get purge mongodb-org*
sudo rm -r /var/log/mongodb
sudo rm -r /var/lib/mongodb

Shell Instruction

create and drop

create

use test
show dbs

drop

use dbToDrop
db.dropDatabase()

query

db.collection.find().pretty()

insert

db.collection.insert();

information

database

db.getName()
db.stats()
db.version()
db.getMongo()

collection

db.getCollectionNames()
db.printCollectionStats()

Basic Concepts​

Feature​

Common Words​

Data Format​

XML​

DTD​

XSD​

JSON​

Relational Algebra​

Operators​

Higher-Level Database Design Models​

UML​

Classes​

Associations​

Associations Classes​

Subclasses​

Composition and Aggregation​

Entity Relationship Model​

SQL​

Basic SQL Statements​

Table Management​

WHERE Condition​

Order​

Group​

Aggregation​

Limit​

Distinct​

Set Operations​

Join​

Index​

View​

Export and Import​

Relational Design​

Decomposition​

Functional Dependencies​

BCNF​

Multi Valued Dependencies​

4NF​

Normalized Design​

Indexes​

Transactions​

Transaction Standard​

ACID Properties​

Isolation Level​

Integrity Constraints​

Triggers (DBMS Level Constraints)​

Views​

Modifications on Views​

Implements Modification with Triggers​

SQL Standard - Updatable Views​

MySQL​

MySQL Setup​

MySQL Infrastructure​

MongoDB​

MongoDB Setup​

MongoDB Installation​

MongoDB Not Upgrade​

MongoDB Control​

MongoDB Uninstall​

Shell Instruction​

create and drop​

create​

drop​

query​

insert​

information​

database​

collection​

Basic Concepts

Feature

Common Words

Data Format

XML

DTD

XSD

JSON

Relational Algebra

Operators

Higher-Level Database Design Models

UML

Classes

Associations

Associations Classes

Subclasses

Composition and Aggregation

Entity Relationship Model

SQL

Basic SQL Statements

Table Management

WHERE Condition

Order

Group

Aggregation

Limit

Distinct

Set Operations

Join

Index

View

Export and Import

Relational Design

Decomposition

Functional Dependencies

BCNF

Multi Valued Dependencies

4NF

Normalized Design

Indexes

Transactions

Transaction Standard

ACID Properties

Isolation Level

Integrity Constraints

Triggers (DBMS Level Constraints)

Views

Modifications on Views

Implements Modification with Triggers

SQL Standard - Updatable Views

MySQL

MySQL Setup

MySQL Infrastructure

MongoDB

MongoDB Setup

MongoDB Installation

MongoDB Not Upgrade

MongoDB Control

MongoDB Uninstall

Shell Instruction

create and drop

create

drop

query

insert

information

database

collection