Thursday, December 26, 2019

Add space to root partition (using /home space)

On /home partition of one server I've got more HDD than I need. Below are steps to shrink /home partition (I preferred to delete it and then recreate, preserving all content):

If you're not under the root, then first do the below:
  1. make (-m) dir and make it home for tmp_usr:
    1. useradd -m -d /tmp/temp_usr temp_usr
  2. usermod -aG wheel temp_usr
  3. id temp_usr
  4. passwd temp_usr # Z481632z.@
  5. ssh temp_usr@needed_server_ip 
  6. sudo su -
  1. df -h
  2. ll /home/
  3. mkdir /tmp/temp
  4. cp -a /home /tmp/temp/ # -a key makes archive
  5. ll /tmp/temp/home/
  6. umount -fl /home # lazy unmount and undetected NFS unmount
  7. lvs # find VolGroupName and /home VG name
  8. lvremove /dev/VolGroupName/lv_home
  9. pvs # how much free space we have
  10. lvextend -L+350G /dev/VolGroupName/lv_root
  11. resize2fs /dev/VolGroup/lv_root # or xfs_growfs /dev/VolGroupName/lv_root
  12. pvs # how much free space we have
  13. lvcreate -L 50G -n lv_home VolGroupName
  14. pvs
  15. lvs
  16. mkfs.ext4 /dev/mapper/VolGroupName-lv_home# or mkfs.xfs /dev/mapper/VolGroupName-lv_home
  17. df -h
  18. mount -a # mount all existent FS 
  19. df -h
  20. cp -a /tmp/temp/home/* /home
  21. ll /home
  22. rm -rf /tmp/temp

Monday, November 18, 2019

Types of UTP cables

(U/UTP) - simple twisted-pair
(U/FTP) - each pair is foiled
(F/UTP, S/UTP, SF/UTP)  - all 4 pairsare foiled (F) or shielded (S) together / each pair is without any shield and divided by plastic cross-divider
(F/FTP, S/FTP, SF/FTP) - all 4 pairsare foiled (F) or shielded (S) together / each pair is foiled
U - unshielded
F - foil
S - shield (grid of thin wires)
CAT 5e, 6/6A и 8/8.1 - are mainly F/UTP
CAT  7/7A и 8.2 - are mainly S/FTP

Monday, October 14, 2019

CentOS 7 OpenVPN

yum install epel-release
yum install openvpn

In pfSense - download file from VPN > OPenVPN > Client > Inline > Most Clients (for example test.ovpn)

mv test.ovpn /etc/openvpn/test.conf

Add new user in pfSense "System Manager" and write down user and password

openvpn --config /etc/test.conf # enter you username and password when prompted

Monday, September 30, 2019

SAP 2

SAP HANA Sizing


SAP HANA Pre OS Installation Sizing:

Sizing for this categories is needed:
  1. memory for static data
  2. memory for objects created during runtime
  3. disk sizing
  4. CPU sizing

RAM
SDF (Source Data Footprint) is current Oracle DB size without indexes and blobs (for Oracle use dba_segments).
RAM size = SDF * 2 / 7 (i.e. RAM Size = 300 * 2 / 7 = 85GB but HANA minimum recomendation is 128GB of RAM)

Describing formula:
  1. Both static (in-memory store) and dynamic (executing queries and loading data) RAM data sizes are sized. "2"  comes from RAM dynamic = RAM static.
  2. "7" comes from average compression factor database table size : HANA memory = 7 : 1

HDD

Backup storage >= /hana/data + /hana/log
  • OS = 50GB 
  • /hana/data  = 1 * SDF (i.e. Disk data = 1 * 300 = 300GB)
  • /hana/shared min 1*RAM and up to 1TB = 85GB
  • /hana/log (1/2 * 85 = 42.5GB):
    • if RAM Size <= 512GB = 1/2 * RAM
    • if RAM Size > 512GB = 512GB
  • /usr/sap = 50gb
  • Total HDD = 50 + 300 + 85 + 42.5 + 50 = 527.5GB
CPU

Approximation:
0.0625 core/vCPU is needed per 1GB of RAM

core/vCPU =0.0625 * 128 = 8

Sizing summary
SDF = 300GB
RAM = 128GB
HDD = 527.5GB
core/vCPU = 8

SAP 1

What is SAP

SAP ERP (Enterprise Resource Planning) software is central application which covers Sales and Distribution (SAP SD), Production Planning (SAP PP), Financials (SAP FI), Human Capital Management (SAP HCM) and many more.
SAP ERP is a transactional system with data being written to its database frequently in a productive business environment.

Analytical operations are usually performed in SAP BW or HANA models.

SAP BW (Business Warehouse) - data warehouses (DW) are applications on a separate database that take data from source systems, then using ETL (Extraction, Transformation, Loading) process cleanse it, apply business logic and optimally store it within themselves for faster access from reports. ETL process is used to more quickly access KPIs (Key Performance Indicators - are being measured across time frames and used for measuring companies to make forecast and decisions) of the company.

Before SAP BW on HANA arrived, one of the biggest downfalls of SAP BW was that there were too many layers of redundant data from pre-cleansing, cleansing and multiple layers of transforming data. All this inherent latency (time taken for data to reach from source to reporting layer) made it a bulky truck which always got the job done but was never meant for a street race. Then came SAP HANA – SAP’s new revolutionary database and then the idea to take the BW application off the traditional databases and put it on top of HANA.

And now BW on HANA is BW4/HANA

SAP flow is as follows:
1) ERP
2) Cleansing and homogenization of data
3) BI
4) from BI to Business Strategy
5) ERP

Calculating ASA Throughput

  1. copy output of command below to the asaThroughput text file:
    1. show traffic | begin Aggregated Traffic on Physical Interface
  2. delete all traffic statistics for interfaces with "Internal" word
  3. Automatically count bytes in all interfaces statistics:
    1. bytes=$(grep -E "bytes$" asaThroughput  | \
    2. grep -v "0 packets.*0 bytes" | \
    3. awk '{s=s+$3} END {print s}')
  4. select the first appearance of the number of seconds of traffic statistics:
    1. seconds=$(grep -m 1 secs asaThroughput | \
    2. awk '{print $3}' | \
    3. cut -d\. -f 1)
  5. get final result (ASA throughput in Mbps):
    1. echo $((bytes/seconds*8/1024/1024))

Friday, September 27, 2019

DSL chip codes/abbreviations

ALCB Alcatel
ANDV Analog Devices
BDCM Broadcom
GSPN Globespan
IKNS Ikanos
IFTN Infineon
META Metanoia
STMI STMicroelectronics
TSTS Texas Instruments

Monday, August 19, 2019

Writing special symbols (Mathematics, Statistics) in HTML


x-bar = x̄ = x&#772; or x&#x0304; (hex)
x-hat = x̂ = x&#770; or x&#x0302; (hex)
x-arrow = x⃗ = x&#8407;
degree = ° = &#176;
left ceiling = ⌈ = &#8968;
right ceiling = ⌉ = &#8969;
left floor = ⌊ = &#8970;
right floor = ⌋ = &#8971;
real-numbers set = ℝ = &#8477;
greek uppercase a = A = &#913;
function = ƒ =&#402;
Sum = ∑ = &#8721;
Hadamard product = ⊙ = &#8857;
dot = ⋅ = &#8901;
not equal = ≠ = &#8800;

Linear Algebra 4. Matrix multiplication.

When you apply one LT and then the other LT (example: 90° clockwise rotation and then shear (shift) the overall effect is another LT which is composition of two LT. This LT will capture overall effect of applying 2 LTs into a single LT.
Applying several LT to one vector is like using several functions and using output of one function as input to the other:
ƒ(g(x)) where:

  1. g is the first LT with input "x"
  2. ƒ is the second LT with input from the previous LT
So the same as with functions - we apply LT from right to the left. 
The composition of two LTs is multiplication / product / dot product of two LT - product of two matrices:
 = AB  where:

  1. A have the same number of columns as B has rows or mathematically
    1.  Al x m and Bm x n 
    2. easy way - to check dot product possibility
      1.  example: A2 x 3 and B3 x 4
      2. Write dimensions of matrices one after the other with "=" sign between them:
        1. 2 x 3 = 3 x 4  as you see 3 = 3, so we can dot-product these matrices
      3. example: A2 x 2 and B3 x 3
        1. 2 x 2  = 3 x 3 as you see 2 = 3 is not true, so we can't multiply that matrices
  2. Ci,j = m Ai,k Bk,j
    k=1
    this means: starting with k=1 and till k=m multiply each Ai,k by Bk,j and also i = {1,2,...,l} and j={1,2,..,n}
For example if we have two matrices and want to find their dot-product:
0 2 1 -2
1 0 1 0
A2 x 2 and B2 x 2 find k:  2 x 2 = 2 x  2 , 2 = 2, k = 2, k shows how many times we sum product of factors A and B:

Change general formula for this particular case:
Ci,j = 2 Ai,k Bk,j
k=1
Then:
C1,1 = 2 A1,k Bk,1 = A1,1 B1,1 + A1,2 B2,1 = 0*1 + 2*1 = 2
k=1

C1,2 = A1,1 ⋅ B1,2 + A1,2 ⋅ B2,2= 0*(-2) + 2*0 = 0
C2,1 = A2,1 ⋅ B1,1 + A2,2 ⋅ B2,1= 1*1 + 0*1 = 1
C2,2 = A2,1 ⋅ B1,2 + A2,2 ⋅ B2,2= 1*(-2) + 0*0 = -2

The simplest way to calculate matrix dot product is to approach it as matrix vector product:
First we find î of the right matrix after applying left matrix (LT):
0 2 1
1 0 1
Secondly we find ĵ hat of the right matrix after applying left matrix (LT):
0 2 -2
1 0 1

Matrix product properties:
A(B + C) = AB + AC distributive property
A(BC) = (AB)C associative property
But ABBA (because matrix is LT, and LT is like function, so apply right to left)

Also matrix element wise product (or Hadamard product) exists. It is supported only for matrices of the same shape:
C = A ⊙ B where Ci,j = Ai,j Bi,j

With 3-D Tensor basis are î , ĵ and k̂ and it's linear combination is:

v = xî + yĵ + zk̂
ĵ =

These materials were used while preparing this blog-post:
  1. https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab
  2. https://www.deeplearningbook.org/
  3. NBGtLA by https://minireference.com/

Linear Algebra 3. Linear transformations and matrices, matrix operations.

Linear transformation (LT) is like a function that transforms (changes) vector: ƒ(x) => L(v⃗)
So linear transformation takes some input vector and produces some output vector.
A transformation is linear if:

  1. all lines (of the coordinate system grid) are not become curved after transformation (horizontal, vertical and diagonal lines). In other words grid lines remain parallel and evenly spaced
  2. the origin remains fixed in place
Example of LT - 90° clockwise rotation about the origin. How we can describe LT numerically? We have input vector with coordinates [xin , yin] and output vector with coordinates [xout , yout] . We know that each vector is just linear combination of the basis/unit vectors, so we can rewrite coordinates like:

  1. [xin , yin] = xin î + yin 
  2. linear combination remains the same even after applying LT, so we just use transformed versions of the î and ĵ => LT(î) and LT(ĵ)
  3. [xout , yout] = xin LT(î) + yin LT(ĵ)
Example of 90° clockwise rotation LT:

  1. Take squared sheet of paper and draw two unit vectors; for convenience - each with length of 2 squares. 
  2. If we make 90° clockwise rotation LT then:
    1. we move î  90° clockwise - now î is down y axis and LT(î) coordinates (in terms of old greed - before transformation) are [0, -1].
    2. we move ĵ  90° clockwise - now ĵ lies on x axis and LT(ĵ) coordinates (in terms of old greed - before transformation) are [1, 0].
  3. if we have some vector v with coordinates [3,2]:
    1. LT(v) = 3LT(î) + 2LT(ĵ) = 3[0, -1] + 2[1,0] = [0, -3] + [2,0] = [2, -3] in terms of greed before transformation
We can describe 2D (Cartesian plane) with 4 digits - 2 for î coordinates and 2 for ĵ coordinates. We can package this coordinates in two-by-two grid of numbers - array of numbers, or in terms of LA - matrix. Matrix will have 2 columns and 2 rows:

  1. columns - 1st is î coordinates and 2nd is  ĵ coordinates
  2. rows - 1st is x axis coordinates of î and ĵ , and 2nd - y axis coordinates of î and ĵ
0 1 3 = 3 0 + 2 1 = 0⋅3 + 1⋅2 = 2
-1 0 2 -1 0 -1⋅3 + 0⋅2 -3

Above we rewrote our linear combination as matrix-vector multiplication.

By convention we denote matrix in bold upper-case, like A . And we denote elements of a matrix upper-case non-bold, like A.

Amxn is matrix with height of m (rows) and width of n (columns)
A1,1 is element at 1 row and 1 column intersection, using above example of LT matrix, A1,1= 0
To denote real valued matrix Amxn : A∈ℝmxn
Colon symbol ":" represents "all" - all rows or all columns:
  1. All numbers/elements of matrix on intersection with i column: A:,i 
    1. A:,1 equals to the set {0, -1} - 1st column of A
  2. All numbers/elements of matrix on intersection with i row: Ai,: 
    1. A2,: equals to the set {-1,0} - 2nd row of A
If we use more than 2 axes (2 axes is 2D) then we'll call such a matrix - tensor.

We can add matrices of the same shape by adding their corresponding elements/numbers:
C = A + B   where   Ci,j = Ai,j + Bi,j

01+-38=0+(-3)1+8=-39
-10-1-5-1+(-1)0+(-5)-2-5

To add scalar to a matrix or to multiply matrix by a scalar, we must perform addition or multiplication of each element of a matrix:
D = aB + c where Di,j = aBi,j + c



 These materials were used while preparing this blog-post:
  1. https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab
  2. https://www.deeplearningbook.org/
  3. NBGtLA by https://minireference.com/

Tuesday, August 6, 2019

Linear Algebra 2. Unit vectors, linear combinations, basis.

Each coordinate of a vector is a scalar stretching and squishing a unit vector. Unit vectors are vectors starting (as each vector) at the origin, orthogonal (perpendicular) to each other and having length of one unit on the corresponding axis. Unit can be anything you want - 1 centimeter, 1 meter, 1 millimeter etc. So:
  1. unit vector on x axis is î (i-hat) with coordinates [1,0] meaning 1 of x, o of y
  2. unit vector on y axis is ĵ (j-hat) with coordinates [0,1] meaning 0 of x, 1 of y
  3. so first we write x coordinate, then y, then (if any) z etc.
So each and any vector is sum of scaled unit vectors. We use vector-scalar multiplication and then vector addition. Thus we make linear combination (as result we get a vector which is an arrow) of î and ĵ (here 3 and 2 are scalars):
[3,2] = 3î + 2 ĵ = 3[1,0] + 2[0,1] = [3,0] + [0,2] = [3,2]

î and ĵ also called basic vectors of the x-y coordinate system.

We can choose different basis vectors (non unit) and get completely new coordinate system. So when describing vectors numerically, it (description) depends on a choice of basis vectors.



These materials were used while preparing this blog-post:
  1. https://www.youtube.com/playlist?list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab
  2. https://www.deeplearningbook.org/
  3. NBGtLA by https://minireference.com/

Monday, August 5, 2019

Docker firewalling

Docker containers are not host services. They rely on a virtual network in your host, and the host acts as a gateway for this network. So traffic is routed traffic and FORWARD chain/table is used.
In fact Docker daemon creates several iptables-chains to setup containers connectivity and we can use DOCKER chain to control access to the Docker Containers. Traffic from FORWARD chain is forwarded to the DOCKER chain. You should not modify the rules Docker adds to your iptables policies. For manually added rules you must use DOCKER-USER chain. Rules from the DOCKER-USER chain are used before DOCKER chain rules.

To restrict access to container which uses docker bridge network (inserts rule to the first position in the rules list):
add rule: iptables -I DOCKER-USER ruleHere -j [ACCEPT|DROP]
remove rule: iptables -D DOCKER-USER ruleNumberHere

For example:
list all rules in DOCKER-USER chain:
iptables -L DOCKER-USER
or more verbose with numeric ports:
iptables -L DOCKER-USER -vn
deny access to all containers from IP address 10.10.10.11:
iptables -I DOCKER-USER -s 10.10.10.11 -j DROP
deny access to the containers TCP port 5000 (this port is container port, not host port of the port-mapping):
iptables -I DOCKER-USER -p tcp -m tcp --dport 5000 -j DROP

macvlan driver

Below (till the end of the blog-post) can be used for any container not just using macvlan driver .

With network namespaces, you can have different and separate instances of network interfaces and routing tables that operate independent of each other.
The only namespace we have on each linux machine is a "default" or "global" namespace (physical interfaces exist here).

From the docker-host:


Make directory for network namespaces linking (done on the container host only once):
mkdir -p /var/run/netns

Find PID of the container:
CPID=$(docker inspect --format='{{ .State.Pid }}' containerName)

Create linking
LINK="/var/run/netns/$CPID"
ln -s "/proc/$CPID/ns/net" "$LINK"

All container related proc entries are under:
/proc/$CPID/ns/net

Drop packets on found container PID:
ip netns exec $CPID iptables -I INPUT -j DROP
ip netns exec $CPID iptables -I OUTPUT -j DROP

Allow only incoming and outgoing ICMP packets:
ip netns exec $CPID iptables -I INPUT  -j ACCEPT
ip netns exec $CPID iptables -I OUTPUT -j ACCEPT

Viewing all container iptables rules:
ip netns exec $CPID iptables -L

rm -f $LINK

From the container itself:


To use iptables inside container itself, you must run container with NET_ADMIN privilege
docker run --cap-add=NET_ADMIN --name='ctr0' --hostname='ctr0' -it centos /bin/bash

From the container bash:
yum install net-tools
yum install iptables

Now you can restrict all access but ICMP:
iptables -I INPUT -j DROP
iptables -I OUTPUT -j DROP
iptables -I INPUT -p icmp  -j ACCEPT
iptables -I OUTPUT -p icmp  -j ACCEPT
iptables -L

Linear Algebra 1. What is vector and scalar.

There are 3 views on vectors:

  1. physics view - arrows pointing in space, having length, direction and also you can move it all around - it is still the same vector
  2. computer science - ordered lists of numbers (order matters) and dimension describes length of that list
  3. mathematics - generalize both views: a vector can be anything where there is sensible notion of adding 2 vectors and multiplying a vector by a number: v⃗+w⃗ and 2v⃗
Geometrically vector is an arrow inside a coordinate system and that coordinates shows move from the origin ([0,0] coordinates of the Cartesian coordinate system) to the tip of the vector.

Vector addition - is like encoding the endpoint of the whole way as group of vectors starting at each turn and each of them encoding direction and length of that part of road:

  1. whole way: go 1 to the right and 2 up, then 3 to the right and 1 down:
    1. here we have 2 parts of the whole way:
      1. 1 to the right and 2 up - we'll encode that v⃗  [1,2]
      2. 3 to the right and 1 down - we'll encode that w⃗ [3,-1]
  2. so we have 2 vectors - v⃗  [1,2] and w⃗ [3,-1], 
  3. then v⃗+w⃗ = [1+3 , 2 + (-1)] = [4,1]
  4. for better understanding:
    1. take a piece of squared paper
    2. draw the whole way using notebook squares to measure steps
    3. draw v and w vectors on the Cartesian plane
Multiplication by a number - this means stretching and squishing of vector or changing its direction:
if v⃗  is [1,2], then 2v⃗ = 2[1,2] = [2*1 , 2*2] = [2, 4] . This also called scaling, and numbers used to scale (stretch, squish, change of the direction) are called scalars. Scalar is just a single number.

We can identify each individual number in a vector by it's index: v⃗  [1,3,5,7,9,2]  v3= 5

By convention we can show vector in bold lowercase or in non-bold lowercase with the arrow above (v or v⃗) and vector elements are non-bold lowercase with subscript.

If we want to index a set of elementsof a vector, then we define set containing the indices and write this set as subscript:

  1. x is [2,3,4,6,1,8,4] we need 1st, 4th, 5th elements (x1,x4,x5)
  2. define set S={1,4,5}
  3. xS
x-1 means all elements but x1
x-S means all elements but x1,x4,x5

These materials were used while preparing this blog-post:

Friday, August 2, 2019

Entropy

Entropy is a measure of uncertainty. High entropy means the data has high variance and thus contains a lot of information and/or noise. For instance, a constant function where f(x) = 4 for all x has no entropy and is easily predictable, has little information, has no noise and can be briefly represented . Similarly, f(x) = ~4 has some entropy while f(x) = random_number is very high entropy due to noise.

Information entropy is a concept from information theory. It tells how much information there is in an event. In general, the more certain or deterministic the event is, the less information it will contain. More clearly stated, information is an increase in uncertainty or entropy. The concept of information entropy was created by mathematician Claude Shannon.

Generally speaking, information entropy is the average amount of information conveyed (sent,transported) by an event, when considering all possible outcomes (results).

Example:
we have 3 bags:

  • 1st with 4 red balls
  • 2nd with 3 red and 1 green balls
  • 3rd with 2 red and 2 green balls
Entropy and information are opposites. The more variants of arrangement of the balls we have the more amount of entropy we'll get. So if we'd speak about color probability if one ball is taken from the bag:

  • 1st bag have 100% probability of red color, so this bag has the least entropy
  • 2nd bag has 75% probability of red and 25% probability of green color, has medium entropy
  • 3rd bag has 50% probability of red and 50% probability of green color, has the greatest entropy


Thursday, August 1, 2019

Tabular Data

Tabular data are opposed to relational data, like SQL database. In tabular data, everything is arranged in columns and rows. Every row have the same number of column (lacking information or missing value substituted by "N/A" (also zero values, as SQL NULL value, are not allowed in tabular data structure). The first line of tabular data is most of the time a header, describing the content of each column. The most used format of tabular data in data science is CSV (Comma-Separated Values). Every column is surrounded by a character (a tabulation, a coma ..), delimiting this column from its two neighbors.
The best is to think of tabular data as being "organized by row" where each row corresponds to a unique identifier such as the time a measurement was made (opposite in SQL where keys are used as unique identifier). For example you can store phone-book as tabular data and each row shows persons Name-Surname and Phone Number. To find relations between rows in tabular data you'll need first load all data in memory and only after that can find relations between rows (example: find all persons with numbers starting with +994 which is code of Azerbaijan). If this phone-book will be in relation structure, then one phone-book table:
  1. tabular data:
    • name;surname;address;zip;phone-number
    • name1;surname1;addressX;zipA;phone1,phone2
    • name2;surname2;addressY;zipB;phone1
    • name3;surname3;addressZ;zipA;phone1,phone2
  2. due to First Normal Form (1NF) - no repeating groups ("phone" is group - two columns like "phone1" and "phone2", or one column "phone" with "phone1,phone2" data are not allowed by 1NF). 1NF adds redundant/repeated values to data:
    • name;surname;address;zipCode;phoneNumber
    • name1;surname1;addressX;zipA;phone1
    • name1;surname1;addressX;zipA;phone2
    • name2;surname2;addressY;zipB;phone1
    • name3;surname3;addressZ;zipA;phone1
    • name3;surname3;addressZ;zipA;phone2
  3. due to Second Normal Form (2NF) - 1NF + all the non-key columns are dependent on the table’s primary key, the table serves a single purpose (each column must depend on the primary key and serve to describe what the primary key identifies, if not - move that column into another table). If we add primary-key rowID, then this key will uniquely describe each row having unique number for that person, but person itself is not describes purpose of the primary-key, so we'll move all person related data to the other table. Main idea of the 2NF is to reduce amount of redundant/repeated data. 
      1. We use table to store all person related stuff (name, surname, address, zip-code):
        • personID;name;surname;address;zipZode
        • 100;name1;surname1;addressX;zipA
        • 200;name2;surname2;addressY;zipB
        • 300;name3;surname3;addressZ;zipA
      2. Now our phone-numbers table will be (we must add rowID to uniquely identify each row) and it is in 2NF:
        • rowID;personID;phoneNumber
        • 1;100;phone1
        • 2;100;phone2
        • 3;200;phone1
        • 4;300;phone1
        • 5;300;phone2
  4. due to Third Normal Form (3NF) - 2NF + contains only columns that are non-transitively dependent on the primary key. Non-transitively dependent means non-through dependent. Dependence - age depends on birth-date.  Transitive dependency - we have 3 columns PK, BMI (Body Mass Index) , oWtf (Over Weight True-Flase), here PK helps to find BMI and oWtf, but oWtf also depends on BMI as BMI>25 is overweight, so oWft relies on PK through BMI.  So all columns in table are dependent only on primary-key (2NF) and not on other columns:
    1. Our phone-number table is 3NF:
      • rowID;personID;phoneNumber
      • 1;100;phone1
      • 2;100;phone2
      • 3;200;phone1
      • 4;300;phone1
      • 5;300;phone2
    2. But our person table is only 2NF, each column related to primary-key (PK), bot not 3NF, because we can use PK to find address of  a person and also can use PK to find zip code of the person, but at the same time address depends on zip code, this is transitive dependency:
      1. Move address an zip code to the separate table:
        • addrID;address;zipCode
        • 111;addressX;zipA
        • 222;addressY;zipB
        • 333;addressZ;zipA
      2. Now our person table will be:
        • personID;name;surname;addrID
        • 100;name1;surname1;111
        • 200;name2;surname2;222
        • 300;name3;surname3;333

So we can say that relational data structures:

  1.  are the same like tabular but with applied normalization, so that one table of tabular data becomes several relational tables - relations
  2. allows zero values while tabular data doesn't
  3. to query tabular data you need to load all data into RAM

Monday, July 29, 2019

CAP Theorem

CAP theorem, also known as Brewer's theorem, states that it is impossible for a distributed data store to simultaneously provide more than two out of the following three guarantees:
  1. Consistency: data on every non-failing node in the distributed system is the same. So that updates across distributed system must be done before allowing further reads.
  2. Availability: Availability can be used in two different meanings:
    1. Availability of real service - can be measured as ratio expressed as a percentage between working and non-working time of the service
    2. Availability in context of CAP theorem - for a distributed system to be continuously available, every request received by a non-failing node in the system must result in a response. So that data must be replicated between nodes of the system and also server is not allowed to ignore the client's requests.
  3. Partition tolerance: the system continues to operate even if any one part of the system is lost or fails. Partition tolerance doesn’t require every node still be available to handle requests. It just means that partitions may occur. If you deploy on a typical IP network, partitions will occur; partition tolerance in these environments is not optional. So only a total network failure can cause a system to respond incorrectly.
So in practice every distributed system using network, must use P, and thus we have two possible types of systems: AP or CP. For systems not using network, we have AC, AP, CP models.
Conventional databases assume no partitioning - clusters were assumed to be small and local (CA).
NoSQL systems may sacrifice consistency. 

AP or AC:

  1. On systems that allow reads before updating all the nodes, we will get high Availability
  2. On systems that lock all the nodes before allowing reads, we will get Consistency

Description of the CAP theorem:
  1. Setup:
    1. we have distributed system consisting of 2 servers - S1 and S2
    2. S1 and S2 are interconnected
    3. C connects to both S1 and S2
    4. client - C - can query any of these servers (S1 or S2)
    5. S1 and S2 keep track on a variable v with initial value = 0 (v=0)
    6. write is done from C to S1 or S2 (write request and write responce) and read is done from C (read request and read responce) to S1 or S2
  2. Consistency:
    1. consistent system:
      1. C write-request S1 => v=1
      2. S1 write => v=1
      3. S1 write-response C => v=1 
      4. S1 update S2
      5. S2 update => v=1
    2. inconsistent system:
      1. C write-request S1 => v=1
      2. S1 write => v=1
      3. S1 write-response C => v=1 
      4. S2 is not updated and v on S2 is still v=1
  3. Patition:
    1. When partition occurs - S1 and S2 are no more interconnected

Thursday, July 25, 2019

DB Basics, Cross, Natural, Inner, Outer, Theta Join

In this blog-post I'll try to go from formal notions in Relational Algebra to the practical SQL using the same queries as in https://it-tuff.blogspot.com/2019/07/relational-algebra-db-basics-select.html.

Prerequisites for practical learning:
  1. install mysq or mariadb server
  2. RA Relation is table in SQL and tables are in database:
    1. CREATE DATABASE Test;
    2. USE Test;
    3. SHOW DATABASES;
  3. RA key is PRIMARY KEY  in SQL, RA Atribute is column in SQL and RA Tuple is row in SQL. To fill table we first must create it's schema:
    1. Data types:
      1. VARCHAR - used for storing alphabetic or mixed alpha-numeric data
      2. INTEGER - storing whole numbers from ~ -2billions to ~+2billions
      3. DECIMAL - storing whole and non-whole numbers, you must specify length of number and also length of the fractional part - DECIMAL(10,4) - number length is 10 digits with 4 digits after decimal-point
      4. after showing data type you must show probable maximal length of that data
    2. CREATE TABLE College (cName VARCHAR(255), PRIMARY KEY (cName) , state VARCHAR(10), enrollment INTEGER);
    3. SHOW TABLES;
    4. CREATE TABLE Student (sID INTEGER, PRIMARY KEY(sID), sName VARCHAR(255), GPA DECIMAL(4,2), sizeHS INTEGER); # HS = High School
    5. SHOW TABLES;
    6. CREATE TABLE Apply (sID INTEGER, PRIMARY KEY(sID), cName VARCHAR(255), major VARCHAR(255), decision VARCHAR(20));
    7. SHOW TABLES;
  4. Now fill tables with test data:
    1. INSERT INTO College (cName, state, enrollment) VALUES ("Amridge", "AL", 749),  ("Berkeley", "CA", 42159), ("Stanford", "CA", 43797), ("Wyoming", "WY", 2024), ("Harcum", "PA", 1425);
    2. INSERT INTO Student (sID, sName, GPA, sizeHS) VALUES (1001, "Nita Millwood", 3.2, 900), (1002, "Vincenzo Lyons", 3.8, 750), (1003, "Zachery Lefebvre", 2.9, 1500), (1004, "Wilbert Chan", 3.6, 1620), (1005, "Mirna Hamann", 3.9, 1000), (1006, "Delta Shutt", 2.5, 1300), (1007, "Ryan Lacefield", 3.1, 1460);
    3. INSERT INTO Apply (sID, cName, major, decision) VALUES (1001, "Amridge", "BA", "accept"), (1002, "Berkeley", "CS", "accept"), (1003, "Houston", "CE" ,"reject"), (1004, "Berkeley", "CS", "reject"), (1005, "Stanford", "CS", "accept");

Practicing SQL:
  1. In SQL RA Select and Project are combined into one operator SELECT:
    1. right after select we write Projection part (* means all columns/attributes)
    2. after Projection part we write FROM and then write table/relation name
    3. after table name we write WHERE with needed column/attribute parameters - this is condition of the Selection
    4. RA ^ (logical and) is AND in SQL
    5. students with GPA>3.7 :
      1. Select * FROM Student WHERE GPA > 3.7;
    6. Application for Stanford for CS major 
      1. SELECT * FROM Apply WHERE cName="Stanford" AND major="CS"
    7. ID and name of students with GPA>3.7: 
      1. SELECT sID,sName FROM Student WHERE GPA > 3.7
  2. In SQL RA Cross-Product is CROSS JOIN in MySQL CROSS JOIN and INNER JOIN are the same, in Oracle you can't specify ON clause for CROSS JOIN (only WHERE is allowed) and Oracle INNER JOIN allows ON clause. Also theta join is join using only WHERE condition and not using ON or USING:
    1. Names and GPA's of students with sizeHS>1000 who applied to CS and were rejected: 
      1. To deeply understand this we'll compose this query step by step:
      2. First we'll find all students:
        1. SELECT * FROM Student ;
      3. Now we need to find applications of all students (cross-product):
        1. SELECT * FROM Student CROSS JOIN Apply ;
      4. Previous query must be filtered by the condition Student.sID=Apply.sID:
        1. SELECT * FROM Student CROSS JOIN Apply WHERE Student.sID=Apply.sID ;
      5. Add sizeHS > 1000 condition:
        1. SELECT* FROM Student CROSS JOIN Apply WHERE Student.sID=Apply.sID AND sizeHS>1000;
      6. Add two other conditions - major="CS" and decision="reject":
        1. SELECT * FROM Student CROSS JOIN Apply WHERE Student.sID=Apply.sID AND sizeHS>1000 AND major="CS" AND decision="Reject" ;
      7. Now make projection to select only sName and GPA:
        1. SELECT sName, GPA FROM Student CROSS JOIN Apply WHERE Student.sID=Apply.sID AND sizeHS>1000 AND major="CS" AND decision="Reject" ;
  3. RA Union in SQL is UNION - this operator is used to make composition of the results of two (or more)  select statements:
    1. List of college and student names:
      1. SELECT cName FROM College 
      2. UNION 
      3. SELECT sName FROM Student;
  4. RA Rename operator is AS in SQL:
    1. List of college and student names under the name Names:
      1. SELECT cName AS Names FROM College
      2. UNION
      3. SELECT sName FROM Student;
    2. for disambiguation in self-joins (when relation/table is joined with itself):
      1. pairs of colleges in same state (we name 1st call of College table C1, and the second - C2):
      2. Only renaming tables:
        1. SELECT * 
        2. FROM College AS C1 
        3. CROSS JOIN College AS C2 
        4. WHERE 
        5. C1.state=C2.state AND
        6. C1.cName != C2.cName;
      3. Renaming tables and columns:
        1. SELECT C1.cName AS C1, C2.cName AS C2, C1.State 
        2. FROM College AS C1 
        3. CROSS JOIN College AS C2 
        4. WHERE C1.state=C2.state AND
        5. C1.cName != C2.cName;
  5. Natural join operator performs cross-product operator and then enforces equality on all of the attributes with the same name (as in above cross-join example: Student.sID=Apply.sID) also natural join eliminates one copy of duplicate attributes:
    1. Names and GPA's of students with sizeHS>1000 who applied to CS and were rejected:
      1. SELECT sName, GPA 
      2. FROM Student 
      3. NATURAL JOIN Apply 
      4. WHERE sizeHS>100 AND 
      5. major="CS" AND 
      6. decision="reject";
    2. The same with column and table renaming:
      1. SELECT St.sName, St.GPA 
      2. FROM Student AS St 
      3. NATURAL JOIN Apply AS Ap 
      4. WHERE St.sizeHS>1000 AND 
      5. Ap.major="CS" AND 
      6. Ap.decision="reject";
    3. Names and GPA's of students with HS>1000 who applied to CS and were rejected to colleges with the enrollment greater than 20000:
      1. Using table rename and two select statements:
        1. SELECT S.sName, S.GPA 
        2. FROM Student AS S 
        3. NATURAL JOIN
        4.  (SELECT * 
        5. FROM Apply AS A 
        6. NATURAL JOIN College AS C  
        7. WHERE C.enrollment>20000 AND 
        8. A.major="CS" AND
        9.  A.decision="reject") AS A 
        10. WHERE S.sizeHS>1000;
      2. Using several natural joins in one Select:
        1. SELECT sName, GPA
        2. FROM Student
        3. NATURAL JOIN Apply
        4. NATURAL JOIN College
        5. WHERE sizeHS>1000 AND
        6. major="CS" AND
        7. decision="reject";
  6. RA Difference operator can be simulated with LEFT JOIN in MySQL (left join adds found rows from the right side to the left side, if right side is empty then NULL values are used), here you must show which columns are used for selection:
    1. IDs of students who didn't apply anywhere:
    2. We can use ON Student.sID = Apply.sID:
      1. SELECT Student.sID 
      2. FROM Student
      3. LEFT JOIN Apply
      4. ON Student.sID=Apply.sID
      5. WHERE Apply.sID IS NULL;
    3. Also "ON Student.sID=Apply.sID" = USING(sID) - when both columns have the same name:
      1. SELECT Student.sID 
      2. FROM Student
      3. LEFT JOIN Apply
      4. USING(sID)
      5. WHERE Apply.sID IS NULL;
  7. MySQL RIGHT JOIN works similar to  LEFT JOIN, the difference is that RIGHT JOIN uses right relation as the main, and LEFT JOIN uses left relation as the main one.
  8. FULL JOIN is INNER JOIN + RIGHT JOIN + LEFT JOIN
  9. Intersection operator can be simulated in MySQL using join and DISTINCT (show only unique values):
    1. Names that are both college name and student name:
      1. SELECT DISTINCT(sName) 
      2. FROM Student
      3. INNER JOIN College
      4. ON sName=cName;
  10. Inner and Outer joins:
    1. Inner join show only data which is in both left and right relations (using ON or USING)
    2. Outer joins use on relation as the main and completes this relation with the data from the other one and all empty data filled with NULLs (LEFT and RIGHT joins are: LEFT OUTER JOIN and RIGHT OUTER JOIN)


DHCP over Relay on Docker

DHCP (Dynamic Host Configuration Protocol) helps us to address dynamically our hosts on the network. In fact, when a Host is configured to get its IP address dynamically, it will broadcast a DHCP Request on the network searching for a DHCP server. DHCP server has to be on the same broadcast domain as the CLIENTS since routers do not forward broadcast packets.
For Docker container it means that we must connect our container to each subnet in the network of our company. But we want to use just one interface (in that post I'll use macvlan) on our container. But problem is:
As our DHCP Client wants to get an IP address, it will send a DHCP Discover message which is a broadcast message. As the Router/Gateway/Firewall do not forward broadcast packets, this message will never reach the DHCP Server (our Docker Container).
To solve this issue we'll use DHCP Relay Agent. This feature is activated on a network device having interfaces in all subnets of the network of the company:

  1. this device (router/gateway/firewall) forwards DHCP messages to the DHCP Server, and when the DHCP Server responds, this device forwards the replies to the Client. 
  2. DHCP Realy Agent adds giaddr (gateway interface address) field to the DHCP Packet. This field contains DHCP Relay Agent interface IP address which received DHCP Request and also this field helps to identify pool from which DHCP Server has to select IP addresses. 
  3. After identifying pool DHCP Server replies with DHCP Offer broadcast message and this message forwarded by DHCP Relay Agent to the DHCP Client.
  4. DHCP Client replies with DHCP Request message 
  5. this message also forwarded to the DHCP Server by DHCP Relay Agent
  6. DHCP Server replies with DHCP Ack
  7. this message forwarded to the DHCP Client by DHCP Relay Agent 
  8. finally DHCP Clietn is assigned an IP address

If you want to use Cisco ISR as Relay Agent:

  1. Setup interface which will be used to interconnect DHCP Relay Agent and DHCP Server:
    1. conf term
    2. int fa0/1 # DHCP Server facing interface
    3. ip address 172.16.3.4 255.255.255.0
  2. Setup interface which will use DHCP Relay Agent and enable IP-helper (DHCP Server IP address) on that interface - all DHCP messages will be forwarded to that IP address:
    1. int fa 0/0
    2. ip address 10.10.6.1 255.255.225.0
    3. ip helper-address 172.16.3.249
    4. do wr
  3. Check configuration:
    1. show ip int fa0/0
  4. Also we need to configure static route on the DHCP Server if DHCP Relay Agent is not default gateway for the DHCP Server:
    1. ip route add 10.10.6.0/24 via 172.16.3.4 # this is not persistent setup to make it persistent create route file for needed interface
Because of using macvlan for Docker Container, you need to enable IP forwarding on Docker Host:
echo 1 /proc/sys/net/ipv4/ip_forward . Previous is not persistent  setup, to make it persistent:


  1. sudo vi /etc/sysctl.conf and add net.ipv4.ip_forward = 1
  2. sudo sysctl -p



If you want to use CentOS 7 as Relay Agent:
  1. Setup interface which needs to use DHCP Relay Agent:
    1. vi ifcfg-eth0
      1. IPADDR=10.10.6.1 
      2. PREFIX=24
    2. vi ifcfg-eth1  # DHCP Server facing interface
      1. IPADDR=172.16.3.4
      2. PREFIX=24
    3. yum install dhcp # dhcp-relay is part of dhcp package
    4. cp /usr/lib/systemd/system/dhcrelay.service /etc/systemd/system
    5. vi /etc/systemd/system
      1. under [Service]
      2. append IP address of the DHCP server to the ExecStart after --no-pid:
        1. ExecStart=/usr/sbin/dhcrelay -d --no-pid 172.16.3.249
        2. Also you can choose interfaces to activate DHCP Relay on them (by default all interfaces are used). You must use separate "-i" option for each additional interface:
          1. ExecStart=/usr/sbin/dhcrelay -d --no-pid 172.16.3.249 -i eth1 -i eth2.20
    6. systemctl --system daemon-reload
    7. systemctl start dhcrelay
    8. systemctl enable dhcrelay
    9. systemctl status dhcrelay
    1. Also we need to configure static route on the DHCP Server if DHCP Relay Agent is not default gateway for the DHCP Server:
      1. ip route add 10.10.6.0/24 via 172.16.3.4 # this is not persistent setup to make it persistent create route file for needed interface
    If you want to use CentOS 6 as Relay Agent:
    1. Setup interface which needs to use DHCP Relay Agent:
      1. vi ifcfg-eth0
        1. IPADDR=10.10.6.1 
        2. NETMASK=24
      2. vi ifcfg-eth1  # DHCP Server facing interface
        1. IPADDR=172.16.3.4
        2. NETMASK=24
      3. yum install dhcp # dhcp-relay is part of dhcp package
      4. vi /etc/sysconfig/dhcrelay
        1. INTERFACES= "eth1 eth2.20" #which interfaces must use DHCP Relay Agent
        2. DHCPSERVERS="172.16.3.249" # DHCP server IP address
      5. service dhcrelay start
      6. chkconfig dhcrelay on
      7. service dhcrelay status
      1. Also we need to configure static route on the DHCP Server if DHCP Relay Agent is not default gateway for the DHCP Server:
        1. ip route add 10.10.6.0/24 via 172.16.3.4 # this is not persistent setup to make it persistent create route file for needed interface

      Interface with DHCP relay must use static IP address (no DHCP is allowed).

      dhcp.conf
      # this server is primary and authorative server on that network
      authoritative;
      # dhcpd listens *only* on interfaces for which it finds subnet declaration in dhcpd.conf
      # empty declaration for local IP subnet to start listening on eth0 interface
      subnet 172.16.3.0 netmask 255.255.255.0 { }

      subnet 10.10.6.0 netmask 255.255.255.0 {
              range 10.10.6.2 10.10.6.3;
              option routers 10.10.6.1;
              #option domain-name-servers 8.8.8.8, 8.8.4.4;
          }

      to kill process on container:
      top > k > PID > Enter

      dhcp -cf dhcp.conf

        Tuesday, July 23, 2019

        Docker Networking

        Normally, Docker creates a new network namespace for each container we run. As we attach the container to a network, we define an endpoint that connects the container network namespace with the actual network. This way, we have one container per network namespace. Docker provides an additional way to define the network namespace in which a container runs. When creating a new container, we can specify that it should be attached to or maybe we should say included in the network namespace of an existing container. With this technique, we can run multiple containers in a single network namespace.

        When you install docker it creates 3 networks automatically:
        1. bridge
          1. docker-host NIC goes to promiscuous mode (allows all L2 packlets without checking destination MAC in other words MAC filtering is disabled)
          2. actually docker bridge is a switch inside docker host, this switch interconnects docker-host and docker-container
          3. network used by default when you run a container
          4. containers in this network can communicate with each other
          5. containers assigned IP from 172.17.0.0/16 subnet
          6. to go to outside world use must use port-mapping to the docker-host IP
          7. Overview:
            1. new network namespace created for container
            2. docker0 bridge is automatically created and attached to the docker-host NIC (docker-host namespace)
            3. veth (Virtual Ethernet) interface:
              1. automatically created
              2. attached to the docker0 bridge
              3. attached the container NIC
              4. veth interface is like media/cable connecting docker0-bridge/switch port to the container NIC
        2. none
          1. to use container without network: --network=none
          2. container no attached to any networks and also cannot communicate with any other container
        3. host
          1. to use host network: --network=host
          2. in this case container uses the same IP as docker-host uses
          3. ports are shared between docker-host and all containers connected to the "host" network
          4. container has direct access to the docker-host's NIC
        To create custom network:
        docker network create \
        custom_isolated_network \
             --driver bridge \
             --subnet 192.168.190.0/24 \
        List all docker networks:
        docker network ls
        To view bridges only:
        brctl show


        Other types of networks supported by docker:
        1. macvlan (requires at least kernel 3.9 on docker-host) - docker-host NIC uses unicast filtering, so L2 with not known DST MAC would be discarded (except is passthru, which uses promiscuous mode)
          1. this type allows you to assign several IP addresses to the same NIC.
          2. MAC-VLAN allows to configure subinterfaces (slave devices) of a parent (master) device
          3. each subinterface will have it own randomly generated MAC and consequently IP address
          4. subinterfaces cannot interact directly with parent interface
          5. to communicate with parent interface - assign macvlan subinterface to the docker-host
          6. macvlan subinterfaces are for example mac0@eth0 (this notation clearly identifies subinterface's parent)
          7. The macvlan is a trivial bridge that doesn’t need to do learning as it knows every mac address it can receive, so it doesn’t need to implement learning or stp. Which makes it simple stupid and fast.
          8. Each sub-interface can be in one of 4 modes that affect possible traffic flows (these are macvlan modes and not all of them are presented in macvlan docker driver - currently docker support only macvlan-bridge mode):
            1. Private - traffic goes only from subinterfaces to the out, subinterfaces on the same parent cannot communicate with each-other. This is not bridge.
            2. VEPA (Virtual Ethernet Port Aggregator) - this mode need VEPA compatible switch. Subinterfaces of one parent can communicate with each other with the help of VEPA hardware switch which returns all frames where both source and destination  are local to the macvlan interface
            3. bridge - all subinterfaces on a parent interface are interconnected with a simple bridge. Frames from one subinterface to the other delivered directly (through bridge) and not sent out. All MAC addresses are known so macvlan-bridge doesn't need STP and MAC learning
            4. passthru - allows a single VM to be connected directly to the physical interface. The advantage of this mode is that VM is then able to change MAC address and other interface parameters.
          9. docker network create --driver macvlan --subnet=10.0.0.0/24 --gateway=10.0.0.1  --opt parent=eth0 macvlanNetworkName
            1. gateway - external (not related to the docker-host) gateway
            2. parent - docker-host physical interface
            3. docker-host eth0 can be for example 10.0.0.2
          10. also you can use macvlan with VLAN interfaces. In this case subinterfaces are using different parent interfaces (ex. eth0.10 and eth0.20) and can communicate with each other only over gateway:
            1. create VLAN interface eth0.10 and eth0.20
            2. docker network create --driver macvlan --subnet=10.0.10.0/24 --gateway=10.0.10.1  --opt parent=eth0.10 macvlan10
            3. docker network create --driver macvlan --subnet=10.0.20.0/24 --gateway=10.0.20.1  --opt parent=eth0.20  macvlan20
            4. docker run --name='container0' --hostname='container0' --net=macvlan10 --ip=10.0.10.2 --detach=true centos
          11. To add additional IP to a container:
            1. docker network connect --ip=10.0.20.3 macvlan20 container1
          12. How to connect from macvlan subinterface to the host:
            1. This will prevent Docker from assigning 192.168.1.223 address to a container, --ip-range command says docker IPAM to allocate IP addresses from given sub-range: 
              1. docker network create -d macvlan -o parent=eno1 --subnet 192.168.1.0/24 --gateway 192.168.1.1 --ip-range 192.168.1.192/27 --aux-address 'host=192.168.1.223' mynet 
            2. Next, we create a new macvlan interface on the host. You can call it whatever you want: 
              1. ip link add mynet-aux link eno1 type macvlan mode bridge
            3. Now we need to configure the interface with the address we reserved and bring it up: 
              1. ip addr add 192.168.1.223/32 dev mynet-aux 
              2. ip link set mynet-aux up
            4. The last thing we need to do is to tell our host to use that interface when communicating with the containers. This is relatively easy because we have restricted our containers to a particular CIDR subset of the local network; we just add a route to that range like this: 
              1. ip route add 192.168.1.192/27 dev mynet-aux 
            5. With that route in place, your host will automatically use this mynet-aux interface when communicating with containers on the mynet network.
            6. above NIC based configs are not persistent and will be lost after reboot, so add all related config to the appropriate configuration files (NIC and route)
        2. ipvlan is similar to the macvlan but uses the same MAC for all endpoints (docker containers). It's useful in situations when switch where docker-host is connected restricts maximum number of MAC addresses per physical port. ipvlan requires at least kernel 4.1 on docker host 
          An IPAM (IP Address Management) driver lets you delegate IP lease management to an external component. This way you can coordinate IP use with other virtual or bare metal servers in your datacenter.
          Docker controls the IP address assignment for network and endpoint interfaces via the IPAM driver(s). Libnetwork has a default, built-in IPAM driver and allows third party IPAM drivers to be dynamically plugged. On network creation, the user can specify which IPAM driver libnetwork needs to use for the network’s IP address management. For the time being, there is no IPAM driver that would communicate with external DHCP server, so you need to rely on Docker’s default IPAM driver for container IP address and settings configuration. Containers use host’s DNS settings by default, so there is no need to configure DNS servers.
          IPAM driver ensures the container got an IPv4 and an IPv6 address from the subnets configured for the macvlan network.

          İf you use Hyper-V:
          Macvlan uses a unique MAC address per ethernet interface, by default, Hyper-V only allows traffics with MAC address sticks to the virutal switch port, we need to "Enable MAC address spoofing" to prevent virtual switch dropping VLAN's traffic.

          Docker Images & Dockerfile

          1. creating new image - can be done when you can't find needed container image on the docker-hub.
          2. Dockerfile:
            1. text file written in a specific format a docker can understand
            2. Every line starts with instruction (FROM, RUN, COPY etc.) followed by argument
            3. Each instruction instructs docker to do a specific action
              1. First line starts with a base OS or another image: FROM centos
              2. Then you install needed dependencies, for example:
                1. RUN yum update -y && yum install python python-pip
                2. RUN pip install flask flask-mysql
              3. Copy source files from docker-host to the docker-image:
                1. COPY . /opt/source-code
              4. Command to run when image is run as a container:
                1. ENTRYPOINT FLASK_APP=/opt/source-code/app.py flask run
            4. When building docker image, every line of Dockerfile creates layer of the docker image:
              1. For the above example layers are:
                1. layer 1: Base CentOS layer
                2. layer 2: changes in yum packages
                3. layer 3: changes in pip packages
                4. layer 4: source code
                5. layer 5: update entry-point with "flask" command
              2. docker build:
                1. docker build Dockerfile -t nameOfTheImage
                2. docker build -t nameOfTheImage dockerFileDirectoryName
                3. docker build -t nameOfTheImage .
                4. docker build Dockerfile -t nameOfTheImage .
              3. to view build process history: docker history imageName
              4. layered build process helps to debug and also helps to start build process, in case of failure, from the needed layer (this is done automatically using docker cash). The same is true when you want to add additional steps in dockerfile, rebuild will be done using cash, so only affected layers will be rebuilt
            5. CMD vs ENTRYPOINT:
              1. CMD defines command and it's parameters (if any) which will run when container starts:
                1. CMD ["mysqld"] or CMD mysqld
                2. CMD ["sleep", "5"] or CMD sleep 5
              2. ENTRYPOINT is like CMD but also appends any input to the "docker run" to the end of the command as parameter:
                1. ENTRYPOINT ["sleep"] or ENTRYPOINT sleep
                2. docker run centos-sleeper 10 # 10 will be appended as parameter to the "sleep" command
                3. when you append some input to the "docker run" and use CMD, then this is not appended to the command following CMD - the entire appended input is used as command after CMD 
              3. To use some value as default value for the ENTRYPOINT:
                1. ENTRYPOINT sleep
                2. CMD 10
                3. two lines above will be "sleep 10" by default, so CMD is appended to the ENTRYPOINT as parameter 

          DHCP on Docker

          DHCP (Dynamic Host Configuration Protocol) helps us to address dynamically our hosts on the network. In fact, when a Host is configured to get its IP address dynamically, it will broadcast a DHCP REQUEST on the network searching for a DHCP server. DHCP server has to be on the same broadcast domain as the CLIENTS since routers do not forward broadcast packets.
          1. create macvlan network:
            1. docker network create -d macvlan -o parent=enp3s0 --subnet 172.16.3.0/24 --gateway 172.16.3.4  --aux-address 'host=172.16.3.250' mynet
          2. add macvlan-aux to the docker-host (to ping directly from docker-host) - including ip link, route etc:
            1. ip link add mynet-aux link enp3s0 type macvlan mode bridge
            2. ip addr add 172.16.3.250/32 dev mynet-aux 
          3. run container with macvlan driver (assign static IP) and run /bin/bash:
            1. docker run --name='ctr0' --hostname='ctr0' --net=mynet --ip=172.16.3.249 -it centos /bin/bash
          4. Ping container IP from the Docker-host:
            1. ping 172.16.3.249
          5. on container (all of these can be done with docker-file):
            1. yum install net-tools -y
            2. yum install dhcp -y
            3. 1.1.6.1 is DHCP relay IP address
            4. dhcpd listens *only* on interfaces for which it finds subnet declaration in dhcpd.conf
          vi dhcp.conf:
          # this server is primary and thus - authorative server on that network
          authoritative;
          subnet 172.16.3.0 netmask 255.255.255.0 {
                     range 172.16.3.1 172.16.3.3;
                     option routers 172.16.3.4;
                     option domain-name-servers 172.16.3.6;
          }     

            Run dhcp service with specified file: dhcpd -cf dhcp.conf

            to kill process on container:
            top > k > PID > Enter

            Tuesday, July 16, 2019

            YAML

            YAML Ain't Markup Language:
            1. YAML uses indentation to distinguish layers (same indentation - same layer) - use spaces, because tabs are not allowed.
            2. YAML start with three dashes: ---
            3. YAML disctionary in key-value pair in one of two forms:
              1. colon separated key (like Python dictionary):
                1. key: value
              2. indentation separated key:
                1. key:
                2.       value
              3. Also one key can contain nested dictionary:
                1. Method 1:
                  1. first_level_key:
                  2.    second_level_key_under_the_first_level: second_level_value
                2. Method 2:
                  1. first_level_key: {second_level_key_under_the_first_level: second_level_value}
            4. YAML uses dashes as indentation to represent list of items (use lists when you want key to have more than one values which are not keys themselves):
              1. Methos 1:
                1. this_is_a_list:
                2.  - element_1
                3.  - element_2
              2. Method 2:
                1. this_is_a_list : [element_1, elemenet_2]

            Relational Algebra

            RA is a formal language that forms conventions used in implemented languages like SQL.
            RA operates on relations and produces relations as result.
            Query (expression) on set of relations produces relation as result. 
            Key is an attribute or a set of attributes whose value is guaranteed to be unique.
            Tuple  - row of a relation.
            Attribute  - column of a relation.
            Relation - table consisting of attribute (column) and tuples (rows).
            Schema  - relation header (attribute names)
            We'll use simple college admission database with three relations (keys are in bold):
            1. 1st relation: College(schema: cName, state, enrollment)
            2. 2nd relation: Student(schema: sID, sName, GPA, sizeHS) # sizeHS = size of High School a student attended
            3. 3rd relation: Apply(schema: sID, cName, major, decision)
            Simplest query in a RA is simply the relation name, for example Student is valid expression (query) in RA and it's returning copy of Student relation.
            Use operators to filter, slice, combine relations:
            1. Select - picks certain rows out of a relation - σ (sigma):
              1. Students with GPA > 3.7 (GPA - Grade Point Average)
                1. σ GPA > 3.7 Student
              2. Students with GPA > 3.7 and sizeHS < 1000 (caret ^ is logical and operator)
                1. σ GPA > 3.7  ^ sizeHS < 1000 Student
              3. Application for Stanford for CS major
                1. σ cName="Stanford" ^ major="CS" Apply
              4. Select operator general case:
                1. σ condition Relation
            2. Project operator - picks certain columns:
              1. Apply relation with sID and decision columns only
                1. П sID,decision Apply
              2. Project operator general case:
                1. П col1,col2,col3... Relation
            3. To Select and Project at the same time:
              1. ID and name of students with GPA>3.7:
                1. П sID,sName (σ GPA>3.7 Student)
            4. Duplicates:
              1. SQL is based on multisets/bags, so duplicates are also shown
              2. RA is based on sets, so duplicates are eliminated automatically
            5. Cross-product operator (a.k.a. cross-join) - combine two relations (a.k.a. Cartesian product) horizontally:
              1. Student x Apply as result we get big relation which going to have eight (8) attributes
              2. as a convention when cross-product is done and we get the same attributes for both cross-producted relations we preface their name with the name of the relation they came from: Student.sID / Apply.sID
              3. cross-product gives as relation where for every row of Student relation you get all rows of Apply relation
              4. Names and GPA's of students with HS>1000 who applied to CS and were rejected:
                1.  П sName, GPA (σ Student.sID=Apply.sID ^ HS>1000 ^ major="CS" ^ decision="Reject" (Student x Apply))
            6. Difference operator A\B or A-B returns elements of A that not in B:
              1. if A = {a, b, c} and B = {b, c, d} then A - B = {a}
              2. IDs of students who didn't apply anywhere:
                1.  sID Student) - (П sID Apply)
              3. IDs and names of students who didn't apply anywhere (we can't just add sName to the project from Student relation because Apply relation has not this attribute and we might not use difference operator on sets with different quantity of attributes):
                1. П sID,sName ( ( (П sID Student) - (П sID Apply) ) ⋈ Student )
            7. Union (denoted by ∪) operator is the set of all elements in the collection. It is one of the fundamental operations through which sets can be combined and related to each other. Union combines vertically:
              1. A ∪ B means all members of A and also all members of B that not in A (RA is set based and removes duplicates):
                1. if A = {a, b, c} and B = {b, c, d} then:
                  1. all members of A are {a, b, c} 
                  2. all members of B that not in A is B\A = {d}
                  3. combine both {a, b, c} and {d} => {a, b, c, d}
                  4. so: A ∪ B = {a, b, c, d}
              2. List of college and student names:
                1.  cName College) ∪ (П sName Student)
            8. Natural join operator performs cross-product operator and then enforces equality on all of the attributes with the same name (as in above example: Student.sID=Apply.sID) also natural join eliminates one copy of duplicate attributes:
              1. Names and GPA's of students with HS>1000 who applied to CS and were rejected:
                1. П sName, GPA (σ HS>1000 ^ major="CS" ^ decision="Reject"(Student ⋈ Apply))
              2. Names and GPA's of students with HS>1000 who applied to CS and were rejected to colleges with the enrollment greater than 20000:
                1. П sName, GPA (σ HS>1000 ^ major="CS" ^ decision="Reject" ^ enrollment>20000(Student ⋈ (Apply ⋈ College)))
              3. Relation between ⋈ and x:
                1. E1 ⋈ E2 => П schema(E1) U schema(E2) (σ E1.a1= E2.a1 ^ E1.a2= E2.a2 ^ ... (E1 x E2))
            9. Theta Join - operator takes two expressions/relations and combines them with bow tie looking operator (⋈) but with a subscript theta (θ) which means select condition (any select condition you want):
              1. E1 ⋈θE2 = σθ (E1 ⋈ E2 )
              2. Theta join is the basic operation implemented in RDBMS so the term "join" often means theta-join
            10. Intersection operator -  A ∩ B of two sets A and B is the set that contains all elements of A that also belong to B (or equivalently, all elements of B that also belong to A), but no other elements:
              1. if A = {1, 2, 3} and B = {2, 3, 4} then A ∩ B = {2, 3}
              2. Names that are both college name and student name:
                1.  sName Student) ∩ (П cName College)
              3. Expressing intersection via difference
                1. A ∩ B => A - ( A - B ) 
              4. Expressing intersection via natural join:
                1. A ∩ B => A ⋈ B
            11. Rename operator uses ρ (rho), it reassigns schema in the result of expression (relation). Above (in 7.2.1 and 10.2.1 we used operators on relations with different attribute names - different schemas, in practice RA doesn't allow that, so we need to use rename operator):
              1. General form:
                1. ρ R(A1,A2,...An) Relation E # call result of the relation E - R with attributes A1 to An
              2. Unify schemas for set operators:
                1. List of college and student names (7.2.1):
                  1. ρ C1(name) (П cName College) ∪ ρ C2(name)  sName Student)
                2. Names that are both college name and student name (10.2.1):
                  1. ρ C1(name) (П sName Student) ∩ ρ C2(name) (П cName College)
              3. for disambiguation in self-joins:
                1. pairs of colleges in same state:
                  1. With cross-join
                    1. σ s1=s2 ^ n1!=n2 ( ρ C1(n1,s1,e1) College x ρ C2(n2,s2,e2) College )
                  2. With natural-join:
                    1. σ n1!=n2 ( ρ C1(n1,s,e1) College x ρ C2(n2,s,e2) College )
            12. Select, Project, Cross-join, union, difference, rename are RA basic operators
            13. Natural join, theta join, intersection are not RA basic operators they can be expressed with use of basic operators, so there are actually are abbreviations

            These materials were used to write this synopsis-post:

            Wednesday, July 10, 2019

            Hash Functions, Binary Tree, O(n)

            Hash Function

            A hash function is any function that can be used to map data of arbitrary size to data of fixed size (N).  Example of simple hash function is h(x) = x mod N (mod is modulus or remainder of division, for example 5 / 5 = 1 with no reminder => mod=0, 6 / 5 = 1 and 1 in remainder => mod = 1)The values returned by a hash function are called hash values, hash codes, digests, or simply hashes. Hash functions are often used in combination with a hash table (consists of hash function h and array, also called table, of size N), a common data structure used in computer software for rapid data lookup. Hashing is done for indexing and locating items in databases because it is easier to find the shorter hash value than the longer string. Hashing is also used in encryption.This term is also known as a hashing algorithm or message digest function. No sorting and no searching required. When you compute the hash function you know where to store the data and you know where to find the data. Hash functions are just one-way they cannot be reversed. The main idea is to store key-element pairs (k, e) as index h(k):
            Example:
            1. phone book with 5 numbers in it (N = 5)
            2. h (name) = (lenght of name) mod 5
            3. This function is ok if all names in phonebook have different lengths, if some lengths are the same, then collision is occurred (collision - when pairs of input to hash function are mapped to the same hash value): 
              1. h(John) = 3 mod 5 = 3
              2. h(Jack) = 3 mod 5
              3. So h(John) = h(Jack)
              4. so because of many collision this is example of the bad hash-function
              5. actually collision can be in every hash function but hash function must be designed in the way minimizing collision possibility. To do so hash functions produce long enough hash-values and this values are hold smaller enough to be computed quickly.

            Binary Tree

            In computer science, a binary tree is a tree data structure in which each node (узел) has at most two children, which are referred to as the left child and the right child. Topmost node called root and this is L-0 (level zero) and height of 0. Each child node in binary tree defines a sub-tree, the root of which it is.

            Big O notation

            In computer science, big O notation is used to classify algorithms according to how their running time or space requirements grow as the input size grows.Actual formula is O(f(n)) meaning: with an increase in the parameter n (amount of input to the algorithm) the running time of the algorithm will increase no faster than some constant multiplied by f(n). How to find big-O of some operation (as Example 3n^2+ n^5 + 4n + 5 + 2^n + log8(n) ):
            1. omit constants and constant multipliers (3n^2+ n^5 + 4n + 5 + 2^n + log8(n) => n^2+ n^5 + n + 2^n + log8(n))
            2. n^a grows faster than n^b for a > b. In other words if you have n^3 - omit n^2 (n^2+ n^5 + n + 2^n + log8(n) => n^5 + 2^n + log8(n) )
            3. any polynomial grows faster than any logarithm, so n or even sqrt(n), grows faster than log3(n)  ( n^5 + 2^n + log8(n) => n^5 + 2^n )
            4. any exponential grows faster than any polynomial, so 3^n grows faster than n^5 ( n^5 + 2^n  => 2^n )
            5. So O(3n^2+ n^5 + 4n + 5 + 2^n + log8(n)) = 2^n
            6. All of the above doesn't mean that nobody cares constant - in practice speeding-up algorithm twice can be very hard but efficient, but it's much more reasonable to find approximate values first

            Monday, July 8, 2019

            Draft: Relax-and-Recover

            Backup to USB and restore from USB

            sudo yum install git syslinux syslinux-extlinux kernel-devel
            git clone https://github.com/rear/rear.git
            cd rear/
            insert USB stick to the backed-up computer
            lsblk # to find name of the USB flash card
            umount /dev/sdb1 # umount if USB flash is automatically mounted
            sudo usr/sbin/rear format /dev/sdb
            type 'Yes' to format USB flash
            rear will format that flash as REAR-000
            edit rear configuration:
            vi etc/rear/local.conf
            ### write the rescue initramfs to USB and update the USB bootloader
            OUTPUT=USB
            ### create a backup using the internal NETFS method, using 'tar'
            BACKUP=NETFS
            ### write both rescue image and backup to the device labeled REAR-000
            BACKUP_URL=usb:///dev/disk/by-label/REAR-000
            Create rescue image  (it's without OS backup and used to restore OS in case of failure) with verbose output:
            sudo usr/sbin/rear -v mkrescue
            Now reboot your system and try to boot from the USB device. If it's ok, then rescue image is ok and you can do OS data backup alonh with creating rescue media:
            sudo usr/sbin/rear -v mkbackup




            Monday, June 10, 2019

            Access CentOS from the other machine (via VNC)

            On target machine

            yum groupinstall "GNOME Desktop"
            Set default target to GUI:
            systemctl set-default graphical.target
            yum install tigervnc-server xorg-x11-fonts-Type1
            Choose a VNC display to use for accessing this particular computer (here we'll use display 5 which equals to the port 5905)
            cp /lib/systemd/system/vncserver@.service /etc/systemd/system/vncserver@:5.service
            Add user (if you need one) whose Desktop will be accessed:
            adduser admin
            passwd admin
            Access file:
            vi /etc/systemd/system/vncserver@:5.service
            Then:
            1. Go to [Service]
            2. ExecStart  - here - change <USER> with admin
            3. PIDFile - here - change <USER> with admin
            firewall-cmd --permanent --zone=public --add-port=5905/tcp 
            firewall-cmd --reload

            su - admin
            Execute vncserver command to assign VNC password for needed user (admin in our case)

            systemctl daemon-reload 
            systemctl start vncserver@:5.service
            systemctl enable vncserver@:5.service
            systemctl status vncserver@:5.service
            Check that vncserver is listening to the ports other that localhost:
            lsof -i -P | grep -i "listen" | grep Xvnc

            If some kind of problem appears do below:
            Remove not needed VNC display sessions (if any):
            rm -f /tmp/.X11-unix/*
            rm -f /tmp/.X*-lock
            systemctl restart vncserver@:5.service
            ps aux | grep vnc

            On client machine

            yum install tigervnc
            Open TigerVNC Viewer (Applications > Internet) or vncviewer (from bash)
            Specify serverNameOrIp:5095
            Specify VNC password
            then login using needed login and password (admin in our case)
            Use F8 to make full-screen and to change other settings.



            Tuesday, May 7, 2019

            Copying multiple files with scp

            You can copy all files in directory but problem is when you are trying to copy several files from huge directory. Below is the method I use to achieve needed:
            1. save list of names of needed files with their paths into the test file (ex: scpList.txt) - full path is needed!
            2. assign this list to the variable: scpList=$(echo \";for file in $(cat scpLoad.txt); do printf $file ; printf " "; done; echo \")
            3. manually copy variable content: echo $scpList
            4. mkdir ~/Desktop/scpFiles
            5. scp root@srvIP:insertCopiedVariableContentHere ~/Desktop/scpFiles