Wednesday, July 10, 2019

Hash Functions, Binary Tree, O(n)

Hash Function

A hash function is any function that can be used to map data of arbitrary size to data of fixed size (N).  Example of simple hash function is h(x) = x mod N (mod is modulus or remainder of division, for example 5 / 5 = 1 with no reminder => mod=0, 6 / 5 = 1 and 1 in remainder => mod = 1)The values returned by a hash function are called hash values, hash codes, digests, or simply hashes. Hash functions are often used in combination with a hash table (consists of hash function h and array, also called table, of size N), a common data structure used in computer software for rapid data lookup. Hashing is done for indexing and locating items in databases because it is easier to find the shorter hash value than the longer string. Hashing is also used in encryption.This term is also known as a hashing algorithm or message digest function. No sorting and no searching required. When you compute the hash function you know where to store the data and you know where to find the data. Hash functions are just one-way they cannot be reversed. The main idea is to store key-element pairs (k, e) as index h(k):
Example:
  1. phone book with 5 numbers in it (N = 5)
  2. h (name) = (lenght of name) mod 5
  3. This function is ok if all names in phonebook have different lengths, if some lengths are the same, then collision is occurred (collision - when pairs of input to hash function are mapped to the same hash value): 
    1. h(John) = 3 mod 5 = 3
    2. h(Jack) = 3 mod 5
    3. So h(John) = h(Jack)
    4. so because of many collision this is example of the bad hash-function
    5. actually collision can be in every hash function but hash function must be designed in the way minimizing collision possibility. To do so hash functions produce long enough hash-values and this values are hold smaller enough to be computed quickly.

Binary Tree

In computer science, a binary tree is a tree data structure in which each node (узел) has at most two children, which are referred to as the left child and the right child. Topmost node called root and this is L-0 (level zero) and height of 0. Each child node in binary tree defines a sub-tree, the root of which it is.

Big O notation

In computer science, big O notation is used to classify algorithms according to how their running time or space requirements grow as the input size grows.Actual formula is O(f(n)) meaning: with an increase in the parameter n (amount of input to the algorithm) the running time of the algorithm will increase no faster than some constant multiplied by f(n). How to find big-O of some operation (as Example 3n^2+ n^5 + 4n + 5 + 2^n + log8(n) ):
  1. omit constants and constant multipliers (3n^2+ n^5 + 4n + 5 + 2^n + log8(n) => n^2+ n^5 + n + 2^n + log8(n))
  2. n^a grows faster than n^b for a > b. In other words if you have n^3 - omit n^2 (n^2+ n^5 + n + 2^n + log8(n) => n^5 + 2^n + log8(n) )
  3. any polynomial grows faster than any logarithm, so n or even sqrt(n), grows faster than log3(n)  ( n^5 + 2^n + log8(n) => n^5 + 2^n )
  4. any exponential grows faster than any polynomial, so 3^n grows faster than n^5 ( n^5 + 2^n  => 2^n )
  5. So O(3n^2+ n^5 + 4n + 5 + 2^n + log8(n)) = 2^n
  6. All of the above doesn't mean that nobody cares constant - in practice speeding-up algorithm twice can be very hard but efficient, but it's much more reasonable to find approximate values first

No comments:

Post a Comment