2) Out line: Introduction Hashing Algorithm Qualities of a good hash function Digital Signature Cryptographic hash function Hash table Method of hash function The division-remainder method: Folding: Radix transformation: Digit rearrangement Application Hash Function Error correction and detection Identification and verification Audio identification Reference
2) Introduction: Origins of the term The term "hash" comes by way of analogy with its standard meaning in the physical world, to "chop and mix." Knuth notes that Hans Peter Luhn of IBM appears to have been the first to use the concept. What is hashing? Hashing is the transformation of a string of characters into a usually shorter fixedlength value or key that represents the original string. Main Use: Hashing is used to index and retrieve items in a database because it is faster to find the item using the shorter hashed key than to find it using the original value. Hashing is also used in many encryption algorithms.
4) Hashing Algorithm: A hash is a mathematical function that takes in some arbitrary value and produces a hash value, based on the given input. The hashing algorithm is called the hash function .In addition to faster data retrieval, hashing is also used to encrypt and decrypt digital signatures. A hash function is a reproducible method of turning some kind of data into a (relatively) small number that may serve as a digital "fingerprint" of the data. The algorithm "chops and mixes" (i.e., substitutes or transposes) the data to create such fingerprints, called hash values. These are commonly used as indices into
hash tables or hash files. Cryptographic hash functions are used for various purposes in information security applications. input Output 5 ) Qualities of a good hash function Produces a fixed length key for variable input Has got infinite key space, implies the next point No collisions (i.e. no two different pieces of input give the same key value)
6) Digital signature The digital signature is transformed with the hash function and then both the hashed value and the signature are sent in separate transmissions to the receiver. Using the same hash function as the sender, the receiver derives a message-digest from the signature and compares it with the message-digest it also received. They should be the same. Authentication : proof of identity of the parties to an electronic transaction; Integrity: assurance that the contents of a message have not been tampered with or modified; Non-repudiation: proof of agreement to the terms of the transaction and prevention of denial of commitment
7 ) Digital signature creation and verification •
•
Digital signature creation uses a hash result derived from and unique to both the signed message and a given private key. For the hash result to be secure, there must be only a negligible possibility that the same digital signature could be created by the combination of any other message or private key. Digital signature verification is the process of checking the digital signature by reference to the original message and a given public key, thereby determining whether the digital signature was created for that same message using the private key that corresponds to the referenced public key.
8) Cryptographic hash function In cryptography, a cryptographic hash function is a hash function with certain additional security properties to make it suitable for use as a primitive in various information security applications, such as authentication and message integrity. A hash function takes a long string of any length as input and produces a fixed length string as output, sometimes termed a maessage digest.
10) hash table Hash tables, a major application for hash functions, enable fast lookup of a data record given its key. hash table, or a hash map, means a data structure that associates keys with values. The primary operation it supports efficiently is a lookup: given a key (e.g. a person's name), find the corresponding value (e.g. that person's telephone number). It works by transforming the key using a hash function into a hash, a number that is used to index into an array to locate the desired location.
11) Method of hash function The division-remainder method: The size of the number of items in the table is estimated. That number is then used as a divisor into each original value or key to extract a quotient and a remainder. The remainder is the hashed value. (Since this method is liable to produce a number of collisions, any search mechanism would have to be able to recognize a collision and offer an alternate search mechanism.) Folding: This method divides the original value (digits in this case) into several parts, adds the parts together, and then uses the last four digits (or some other arbitrary number of digits that will work ) as the hashed value or key. Radix transformation: Where the value or key is digital, the number base (or radix) can be changed resulting in a different sequence of digits. (For example, a decimal numbered key could be transformed into a hexadecimal numbered key.) High-order digits could be discarded to fit a hash value of uniform length. Digit rearrangement: This is simply taking part of the original value or key such as digits in positions 3 through 6, reversing their order, and then using that sequence of digits as the hash value or key. 12) Application of hash function: Error correction: Using a hash function to detect errors in transmission is straightforward. The hash function is computed for the data at the sender, and the value of this hash is sent with the data. The hash function is performed again at the receiving end, and if the hash values do not match, an error has occurred at some point during the transmission
Identification and verification Cryptographic grade hash functions are commonly used as integrity check values to identify files and/or verify their integrity. Some hash algorithms, notably MD5 are no longer recommended for new applications, and may not provide the necessary level of security desired. However they still may still be useful as an error checking mechanism, where purposeful data tampering isn't a primary concern.
Audio identification For audio identification such as finding out whether an MP3 file matches one of a list of known items, one could use a conventional hash function such as MD5, but this would be very sensitive to highly likely perturbations such as time-shifting, CD read errors, different compression algorithms or implementations or changes in volume. Using something like MD5 is useful as a first pass to find exactly identical files, but another more advanced algorithm is required to find all items that would nonetheless be interpreted as identical to a human listener.