Intelligent Dictionary Based Encryption And Compression Algorithm

  • Uploaded by: arjun c chandrathil
  • 0
  • 0
  • June 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Intelligent Dictionary Based Encryption And Compression Algorithm as PDF for free.

More details

  • Words: 1,614
  • Pages: 35
Intelligent Dictionary Based Encryption And Compression Algorithm (For Network Security and High Speed Text Data Transfer)

Hanly Y Nadackal R7A 29 Guided By : Sindhu M P 16/09/’09

Current Scenario: Computers - now a days are a part of a network  Data residing inside a system is much important  But, This Data is Vulnerable to exploits  Of course, Data can be made secure within a computer by the User  Anyhow, The Data sent over a network is vulnerable to external attacks…

So, Why Network Security?: Confidentiality- Only sender, intended receiver

should “understand” message contents  Authentication- Sender, receiver want to

confirm identity of each other  Message Integrity- Sender, receiver want to

ensure message not altered (in transit, or afterwards) without detection

Security now in practice: Firewall- Is designed to block unauthorized access while permitting authorized communications  Security features built up in Application, Transport, Network, Link layers

Friends and enemies: Alice, Bob, Trudy:  well-known in network security world  Bob, Alice want to communicate “securely”  Trudy (intruder) may intercept, delete, add messages Alice

Bob

channel data, control messages

data

secure sender Trudy

secure receiver

data

Who might Bob, Alice be?  … well, real-life Bobs and Alices!  Web browser/server for electronic transactions (e.g., on-line purchases)  On-line banking client/server  DNS servers  Routers exchanging routing table updates

There are bad guys (and girls) out there! Q: What can a “bad guy” do? A: A lot!  Eavesdrop: Intercept messages  Actively Insert messages into connection  Impersonation: Can fake (spoof) source address in packet (or any field in packet)  Hijacking: “Take over” ongoing connection by removing sender or receiver, inserting himself in place  Denial Of Service(DoS): Prevent service from being used by others (e.g., by overloading resources)

That’s the reason why we need the network to be secure………………

Why Compression of data essential?: There has been an unprecedented explosion in the amount of digital data transmitted via the Internet  It is estimated that in the year 2004 the National Service Provider backbone will have an estimated traffic around 30000Gbps and that the growth will continue to be 100% every year.  With this trend expected to continue, it makes sense to pursue research on developing algorithms that can most effectively use available network bandwidth by maximally compressing data.

What Compression algorithms do?: Compression algorithms reduce the redundancy in data to decrease the storage required for that data.  Offers an attractive approach to reduce the communication costs by using available bandwidth effectively.

So, What if the Encoding and compression done in a single Algorithm?

 In early times we achieved this by using a “STAR ENCODING”

Star Encoding: Here each word has a star-encoded equivalent, in which as many letters possible are replaced by the '*’ character.  For example, a commonly used word such ‘the’ might be replaced by the string t**.  The star-encoding transform simply replaces ‘every occurrence’ of the word ‘the’ in the input file with t**.

Star Encoding continued….  Ideally, the most common words will have the highest percentage of '*' characters in their encoding.  If done properly, this means that transformed file will have a huge number of '*’characters.  This ought to make the transformed file more compressible than the original plain text.

Star Encoding continued….  The existing star encoding does not provide any compression as such but provide the input text a better compressible format for a later stage compressor.  The star encoding is very much weak and vulnerable to attacks.

Star Encoding continued…. Consider the following text taken from Romeo and Juliet: “But soft, what light through yonder window breaks? It is the East, and Iuliet is the Sunne, Arise faire Sun and kill the enuious Moone, Who is already sicke and pale with griefe, That thou her Maid art far more faire then she...”

Star Encoding continued…. Running this text through the star-encoder yields:“B** *of*, **a* **g** *****g* ***d*r ***do* b*e***? It *s *** E**t, **d ***i** *s *** *u**e, A***e **i** *un **d k*** *** e****** M****, *ho *s a****** **c*e **d **le ***h ****fe, ***t ***u *e* *ai* *r* f*r **r* **i** ***n s**…”

Star Encoding continued….  You can clearly see that the encoded data has exactly the same number of characters, but is dominated by stars.  It certainly looks as though it is more compressible and at the same time does not offer any serious challenge to the hacker!  That is, the star encoding provide a compressible form of output compromising the Security of data transfer

Do an Algorithm gives both the “Security” and “Compression” at the same time? Yes!!! The Algorithm is:  “An Intelligent Dictionary Based Encoding and Compression Algorithm”  ..Which will offer higher compression ratios and better security towards all possible ways of attacks while transmission.

IDBE Algorithm:Consider the following paragraph: “Our philosophy of compression is to trasfom the txt into som intermedate form which can be compresed with bettr efficency and which xploits the natural redundancy of the language in making this tranformation.”  The above block is written with a lot of spelling mistakes, but most people will have no problem to read it…

IDBE Algorithm continued….  Because our visual perception system recognizes each word with an approximate signature pattern and we have a dictionary in our brain, which associates each misspelled word with a corresponding, correct word.  This is the core concept in the development of this algorithm.

IDBE Algorithm continued…. The algorithm is developed in two steps: I. Make an intelligent dictionary II. Encode the input text data using the dictionary

IDBE Algorithm continued…. Dictionary Making Algorithm: 1. Extract all words from input files. 2. If a word is already in the table increment the number of occurrence by 1, otherwise add it to the table and set the number occurrence to 1. 3. Sort the table by frequency of occurrences in descending order.

A

IDBE Algorithm continued…. A

4. Start giving codes using the following method: i). Give the first 218 words the ASCII characters 33 to 250 as the code. ii). Now give the remaining words each one permutation of two of the ASCII characters (in the range 33 - 250), taken in order. If there are any remaining words give them each one permutation of three of the ASCII characters and finally if required permutation of four characters and so on. 5. Create a new table having only words and their codes. Store this table as the Dictionary in a file. 6. Stop.

IDBE Algorithm continued…. Encoding Algorithm: A. Read the dictionary and store all words and their codes in a table B. While inp is not empty 1.Read the characters from inp and form tokens. 2. If the token is longer than 1 character, then 1.Search for the token in the table 2. If it is not found, Write the token as such in to the output file. B

IDBE Algorithm continued…. B

3.Else(that is, if the token is found in the dictionary) 1. Find the length of the code for the word(which is used when decoding is to be done) 2. Write the actual code into the output file. 3. Read the next token and neglect the it if it is a space. If it is any other character, go back to B

C

IDBE Algorithm continued…. C

3.Else(That is, if it is a ‘1 character token’) 1. Write the 1 character token 2. If the token is one of the ASCII characters 251 - 255, write the character once more so as to show that it is part of the text and not a marker C. Stop

IDBE Algorithm continued…. Consider the following data taken from “THE BIBLE” “In the beginning God created the heaven and the earth. And the earth was without form, and void; and darkness was upon the face of the deep. And the Spirit of God moved upon the face of the waters. And God said, Let here be light: and there was light. And God saw the light, that it was good: and God divided the light from the darkness. And God called the light Day, and the darkness he called Night. And the evening and the morning were the first day.…”

IDBE Algorithm continued…. Running the text through the Intelligent Dictionary Based Encoder (IDBE) yields the following text: “û©û!ü%;ûNü’.û!ü".û"û!û.ÿ. û*û!û.û5ü"8ü"}ÿ, û"ü2Óÿ; û"ü%Lû5ûYû!ü"nû#û!ü&.ÿ.û*û!ü%Ìû#ûNü&ÇûYû!ü"nû#û! ü#Éÿ.û*ûNûAÿ, ü"¿û]û.ü".ÿ: û"û]û5ü".ÿ.û*ûNü"Qû!ü".ÿ, û’û1û5û²ÿ: û"ûNü(Rû!ü".û;û!ü%Lÿ….”  It is clear from the above sample output that the encoded text provide a better compression and a stiff challenge to the hacker!

Performance analysis: The performance issues such as Bits Per Character (BPC) and conversion time are compared for the two cases i.e., Star encoding and our proposed Intelligent Dictionary Based Encoding (IDBE).  The results are shown below:

BPC(Bits Per Character) and Time comparison of *Encode & IDBE

BPC & Conversion time comparison of transform with *Encoding and IDBE for the above data

Conclusion : In an ideal channel, the reduction of transmission time is directly proportional to the amount of compression.  But in a typical Internet scenario with fluctuating bandwidth, congestion and protocols of packet switching, this does not hold true.  But, Results have shown excellent improvement in text data compression and added levels of security over the existing methods.

References : [ZiLe77] J. Ziv and A. Lempel. .A Universal Algorithm for Sequential Data Compression., IEEE Trans. Information Theory, IT-23, pp.237-243.  [Welc84] T. Welch, .A Technique for HighPerformance Data Compression., IEEE Computer, Vol. 17, No. 6, 1984.  [Moff90] A. Moffat. .Implementing the PPM Data Compression Scheme., IEEE Transactions on Communications, COM-38, 1990, pp. 1917-1921.

Questions?

Related Documents


More Documents from "wrenchguy"