Captcha Presentation

  • April 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Captcha Presentation as PDF for free.

More details

  • Words: 1,078
  • Pages: 28
By BHARATH B S 4VV05CS009

Agenda  Definition  Background  Types  Applications  Constructing CAPTCHAs  Breaking CAPTCHAs  Issues with CAPTCHAs  Conclusion

Intro  CAPTCHA Completely Automated

Public Turing test to tell Computers and Humans Apart  Invented at CMU by Luis von Ahn,

Manuel Blum, et. al  A program that is a challenge –

response test to separate humans from computer programs

 Generic CAPTCHAs distort letters and

numbers  Distorted characters are presented

to user  User has to recognize the distorted

letters  If the guessed letters are correct, the

user is inferred to be a human and allowed access

 Humans can read the distorted and

noisy text  Current OCRs cannot read them

Background  Why CAPTCHA was needed?  Sabotage of online polls  Spam emails  Abusing free online accounts  Tampering with rankings on

recommendation systems (like EBay, Amazon)

 Altavista first used a crude CAPTCHA

in their sites  Resulted in 95% spam reduction  Yahoo partnered CMU to counter

these threats in Messenger chat service.  Luis von Ahn and Manuel Blum of

CMU trademarked CAPTCHA in 2000

 What is a Turing test?  Proposed by Alan Turing  To test a machine’s level of intelligence  Human judge asks questions to two

participants, one is a machine, he doesn’t know which is which  If judge can’t tell which is the machine, the machine passes the test  CAPTCHA employs a reverse Turing test, judge = CAPTCHA program, participant = user if user passes CAPTCHA, he is human if user fails, it is a machine

Types of CAPTCHAs  Text based:  Simple, normal language questions:  What is sum of three and thirty-five? If today is Saturday, what is day after tomorrow?  Which of mango, table, water is a fruit?

 Very effective, needs a large question bank  Cognitively challenged users find it hard

 Gimpy:  Designed by Yahoo and CMU  Picks up 10 random words from dictionary and distorts, fills with noise  User has to recognize at least 3 words  If user is correct, he is admitted

 EZ-Gimpy:  A modified version of Gimpy  Yahoo used this version in Messenger  Has only 1 random string of characters  Not a dictionary word, so not prone to dictionary attack  Not a good implementation, already broken by OCRs

 MSN’s Passport service CAPTCHAs:

Provided for Microsoft’s MSN services Use 8 characters Warping is used to distort Very strong implementation, hasn’t been broken  It is segmentation-resistant    

 Graphic based CAPTCHAs:  BONGO:  After M.M.Bongard, pattern recognition expert  User has to solve a pattern recognition problem  Has to tell the distinct characteristic between two sets of figures  Then tell to which set a given figure belongs to

 PIX:  Uses a large database of labelled images  It shows a set of images, user has to recognize the common feature among those  E.g., Pick the common characteristic among the following four pictures-----”Aeroplane”

 Audio CAPTCHAs:  Consist of downloadable audio clip  User listens and enters the spoken word  Helps visually disabled users  Below is the Google’s audio enabled

CAPTCHA  Not popular

Applications  Protect online polls  Prevent Web registration abuse,

protect passwords from brute-force attack  Prevent comment spam and spam

emails  E-Ticketing, prevent scalping

 Verify digitized books: reCAPTCHA  Used in Google Books Project  Two words are shown, the program

knows first word  If user enters first word correctly, it assumes that the second unknown word will also be entered correctly  Second word becomes “known”

 Help advance AI knowledge  CAPTCHAs are called Hard-AI problems  A win-win scenario:

 If CAPTCHAs are broken by a bot, a HardAI problem is solved  If its not yet broken, then current implementation is able to withstand attacks 

Thus AI knowledge is advanced if CAPTCHAs are broken

Constructing CAPTCHAs  Things to keep in mind:  Don’t store CAPTCHA solution in Web

page’s metadata

 A CAPTCHA is no good if it doesn't

distort

 Need a large database of different

CAPTCHA questions

 Avoid repetition of questions

 CAPTCHA Logic:  Generate the question  Persist the correct answer  Present the question to user  Evaluate answer, if incorrect, start

again-- Generate a different CAPTCHA  If correct, allow access to user

 Embeddable CAPTCHAs:  Available freely, just embed code into

Web page’s HTML, from e.g., www.recaptcha.net  No maintenance

 Custom CAPTCHAs:  Fits to the theme of the page  Better protected from spammers

Can be written in any language– Perl, .NET, ASP, JavaScript

 Guidelines:  Accessibility  Image security  Script security  Security after widespread adoption  Custom implementation or a general

CAPTCHA?

Breaking CAPTCHAs  Cracking CAPTCHAs through

programs Convert CAPTCHA into greyscale  Detect patterns in the image corresponding to characters  Or, read session files of that user and know the CAPTCHA word 

 Solution: Only store a hash of the CAPTCHA word in session files

 Greg Mori and Jitendra Malik have

broken text CAPTCHAs, e.g., EzGimpy  To break this CAPTCHA 

 Segmentation: Locate possible letters in the image   Construct graph of consistent letters   Find out plausible words from the graph, use scores to rank roll=11.94, profit=9.42 (better match)

 Social engineering to break

CAPTCHAs:  Spammer encounters a CAPTCHA  That CAPTCHA is copied to another site  Humans are baited, e.g., free MP3s  To get those MP3s, users are told to

solve the copied CAPTCHA  Solution is routed to the spammer  Solution: Fix a time-to-live period for a question

 CAPTCHA cracking as a business:  Firms offer CAPTCHA cracking service in

exchange for money

Issues with CAPTCHAs  Usability issues:  W3C mandates Web to be accessible to

all people  Some CAPTCHAs are inaccessible to visually impaired, cognitively challenged people

 Compatibility issues:  JavaScript may need to be activated in

browsers  Some may need Adobe Flash plugin

Summary  CAPTCHAs are an effective way to

counter bots and reduce spam  They serve dual purpose– help advance AI knowledge  Applications are varied– from stopping bots to character recognition & pattern matching  Some issues with current implementations represent challenges for future improvements

Related Documents

Captcha Presentation
April 2020 1
Captcha
June 2020 4
Captcha
December 2019 3
Picture Captcha
November 2019 4
Data Captcha
June 2020 5