Genetic Programming: Principles And Applications

  • Uploaded by: Behnam Dezfouli
  • 0
  • 0
  • October 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Genetic Programming: Principles And Applications as PDF for free.

More details

  • Words: 3,750
  • Pages: 52
Genetic Programming: Principles and Applications Behnam Dezfouli Spring 2007

‫‪Introduction‬‬ ‫„‬

‫ﻫﺎي‬ ‫ﺑﺮﻧﺎﻣﻪ ﺎ‬ ‫ژﻧﺘﻴﮓ ااﺳﺖ ﻛﻛﻪ ﺑﺮ روي ﺟﻤﻌﻴﺘﻲ ااز ﺎ‬ ‫اﻟﮕﻮرﻳﺘﻢ ﮓ‬ ‫ﻫﻤﺎن اﻟﮕ‬ ‫ﻞ ﺎ‬ ‫در ااﺻﻞ‬ ‫ﻛﺎﻣﭙﻴﻮﺗﺮي اﻋﻤﺎل ﺷﺪه اﺳﺖ‬ ‫ازاي‬ ‫ﻧﻈﺮ راا ﺑﻪﻪ ازا‬ ‫ﻫﺎي ﻣﻮرد ﻧﻈ‬ ‫ﺧﺮوﺟﻲ ﺎ‬ ‫ﺑﺘﻮاﻧﺪ ﺧ‬ ‫ﺖ ﻛﻪ ﺘ اﻧﺪ‬ ‫ﺑﺮﻧﺎﻣﻪﻪ ااﺳﺖ‬ ‫ﻚ ﻧﺎ‬ ‫اﻳﺠﺎد ﻳﻚ‬ ‫ﺪف ا ﺎ‬ ‫ﻫﺪف‬ ‫ورودﻳﻬﺎي ﺧﺎص ﺗﻮﻟﻴﺪ ﻛﻨﺪ‬ ‫ﺑﺮﻧﺎﻣﻪ ﻫﺎي ﻣﻤﻜﻦ‬ ‫ﺑﺮﻧﺎﻣﻪ ددر ﻓﻀﺎي ﺑ ﻧﺎﻣﻪ‬ ‫ﺗﺮﻳﻦ ﺑ ﻧﺎﻣﻪ‬ ‫ﻛﺎرا ﺗ ﻳﻦ‬ ‫ﺟﺴﺘﺠﻮي ﻛﺎ ا‬ ‫ﺟ ﺘﺠﻮي‬ ‫ﻓﻀﺎي ﺟﺴﺘﺠﻮ ﻫﻤﺎن ‪ symbolic expression‬ﻫﺎي ‪ LISP‬اﺳﺖ‬

‫„‬

‫‪S‬‬ ‫‪S-expressions:‬‬ ‫‪i‬‬ ‫‪T‬‬ ‫‪Terminals‬‬ ‫‪i l and‬‬ ‫‪dF‬‬ ‫‪Functions‬‬ ‫‪ti‬‬

‫„‬ ‫„‬ ‫„‬

‫‪Introduction‬‬ ‫‪ 9‬ﻣﺸﺨﺼﻪ ي اﺻﻠﻲ ‪:Genetic programming‬‬ ‫ﺳﻠﺴﻠﻪ ﻣﺮاﺗﺒﻲ )‪ (Hierarchical‬ﺑﻮدن ﺑﺮﻧﺎﻣﻪ ﻫﺎ‬ ‫ƒ‬ ‫ƒ‬ ‫ƒ‬

‫‪Representation as key issue‬‬ ‫ﻧﺎﻣﻨﺎﺳﺐ‬ ‫ﻫﺎي ﻧﺎ ﺎ‬ ‫ﺑﺮﻧﺎﻣﻪ ﺎ‬ ‫اﻳﺠﺎد ﻧﺎ‬ ‫ﻓﻀﺎي ﺟﺴﺘﺠﻮ و ا ﺎ‬ ‫اﻧﺪازه و ﺷﻜﻞ‪ :‬ﻣﺤﺪودﻳﺖ ﻓﻀﺎ‬ ‫ﻣﺤﺪود ﻛﻛﺮدن اﻧ ا‬ ‫آزادي و دﻳﻨﺎﻣﻴﻚ ﺑﻮدن ﺳﺎﺧﺘﺎرﻫﺎ‪ :‬ﺗﻮﻟﻴﺪ ﺟﻮاب ﺧﻮب ﺑﺮاي ﻣﺴﺌﻠﻪ‬

‫ﻣﻌﺎﻳﺐ ﻧﻤﺎﻳﺶ رﺷﺘﻪ اي‪String-base representation :‬‬ ‫ƒ ذات ﺑﺮﻧﺎﻣﻪ ﻫﺎ و روال ﻫﺎ ﺳﻠﺴﻠﻪ ﻣﺮاﺗﺒﻲ اﺳﺖ‬ ‫اب ﻧﻧﻤﻲ ﺎﻧﺪ‬ ‫رﺳﺎﻧﺪ‬ ‫ﺟﻮاب‬ ‫ﻣﺎ راا ﺑﻪﻪ ﺟ‬ ‫رﺷﺘﻪ اي ﺎ‬ ‫ﻧﻤﺎﻳﺶ ﺷﺘﻪ‬ ‫ﺘﻔﺎده از ﻧ ﺎ ﺶ‬ ‫ƒ ااﺳﺘﻔﺎده‬ ‫ƒ ﻧﻤﺎﻳﺶ رﺷﺘﻪ اي ﻗﺎدر ﺑﻪ ﻧﻤﺎﻳﺶ ﺳﺎﺧﺘﺎر ﻫﺎي ﺗﻜﺮار و ﺑﺎزﮔﺸﺘﻲ ﻧﻤﻲ ﺑﺎﺷﺪ‬ ‫ƒ ﻗﺎﺑﻠﻴﺖ ﺗﻐﻴﻴﺮ دﻳﻨﺎﻣﻴﻚ ﻧﺪارد‪ .‬ﺗﻌﻴﻴﻦ اﻧﺪازه اوﻟﻴﻪ = ﻣﺤﺪودﻳﺖ زﻳﺎد در ﻣﻴﺰان ﭘﻴﭽﻴﺪﮔﻲ‬ ‫ﻣﺤﺎﺳﺒﺎﺗﻲ‬

‫‪Introduction‬‬ ‫ﻣﻘﺎﻳﺴﻪ ﺑﺎ اﻟﮕﻮرﻳﺘﻤﻬﺎي ژﻧﺘﻴﻚ‪:‬‬ ‫ﺳﺎﺧﺘﺎرﻫﺎي ﭘﻴﭽﻴﺪه ارا ﺑﻬﻴﻨﻪ ﻛﻛﻨﺪ‬ ‫ƒ ﻗﺎﻗﺎدر ااﺳﺖ ﺎﺧ ﺎ ﺎ‬ ‫ƒ ﺑﺮﻧﺎﻣﻪ ﺳﺎزي ژﻧﺘﻴﻜﻲ ﺑﺮاي ﻣﺤﺪوده وﺳﻴﻌﺘﺮي از ﻣﺴﺎﺋﻞ‬ ‫اﻟﮕﻮرﻳﺘﻢ ﺑﻪ ﻋﻨﻮان ﻳﻚ ﺳﻴﺴﺘﻢ‪:‬‬ ‫درﺟﻤﻌﻴﺖ ﻓﻌﻠﻲ‬ ‫ﻲ‬ ‫ﻣﻮﺟﻮد ر‬ ‫اﻓﺮاد ﻮ ﻮ‬ ‫ﻢ= ﺮ‬ ‫ƒ ووﺿﻌﻴﺖ ﺳﻴﺴﺘﻢ‬ ‫ƒ ورودي ﺳﻴﺴﺘﻢ = ﺗﻨﺎﺳﺐ اﻓﺮاد‬

‫ﺳﺎﺧﺘﺎرﻫﺎي ﺑﻜﺎر رﻓﺘﻪ در اﻟﮕﻮرﻳﺘﻢ‬ Hierarchically structured computer programs whose size, shape, and complexity can dynamically change during the process.

.‫ﻛﺎر ﻧﻤﻲ آآﻳﻨﺪ‬ ‫ژﻧﺘﻴﻚ ﺑﻪ ﻛﺎ‬ ‫ﺳﺎزي ﻚ‬ ‫ﺑﺮﻧﺎﻣﻪ ﺎ‬ ‫ﺧﻄﻲ در ﺎ‬ ‫ﻳﻚ ﺑﻌﺪي ﺧﻄ‬ ‫ﺳﺎﺧﺘﺎرﻫﺎي ﻚ‬ ‫ﺎﺧ ﺎ ﺎ‬

„

F = {f1, {f1 f2, f2 ... , fn} T = {a1, a2, ... , am}.

„ „

‫اﻟﮕﻮرﻳﺘﻢ‬ ‫ﻓﺘﻪ ددر اﻟﮕﻮ ﻳﺘ‬ ‫ﺑﻜﺎر رﻓﺘﻪ‬ ‫ﻫﺎي ﺑﻜﺎ‬ ‫ﺳﺎﺧﺘﺎرﻫﺎي‬ ‫ﺳﺎﺧﺘﺎ‬ „

The functions may be: standard arithmetic operations (such as addition, subtraction, multiplication, and division), standard mathematical functions (such as SIN EXP etc.), SIN,EXP, etc ) Boolean operations, operations domain-specific domain specific functions, logical operators such as If-Then-Else, and iterative operators such as Do-Until, etc

„

The terminals may be: variable atomic arguments, such as the state variables of a system; constant atomic arguments, t such h as 0 and d 1; 1 and, d iin some cases, may be other atomic entities such as functions with no arguments.

‫اﻟﮕﻮرﻳﺘﻢ‬ ‫ﻓﺘﻪ ددر اﻟﮕﻮ ﻳﺘ‬ ‫ﺑﻜﺎر رﻓﺘﻪ‬ ‫ﻫﺎي ﺑﻜﺎ‬ ‫ﺳﺎﺧﺘﺎرﻫﺎي‬ ‫ﺳﺎﺧﺘﺎ‬ ‫„‬

‫„‬

‫‪.1‬‬ ‫‪.2‬‬ ‫‪.3‬‬ ‫‪3‬‬ ‫‪.4‬‬ ‫‪.5‬‬ ‫‪.6‬‬

‫ﻫﺮ زﺑﺎﻧﻲ ﺑﺮاي ﺑﺮﻧﺎﻣﻪ ﻧﻮﻳﺴﻲ ژﻧﺘﻴﻚ ﻣﻲ ﺗﻮاﻧﺪ ﺑﻜﺎر ﺑﺮود‪.‬‬ ‫ﻣﺎ از ‪ LISP‬اﺳﺘﻔﺎده ﻣﻲ ﻛﻨﻴﻢ‪ ،‬زﻳﺮا‪:‬‬ ‫ﺑﺮﻧﺎﻣﻪ ﻫﺎ و داده ﻫﺎ داراي ﻳﻚ ﻓﺮم ﻣﻲ ﺑﺎﺷﻨﺪ‪ .‬دﺳﺘﻜﺎري ‪ GP‬ﻣﺎﻧﻨﺪ داده ﻫﺎ‪.‬‬ ‫دﺳﺘﺮﺳﻲ آﺳﺎن ﺑﻪ درﺧﺖ ‪ parse‬ﻓﺮاﻫﻢ اﺳﺖ‪ S-expression .‬ﻣﻌﺎدل درﺧﺖ ﭘﺎرس اﺳﺖ‪.‬‬ ‫دارﻧﺪ‪.‬‬ ‫ﻣﺘﻐﻴﺮ دا ﻧﺪ‬ ‫ﺳﺎﺧﺘﺎرﻫﺎﻳﻲ ﻛﻪ اﻧﺪازه و ﺷﻜﻞ ﻣﺘﻐ‬ ‫دﺳﺘﻜﺎري آﺳﺎن ﺳﺎﺧﺘﺎ ﻫﺎﻳ‬ ‫دﺳﺘﻜﺎ ي‬ ‫دﺳﺘﻜﺎري آﺳﺎن ﺳﺎﺧﺘﺎرﻫﺎي ﺳﻠﺴﻠﻪ ﻣﺮاﺗﺒﻲ‪.‬‬ ‫ﻗﺎﺑﻠﻴﺖ ﺑﺎزدﺧﻮﻟﻲ ﺑﻮدن )‪.(Reentrant‬‬ ‫وﺟﻮد ﻣﺤﻴﻂ ﻧﺮم اﻓﺰاري ﻗﻮي‪.‬‬

‫ﺳﺎﺧﺘﺎرﻫﺎي ﺑﻜﺎر رﻓﺘﻪ‪LISP S-expression :‬‬

‫ﺳﺎﺧﺘﺎرﻫﺎي ﺑﻜﺎر رﻓﺘﻪ در اﻟﮕﻮرﻳﺘﻢ‬ ‫„ ﻣﺜﺎل ‪:1‬‬ ‫))‪OR (AND (NOT D0) (NOT D1)) (AND D0 D1‬‬ ‫}‪F = {AND, OR, NOT‬‬ ‫}‪T = {D0, D1‬‬

‫ﻣﺤﻴﻂ‪ :‬ﭼﻬﺎر ﺗﺮﻛﻴﺐ ﻣﻤﻜﻦ از ﻣﻘﺎدﻳﺮ ‪D0 , D1‬‬

‫ﺳﺎﺧﺘﺎرﻫﺎي ﺑﻜﺎر رﻓﺘﻪ در اﻟﮕﻮرﻳﺘﻢ‬ ‫ ﻣﺠﻤﻮﻋﻪ ﺗﻮاﺑﻊ و‬،‫ﺑﺮاي ﺣﺼﻮل ﻳﻚ ﻧﻤﺎﻳﺶ درﺳﺖ ﻛﻪ ﻗﺎدر ﺑﻪ ﺣﻞ ﻣﺴﺌﻠﻪ ﺑﺎﺷﺪ‬ :‫ﺗﺮﻣﻴﻨﺎﻟﻬﺎ ﺑﺎﻳﺪ دو ﻣﺸﺨﺼﻪ داﺷﺘﻪ ﺑﺎﺷﻨﺪ‬ „

Closure property of the function set and terminal set: each function of the function set must be able to process all possible argument values generated by other functions or the terminal set set.

„

Sufficiency property of the function and terminal set: the problem bl must b be solvable l bl using i the h ((proposed) d) selection l i off functions and terminals.

‫‪Search Space‬‬ ‫ﺳﺎﺧﺘﺎرﻫﺎي اوﻟﻴﻪ‬ ‫„‬ ‫„‬

‫اﺳﺘﻔﺎده از ﻣﺠﻤﻮﻋﻪ ﺗﺮﻛﻴﺐ ﺗﺮﻣﻴﻨﺎﻟﻬﺎ و ﺗﻮاﺑﻊ و اﻧﺘﺨﺎب اﺗﻔﺎﻗﻲ آﻧﻬﺎ و ﺳﺎﺧﺖ‬ ‫ﺳﺎﺧﺘﺎرﻫﺎي اوﻟﻴﻪ‪.‬‬ ‫ﺗﻮزﻳﻊ اﺣﺘﻤﺎل و ﺗﻌﺪاد آآرﮔﻮﻣﺎﻧﻬﺎي ﺗﻮاﺑﻊ‪ ،‬ﻋﻤﻖ درﺧﺖ را ﺗﻌﻴﻴﻦ ﻣﻲ ﻛﻨﺪ‪.‬‬

‫ﺗﻨﺎﺳﺐ‬ ‫ﺗﺎﺑﻊ ﺗﻨﺎ‬ ‫ﺗﺎ‬ ‫ƒ‬ ‫ƒ‬ ‫ƒ‬

‫ﻫﺮ ﻓﺮد ﻳﻚ ﻣﻘﺪار ‪ fitness‬ﺑﺎ ﺗﻮﺟﻪ ﺑﻪ ﻣﺤﻴﻂ دارد‬ ‫ﺗﻮﺟﻪ ﺑﻪ ﻣﺤﻴﻂ ﻣﺮﺑﻮط ﺑﻪ ﻣﺜﺎل ‪1‬‬ ‫ﺑﺮاي اﻛﺜﺮ ﻣﻮارد ‪ raw fitness‬ﻫﻤﺎن ﺗﻔﺎوت ﻣﻴﺎن ‪ S-expression‬و ﻧﻘﻄﻪ واﻗﻌﻲ اﺳﺖ‬

Search Space :‫در ﺻﻮرﺗﻲ ﻛﻪ ﻋﺒﺎرت از ﻧﻮع ﻋﺪد ﺻﺤﻴﺢ ﻳﺎ ﺣﻘﻴﻘﻲ ﺑﺎﺷﺪ‬ The raw fitness r(i,t) of an individual LISP S-expression i in the population p p of size M at any yg generational time step p t is:

„

„

S(i,j) is the value returned by S-expression i for environmental case j (of N environmental cases) and where C(j) is the correct value for environmental case jj. Less is better ‫ ﻣﺠﻤﻮع ﻓﻮاﺻﻞ = ﺗﻌﺪاد اﺷﺘﺒﺎﻫﺎت‬،‫ داﺷﺘﻪ ﺑﺎﺷﻴﻢ‬boolean b l ‫در ﺻﻮرﺗﻲ ﻛﻪ ﻳﻚ ﻋﺒﺎرت ﺑﺎ ﻣﻘﺎدﻳﺮ‬

„

Search Space Adjusted fitness measure

„ „

Between 0 and 1 Closer to 1 is better

Normalized fitness „ „ „ „

Between 0 and 1 Closer to 1 is better sum of the normalized fitness values is 1 =p proportional p to fitness = fitness p proportionate p

Operations in GP „

Primary operations: …

Reproduction … Crossover (recombination)

FITNESS PROPORTIONATE REPRODUCTION OPERATION Asexual operation - ‫ ﻳﻚ ﻋﻤﻞ ﻏﻴﺮ ﺟﻨﺴﻲ‬- ‫ƒ ﻓﻘﻂ ﺑﺮ روي ﻳﻚ واﻟﺪ ﻋﻤﻞ ﻣﻲ ﻛﻨﺪ‬ ‫ƒ ﺗﻮﻟﻴﺪ ﻳﻚ ﻓﺮزﻧﺪ – اﻧﺘﻘﺎل ﺧﻮدش ﺑﻪ ﻧﺴﻞ ﺑﻌﺪي‬ CROSSOVER (RECOMBINATION) OPERATION ‫ﺗﻮﻟﻴﺪ ﺣﺪاﻗﻞ ﻳﻚ ﻓﺮزﻧﺪ‬ Sexual operation - ‫ﻳﻚ ﻋﻤﻞ ﺟﻨﺴﻲ‬

Operations in GP :2 ‫ل‬ ‫„ ﻣﺜﺎل‬

CROSSOVER

Operations

Control Parameters „

1. Population size: A larger population allows for a greater exploration of the problem space at each generation and increases the chance of evolving a solution. In general, the more complex a problem the greater the population size needed.

„

2. Maximum number of generations: The evolutionary process needs to be given time; the greater the maximum number of generations the greater the chance of evolving a solution. further evolution of a population does not guarantee a solution will be found found—it it may be better to start again with a different initial population.

„

3. Probability of crossover: [Koza, 3 [Koza 1992] does not change this value from 0.90—90% of the population undergoes crossover.

„

4. Probability of reproduction: Koza Koza’s s work this value stays constant, at 0.10—10% of the population undergoes reproduction.

Five Major Preparatory Steps for GP

Flowchart for Genetic Programming g g

Experimental Results GP ‫ ﺑﺎ اﺳﺘﻔﺎده از‬machine learning ‫ﺑﺮرﺳﻲ ﺑﺮﺧﻲ آزﻣﺎﻳﺸﺎت در زﻣﻴﻨﻪ‬ „

The algorithm Th l ith is i probabilistic: b bili ti we almost l t never gett the precisely same result twice.

„

machine learning: The ability of a machine to improve its performance based on previous results.

„

‫‪MACHINE LEARNING OF A FUNCTION‬‬ ‫„‬

‫ﺗﻮﺳﻌﻪ ﻳﻚ ﺗﺮﻛﻴﺐ از ﺗﻮاﺑﻊ‪ ،‬ﻛﻪ ﺑﺘﻮاﻧﺪ ﻣﻘﺪار درﺳﺖ ﻳﻚ ﺗﺎﺑﻊ را ﭘﺲ از ﻣﺸﺎﻫﺪه ﺗﻨﻬﺎ‬ ‫ﻳﻚ ﺗﻌﺪاد ﻧﺴﺒﺘﺎً ﻛﻮﭼﻚ از ﻧﻤﻮﻧﻪ ﻫﺎي ﺧﺎص ﻣﻘﺪار آن ﺗﺎﺑﻊ ﻛﻪ واﺑﺴﺘﻪ ﺑﻪ ﺗﺮﻛﻴﺐ‬ ‫ﺑﺮﮔﺮداﻧﺪ‪.‬‬ ‫آرﮔﻮﻣﺎﻧﻬﺎ ﺑﺎﺷﺪ ﺑ ﮔ داﻧﺪ‬ ‫ﺧﺎﺻﻲ از آ ﮔﻮﻣﺎﻧﻬﺎ‬ ‫ﺧﺎﺻ‬

‫„ ﻣﺜﺎﻟﻲ ﺑﺮاي اﻳﻦ ﻣﻮرد‪:‬‬ ‫‪BOOLEAN 11-MULTIPLEXER FUNCTION‬‬ ‫ﭘﺮدازﻳﻢ‪.‬‬ ‫ﻣﻮرد ﻣﻣﻲ ﭘ دازﻳﻢ‬ ‫ﺑﺮرﺳﻲ اﻳﻦ ﻣﻮ د‬ ‫ددر اداﻣﻪ ﺑﻪ ﺑ ﺳ‬

‫‪BOOLEAN 11‬‬ ‫‪11--MULTIPLEXER FUNCTION‬‬ ‫‪input to the Boolean multiplexer function consists of k‬‬ ‫‪"address" bits ai and 2^k "data" bits di.‬‬

‫„‬

‫)‪for the 11-multiplexer (where k = 3‬‬

‫„‬

‫ﻧﻤﻮﻧﻪ‬ ‫ﻳﻦ ﻮ‬ ‫ﻲ اﻳﻦ‬ ‫ﻃﺮﻓﻲ‬ ‫ﻲ ﺑﺑﺎﺷﺪ و ازز ﺮ‬ ‫ﺠﺶ ﻣﻲ‬ ‫ﺑﻞ ﺳﻨﺠﺶ‬ ‫ﺠﻮ ﻗﺎﺑﻞ‬ ‫ﺟﺴﺘﺠﻮ‬ ‫يﺟ‬ ‫ل‪ :‬ﻓﻀﺎي‬ ‫ﻳﻦ ﻣﺜﺎل‬ ‫ﻣﺰﻳﺖ ﻛﺎرر ﺑﺮ روي اﻳﻦ‬ ‫ﺰﻳ‬ ‫در ﺷﺒﻜﻪ ﻫﺎي ﻋﺼﺒﻲ ﺑﻪ ﻋﻨﻮان ﻳﻚ ﺗﺎﺑﻊ ﺗﺴﺖ در ﻧﻈﺮ ﮔﺮﻓﺘﻪ ﺷﺪه اﺳﺖ‪.‬‬

BOOLEAN 11 11--MULTIPLEXER FUNCTION

k+2

.‫ﻣﺮﺣﻠﻪ اول اﻧﺘﺨﺎب ﻣﺠﻤﻮﻋﻪ ﺗﻮاﺑﻊ و ﺗﺮﻣﻴﻨﺎﻟﻬﺎ اﺳﺖ‬ :‫ﻣﺠﻤﻮﻋﻪ ﺗﻮاﺑﻊ ﻣﻮرد ﻧﻴﺎز ﺑﺮاي اﻧﻮاع ﻋﺒﺎرات ﺑﻮﻟﻲ‬

„

F = {AND, OR, NOT, IF}. T = {{A0,, A1,, A2,, D0,, D1,, ... , D7}. }

„

k

Number of arguments

K=3

2

2

k + 2k

Number of possible Boolean Functions

2

211

=2

2048

„

„

BOOLEAN 11 11--MULTIPLEXER FUNCTION .‫ ﺗﺮﻛﻴﺐ آرﮔﻮﻣﺎﻧﻬﺎ را ﺑﻪ ﻋﻨﻮان ﻣﺤﻴﻂ اﺳﺘﻔﺎده ﻣﻲ ﻛﻨﻴﻢ‬2048 ‫ﺑﺮاي اﻳﻦ ﻣﺴﺌﻠﻪ ﻛﻞ‬ .‫ﮔﻴﺮﻳﻢ‬ ‫ﻧﻈﺮ ﻣﻲ ﮔ ﻳ‬ ‫ ددر ﻧﻈ‬4000 ‫ﺑﺮاﺑﺮ ﺑﺎ‬ ‫ﻞ راا ﺑ اﺑ‬ ‫ﺖ ﻫﻫﺮ ﻧﻧﺴﻞ‬ ‫ﺟﻤﻌﻴﺖ‬ ‫ﺟ ﻌ‬

„ „

„

Raw fitness of a LISP S-expression p is the sum of the distances (taken over all the environmental cases) between the point returned by the S-expression for a given set of arguments g g and the correct p point.

„

The raw fitness of an S-expression can range over 2049 different values between 0 and 2048. 2048

„

The number of hits is simply p y 2048 minus the raw fitness (mismatches).

BOOLEAN 11 11--MULTIPLEXER FUNCTION „

The initial random population includes a variety of highly unfit individuals Some individual S individuals. S-expressions expressions involve logical contradictions such as (AND A0 (NOT A0)). Others involve inefficiencies such as (OR D7 D7). Some are passive and merely pass an input through as the output, such as (NOT (NOT A1)). Some of the initial random individuals base their decision on precisely the wrong arguments (i.e. data bits), such as (IF D0 A0 A2).

„

Most of the initial random individuals are partially "blind" in that they do not incorporate all 11 arguments that are known to be necessary for a solution to the problem.

„

Some S-expressions are just nonsense, such as (IF (IF (IF D2 D2 D2) D2) D2) D2).

‫‪BOOLEAN 11‬‬ ‫‪11--MULTIPLEXER FUNCTION‬‬ ‫„ ﺑﺎ وﺟﻮد ﺗﻨﺎﺳﺐ ﺑﺴﻴﺎر ﭘﺎﻳﻴﻦ اﻓﺮاد ﺟﻤﻌﻴﺖ اوﻟﻴﻪ‪ ،‬در ﻣﻴﺎن ‪ 4000‬ﻧﻔﺮ‪ 25 ،‬ﻧﻔﺮ ﺑﺎ ﻣﻴﺰان‬ ‫‪ hit‬ﺑﺮاﺑﺮ ‪ 1280‬وﺟﻮد دارﻧﺪ‪.‬‬ ‫„ ﻣﻴﺎﻧﮕﻴﻦ ‪ raw fitness‬ﻧﺴﻞ اول = ‪985.4‬‬ ‫ﺳﺎزﻳﻢ‪.‬‬ ‫اﺳﺘﻔﺎده ااز ‪ crossover‬و ‪ reproduction‬ﺟﻤﻌﻴﺖ ﺑﻌﺪي ارا ﻣﻲ ﺎ‬ ‫„ ﺎﺑﺎ ا ﻔﺎ‬ ‫„ ‪ Raw fitness‬در ﻧﺴﻞ دوم = ‪891‬‬ ‫اﺳﺖ‪..‬‬ ‫„ ﻣﺸﺎﻫﺪه ﻣﻲ ﺷﻮد ﻛﻪ ﺑﻬﺒﻮد ﺗﻨﺎﺳﺐ از ﻧﺴﻞ دوم ﺷﺮوع ﺷﺪه اﺳﺖ‬ ‫„ در ‪ 9‬ﻧﺴﻞ ﺑﻌﺪي دارﻳﻢ‪:‬‬ ‫‪ 845‬و ‪ 823‬و ‪ 763‬و ‪ 731‬و ‪ 651‬و ‪ 558‬و ‪ 459‬و ‪382‬‬ ‫ﺟﻮاب ﻣﺴﺌﻠﻪ‬ ‫در ﻧﺴﻞ ‪ 9‬ام ﻓﺮدي ﺑﺎ ﺗﻨﺎﺳﺐ ‪ 2048‬ﺑﺮﺧﻮرد ﺣﺎﺻﻞ ﻣﻲ ﺷﻮد‬

BOOLEAN 11 11--MULTIPLEXER FUNCTION :‫ ام ﺑﻪ دﺳﺖ ﻣﻲ آﻳﺪ‬9 ‫آﻧﭽﻪ در ﻧﺴﻞ‬

„

‫‪BOOLEAN 11‬‬ ‫‪11--MULTIPLEXER FUNCTION‬‬ ‫ﺣﺪ اﻛﺜﺮ ﺗﻌﺪاد اﺟﺪاد ﻓﺮدي ﻛﻪ در ﻧﺴﻞ ‪ h‬ام ﻣﻲ ﺑﺎﺷﺪ‪:‬‬

‫‪h +1‬‬

‫‪2‬‬

‫ﺟﻮاب ﺣﺎﺻﻞ ﺷﺪه در ﻧﺴﻞ ‪ 9‬ام‪ ،‬ﺣﺎﺻﻞ ﻋﻤﻞ ‪ crossover‬ﺑﺮ روي ‪ 2‬ﻓﺮد ﻣﻲ ﺑﺎﺷﺪ‪.‬‬ ‫ﭘﺪر از ﻧﺴﻞ ‪ 8‬ام و در ﻣﻴﺎن ‪ 4000‬ﻋﻀﻮ داراي رﺗﺒﻪ ‪ 58‬از ﻟﺤﺎظ ﺗﻨﺎﺳﺐ ﻣﻴﺒﺎﺷﺪ و ﻧﺮخ ﺑﺮﺧﻮرد آن ‪ 1792‬ﻣﻲ ﺑﺎﺷﺪ‪.‬‬ ‫ﻣﺎدر از ﻧﺴﻞ ‪ 8‬ام و داراي رﺗﺒﻪ ‪ 1‬از ﻟﺤﺎظ ﺗﻨﺎﺳﺐ در ﻣﻴﺎن ‪ 4000‬ﻋﻀﻮ ﻣﻲ ﺑﺎﺷﺪ‪ .‬ﻧﺮخ ﺑﺮﺧﻮرد آن ‪ 1920‬اﺳﺖ‪.‬‬

‫ﻣﺎدر‬ ‫ﺎد‬

‫ﭘﺪر‬ ‫ﭘﺪ‬

‫‪BOOLEAN 11‬‬ ‫‪11--MULTIPLEXER FUNCTION‬‬ ‫„‬

‫در اﻳﻦ ﺣﺎﻟﺖ ﺗﻌﺪاد ﻋﺒﺎرات ﺑﺮرﺳﻲ ﺷﺪه در ﻣﻴﺎن ‪ 10‬ﻧﺴﻞ ﻋﺒﺎرت اﺳﺖ از‪:‬‬

‫„‬

‫‪40000 * 2048 = 81920000 = 81.92E+06‬‬ ‫اﻳﻦ ﻣﻘﺪار ﻧﺴﺒﺖ ﺑﻪ ﻛﻞ ﺗﻌﺪاد ﺗﻮاﺑﻊ ‪ 10 616‬ﺑﺴﻴﺎر ﻛﻤﺘﺮ اﺳﺖ‪.‬‬

‫„‬

‫‪Simple Symbolic Regression‬‬ ‫„‬

‫ﻫﺪف‪ :‬ﭘﻴﺪا ﻛﺮدن ﻳﻚ ﻋﺒﺎرت رﻳﺎﺿﻲ ﺑﺮاي )‪ sin(x‬اﺳﺖ‪.‬‬

‫‪0 ≤ x ≤ 180‬‬

‫ﻣﻲ ﺧﻮاﻫﻴﻢ ﺗﻮﺳﻂ ‪ symbolic regression‬ﻳﻚ ﺗﻘﺮﻳﺐ ﺑﺮاي ﺗﺎﺑﻊ روﺑﺮو ﺑﺪﺳﺖ آورﻳﻢ‪:‬‬

‫‪ x‬ﺑﻴﻦ ‪ 0‬ﺗﺎ ‪ 9‬رادﻳﺎن اﺳﺖ و ﻟﺬا ﻋﺒﺎرت داﺧﻞ ﭘﺮاﺗﺰ ﺑﻴﻦ ‪ 0‬ﺗﺎ ‪ 172‬درﺟﻪ ﺗﻐﻴﻴﺮ ﺧﻮاﻫﺪ ﻛﺮد‪.‬‬ ‫ﺗﺎﺑﻊ ﺗﻨﺎﺳﺐ را ﺑﻪ ﺻﻮرت روﺑﺮو در ﻧﻈﺮ ﻣﻲ ﮔﻴﺮﻳﻢ ﻛﻪ ﺗﻔﺎﺿﻞ ﻣﻘﺪار ﺗﻘﺮﻳﺒﻲ ﺑﺎ ﻣﻘﺪار ﭘﻴﺶ ﻓﺮض ﻣﻲ ﺑﺎﺷﺪ‪.‬‬

‫ﺣﺪاﻛﺜﺮ ﻣﻴﺰان درون ﻗﺪر ﻣﻄﻠﻖ ‪ 20‬ﺧﻮاﻫﺪ ﺑﻮد و ﻟﺬا ﻣﻘﺪار ‪ f‬ﺑﻴﻦ ‪ 0‬ﺗﺎ ‪ 200‬ﻣﺘﻐﻴﺮ ﺧﻮاﻫﺪ ﺑﻮد‪.‬‬ ‫ﻫﺮﭼﻪ ﺑﻪ ‪ 200‬ﻧﺰدﻳﻜﺘﺮ ﺑﺎﺷﺪ ﻳﻌﻨﻲ ﺗﻨﺎﺳﺐ ﺑﺎﻻﺗﺮ اﺳﺖ‪.‬‬ ‫ادازه ﺟﻤﻌﻴﺖ = ‪ 100‬و ﺗﻌﺪاد ﻧﺴﻠﻬﺎ = ‪30‬‬

‫‪Simple Symbolic Regression‬‬ ‫„‬

‫ﺑﻬﺘﺮﻳﻦ ‪ S-expression‬ﻫﺎي زﻳﺮ در ﻃﻮل ‪ 30‬ﻧﺴﻞ ﺗﻮﻟﻴﺪ ﺷﺪه اﻧﺪ‪:‬‬

‫‪% the‬‬ ‫‪th protected‬‬ ‫‪t t d division‬‬ ‫‪di i i operator‬‬ ‫‪t‬‬

‫ﻣﺮوري ﺑﺮ ﻧﺘﺎﻳﺞ ﺑﻪ دﺳﺖ آﻣﺪه‬

‫ﻣﻴﺰان ﭘﻴﭽﻴﺪﮔﻲ ﺑﺮ اﺳﺎس ﺗﻌﺪاد ﺗﺮﻣﻴﻨﺎﻟﻬﺎ و ﺗﻮاﺑﻊ اﺳﺘﻔﺎده ﺷﺪه در ﻫﺮ ﻋﺒﺎرت اﺳﺖ‪.‬‬

‫‪Simple Symbolic Regression‬‬ ‫„‬

‫ﺑﺎ ﺳﺎده ﺗﺮ ﻛﺮدن ﻧﺘﺎﻳﺞ ﺑﻪ دﺳﺖ آﻣﺪه‪:‬‬

‫‪Simple Symbolic Regression‬‬ ‫„‬

‫ﻫﻤﺎﻧﻄﻮري ﻛﻪ از ‪ E1‬ﺗﺎ ‪ E4‬دﻳﺪه ﻣﻲ ﺷﻮد‪ (*x7) , (*xx) ،‬ﺑﻠﻮﻛﻬﺎي ﺳﺎزﻧﺪه ﺑﺮاي ﺗﻨﺎﺳﺐ ﺑﺎﻻ ﻣﻲ‬ ‫ﺑﺎﻻﺗﺮ ﺑﺴﺎزﻧﺪ‪.‬‬ ‫درﺧﺘﻬﺎﻳﻲ ﺑﺎ ﺗﻨﺎﺳﺐ ﺑﺎﻻﺗ‬ ‫ﺑﺎﺷﻨﺪ و اﻳﻦ ﺟﻤﻼت ددر ﻃﻮل ﻧﺴﻠﻬﺎي ﻣﺨﺘﻠﻒ ﻧﮕﻪ داﺷﺘﻪ ﺷﺪه اﻧﺪ ﺗﺎ د ﺧﺘﻬﺎﻳ‬ ‫در ﻧﺴﻞ ‪ 11‬ام )‪ (E3‬اﻳﻦ زﻳﺮ ﻋﺒﺎرت ﺑﺎ ﺟﻤﻠﻪ ))‪ (*x(%94‬ﺗﺮﻛﻴﺐ ﺷﺪه ﺗﺎ ‪ E4‬را ﭘﺪﻳﺪ آورد‪.‬‬

‫„‬

‫‪ (sufficiency‬را ﺗﻮﺳﻂ ﭘﻴﺎده ﺳﺎزي‬ ‫‪( ffi i‬‬ ‫)‪property‬‬ ‫اﻫﻤﻴﺖ اﻧﺘﺨﺎب ﻣﺠﻤﻮﻋﻪ درﺳﺖ ﺗﻮاﺑﻊ ) ‪t‬‬ ‫ﻛﺮدن ﺗﺎﺑﻊ )‪ Sin(x‬ﺑﺎ ﺳﻪ ﻣﺠﻤﻮﻋﻪ ﺗﻮاﺑﻊ ﻧﺸﺎن داده اﻳﻢ‪.‬‬ ‫اﺟﺮا را ﺗﺎ زﻣﺎﻧﻲ اداﻣﻪ داده اﻳﻢ ﻛﻪ ﻣﻴﺰان ﺣﺪاﻗﻠﻲ ﺑﺮاي ﺗﻨﺎﺳﺐ ﺣﺎﺻﻞ ﺷﻮد‪.‬‬

‫„‬

‫„‬

‫‪Applying genetic programming‬‬ ‫‪technique in classification trees‬‬ ‫„‬ ‫„‬ ‫„‬ ‫„‬ ‫„‬

‫ﺑﺮاي ﺣﻞ ﻣﺴﺎﺋﻞ ﻃﺒﻘﻪ ﺑﻨﺪي ﻛﺮدن ﺑﻜﺎر ﻣﻲ روﻧﺪ‪.‬‬ ‫ﻣﺎ از ‪ GP‬ﺑﺮاي ﺟﺴﺘﺠﻮي ﻳﻚ درﺧﺖ ﻃﺒﻘﻪ ﺑﻨﺪي اﺳﺘﻔﺎده ﻣﻲ ﻛﻨﻴﻢ‪.‬‬ ‫ﻧﻤﻮد و ددر‬ ‫ﺗﺒﺪﻳﻞ ﻧ د‬ ‫ﻗﻮاﻧﻴﻦ ﺗ ﺪ ﻞ‬ ‫ﻣﺠﻤﻮﻋﻪ ااي از ﻗ اﻧ‬ ‫ﻪ‬ ‫ﺗﻮان ﺑﻪﻪ‬ ‫ﺷﺪه ارا ﻣﻲ ﺗ ا‬ ‫ﺣﺎﺻﻞ ﺷﺪ‬ ‫ﺧﺖ ﺎ ﻞ‬ ‫ددرﺧﺖ‬ ‫‪ knowledge base‬ﻗﺮار داد‪.‬‬ ‫دو ﻋﻤﻠﮕﺮ ژﻧﺘﻴﻜﻲ ﺟﺪﻳﺪ را ﻣﻌﺮﻓﻲ ﺧﻮاﻫﻴﻢ ﻛﺮد‪.‬‬ ‫ﻧﺘﺎﻳﺞ را ﺑﺮاي ﻳﻚ ﻣﺴﺌﻠﻪ ﻛﺎرت اﻋﺘﺒﺎري ﻧﺸﺎن ﺧﻮاﻫﻴﻢ داد‪.‬‬ ‫‪Classification for:‬‬ ‫‪Knowledge discovery‬‬ ‫‪… Business decision-making‬‬ ‫…‬

‫„‬

Applying genetic programming technique in classification trees „

Two main tasks „ finding classifiers: by adopting appropriate learning techniques on training data sets. „ classifying l if i unknown k d data t knowledge ‫ ﻧﻘﺶ ﻣﻬﻤﻲ در‬،‫ﺳﺎﺧﺖ ﻳﻚ ﻣﺘﺪ ﻛﺎرا ﺑﺮاي ﺗﻮﻟﻴﺪ ﺧﻮدﻛﺎر ﻗﻮاﻧﻴﻦ ﻃﺒﻘﻪ ﺑﻨﺪي‬ .‫ دارد‬managementt

„

:‫در اﻳﻦ روش ﻣﺎ از ﻋﻤﻠﮕﺮﻫﺎي زﻳﺮ اﺳﺘﻔﺎده ﻣﻲ ﻛﻨﻴﻢ‬

„

Crossover Mutation Elimination Merge

… … … …

Applying genetic programming technique in classification trees :‫روش ﺑﻜﺎر رﻓﺘﻪ را در ﺷﻜﻞ روﺑﺮو ﻧﺸﺎن داده اﻳﻢ‬

Applying genetic programming technique in classification trees „

The proposed algorithm: … Input: a set of instances to be classified. … Output: an appropriate classification tree.

Applying genetic programming technique in classification trees „

Step 1: Create an initial population of n randomly generated classification trees.

„

Step 2: Evaluate the fitness value of each classification tree by an evaluation function and the set of instances.

„

Step 3: Select suitable classification trees according to the evaluation results to perform the following genetic operations.

„

Step 3.1: Perform crossover operations on parent trees to generate offspring trees.

„

S Step 3 2 Perform 3.2: P f mutation i operations i on parent trees to generate offspring ff i trees.

„

Step 3.3: Perform elimination operations on offspring trees to remove redundancy.

„

St 3.4: Step 3 4 Perform P f merge operations ti on offspring ff i trees t to t remove subsumption. b ti

„

Step 5: If the termination criterion is not satisfied, then go to Step 3; otherwise, do the next step.

„

Step 6: Select the best classification tree with the highest fitness value from the population as the final classification tree.

‫‪Applying genetic programming‬‬ ‫‪technique in classification trees‬‬ ‫„‬ ‫„‬

‫ﭘﺲ از ﻣﺮﺣﻠﻪ ﺷﺸﻢ‪ ،‬درﺧﺖ ﻃﺒﻘﻪ ﺑﻨﺪي ﻧﻬﺎﻳﻲ را ﺑﻪ ﻣﺠﻤﻮﻋﻪ ﻗﻮاﻧﻴﻦ ﺗﺒﺪﻳﻞ ﻣﻲ ﻛﻨﻴﻢ و در‬ ‫‪ knowledge base‬ﻗﺮار ﻣﻲ دﻫﻴﻢ‪.‬‬ ‫ﻋﻤﻠﮕﺮﻫﺎي ‪ merge‬و ‪ elimination‬ﺑﺮاي ﻛﺎﻫﺶ ﭘﻴﭽﻴﺪﮔﻲ درﺧﺖ ﺑﻜﺎر ﻣﻲ روﻧﺪ‪ .‬وﻟﻲ زﻣﺎن اﺟﺮا‬ ‫اﻓﺰاﻳﺶ ﻣﻲ ﺎﻳﺎﺑﺪ‪.‬‬ ‫ا ا‬

‫ﺟﺰﺋﻴﺎت اﻟﮕﻮرﻳﺘﻢ‬ ‫ﺗﻮاﺑﻊ اﺳﺘﻔﺎده ﺷﺪه‪:‬‬

Applying genetic programming technique in classification trees :‫„ ﻣﺜﺎل‬ .‫ﻛﺎرﺑﺮدي ﻛﻪ ﻣﺜﺎل ﻣﻲ زﻧﻴﻢ در ﻣﻮرد درﺧﻮاﺳﺖ ﺑﺮاي ﻛﺎرت اﻋﺘﺒﺎري اﺳﺖ‬ „

Two classes {disapproving, approving} represented as {R0, R1}, are to be distinguished by the three features {Job type, Education, Salary}.

„

the feature of Job type has four possible values {laborer, low-level manager, middle-level manager, high-level manager} denoted {J0, J1, J2, J3} the feature of Education has four p possible values {{High g school,, College, g , Master, Ph.D} denoted {E0, E1, E2, E3} the feature of Salary has four possible values {< 200 thousand (NT dollars), 200−490 thousand, 500−800 thousand, >800 thousand} denoted {S0, S1, S2 S3}. S2, S3}

„ „

Applying genetic programming technique in classification trees :‫ داراي دو ﻗﺎﻧﻮن زﻳﺮ ﺑﺎﺷﺪ‬RSi ‫ ﻓﺮض ﻛﻨﻴﺪ ﻛﻪ ﻣﺠﻤﻮﻋﻪ ﻗﻮاﻧﻴﻦ‬،‫ﺑﺮاي ﻧﺸﺎن دادن ﻧﻤﺎﻳﺶ درﺧﺘﻲ‬ „ „

„ „

„

Rule1: If (Job type =middle-level middle level manager) and (Education = Master) and (Salary = 500 − 800 thousand) then Class is approving. Rule2: If (Job type = laborer) and (Education = College) and (Salary = <200 thousand) then Class is disapproving. Rule1: If J2 and E2 and S2 then R1; Rule2: If J0 and E1 and S0 then R0.

‫‪Applying genetic programming‬‬ ‫‪technique in classification trees‬‬ ‫„‬

‫ﺗﺎﺑﻊ ﺗﻨﺎﺳﺐ‪:‬‬ ‫… ﻧﻴﺎز ﺑﻪ ﻳﻚ ﺗﺎﺑﻊ ﺗﻨﺎﺳﺐ ﻣﻨﺎﺳﺐ ﺑﺮاي ﻛﺎرﻛﺮد ﺻﺤﻴﺢ اﻟﮕﻮرﻳﺘﻢ دارﻳﻢ‪.‬‬ ‫ﺗﺄﻳﻴﺪﺪ ﺻﻼﺣ ﺖ‬ ‫ﺻﻼﺣﻴﺖ‬ ‫اي ﺗﺄﻳ‬ ‫آﻣﻮزﺷﻲ راا ﺑﺑﺮاي‬ ‫ﻢ و ﻣﺠﻤﻮﻋﻪ اي از ﻧﻤﻮﻧﻪ ﻫﺎي آﻣﻮزﺷ‬ ‫ﻛﻨﻴﻢ‬ ‫ﻳﻒ ﻣﻣﻲ ﻛﻨ‬ ‫ﺗﻌﺮﻳﻒ‬ ‫… ﻳﻚ ﺗﺎﺑﻊ ﺗﻨﺎﺳﺐ راا ﺗﻌ‬ ‫ﻛﺮدن درﺧﺘﻬﺎي ﻃﺒﻘﻪ ﺑﻨﺪي ﺑﻜﺎر ﻣﻲ ﺑﺮﻳﻢ‪.‬‬ ‫… ﻧﻤﻮﻧﻪ ﻫﺎي آﻣﻮزﺷﻲ ﻣﻮاردي ﻫﺴﺘﻨﺪ ﻛﻪ در ﻛﺎرﺑﺮد ﻧﻤﻮﻧﻪ ﺑﻜﺎر ﻣﻴﺮوﻧﺪ و ﻣﻲ ﺗﻮان آﻧﻬﺎ را از ﻣﺤﻴﻂ‬ ‫ﻧﻤﻮد‪.‬‬ ‫وري ﻮ‬ ‫ﻊ آوري‬ ‫ﺟﻤﻊ‬ ‫ﻲﺟ‬ ‫وواﻗﻌﻲ‬

‫‪Applying genetic programming‬‬ ‫‪technique in classification trees‬‬ ‫„‬

‫ﺑﺎ اﺳﺘﻔﺎده از دو ﻓﺮﻣﻮل ﻗﺒﻠﻲ‪ ،‬ﺗﺎﺑﻊ ﺗﻨﺎﺳﺐ را ﺑﻪ ﺻﻮرت زﻳﺮ ﺗﻌﺮﻳﻒ ﻣﻲ ﻛﻨﻴﻢ‪.‬‬

‫ﻋﻤﻠﮕﺮﻫﺎي ژﻧﺘﻴﻜﻲ اﺳﺘﻔﺎده ﺷﺪه ﻋﺒﺎرﺗﻨﺪ از‪:‬‬ ‫‪crossover , mutation‬‬ ‫ƒ دو ﻋﻤﻠﮕﺮ اﺳﺎﺳﻲ‪t ti :‬‬ ‫ƒ دو ﻋﻤﻠﮕﺮ ﺟﺪﻳﺪ‪merge , elimination :‬‬

Applying genetic programming technique in classification trees „

Crossover …

(1) Select a crossover point in one of the parents at random.

…

(2) If the chosen point occurs at a rule boundary (“IF” node), the crossover point in the other parent tree must also be at a rule boundary Otherwise, boundary. Otherwise the point may be within a rule rule. The crossover point for the other parent must also be within a rule.

…

(3) Exchange the two chosen parent subtrees,which consist of the entire set of nodes below the crossover point, to generate two new offspring trees.

Applying genetic programming technique in classification trees „

Crossover

Applying genetic programming technique in classification trees „

Mutation …

The mutation operation is desired to help the evolving process escape from local optimum.

Applying genetic programming technique in classification trees „

Elimination .‫ﺑﺮاي ﺣﺬف اﻓﺰوﻧﮕﻲ در زﻣﺎﻧﻲ ﻛﻪ دو ﻳﺎ ﭼﻨﺪ ﻗﺎﻧﻮن ﻣﺸﺎﺑﻪ در ﻳﻚ درﺧﺖ وﺟﻮد داﺷﺘﻪ ﺑﺎﺷﺪ‬

„

Let the classification tree to be processed be CTr with k rules. The steps for the elimination operation are shown as follows. „

(1) Set i = 1 and j = i + 1, where i and j are used to represent the numbers of the current two selected rules for checking.

„

(2) Compare Ruleri with Ruler j for their feature values and classes.

„

(3) If the two rules have the same values, they are redundant. Remove the deeper redundant rule from the classification tree CTr .

„

(4) Set j = j + 1. (5) If j > k, then do the next step; otherwise, go to step (2).

„

(6) Set j = i + 2 and i = i + 1.

„

(7) If i = k, then exit the searching process; otherwise, go to step (2).

Applying genetic programming technique in classification trees „

Elimination

Applying genetic programming technique in classification trees „

Merge .‫ﺑﺮاي ﻛﺎﻫﺶ ﭘﻴﭽﻴﺪﮔﻲ درﺧﺖ اﺳﺘﻔﺎده ﻣﻲ ﺷﻮد‬

„

Let the classification tree to be processed be CTs with q rules. The steps for the merge operation are shown as follows. „

(1) Set i = 1 and j = i + 1, where i and j are used to represent the numbers of the current two selected rules for checking.

„

(2) C Compare Rule R l sii with ith Rule R l sjj for f th their i feature f t values l and d classes. l

„

(3) If the two rules have the same classes and the feature values of a rule is a subset of those in the other one, remove the rule with more features from the classification tree CTs .

„

(4) Set j = j + 1.

„

(5) If j > q, then do the next step; otherwise, go to step (2).

„

(6) Set j = i + 2 and i = i + 1. 1

„

(7) If i = q, then exit the searching process; otherwise, go to step (2).

Applying genetic programming technique in classification trees „

Merge

‫‪Applying genetic programming‬‬ ‫‪technique in classification trees‬‬ ‫„‬

‫ﻧﺘﺎﻳﺞ ﻋﻤﻠﻲ‬

‫„‬

‫در اﻳﻦ ﺑﺨﺶ از ﻣﺠﻤﻮﻋﻪ اي از داده ﻫﺎ ﺑﺮاي ﻛﺎرﺑﺮدي ﻛﻪ در آن ﻛﺎرت اﻋﺘﺒﺎري درﺧﻮاﺳﺖ ﺷﺪه‪ ،‬اﺳﺘﻔﺎده ﺷﺪه اﺳﺖ‪.‬‬ ‫آﻣﺪه‪.‬‬ ‫ﺖآ ﺪ‬ ‫ﺑﻪ ددﺳﺖ‬ ‫ﺗﺎﻳﻮان ﻪ‬ ‫ﺑﺎﻧﻜﻲ ددر ﺗﺎ ا‬ ‫ﻧﻤﻮﻧﻪ از ﺎﻧﻜ‬ ‫ﻣﻮرد ﻧ ﻧﻪ‬ ‫‪ 687‬د‬ ‫اﻳﻦ ﻣﻮارد ﺷﺎﻣﻞ ‪ feature 14‬و دو ﻛﻼس ‪ approve , disapprove‬ﻣﻲ ﺑﺎﺷﺪ‪.‬‬ ‫ﻫﺪف ﭘﻴﺪا ﻛﺮدن ﻳﻚ درﺧﺖ ﻃﺒﻘﻪ ﻣﻨﺎﺳﺐ ﺑﻮده ﻛﻪ ﺑﺘﻮاﻧﺪ ﺑﻪ ﻣﺠﻤﻮﻋﻪ اي از ﻗﻮاﻧﻴﻦ ﺗﺒﺪﻳﻞ ﺑﺸﻮد ﻛﻪ ﺗﻮﺳﻂ آﻧﻬﺎ ﺑﺘﻮاﻧﻴﻢ‬ ‫ﻛﻨﻴﻢ‪.‬‬ ‫ﻦ ﻛﻨ ﻢ‬ ‫ﻣﻌﻴﻦ‬ ‫ده ﻣﻌ‬ ‫ي ﻛﻛﺮده‬ ‫اﻋﺘﺒﺎري‬ ‫ت اﻋﺘ ﺎ‬ ‫ﻛﺎرت‬ ‫ﺧﻮاﺳﺖ ﻛﺎ‬ ‫دي ﻛﻪ ددرﺧﻮاﺳﺖ‬ ‫اي ﻓﻓﺮدي‬ ‫ﻳﻜﻲ از دو ﻛﻼس راا ﺑﺑﺮاي‬ ‫ﻳﻜ‬ ‫اﻳﻦ ‪ 687‬ﻣﻮرد را اﺑﺘﺪا ﺑﻪ دو دﺳﺘﻪ ﺗﻘﺴﻴﻢ ﻛﺮده اﻳﻢ‪ ،‬دﺳﺘﻪ اول ﺑﺮاي ﭘﺮورش درﺧﺘﻬﺎ و دوﻣﻲ ﺑﺮاي ﺗﺴﺖ ﻛﺮدن‪.‬‬ ‫از دﺳﺘﻪ اول ﺑﺮاي ارزﻳﺎﺑﻲ ﻣﻘﺪار ﺗﻨﺎﺳﺐ درﺧﺘﻬﺎ اﺳﺘﻔﺎده ﺷﺪه اﺳﺖ‪.‬‬ ‫ﻧﻬﺎﻳﻲ اﺳﺘﻔﺎده ﺷﺪه‪.‬‬ ‫ﻧﺘﻴﺠﻪ ﻧﻬﺎﻳ‬ ‫ﻛﺮدن ﻧﺘ ﺠﻪ‬ ‫ﺗﺴﺖ ﻛ دن‬ ‫ﺑﺮاي ﺗ ﺖ‬ ‫از دﺳﺘﻪ دوم ﺑ اي‬ ‫‪ %70‬ﻣﻮارد ﺑﺮاي ارزﻳﺎﺑﻲ ﺗﻨﺎﺳﺐ و ‪ %30‬از ﻣﻮارد ﺑﺮاي ﺗﺴﺖ اﺳﺘﻔﺎده ﺷﺪه اﺳﺖ‪.‬‬ ‫ﭘﺲ از ‪ 500‬ﻧﺴﻞ‪ ،‬روش ﭘﻴﺸﻨﻬﺎد ﺷﺪه دﻗﺘﻲ ﺑﺮاﺑﺮ ﺑﺎ ﺟﺪول زﻳﺮ را ﺑﻪ دﺳﺖ آورده اﺳﺖ‪.‬‬

‫„‬ ‫„‬ ‫„‬ ‫„‬ ‫„‬ ‫„‬ ‫„‬ ‫„‬

‫‪Applying genetic programming‬‬ ‫‪technique in classification trees‬‬

‫ﻣﺸﺎﻫﺪه ﻣﻲ ﺷﻮد ﻛﻪ روش اراﺋﻪ ﺷﺪه‪ ،‬ﻛﺎراﻳﻲ ﺑﺎﻻﺗﺮ از روش ‪ koza‬و روش ‪ quinlan‬دارد‪.‬‬ ‫ﭘﺲ ﺗﻮاﻧﺴﺘﻴﻢ ﻣﺘﺪ ﻛﺎراﻣﺪي ﺑﺮاي ﺗﻮﻟﻴﺪ ﻳﻚ درﺧﺖ ﻃﺒﻘﻪ ﺑﻨﺪي ﻣﻨﺎﺳﺐ ﭘﺒﺪا ﻛﻨﻴﻢ‪.‬‬ ‫اﺳﺘﻔﺎده از ﻋﻤﻠﮕﺮﻫﺎي ‪ merge , elimination‬ﻣﻮﺟﺐ ﺷﺪ ﻛﻪ رواﺑﻂ ﺑﻴﻦ ﻗﻮاﻧﻴﻦ را ﻧﻴﺰ در ﻧﻈﺮ ﺑﮕﻴﺮﻳﻢ‪ .‬اﻣﺎ اﻳﻦ ﺑﻪ ﻗﻴﻤﺖ‬ ‫زﻣﺎن ﭘﺮدازش ﺑﻴﺸﺘﺮ اﺳﺖ‪.‬‬ ‫در آﻳﻨﺪه ﻣﻲ ﺗﻮان ﺑﺮ روي ﺗﺎﺑﻊ ﺗﻨﺎﺳﺐ ﺑﻬﺘﺮي ﻛﺎر ﻛﺮد ﺗﺎ دﻗﺖ ﺑﺎﻻﺗﺮي را ﺑﻪ دﺳﺖ آورﻳﻢ‪.‬‬

R f References „

[1] John R. Koza, “GENETIC PROGRAMMING: A PARADIGM FOR GENETICALLY BREEDING POPULATIONS OF COMPUTER PROGRAMS TO SOLVE PROBLEMS”, Computer Science Department, Stanford University, Margaret Jacks Hall Stanford, CA 94305

„

[2] S. Settea,, L. Boullart, “Genetic programming: principles and applications” ,Elsevier, Engineering Applications of Artificial Intelligence 14 (2001) 727–736

„

[3] Chan-Sheng Ch Sh K Kuo, T Tzung-Pei, P i Ch Chuen-Lung L Chen Ch Hong, H “Applying “A l genetic programming technique in classification trees” Springer-Verlag 2007

„

[[4]] Matthew Walker,, “Introduction to Genetic Programming”, g g , October 7,, 2001

„

[5] “Executional Steps of Genetic Programming”, http://www.geneticprogramming.com/gpanimatedtutorial.html

Related Documents


More Documents from "Mihai Oltean"