Cheng-icml09poster

  • Uploaded by: Weiwei Cheng
  • 0
  • 0
  • May 2020
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Cheng-icml09poster as PDF for free.

More details

  • Words: 597
  • Pages: 1
DECISION TREE AND INSTANCE-BASED LEARNING FOR LABEL RANKING ¨ ¨ Weiwei Cheng, Jens Huhn and Eyke Hullermeier Knowledge Engineering & Bioinformatics Laboratory Department of Mathematics and Computer Science Marburg University, Germany

Label Ranking

Label Ranking Tree

Local Learning Approach

IF age ≤ 35

label ranking   customer 1

MINI _ Toyota _ BMW _ Volvo

customer 2

BMW _ MINI _ Toyota

customer 3

Volvo _ BMW _ Toyota _ MINI

customer 4

Toyota _ BMW

age B_T_M

IF sex = f M_T_B

B_M_T

IF lives in Montreal

M_T M_T

B_T_M

T_M

income

income

Given:

– Target function is estimated (on demand) in a local way.

– a set of training instances { xk | k = 1 . . . m } ⊆ X

– Core part is to estimate a locally constant model.

– a set of labels L = { λ1, λ2, . . . , λn }

– Use probabilistic models for rankings, considering nearby preferences as a representative sample.

¨ ¨ Furnkranz and Hullermeier, ECML 2003

=

Har-Peled, Roth and Zimak, NIPS 2003

k Y

P(E(σi) | θ, π) =

k Y

X

P(σ | θ, π)

i=1 i=1 σ∈E(σi) Qk P i=1 σ∈E(σi) exp (−θd(σ, π)) ,  k Qn 1−exp(−jθ) j=1 1−exp(−θ)

where E(σi) denotes the set of consistent extensions of σi.

– Log linear models for label ranking Dekel, Manning and Singer, NIPS 2003

• essentially reduce label ranking to classification • are efficient but may come with a loss of information • may have an improper bias and lack flexibility • or may produce models that are not easily interpretable

BÂM

TÂMÂB

BÂMÂT

MÂBÂT

TÂMÂB

Major modifications compared with regression trees: 1. Split criterion: seeking a split of T (set of rankings) into T + and T − that maximizes

2. Stopping criterion: tree is pure OR number of labels in a node is too small

with center π ∈ Ω, spread θ > 0, and distance d on Ω. Maximum likelihood estimation (MLE) based on observed (incomplete) rankings σ = {σ1, . . . , σk }: σ | θ, π) = P(σ

– Constraint classification

fal se

|T +| · θ+ + |T −| · θ− |T |

exp(−θd(π, σ)) P(σ | θ, π) = φ(θ, π)

Existing Methods – Ranking by pairwise comparison

e

fal se

Mallows Model

A ranking function (X → Ω mapping) that maps each x ∈ X to a ranking x of L (permutation πx) and generalizes well in terms of a loss function on rankings.

t ru

B_M_T

true

Find:

IF a PhD student

fal se

e tru

???

– for each training instance xk : a set of pairwise preferences of the form λi xk λj

fals e

tr ue

Observation σ

Extensions E(σ)

a_b

a_b_c a_c_b c_a_b

ˆ = arg maxπ,θ P(σ σ | θ, π) Approximation of the MLE (ˆ π , θ) with an EM-like estimation procedure.

Main Conclusions from Experiments 0,8 Ranking performance

new customer

age

Label Ranking Tree (LRT) Instance-Based Label Ranking (IBLR)

0,75

Constraint Classification (CC)

0,7 0,65 0,6 Probability of missing label

0,55 0

0,1

0,2

0,3

0,4

0,5

0,6

0,7

– Local learning is more flexible and can exploit more preference information. – Given enough data, IBLR is significantly better than LRT and CC in terms of predictive accuracy (Kendall’s tau). – The size of LRT is smaller than expected.

¨ ACKNOWLEDGEMENTS: Weiwei Cheng and Jens Huhn were financially supported by the ICML 2009 Student Scholarship Program. Poster presented at ICML 2009, 26th International Conference on Machine Learning, Montreal, Canada, June 2009

More Documents from "Weiwei Cheng"