Semantic Web Content Analysis - χρήστος ζιγκόλης

  • Uploaded by: chris zlatis
  • 0
  • 0
  • October 2019
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Semantic Web Content Analysis - χρήστος ζιγκόλης as PDF for free.

More details

  • Words: 1,107
  • Pages: 31
Semantic Web Content Analysis A Study in Proximity-Based Collaborative Clustering

Contents… (1)

Semantic Web

(2) (3)

Proximity Based Collaborative Clustering

Experimental Studies

(4)

Semantic Web in Web Intelligence

Semantic Web

Semantic Web – Definition “is an evolving extension of the World Wide Web in which the semantics of information and services on the web is defined, making it possible for the web to understand and satisfy the requests of people and machines to use the web content.” The ultimate goal is to create a global mean for information exchange where the data will be available for processing from humans and machines

Semantic Web – Architecture URI = string that characterize a web resource XML = a user-defined syntax for web resources [No semantic issues] RDF = represent information and relations between web resources OWL = extend and enhance the RDF with more features Logic / Proof = semantic relations between data from lower levels under rules AND conclusions will be extracted from these rules Trust = information reliability tests / digital signatures etc

Semantic Web – RDF document Bob Dylan USA Columbia 10.90 1985 ….

RDF file

Semantic Web – Triple Form

Subject – Predicate – Object

Semantic Web – Graph Representation

Proximity Based Collaborative Clustering

Proximity Based Collaborative Clustering Collaborative Clustering …in keywords • Several Data sets (same objects – different features) • Process separately each data set • Collaboration at the level of the results (information granules) ---------------------------------------------------------------------------------------Proximity measure This mechanism allows us to use different number of clusters in the processing of each data set C

Prox i , j [ii ] = ∑ min(uki , ukj ) Æ matrix NxN k =1

Proximity Based Collaborative Clustering(2) The Algorithm Process

1) Compute X = [X1|X2|…|Xp] U = fcm(X, max(C1, C2,…,Cp))

DATA SETS

Prox(U) 2) For each { X[ii], C[ii] }

[ {X1, C1},{X2, C2}, …,{Xp, Cp} ]

U[ii] = fcm( X[ii], C[ii] ) Prox(U[ii]) 3) Repeat Optimization of index V Æ min

Proximity Based Collaborative Clustering(3) V = Prox(U) − Prox(U[1]) + Prox(U) − Prox(U[2]) + ... + Prox(U) − Prox(U[p]) We require that Prox(U) is made as close as possible to the matrices Prox(U[1]), Prox(U[2]),…Prox(U[p]) The optimization of V is carried out using a standard gradientbased mechanism

uij (iteration + 1) = uij (iteration) −

α ∂V N ∂ui , j

Experimental Studies

Experimental Studies Data Formulation 70 SWDs (RDF syntax) Grouping according to the main topic • 1-16 : docs with phone devices’ information • 17-34 : personal homepages • 35-51 : people and information about their workplace • 52-70 : semantic web area

Experimental Studies Data Formulation (cont…) Two Feature Spaces Semantic : a parser extracts the most relevant metadata, 12 in number Content-Based : a parser elicits the most meaningful words which represent the value assumed by metadata, and which are surrounded by the meta-tags. 10 in number

Experimental Studies(2) Data Formulation (cont…) We have to express these 2 feature spaces to 2 data matrices so we are able to start the clustering. “Semantic” Data Matrix : Rows = 70 SWDs and Columns = 12 semantic features “Content-Based” Data Matrix : Rows = 70 SWDs and Columns = 10 content-based features ---------------------------------------------------------------------------------------Each entry of these matrices represents the number of occurrences of the corresponding feature in the current document

Experimental Studies(3) Metadata-Features (1)airport:Airport (2)contact:nearestAirport (3)foaf:Person (4)foaf:knows (5)dc:title (6)foaf:Document (7)prf:NetworkCharacteristic (8)prf:HardwarePlatform (9)foaf:homepage (10)foaf:interest (11)prf:CcppAccept (12)foaf:Project

| | | | | | | | |

Content-based Features (1)2004-XX-XX (2)semantic web (3)web (4)network (5)internet (6)paper/document (7)project (8)UTF-8 (9)technology (10)information

particular cluster

distribution of the membership grades of certain docs

Comparison Issues A unique data set with the 70 SWDs and the 22 features (semantic and content-based) X[70x22] Æ Standard FCM with C = 4

Two different data sets and Proximity Based Collaborative Clustering X[70x12] and Y[70x10] and a global structure U

Comparison Issues “Proximity-Based VS Standard FCM” 1. Distribution of documents in Cluster 1 and 2 are similar 2. In prototype of cluster 2 the representative features (project, information) has higher values in Proximity-Based than FCM 1.40 > 0.54 n’ 2.35 > 0.79 3. FCM weakness Æ unable to discriminate the remaining documents in Cluster 3 and 4. Membership values close to each other - docs in range [35-60] Æ similar membership distribution for each component (Cluster 3, [< 0.44]) and docs in range [48-70] Æ same effect (Cluster 4, [max = 0.44]) - Proximity Based Æ (Cluster 3, [35-60], [ > 0.44, max = 0.72] and (Cluster 4, [48-70], [ > 0.44])

Comparison Issues(2) “Proximity-Based VS Standard FCM” 4. Documents in range [61-70]

The contribution of metadata clustering Proximity-Based Æ Cluster 2 FCM Æ NOT appear in Cluster 2 Proximity-Based collaborative clustering better reflects the partitioning realized in the individual clustering.

Prototypes, Proximity-Based and FCM

(2)

Proximity-Based Prototypes

FCM Prototypes

(2)

(1) S T A N D A R D F C M

(4)

values < 0.44

(3)

P R O X I M I T Y B A S E D

(1)

(4)

values > 0.44

(3)

Semantic Web in Web Intelligence

Semantic Web in Web Intelligence Data : “refers to a collection of natural phenomena descriptors, including the results of experience, observation or experiment, or a set of premises.” Information : “is the interpretation of the results came from data processing”

Web Until Now

---------------------------------------------------------------------------------------Knowledge “well, there are more than one definitions” “We have to extract the hidden knowledge from web and build an extension. Make the “new” web understandable not only for humans but also for machines”

Semantic Web

Semantic Web in Web Intelligence(2) • Web Intelligence : “exploits Artificial Intelligence (AI) and advanced Information Technology (IT) on the Web and Internet” • Semantic Web needs standards for both syntactic and semantic content Æ Ontology is a solution • Ontologies will enable Web-based knowledge processing, sharing, and reuse between applications. Also they’ll play a major role in supporting information exchange processes.

Semantic Web in Web Intelligence(3) The roles of ontologies for Web intelligence : • communication between Web communities • agents communication based on semantics • knowledge-based Web retrieval • understanding Web contents in a semantic way • web community discovery (implicitly-defined community)

Conclusions There are many types of algorithms that belong to the field of Computing Intelligence which have been used for problem solving. The web is expanding with great speed, while searching for new information organization techniques and knowledge extraction from these. So, why should we not advance to applications of classical algorithms from the field of Computing Intelligence in order to solve some of the existing problems?

Welcome to the world of Web Intelligence.

Thank you!

Related Documents

Semantic Web
May 2020 8
Content Analysis
July 2020 10
Paper Web Semantic Rabnawaz
December 2019 16
Semantic Web Und Frbr
November 2019 19
The Social Semantic Web
November 2019 12
Populating The Semantic Web
November 2019 24

More Documents from ""

October 2019 10
October 2019 7
October 2019 7
Elections 2008, Final Report
December 2019 22
November 2019 7