Arizona State University P.O. Box 878809 Tempe AZ, 85287-8809
YAN QI
Office Phone: (480) 727-3611 Email:
[email protected] http://www.public.asu.edu/∼yqi04
OBJECTIVE To obtain a full-time research and/or development position in information technology, especially database systems, information retrieval and processing.
EDUCATION • Ph.D in Computer Science. Arizona State University, Tempe, AZ. Expected, 2009. – Thesis: “An Efficient Query Execution Engine for Supporting Exploration and Expert Feedback in Resolving Conflicts during Integration of Metadata” – Specialization: Data integration, Information retrieval – GPA: 3.8/4.0 • M.S in Computer Science. University of Science and Technology of China, Hefei, China, 2004. – Major: Computer Science – Thesis: “Small Sample Neural Network” – Specialization: Neural computing, Evolutionary computing – GPA: 3.6/4.0 • B.S in Computer Science. University of Science and Technology of China, Hefei, China, 2001. – Major: Computer Science – Thesis: “Emotion-based Image Retrieval Engine” – Specialization: Image processing and retrieval – GPA: 3.7/4.0
ACADEMIC EXPERIENCE Research Assistant – CSE Department of ASU, Aug, 2004 - Present • Research and development on rank-aware SPARQL query engine. The goal is achieved by extending RDF model in Jena to support association of confidence values with statements, and modifying SPARQL language query engine in ARQ for an efficient query processing to obtain top-K ranked results. • Research and development on the tDAR (the Digital Archaeological Record) project. Technologies involved in this project include Java, Struts2, Hibernate, Spring, JSP, JavaScript etc; the research of tDAR is related to data integration, ontology analysis, database management system etc. (tDAR: www.tdar.org) • Research and development on the table summarization with the help of taxonomy; the goal is to implement a system (i.e., AlphaSum), which aims to obtain OLAP-like navigable summaries from large RDB tables. The involved technologies include C++, SQL etc. (AlphaSum: code.google.com/p/alpha-sum/) • Research and development on data integration systems; the goal is to capture value and structural conflicts in the integrated data and assist users in resolving conflicts through an interactive, feedback-based process. Research funded by an NSF grant on scientific data integration. Involved technologies include Java GUI programming, graph algorithm design and implementation. (FICSR: code.google.com/p/ficsr/ and K-Shortest Paths: code.google.com/p/k-shortest-paths/) • Research and development on rank-aware twig-query processing on the integrated data and metadata with conflicts: introduced a sum-max monotonicity property of twig-query in terms of ranked structuraljoins, and leveraged this property to develop a self-punctuating, horizon-based ranked join (HR-Join) operator for ranked twig-query execution on data graphs. The implementation employed multiplethreads technique in C++. (HR-Join: code.google.com/p/hr-join/)
• Research and development on an algorithm (CUTS) for topic segmentation of text streams; the goal is to track topic development patterns in a given text information stream and identify topic boundaries and topic development tendencies in the stream. CUTS is implemented with Java and Matlab mixed programming (CUTS: code.google.com/p/cuts/). CUTS has been embedded in the MAISON system (MAISON: maison.asu.edu). • Research and development on real-time message filtering systems; the goal is to efficiently filter and match XML messages against local data, leveraging a novel cluster-domain matching scheme.
Selected Course Projects – CSE Department of ASU, Aug, 2004 - May, 2006 • Extending PRIX for Similarity-based XML Query (graduate course on Multimedia and Web Databases) was implemented using JAVA. Cooperated with 3 graduate students. (tinyurl.com/exPRIX) • GoogleME (graduate course on Semantic Web Mining) uses semantic data mining techniques to provide an enhanced searching mechanism based on Google’s search results. Cooperated with 2 graduate students. (tinyurl.com/qyGoogleMe) • MPS (graduate course on Distributed and Multiprocessor Operating Systems) is a message passing system for the distributed computing.
Research Assistant – Computer Science Department of University of Science and Technology of China, Feb, 2002 - May, 2004 • Research and development of an interactive genetic algorithm for image retrieval. The focus of the research was learning of the user preferences to improve retrieval performance. • Research and development of a neural network model to compute the force on bridge cables to measure the health condition of a cable-stayed bridge.
Teaching Assistant – Computer Science Department of University of Science and Technology of China, Sep, 2001 - Jan, 2002 • I was a teaching assistant of the course ‘UML’. My tasks included teaching recitation classes, giving quizzes and holding office hours every week.
INDUSTRIAL EXPERIENCE Research Intern, NEC Labs America, Cupertino, CA • May, 2008 to Aug, 2008 – Research and development on the extraction and matching of taxonomies in facet-based deep web. • June, 2007 to Sep, 2007 – Research and development on the OLAP operations over heterogeneous data sources.
PUBLICATIONS Feedback based data integration, exploration and query processing (also see 5,6,7,8,9) 1. K. Sel¸cuk Candan, Huiping Cao, Yan Qi, Maria Luisa Sapino. System Support for Exploration and Expert Feedback in Resolving Conflicts during Integration of Metadata. The International Journal on Very Large Data Bases Publisher Springer Berlin, August 07, 2008. Topic segmentation of text streams (also see 10,11) 2. Syed Toufeeq Ahmed, K. Sel¸cuk Candan, Sangwoo Han, Yan Qi. Topic Development Pattern Analysis based Adaptation of Information Spaces to Facilitate Accesses to Digital Libraries by Users who are Blind. Accepted by Special issue on Adaptive Hypermedia of the New Review in Hypermedia and Multimedia. Database summarization with the help of taxonomies 3. K. Sel¸cuk Candan, Huiping Cao, Yan Qi, Maria Luisa Sapino. AlphaSum: Size-Constrained Table Summarization using Value Lattices. 12th International Conference on Extending Database Technology (EDBT) Saint Petersburg, Russia, March 24-26 2009.
2
4. K. Sel¸cuk Candan, Huiping Cao, Yan Qi, Maria Luisa Sapino. Table Summarization with the Help of Domain Lattices. Poster. In CIKM ’08: Proceedings of the 2008 ACM CIKM International Conference on Information and Knowledge Management, Napa, USA, 2008. Feedback based data integration, exploration and query processing (also see 1) 5. Yan Qi, K. Sel¸cuk Candan, Junichi Tatemura, Songting Chen, Fenglin Liao. Supporting OLAP Operations over Imperfectly Integrated Taxonomies. In SIGMOD ’08: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pages 875-888, Canada, 2008. 6. Yan Qi, K. Sel¸cuk Candan, Maria Luisa Sapino. Sum-Max Monotonic Ranked Joins for Evaluating Top-K Twig Queries on Weighted Data Graphs. In VLDB ’07: Proceedings of the 33rd International Conference on Very Large Data Bases, pages 507-518, Austria, 2007. 7. Yan Qi, K. Sel¸cuk Candan, Maria Luisa Sapino. FICSR: Feedback-based InConSistency Resolution and Query Processing on Misaligned Data Sources. In SIGMOD ’07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data, pages 151-162, USA, 2007. 8. Yan Qi, K. Sel¸cuk Candan, Maria Luisa Sapino, Keith Kintigh. Integrating and Querying Taxonomies with QUEST in the Presence of Conflicts. Demo. In SIGMOD ’07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data, pages 1153-1155, USA, 2007. 9. Yan Qi, K. Sel¸cuk Candan, Maria Luisa Sapino, Keith Kintigh. QUEST: QUery-driven Exploration of Semistructured Data with ConflicTs and Partial Knowledge. In CleanDB ’06: Proceedings of the First International VLDB Workshop on Clean Databases, 2006. Topic segmentation of text streams (also see 2) 10. Qing Li, K. Sel¸cuk Candan, Yan Qi. Extracting Relevant Snippets for Web Navigation. In AAAI ’08: Proceedings of the 23th AAAI Conference on Artificial Intelligence, pages 1195-1200, USA, 2008. 11. Yan Qi, K. Sel¸cuk Candan. CUTS: CUrvature-based development pattern analysis and segmentation for blogs and other Text Streams. In HYPERTEXT ’06: Proceedings of the 17th ACM Conference on Hypertext and Hypermedia, pages 1-10, Odense, Denmark, 2006. Research and development of a real-time message filtering systems 12. K. Sel¸cuk Candan, Mehmet E. Donderler, Yan Qi, Jaikannan Ramamoorthy, Jong W. Kim. FMware: Middleware for Efficient Filtering and Matching of XML Messages with Local Data. In Middleware ’06: ACM/IFIP/USENIX 7th International Middleware Conference, pages 301-321, Melbourne, Australia, 2006.
TECHNICAL SKILLS • Language: C++, Java, Matlab, Visual C++, C, C#, PHP, JSP, JavaScripts • Web technologies: Struts 2, Spring, Hibernate • Operating Systems: Linux, Windows • Databases: MySQL, PostgreSQL, Microsoft Access, Berkeley DB, Jena • Graphics: OpenGL
AWARD The CSE outstanding Ph.D. student award in 2009 (Only one every year)
3