CMPS-290H: XML Databases

[ Information ]   [ Description ]   [ Grading ]   [ Syllabus ]

Course Information

Lectures: Monday and Wednesday, 5:00-6:45PM, BE 165

InstructorAlkis Polyzotis
 Phone: x9-1304
 Office: BE-359B
 Office Hours: Monday, 1:00-3:00PM

Course Description

XML has rapidly evolved from a mark-up language to a de-facto standard for data exchange over the Internet. This increasing volume of available XML data foreshadows the development of XML Database Management Systems, which will allow users and applications to query, in a declarative fashion, large stores of semi-structured data.

The course will cover research topics in XML Databases, with an emphasis on systems. Among the topics covered will be: using relational databases for XML query processing; evaluation techniques for XML queries; XML query optimization; statistics for XML databases; information retrieval techniques for XML; and overall system architecture for native XML databases. The covered material will be based on papers published in well-known database forums.

Students will be expected to present papers in class, write paper reviews, and complete a project. The project can be either an in-depth survey of the existing literature on a specific topic, or an implementation project with a strong research flavor. A list of suggested topics can be found here.

Enrolled students will be expected to have a basic background in database systems (the official pre-requisite is CMPS180 or equivalent). Interested students are strongly encouraged to contact the instructor.

Grading

Paper Reviews: 30%
Paper Presentations: 30%
Project: 40%

Syllabus

  • Warm-up
    • Serge Abiteboul, "Querying semi-structured data", ICDT 1997.
    • XML , XPath , XQuery
    • Jennifer Widom, "Data Management for XML: Research Directions", IEEE Data Eng. Bull. 22(3), 1999.
  • Shredding XML to Relations
    • Alin Deutsch, Mary F. Fernandez, Dan Suciu, "Storing Semistructured Data with STORED", SIGMOD 1999.
    • Jayavel Shanmugasundaram, Kristin Tufte, Chun Zhang, Gang He, David J. DeWitt, Jeffrey F. Naughton, "Relational Databases for Querying XML Documents: Limitations and Opportunities", VLDB 1999.
    • Igor Tatarinov, Stratis Viglas, Kevin S. Beyer, Jayavel Shanmugasundaram, Eugene J. Shekita, Chun Zhang,"Storing and querying ordered XML using a relational database system", SIGMOD 2002.
  • XML Database Systems
    • J. McHugh, S. Abiteboul, R. Goldman, D. Quass, and J. Widom. "Lore: A Database Management System for Semistructured Data", SIGMOD Record, 26(3), September 1997.
    • Jeffrey F. Naughton, David J. DeWitt, David Maier, Ashraf Aboulnaga, Jianjun Chen, Leonidas Galanis, Jaewoo Kang, Rajasekar Krishnamurthy, Qiong Luo, Naveen Prakash, Ravishankar Ramamurthy, Jayavel Shanmugasundaram, Feng Tian, Kristin Tufte, Stratis Viglas, Yuan Wang, Chun Zhang, Bruce Jackson, Anurag Gupta, Rushan Chen,"The Niagara Internet Query System", IEEE Data Eng. Bull. 24(2): 27-33 (2001).
    • S. Abiteboul, S. Cluet, G. Ferranb, M. -C. Rousse, "The Xyleme project", Computer Networks Volume 39, Issue 3 , June 2002.
    • H. V. Jagadish, Shurug Al-Khalifa, Adriane Chapman, Laks V. S. Lakshmanan, Andrew Nierman, Stelios Paparizos, Jignesh M. Patel, Divesh Srivastava, Nuwee Wiwatwattana, Yuqing Wu, Cong Yu " TIMBER: A native XML database " , VLDB Journal, 11(4), 2002.
  • XPath Evaluation
    • Q. Li, B. Moon, "Indexing and Querying XML Data for Regular Path Expressions" , VLDB 2001.
    • Georg Gottlob, Christoph Koch, Reinhard Pichler: "Efficient Algorithms for Processing XPath Queries", VLDB 2002.
  • Tree Pattern Evaluation
    • Shurug Al-Khalifa, H. V. Jagadish, Jignesh M. Patel, Yuqing Wu, Nick Koudas, Divesh Srivastava, "Structural Joins: A Primitive for Efficient XML Query Pattern Matching" . ICDE 2002.
    • Nicolas Bruno, Nick Koudas, Divesh Srivastava, "Holistic twig joins: optimal XML pattern matching", SIGMOD 2002.
  • XML Indexing
    • Raghav Kaushik, Philip Bohannon, Jeffrey F. Naughton, Henry F. Korth, " Covering indexes for branching path queries " , SIGMOD 2002.
    • Haixun Wang, Sanghyun Park, Wei Fan, Philip S. Yu, " ViST: a dynamic index method for querying XML data by tree structures ", SIGMOD 2003.
  • XML Query Algebras
    • Vassilis Christophides, Sophie Cluet, Guido Moerkotte, "Evaluating Queries with Generalized Path Expressions", SIGMOD 1996.
    • H.V. Jagadish, Laks V.S. Lakshmanan, Divesh Srivastava, Keith Thompson, " TAX: A Tree Algebra for XML ", DBPL 2001.
  • XML Query Optimization
    • Jason McHugh, Jennifer Widom, " Query Optimization for XML ", VLDB 1999.
    • Yuqing Wu, Jignesh M. Patel, H. V. Jagadish, " Structural Join Order Selection for XML Query Optimization", ICDE 2003.
    • Alan Halverson, Josef Burger, Leonidas Galanis, Ameet Kini, Rajasekar Krishnamurthy, Ajith Nagaraja Rao, Feng Tian, Stratis Viglas, Yuan Wang, Jeffrey F. Naughton, David J. DeWitt, "Mixed Mode XML Query Processing", VLDB 2003.
  • XML Statistics
    • Juliana Freire, Jayant R. Haritsa, Maya Ramanath, Prasan Roy, Jerome Simeon. " StatiX: making XML count ", SIGMOD 2002.
    • N. Polyzotis, M. Garofalakis, " Statistical Synopses for Graph-Structured XML Databases ", SIGMOD 2002.
  • Other Directions
    • Sara Cohen, Jonathan Mamou, Yaron Kanza, Yehoshua Sagiv, " XSEarch: A Semantic Search Engine for XML ", VLDB 2003.
    • Ashish Kumar Gupta, Dan Suciu, " Stream processing of XPath queries with predicates ", SIGMOD 2003.