Skip to main content

Databases: Semistructured Data

This course is one of five self-paced courses on the topic of Databases, originating as one of Stanford's three inaugural massive open online courses released in the fall of 2011. The original "Databases" courses are now all available on edx.org.

Part of the Databases series, this is a standalone course; learners seeking to develop an understanding of topics in this course do not need to take other Databases courses. This course covers the JSON and XML standards for semistructured data, along with query languages and schema declaration features for XML.

  • The XML Data section of this course introduces the XML model for semistructured and self-describing data, including DTDs and some features of XML Schema.
  • The JSON Data section of this course introduces the JSON model for human-readable structured or semistructured data.
  • The XPath and XQuery section of this course covers the XPath language for processing XML data, along with many features of the more advanced XQuery language.
  • The XSLT section of this course provides a general introduction to the XSLT rule-based language for querying and transforming XML data.
Databases: Semistructured Data

There is one session available:

After a course session ends, it will be archived.
Starts Oct 15
Ends Dec 31
Estimated 2 weeks
8–10 hours per week
Self-paced
Progress at your own speed
Free
Optional upgrade available

About this course

Skip About this course

About the Database Series of Courses

"Databases" was one of Stanford's three inaugural massive open online courses in the fall of 2011. It has been offered in synchronous and then in self-paced versions on a variety of platforms continuously since 2011. The material is now being offered as a set of five self-paced courses, which can be taken in a variety of ways to learn about different aspects of databases.

Relational Databases and SQL is the most popular course in the Databases series. It is applicable to learners seeking to gain a strong understanding of relational databases, and to master SQL, the long-accepted standard query language for relational database systems. Additional courses focus on advanced concepts in relational databases and SQL, formal foundations and database design methodologies, and semistructured data.

All of the courses are based around video lectures and demos. Many of them include quizzes between video segments to check understanding, in-depth standalone quizzes, and/or a variety of automatically-checked interactive exercises. Each course also includes an unmoderated discussion forum and pointers to readings and resources. The courses are described briefly below. Taught by Professor Jennifer Widom, the overall curriculum draws from Stanford's popular longstanding Databases course.

Why Learn About Databases

Databases are incredibly prevalent -- they underlie technology used by most people every day if not every hour. Databases reside behind a huge number of websites; they're a crucial component of telecommunications systems, banking systems, video games, and just about any other software system or electronic device that maintains some amount of persistent information. In addition to persistence, database systems provide a number of other properties that make them exceptionally useful and convenient: reliability, efficiency, scalability, concurrency control, data abstractions, and high-level query languages. Databases are so ubiquitous and important that computer science graduates frequently cite their database class as the one most useful to them in their industry or graduate-school careers.

At a glance

  • Language: English
  • Video Transcript: English

What you'll learn

Skip What you'll learn

Stanford's online offering in Databases is now available as a set of five self-paced courses:

Databases: Relational Databases and SQL

  • Introduction to the relational model and concepts in relational databases and relational database management systems
  • Comprehensive coverage of SQL, the long-accepted standard query language for relational database management systems

Databases: Advanced Topics in SQL (prerequisite: Relational Databases and SQL)

  • Creating indexes for increased query performance
  • Using transactions for concurrency control and failure recovery
  • Database constraints: key, referential integrity, and "check" constraints
  • Database triggers
  • How views are created, used, and updated in relational databases
  • Authorization in relational databases

Databases: OLAP and Recursion

  • Star schemas, the data cube concept, and On-Line Analytical Processing (OLAP) features in relational databases including the Cube and Rollup operators
  • The SQL standard for queries over recursively-defined relations

Databases: Modeling and Theory

  • Relational algebra – the algebraic query language that provides the formal foundations of SQL
  • Dependency theory and normal forms in relational databases as the basis of schema design
  • The data-modeling component of the Unified Modeling Language (UML), how UML diagrams are translated to relations

Databases: Semistructured Data

  • The XML model for semistructured and self-describing data, including DTDs and some features of XML Schema
  • The JSON model for human-readable structured or semistructured data
  • The XPath language for processing XML data, and many features of the more advanced XQuery language
  • An introduction to the XSLT rule-based language for querying and transforming XML data

About the instructors

Frequently Asked Questions

Skip Frequently Asked Questions

How long will it take to go through the course material?

All courses in the Databases series are self-paced and include videos, quizzes, and/or exercises. The courses vary considerably in length and complexity, and some students work faster than others, so we're not able to predict an individual time commitment.

What background do I need?

The series of courses does not assume prior knowledge of any specific topics, however a solid computer science foundation -- a reasonable amount of programming, as well as knowledge of basic computer science theory -- will make the material more accessible.

Do I need to buy a textbook?

Detailed lecture notes are provided. Having a textbook in addition to the notes is not necessary, but you might want to purchase one for reference, to reinforce the core material, and as a source of additional exercises. Suggested textbooks and readings are listed as part of the materials.

Interested in this course for your business or team?

Train your employees in the most in-demand topics, with edX for Business.