A Database Schema is a set of two files containing certain pieces of meta-information regarding the tables that comprise a NoSQL database. Such two files can take any names, but I will refer to them as the file schema and the field schema. Such schema files are just like any other NoSQL table, and can be manipulated with all the standard NoSQL commands and operators. In the explanations that follow I will make use of an hypothetical "school management" database. Our school has courses, and each course is followed by a certain number of students. We will therefore have three tables in the database:
course
table, containing the list of available courses:Course Credit Room Day Time Teacher ------ ------ ----- ---------- ----- ------- chem-1 2 HA-18 2003-10-12 14:00 1 cs-101 3 DB-1 2003-10-11 10:00 2 econ-1 1 RB-8 2003-10-10 09:00 2 his-10 3 HA-18 2003-11-12 15:00 3
student
table, that lists the students in our school:Student First Last Year Phone ------- ------- ---------- ---- ------- 1 Thomas Jefferson 1 9876 2 George Washington 1 87654 3 Ronald Reagan 1 7654210 4 William Clinton 1 6542109
course_student
table, that tells us which students are enrolled into which courses:Course Student ------ ------- chem-1 1 chem-1 2 chem-1 3 cs-101 2 cs-101 4 econ-1 1 econ-1 2 econ-1 3 his-10 1
And here's what a file schema table for our "school management" database will look like, as displayed by the NoSQL justify
command:
Table Path Edit -------------- ---------------- ---------------------- course ./course course_student ./course.student ./course.student-edits student ./student
Such a schema shows a typical many-to-many database relation (one student can take many courses and one course involves many students). Here's what the table columns mean:
Table
field must be unique across the whole database.Table
entry. In the sample schema above, relative paths are used, and to make clear that they are paths they were prepended by ./
, which is optional for those files that are located in the current working directory, like in this particular example.The second Database Schema file is the field schema. It contains the list of all column names that comprise a database, associated with the table names they belong to. The table names are those listed in the file schema previously explained. Here is the field schema table of our sample school management database:
Column Table Flags Match Parent Insert Delete ------- -------------- ----- ------------ ------ --------- -------- Course course K Restrict Course course_student k course Dependent Credit course Day course h ^[1-3][0-9]$ First student Last student Phone student Room course Student course_student k student Dependent Student student Kb Restrict Teacher course Time course h ^[0-9:]+$ Year student
As you can see, fields that occur in multiple tables are listed multiple times, one time per table, and thus the fieldlist table key is the concatenation of Column
and Table
values.
The Database Schema files can contain additional, application-specific columns beside the basic ones explained above (i.e. to add annotations or anything else); such columns will not be read by NoSQL schema-related operators but their presence will be tolerated.
The Flags
field in the fieldlist table provides further properties concerning each field, namely:
Time
and Day
are flagged as hidden, meaning that we do not want them to appare on output.table._x.Column1.Column2...
). At most 9 multicolumn index files are supported for each table, but each index can be on as many columns as desired.
Each Flags
field may contain multiple flags, such as kh
, Kh
, etc.
The Match
field in the fieldlist table provides a crude method of datatype enforcement. The field is optional and may contain any valid AWK regular expression that the content of the associated table column must match to be considered valid. If any flags in the Flags
column imply any form of data-typing they take precedence over the Match
field. For instance, if one of the flags is N
then NoSQL operators that read the database schema table MUST check for the field not to be null first, and then apply the pattern-matching specified by Match
, if any. In case of conflicts between Flags
and Match
the former MUST take over and the second MUST be ignored by the relevant operators. Likewise, if N
is not listed in Flags
the field content MUST be considered valid if either it is null or it matches the specified pattern.
The Parent
field indicates that the column is a foreign-key of the specified table.
The Insert
field, which also applies only to foreign-keys, specifies whether the presence of the associated primary key is mandatory or not when a new value is inserted into the column being described. If the field contains the keyword "Dependent" then the associated primary key MUST exist, or an insert declaation will trigger an exception in the 'constraint' operator.
The Delete
field, which applies only to primary keys, tells what to do if the primary key being described is deleted. If the field contains the keyword "Restrict" then the primary key may not be deleted until there are foreign keys referring back to it from other tables.
The Parent
, Insert
and Delete
schema fields are used by the 'constraint' NoSQL operator. See contraint's help text for more information.
A copy of the sample database used in the above examples can be downloaded from here.