|
OttoQL is a universal QueryLanguage for tables and documents, which was implemented firstly for XML. It has a very simple syntax (kind of writing). The operations are applied generally sequentially on all corresponding tuples or subtuples. In the following program an XML-document is given by a table: BMI-example in OttoQL <<L(NAME, LENGTH, L(AGE, WEIGHT)):: Klaus 1.68 18 61 30 65 56 80 Rolf 1.78 40 72 Kathi 1.70 18 55 40 70 Valerie 1.00 3 16 Viktoria 1.61 13 51 Bert 1.72 18 66 30 70 >> mit NAME: AGE>20 # with: selection ext BMI:=(WEIGHT div LENGTH**2) # ext: introduction of a new column gib BMIAVG,M(AGE,BMIAVG,B(BMI,NAME)) && # give me; && connects two lines to a logical unit BMIAVG:=avg(BMI) round 2 It is visible also without tags that the tuple of Klaus ends with 80 and that Klaus has 3 subtuples. In this structure, for example is AGE subordinated to NAME. In the gib-part this hierarchy is inverted simply by giving the scheme or generally the DTD (Document Type Definition) of the desired XML-document. Here M (German: Menge) abbreviates set, B bag and L list. But at first a selection is applied in the above example. Instead of mit also ohne (without) can be used. By the above selection all tuples without an AGE-entry greater than 20 are discarded. These are Valerie and Viktoria. The first subtuple of Klaus remains yet in the table, because by NAME: is expressed that we select only complete tuples and no subtuples. If we want to omit subtuples, we have to replace NAME by AGE or WEIGHT. The following two conditions select in both lists: <code>mit NAME, AGE: AGE>20</code> resp. <code>mit AGE>20</code>. By an ext-part the table is extended by a new column (extension). Without an introduction of variables here column names of different levels can be used. Right of WEIGHT the Body-Mass-Index-column is introduced. It is notable that the BMI-values not only for the length 1.68 and the weight 61 but also for 1.68 and the second row (65) are computed. Beside restructurings in non-recursive DTD's it is possible to realize by a gib-part also the following tasks: * sort (M,B) (by the first fields of the collections) (M-, B-: descending) * aggregate (simultaneously horizontal and vertical) * eliminate duplicates (M, M-) * joins and unions * projections * groupby and nest * unnest * taggen The last operation (round) rounds all numbers, which occur in the result of the gib-part, to 2 digits after .. Binary operations are written in OttoQL infix. Because of this the above program realizes the following query: Find the average BMI, the BMI per age-level and the BMI of each persons and AGE persons, where the person is older than 20. Sort by AGE and within an AGE-group by BMI. The result as table: <pre> <<BMIAVG,M(AGE, BMIAVG, B(BMI, NAME)):: 23.12 18 20.98 19.03 Kathi 21.61 Klaus 22.31 Bert 30 23.34 23.03 Klaus 23.66 Bert 40 23.47 22.72 Rolf 24.22 Kathi 56 28.34 28.34 Klaus>> </pre> * Independence upon the data structure The operations of OttoQL need a DTD, because the system has to be able to recognize what is a collection and what is a tuple. Nevertheless, the important operations of OttoQL are widely independent of the DTD. The above BMI-example works also, if the given table is flat (L(NAME, LENGTH, AGE, WEIGHT)) or inversely structured (M(WEIGHT, L(NAME, LENGTH, AGE))). This property is important, if OttoQL should be used by search engines. Development The basic ideas of the most important operations of OttoQL are presented already in the. The ideas have been extended in and. But in these publications you can yet not find a generalization to XML. To the present implementation Andreas Hauptmann, Martin Schnabel and Dmitri Schamschurko made great contributions. The algebraic background of OttoQL you can find in the paper of Reichel.
|
|
|