Onyx Version 2: Normalization Rules

Changelog

Date Description
16-Dec-2006 First import of rules


Normalization Basics

In Onyx there are certain constructs of the language that are factored out of the AST representation, or represented differently. The process of factoring is called normalization. In the version of Onyx used in this class, normalization rules apply exclusively to flwr expressions. The XQuery standard defines a wide range of normalization rules, which are described in the XQuery formal semantics document. These are used to simplify the semantic specification of the language, since XQuery provides redundant constructs (diffedrent ways of saying the same thing). to

Flwr Normalization

There are no flwr nodes in the Onyx AST. They are instead replaced with an appropriate ForExpression or LetExpression node, corresponding to the highest for or let node within the flwr expression. The first child of a for or let expression defines what value(s) its variable is bound to and second defines the resulting value. Thus the second child is actually the nested expression within it.

It should be noted that the statement:

let $i := 5, $j := 6 return $i*$j

is equivalent to

        let $i := 5
            let $j := 7
                return $i*$j

The same is true for commas in for expressions. The comma is simply syntactic sugar or short-hand for writing onyx code. When being evaluated such an expression is evaluated as two separate expression.

When a flwr expression is normalized there are no such things as where or return expressions. A return expression is simply normalized to the node within it. For example return 5 is normalized to simply 5.

Here is the parse tree corresponding to the above expression:
<?xml version="1.0" encoding="UTF-8"?>
<onyx.ast.Query>
   <onyx.ast.ExprList>
      <onyx.ast.LetExpression name="$i">
         <onyx.ast.Constant datatype="onyx.types.Integer">5
         <onyx.ast.LetExpression name="$j">
            <onyx.ast.Constant datatype="onyx.types.Integer">6
            <onyx.ast.Operator name="op:numeric-multiply">
               <onyx.ast.Variable>$i
               <onyx.ast.Variable>$j
            </onyx.ast.Operator>
         </onyx.ast.LetExpression>
      </onyx.ast.LetExpression>
   </onyx.ast.ExprList>
</onyx.ast.Query>

A where expression is normalized to an ifThenElse expression. For example where true return 5 is normalized to: if (true) then 5 else (). This ifThenElse expression is semantically equivalent in onyx to the where expression. The advantage of normalizing a where to an ifThenElse expression is to reuse an existing node type rather than having to define a new one specialized for where only.

When normalizing where expressions we run into the problem that the interpreter loses knowledge that it is actually evaluating a where expression. This can cause problems when reporting errors relating to the where. Thus to give the interpreter correct knowledge a new attribute is introduced to the ifThenElse AST node called normalized. The value of this attribute is the name of the expression it normalized, which in this case is where. Thus the interpreter can check this attribute when evaluating the expression so it reports the proper errors.

The following examples give a flwr expression, it's normalized expression equivalent (in onyx like syntax) and a normalized AST representation of the expression.

Flwr Normalization Examples

Example 1

Expression:

for $i in (1,2,3) return 5

Normalized Expression Equivalent (not legal onyx):

for $i in (1,2,3)
    5

Normalized AST:

<onyx.ast.ForExpression name="$i">
    <onyx.ast.ExprList>
        <onyx.ast.Constant datatype="onyx.types.Integer">1</onyx.ast.Constant>
        <onyx.ast.Constant datatype="onyx.types.Integer">2</onyx.ast.Constant>
        <onyx.ast.Constant datatype="onyx.types.Integer">3</onyx.ast.Constant>
    </onyx.ast.ExprList>
    <onyx.ast.Constant datatype="onyx.types.Integer">5</onyx.ast.Constant>
</onyx.ast.ForExpression>
			 

Example 2:

Expression:

for $i in (1,2,3), $j in (4,5,6) where $i > 2 return true

Normalized Expression Equivalent (not legal onyx):

for $i in (1,2,3)
    for $j in (4,5,6)
        if ($i > 2) then true else ()

Normalized AST:

<onyx.ast.ForExpression name="$i">
    <onyx.ast.ExprList>
        <onyx.ast.Constant datatype="onyx.types.Integer">1</onyx.ast.Constant>
        <onyx.ast.Constant datatype="onyx.types.Integer">2</onyx.ast.Constant>
        <onyx.ast.Constant datatype="onyx.types.Integer">3</onyx.ast.Constant>
    </onyx.ast.ExprList>
    <onyx.ast.ForExpression name="$j">
        <onyx.ast.ExprList>
            <onyx.ast.Constant datatype="onyx.types.Integer">4</onyx.ast.Constant>
            <onyx.ast.Constant datatype="onyx.types.Integer">5</onyx.ast.Constant>
            <onyx.ast.Constant datatype="onyx.types.Integer">6</onyx.ast.Constant>
        </onyx.ast.ExprList>
        <onyx.ast.IfThenElseExpr normalized="where">
            <onyx.ast.Operator name="op:greater-than">
                <onyx.ast.Variable>$i</onyx.ast.Variable>
                <onyx.ast.Constant datatype="onyx.types.Integer">2</onyx.ast.Constant>
            </onyx.ast.Operator>
            <onyx.ast.Constant datatype="onyx.types.Boolean">true</onyx.ast.Constant>
            <onyx.ast.ExprList/>
        </onyx.ast.IfThenElseExpr>
    </onyx.ast.ForExpression>
</onyx.ast.ForExpression>
            

Example 3:

Expression:

for $i in (1,2,3), $j in (4,5,6) let $k := 3 for $l in (7,8,9) where $k < $j return 5

Normalized Expression Equivalent (not legal onyx):

for $i in (1,2,3)
    for $j in (4,5,6)
        let $k := 3
            for $l in (7,8,9)
               if  ($k < $j) then 5 else ()

Normalized AST:

<onyx.ast.ForExpression name="$i">
    <onyx.ast.ExprList>
        <onyx.ast.Constant datatype="onyx.types.Integer">1</onyx.ast.Constant>
        <onyx.ast.Constant datatype="onyx.types.Integer">2</onyx.ast.Constant>
        <onyx.ast.Constant datatype="onyx.types.Integer">3</onyx.ast.Constant>
    </onyx.ast.ExprList>
    <onyx.ast.ForExpression name="$j">
        <onyx.ast.ExprList>
            <onyx.ast.Constant datatype="onyx.types.Integer">4</onyx.ast.Constant>
            <onyx.ast.Constant datatype="onyx.types.Integer">5</onyx.ast.Constant>
            <onyx.ast.Constant datatype="onyx.types.Integer">6</onyx.ast.Constant>
        </onyx.ast.ExprList>
        <onyx.ast.LetExpression name="$k">
            <onyx.ast.Constant datatype="onyx.types.Integer">3</onyx.ast.Constant>
            <onyx.ast.ForExpression name="$l">
                <onyx.ast.ExprList>
                    <onyx.ast.Constant datatype="onyx.types.Integer">7</onyx.ast.Constant>
                    <onyx.ast.Constant datatype="onyx.types.Integer">8</onyx.ast.Constant>
                    <onyx.ast.Constant datatype="onyx.types.Integer">9</onyx.ast.Constant>
                </onyx.ast.ExprList>
                <onyx.ast.IfThenElseExpr normalized="where">
                    <onyx.ast.Operator name="op:less-than">
                        <onyx.ast.Variable>$k</onyx.ast.Variable>
                        <onyx.ast.Variable>$j</onyx.ast.Variable>
                    </onyx.ast.Operator>
                    <onyx.ast.Constant datatype="onyx.types.Integer">5</onyx.ast.Constant>
                    <onyx.ast.ExprList/>
                </onyx.ast.IfThenElseExpr>
            </onyx.ast.ForExpression>
        </onyx.ast.LetExpression>
    </onyx.ast.ForExpression>
</onyx.ast.ForExpression>