CHAPTER 13: WHERE Clause


A WHERE clause is used to restrict the records affected by a SELECT, UPDATE, or DELETE command to those satisfying conditions in the WHERE clause. When there is no WHERE clause, all records in the table(s) are affected.

A WHERE clause consists of boolean expressions; that is, expressions that evaluate to true or false. A complex WHERE clause may be built up using boolean expressions that are joined using AND and OR. Precedence in evaluating the parts of a complex WHERE clause can be controlled using parentheses.

A WHERE clause can restrict records based on the values of attributes by comparing expressions, one or both of which may involve attributes. The comparison operators are:

Table 13-1: WHERE Clause Operators

Operator Description
= Equality
!= or <> or ~= Inequality (not equal to)
! Not (negation operator)
> Greater than
>= Greater than or equal to
< Less than
<= Less than or equal to
IN Inclusion in a list of values
RANGE ... Range between a lower an upper value. Range values are in inclusive by default but may be designated to be exclusive.
BETWEEN ... Same as range
LIKE Case-sensitive pattern matching
MATCH Case-insensitive pattern matching
SMATCH Case-sensitive pattern matching

Note that ~ may be used as a synonym for !.

Syntax

WHERE |b_expr
|(b_expr)
|b_expr OR b_expr
|b_expr AND b_expr
|NOT b_expr
|
|
|
|
|

where:

b_expr is a boolean expression. Details on the boolean expression is explained below.

Notes

  1. Boolean expression, b_expr is:
    |expr [
    |
    |IS
    |[IS] NOT
    |] condition
    |
    |
    |
    |NULL expr
    |EXISTS subquery
    |
    |

    expr is an expression. For the definition of expr see the chapter on Expressions.

    condition is one of the following:
    |
    |
    |
    |
    |
    |
    |
    | =
    |!=
    |<>
    | >
    |>=
    | <
    |<=
    |
    |
    |
    |
    |
    |
    |
    |exp
    |Subquery
    |
    |
    |
    |
    |
    |
    |
    |
    |
    |
    |
    |
    IN (expr {, expr})
    IN subquery
    LIKE like_pattern [ESCAPE atom]
    |
    |
    |
    |
    |
    |
    |
    |MATCH
    |!MATCH
    |SMATCH
    |!SMATCH
    | pattern
    |
    |
    |
    |
    |
    |
    |
    |
    |
    [ | =
    |!=
    |] NULL
    |
    |
    |
    |
    |
    |
    BETWEEN |expr
    |subquery
    | [
    |
    |EXCLUSIVE
    |INCLUSIVE
    |] AND
    |
    |expr
    |subquery
    |]
    |
    |EXCLUSIVE
    |INCLUSIVE
    |]
    |
    |
    |
    |
    |
    RANGE |expr
    |subquery
    | [
    |
    |EXCLUSIVE
    |INCLUSIVE
    |] TO
    |
    |expr
    |subquery
    |]
    |
    |EXCLUSIVE
    |INCLUSIVE
    |]
    |
    |
    |

    subquery is a Query Language command of the form:

    [ |ANY
    |ALL
    |] (select_command)
    |

    The like_pattern represents a character string. By default, like_pattern may incorporate the following special characters:

Table 13-2: LIKE Pattern Matching Characters

Character Description
_ (underline) Matches any single character in the position.
% Matches zero or more occurrences of any character in the position.

The atom is a single character that, when preceding the characters _ and %, force Empress to interpret them literally instead of as special characters.

In a subquery, the select_command must select one and only one item, either an attribute or aggregate function over attributes. It may not use grouping, having clauses or print headers. The enclosing parentheses are required. For further details, see the section on Subqueries in the Empress SQL: User's Guide.

BETWEEN ... AND is an alternative to RANGE ... TO. Note that the values must be given in increasing order for any records to meet the condition. MATCH is a dual-case pattern match, while SMATCH is single-case. The ! preceding a MATCH keyword selects all records which do not match. LIKE performs a case sensitive pattern match.

  • A simple boolean expression in a WHERE clause is typically an expression followed by a condition, such as:

       expr = VALUE
    
    

    This boolean expression may have a result of true, false or null. A null result arises if the expression cannot be evaluated for some reason; for instance, if it involves an attribute with a null value in a computation, if a computation fails, or if there is a conversion error.

    Since a boolean expression may have one of three results, a three-valued logic is required to evaluate complex boolean expressions. The logic gives the rules for evaluating expressions constructed from boolean expressions joined with AND and OR conjunctions. The truth tables for this logic are given below.

    Table 13-3: AND conjunction

    T N F
    T T N F
    N N N F
    F F F F

    Table 13-4: OR conjunction

    T N F
    T T T T
    N T N N
    F T N F

    If two boolean expressions are joined by an AND, both expressions must be true in order for the whole boolean expression to be true.

    If two boolean expressions are joined by an OR, either one or both expressions must be true in order for the whole boolean expression to be true.

  • Note that NOT is not equivalent to !. ! applies to an operator, but NOT applies to an expression. When NOT is used in front of an operator its effect is as if it were in front of the expression.

    For example,

       ... WHERE attr NOT = ANY SELECT ...
    
    

    is equivalent to

       ... WHERE NOT (attr = ANY SELECT ...)
    
    

    is equivalent to

       ... WHERE attr != ALL SELECT ...
    
    

  • A pattern match is an operation that compares attribute values against some specified pattern of characters. If the attribute value matches the pattern, then that record satisfies the condition.

    Pattern matching also uses special characters which allow a match to be made on subgroups of characters within an attribute. This enables phrases to be identified within an attribute value. The special characters are:

    Table 13-5: Pattern Matching Characters

    Character Description
    ? Matches any character in the position.
    * Matches zero or more occurrences of any character.
    [...] Matches any of a set of characters in the position.
    Example: [abc] matches a or b or c.
    {...} Matches zero or more occurrences of a fixed length pattern.
    Example: {[a-z]} matches any string of lower case letters.
    [.-.] Matches a range of characters in the position.
    Example: [a-cf-i] matches a, b, c, f, g, h, i.
    [^...] Matches anything but a set or range of characters in the position.
    Example: [^123] matches anything except 1, 2 or 3; [^a-d] matches anything except a, b, c, or d.
    ...|... Requires a match on either side of the bar.
    Example: ab|cd requires a match on ab or cd.
    ...&... Requires a match on both sides of the ampersand.
    Example: [a-z]&[^x] matches any letter except x.
    \ Placed before any of the special characters ?, *, |, &, {, }, [, ], and \ causes that character to be interpreted as an ordinary character rather than a special one.

    Note that the character used to expand variables (defined in MSQLVARCHARS; the dollar $ character by default) has no pattern matching significance and retains its variable expansion function in strings. To represent this character literally anywhere in a string or pattern it must be doubled to prevent its variable expansion function. That is, $abc causes Empress to attempt to replace a variable abc with its value, whereas $$abc evaluates to the string $abc.

    Examples of Pattern Match:

    Example Description
    *ho* This pattern can match any attribute value that consists of zero or more characters, followed by the letters ho, followed by zero or more characters. This pattern would match attribute values such as ho, who, whom, Aho, hoho, Alphonse, etc. If the pattern has been preceded by SMATCH (single case match), then the pattern match is done using the lower case letters ho. If the pattern has been preceded by MATCH, then the match is done using ho, hO, Ho and HO (all upper and lower case combinations.) A ! before MATCH or SMATCH picks those that do not match.
    12|13 This matches the attribute value 12 or 13. Either one or both gives a match.
    R*&*r This matches any attribute value that begins with an R and ends with an r, such as Roger. (This could also be expressed as R*r.)
    *Jun*&*1990* This matches only attribute values containing Jun and 1990, whether they are written as 12 June 1990, Jun 12 1990, 1990 12 June, 1990 June 12, on the 12th day of June in 1990, or some other form. (This is ideal for selecting free-form dates entered as CHAR rather than DATE.)
    f?r This matches any three-letter word starting with f and ending with r: fur, for, fir or far.
    f[uo]r This matches three-letter words starting with f, ending with r, and having u or o as the middle letter fur and for.
    f[^i]r This is like the previous example, except that it allows any letter except i as the middle one. It matches fur, for, and far.
    E_p This matches all strings of three characters that start with E and end with p.
  • Example

    1. The following WHERE clause restricts the records from personnel to those having the name of Jones. Since there is one record for each name in the personnel table, this restriction will have the effect of requesting a single record.

         SELECT * FROM personnel
            WHERE name = 'Jones';
      
      

      It produces the following output:

         number     name     phone        credit_limit
      
          3         Jones    667-2951     $500.00
      
      
    2. The following WHERE clause restricts the records to those having a credit limit greater than or equal to $200.00. Records with credit limit less than $200.00 will not be requested. Note that it is not necessary to include the dollar sign for dollar values when using the Query Language. If the dollar sign is included, it is then necessary to precede it with another dollar sign, so that it will not be mistaken for an Empress variable.

         SELECT * FROM personnel
            WHERE credit_limit >= 200.00;
      
      

      This produces:

         number     name          phone        credit_limit
         
         10         Kilroy        426-9681     $500.00
         5          Mosca         544-2243     $750.00
         17         Wladislaw     723-6073     $200.00
         3          Jones         667-2951     $500.00
         8          Peterson      978-6060     $250.00
      
      
    3. The following WHERE clause restricts the records to those with a name which is not Kilroy and a credit limit of $500.00. Since an AND is used to connect the two conditions, both conditions in parentheses must be true for the record to be requested. <> may be substituted for !=.

         SELECT * FROM personnel
            WHERE name <> 'Kilroy'
            AND credit_limit = '$$500';
      
      

      It produces:

         number     name      phone        credit_limit
      
          3         Jones     667-2951     $500.00
      
      

      Note the double $ to prevent interpretation of the credit limit as a variable.

    4. The following WHERE clause restricts the records from the loans table to those having a date of 16 February 1990 or the name Wladislaw. Since an OR is used to connect the two conditions, either one or the other or both conditions must be true for the record to be requested.

         SELECT * FROM loans
            WHERE date = '16 February 1990'
            OR name = 'Wladislaw';
      
      

      and produces:

         name            date                 amount
      
         Kilroy          16 February 1990     $250.00
         Wladislaw       12 May 1990           $25.00
         Wladislaw       27 February 1990      $55.00
         Wladislaw       25 June 1990          $75.00
      
      
    5. When it is necessary to select records where an attribute may have a number of different values, one way to accomplish this is to construct a complex WHERE clause using OR. For example, to find all loans by Jones, Kilroy, and Mosca, you could use the following to achieve the desired result:

         SELECT * FROM loans
            WHERE name = 'Jones' OR
            name = 'Kilroy' OR name = 'Mosca';
      
      

      There is, however, an alternative way to do this, using a constant list. In this case, the command above becomes:

         SELECT * FROM loans
            WHERE name IN
            ('Jones', 'Kilroy', 'Mosca');
      
      

      which saves considerable typing for a long list. If there is an index on the attribute name, the format should be used to utilize the indexed attribute.

      Similarly, all cases not in a list may be found. To find loans by everyone except Jones, Kilroy, and Mosca, use:

         SELECT * FROM loans
            WHERE name NOT IN
            ('Jones', 'Kilroy', 'Mosca');
      
      
    6. The following would be used to find all employees who do not have a phone number entered. The keyword NULL indicates data not entered in the record.

         SELECT FROM personnel
            WHERE phone = NULL;
      
      

      For this table, this command produces the following output:

         number     name     phone          credit_limit
      
      

      There are no records with null values for phone, so only the attribute headings are printed. If preferred, it is possible to use the keywords IS and IS NOT instead of = and != when finding null or not-null values. Thus, the command:

         SELECT FROM personnel
            WHERE phone IS NULL;
      
      

      has exactly the same effect as the command above, while:

         SELECT FROM personnel
            WHERE phone IS NOT NULL;
      
      

      finds all the employees who do have phone numbers recorded.

    7. The following would be used to find loans made before May 1990 by Jones or after April 1990 by Wladislaw. Conditions in parentheses are evaluated first. The parentheses are used to group conditions so that each group of name and date is requested as a unit and evaluated first.

      Since an OR is used to connect two conditions in parentheses, either the first or the second or both conditions must be true for the record to be requested. Note that for this particular case it is impossible for both conditions in parentheses to be true at the same time for the same record.

         SELECT FROM loans
            WHERE (name = 'Jones' AND date < '1 May 1990')
            OR (name = 'Wladislaw' AND date > '30 April 1990');
      
      

      It produces:

         name               date                 amount
      
         Jones               7 February 1990     $33.95
         Jones               3 April 1990        $25.00
         Wladislaw          12 May 1990          $25.00
         Wladislaw          25 June 1990         $75.00
      
      

      Parentheses may be included in a WHERE clause simply to emphasize the logic, if so desired.

    8. The following WHERE clause restricts the records to those with loans larger than half the employee's credit limit.

         SELECT name, date, amount
            FROM loans, personnel
            WHERE amount > 0.5 * credit_limit
            AND loans.name = personnel.name;
      
      

      This produces:

         name               date                  amount
      
         Jones              12 August 1990        $300.00
         Scarlatti           9 September 1990     $150.00
         Scarlatti          14 October 1990       $125.00
      
      
    9. You could print the name, phone, and credit limit of all employees who currently have loans with the command:

         SELECT name, phone, credit_limit
            FROM personnel WHERE exists
            (SELECT name FROM loans
             WHERE loans.name = personnel.name);
      
      

      which produces:

         name          phone           credit_limit
         
         Jones         667-2951        $500.00
         Kilroy        426-9681        $500.00
         Mosca         544-2243        $750.00
         Peterson      978-6060        $250.00
         Scarlatti     961-7363        $100.00
         Wladislaw     723-6073        $200.00
      
      

      The following WHERE clause finds all loans less than the average loan amount over all loans, by using a nested select statement to find the average.

         SELECT FROM loans
            WHERE amount < (SELECT AVG amount FROM loans);
      
      

      It produces:

         name            date                amount
         
         Jones            7 February 1990    $33.95
         Wladislaw       27 February 1990    $55.00
         Jones            3 April 1990       $25.00
         Wladislaw       12 May 1990         $25.00
         Peterson         6 June 1990        $50.00
         Wladislaw       25 June 1990        $75.00
      
      

      This example prints the names and credit limits of all employees with single loans exceeding $200 outstanding.

         SELECT name, credit_limit FROM personnel
            WHERE name= ANY
            (SELECT name FROM loans WHERE amount > 200);
      
      
         name         credit_limit
      
         Kilroy       $500.00
         Jones        $500.00
      
      
    10. The following WHERE clause restricts the records to those having a pattern of characters in phone that matches the pattern of the number 9, followed by zero or more of any other characters. This could be used to find all employees at a given local phone exchange. In this case, it will find the records with the phone numbers 978-6060 and 961-7363.

         SELECT * FROM personnel
            WHERE phone MATCH '9*';
      
      

      It produces:

         number    name        phone       credit_limit
      
         8        Peterson     978-6060    $250.004
                  Scarlatti    961-7363    $100.00
      
      

      This command may also be expressed as:

         SELECT * FROM personnel
            WHERE phone LIKE '9%';
      
      

      The following WHERE clause restricts records to those having a pattern of characters in date that matches the pattern of zero or more characters, a blank, the letters May followed by a blank, followed by zero or more characters. This would find all the loans which were made during May in any year.

         SELECT * FROM loans
            WHERE date MATCH '* May *';
      
      

      It produces:

         name            date           amount
      
         Mosca            4 May 1990    $200.00
         Wladislaw       12 May 1990     $25.00
      
      

      Since the only letters in a date are in the month, a more terse way to look for all May entries is to look for the letters May preceded and followed by anything:

         SELECT * FROM loans
            WHERE date MATCH '*May*';
      
      

      To find all loans not made in May, the appropriate command is:

         SELECT * FROM loans WHERE date !MATCH '*May*';
      
      

      or

         SELECT * FROM loans WHERE date NOT LIKE '%May%';
      
      

      which produces:

         name               date                 amount
         
         Mosca               2 February 1990     $150.00
         Jones               7 February 1990      $33.95
         Kilroy             16 February 1990     $250.00
         Wladislaw          27 February 1990      $55.00
         Jones               3 April 1990         $25.00
         Peterson            6 June 1990          $50.00
         Wladislaw          25 June 1990          $75.00
         Jones              12 August 1990       $300.00
         Scarlatti           9 September 1990    $150.00
         Scarlatti          14 October 1990      $125.00
      
      

      The effect of SMATCH may be shown by a selection from the auto parts table. The command:

         SELECT supplier, phone, ATTR 'part name', price
            FROM 'auto parts'
            WHERE ATTR 'part name' SMATCH '*chrome*';
      
      

      produces:

         supplier           phone        part name              price
      
         AAA Automotive     922-8624     Fender--chrome          $67.95
      
      

      with just a single record, whereas the command:

         SELECT supplier, phone, ATTR 'part name', price
            FROM 'auto parts'
            WHERE ATTR 'part name' MATCH '*chrome*';
      
      

      produces the following with two records:

         supplier             phone        part name              price
      
         AAA Automotive       922-8624     Fender--chrome        $67.95
         Auto Wholesalers     922-6219     Chrome paint           $1.99
      
      

      Note that the complex attribute name part name must be preceded by the keyword ATTR to indicate that it is an attribute, not an expression.

    11. To find values lying within a given range, the keyword RANGE should be used in preference to a complex WHERE clause such as:

         WHERE ATTRIBUTE first_constant AND ATTRIBUTE second_constant
      
      

      The range must be given from a lower to a higher value for any record to meet the condition. Selecting a range from a higher to a lower value does not produce an error, but no records will be selected. The records with values equal to the upper and lower range limits are included by default. These values can be excluded from the WHERE clause by specifying EXCLUSIVE in the WHERE condition.

      Note that BETWEEN ... AND may be used instead of RANGE ... TO.

      The following would be used to find all loans between $100 and $200, inclusive:

         SELECT name, date, amount FROM loans
            WHERE amount RANGE 100 to 200;
      
      

      It produces:

         name               date                 amount
      
         Mosca               2 February 1990     $150.00
         Mosca               4 May 1990          $200.00
         Scarlatti           9 September 1990    $150.00
         Scarlatti          14 October 1990      $125.00
      
      

      By contrast, the command:

         SELECT name, date, amount FROM loans
            WHERE amount RANGE 100 EXCLUSIVE TO 200 EXCLUSIVE;
      
      

      produces only:

         name               date                amount
      
         Mosca              2 February 1990      $150.00
         Scarlatti          9 September 1990     $150.00
         Scarlatti          9 September 1990     $125.00
      
      

      The following finds all loans over $200.00 but less than or equal to $300.00:

         SELECT name, date, amount FROM loans
            WHERE amount RANGE 200 EXCLUSIVE TO 300;
      
      

      It produces:

         name            date                 amount
      
         Kilroy          16 February 1990     $250.00
         Jones           12 August 1990       $300.00
      
      
    12. Use BETWEEN ... AND to specify a range of acceptable values. BETWEEN ... AND may be used as a synonym for RANGE ... TO. A BETWEEN ... AND can also involve a complex boolean expression, but is then evaluated more slowly.

         SELECT * FROM table WHERE
            attr BETWEEN value AND value;
      
      

      is equivalent to:

         SELECT * FROM table WHERE
            attr RANGE value TO value;
      
      

      Thus, the example in the previous section which finds credit limits between $100 and $200 can also be expressed as:

         SELECT name, date, amount FROM loans
            WHERE amount BETWEEN 100 EXCLUSIVE AND 200 EXCLUSIVE;
      
      

      and produces exactly the same output:

         name                date                 amount
      
         Mosca               2 February 1990      $150.00
         Scarlatti           9 September 1990     $150.00
         Scarlatti           9 September 1990     $125.00
      
      

      As with RANGE ... TO, the values must be given in increasing order for any record to meet the condition.