Feature highlight: The Where query builder.

Posted 6 years ago.

It often happens you can not know in advance what is the condition of your query going to be. Take by example the search form of an electronic part store web site. Just for the resistor component, the possible search criteria are:

  • value in ohms from 0.1 to 10 million with more than 20 possible values in every power of ten.
  • precision (0.1%, 1%, 5% and 10%)
  • power dissipation from 0.125 to 100W
  • type (carbon, metal, wire wound, cement, dissipated)

A real shop would have more criteria than this but this is a good start since it already makes several thousand possible choices.

Let's start with a simplistic (and unrealistic) approach that would make users to be able to select at most only one value of each field. The returned data is then an array of values formatted as follow:

[ "value" => "2200", "precision" => "10", "power_dissipation" => "1", "type" => "metal" ]

Needless to say these values MUST be validated and cleaned before being used for our database.

Of course, less or even none of the values are possible. How to handle this search and the case no criteria is entered ? This is exactly what the Where class stands for.

<?php // /Store/Catalog/ResistorMap.php
// ...

    public function search(Array $criteria)
    {
        $where = new \Pomm\Query\Where();

        foreach ($criteria as $field => $value)
        {
            $where->andWhere(sprintf("%s = ?", $field), array($value));
        }

        return $this->query($where); // SELECT ... WHERE field1 = ? AND field2 = ? AND ...
    }

This simple example shows how handy the Where class is. It handles the values for us and formats the where condition by itself. The issued query for the fields given in the array above would be :

SELECT 
  component_reference,
  name,
  value,
  precision,
  power,
  type
FROM
  catalog.resistor
WHERE
    value = 2200
  AND
    precision = 10
  AND
    power_dissipation = 1
  AND
    type = 'metal'
;

What if no criteria is passed to the search method ? The Where instance would have no criteria nor data and would issue the following query:

SELECT ... WHERE true;

The Postgresql's planner will instantly understand this where clause to be always true and discard it with no performance loss.

What if we allow our user to select multiple values for each criteria ? The array returned by the form now looks like:

[ "value" => ["1000", "1200", "1500", "1800", "2000"], "precision" => [ "5", "1", "0.1" ], "power_dissipation" => [ "1" ], "type" => [ "metal", "wire wound" ] ]

This means we want all resistors with values of 1000 OR 1200 OR ... AND precision equals to 5 OR 1 OR 0.1 AND power_dissipation = 1 AND ...

The search method becomes:

<?php // /Store/Catalog/ResistorMap.php
// ...

    public function search(Array $criteria)
    {
        $where = new \Pomm\Query\Where();

        foreach ($criteria as $field => $values)
        {
            $sub_where = new \Pomm\Where();

            foreach ($values as $value)
            {
                $sub_where->orWhere(sprintf("%s = ?", $field), array($value));
            }

            $where->andWhere($sub_where);
        }

        return $this->query($where); // SELECT ... WHERE (field1 = ? OR field1 = ?) AND (field2 = ? OR ...
    }

Note how the clause builder handles the logical precedence using parenthesis. Although this query is correct from a logical point of view, it creates a complex query the planner may not optimize well, it would be easier and better if using the IN SQL operator:

<?php // /Store/Catalog/ResistorMap.php
// ...

    public function search(Array $criteria)
    {
        $where = new \Pomm\Query\Where();

        foreach ($criteria as $field => $values)
        {
            $where->andWhere(\Pomm\Where()::createIn($field, $values));
        }

        return $this->query($where); // SELECT ... WHERE field1 IN (?, ?, ?) AND field2 IN (?, ?)
    }

The good news is that the where class manages the variable number of values to escape.

This method is probably good enough for the majority of catalog search but, knowing our business makes us able to do the things a bit better. We know by example a user would not look for all resistors of either 1000 or 2000 ohms but resistors between 1000 and 2000 ohms. On the same principle, a user would not look for resistors able to dissipate either 1W or 10W but all resistors between 1 and 10W. A simple javascript widget would make users able to select ranges of values sending only the first and the last value of the range. The form values would then be formatted like this:

[ "value" => ["1000", "2000"], "power_dissipation" => ["1", "10"] ]
<?php // /Store/Catalog/ResistorMap.php
// ...

    public function search(Array $criteria)
    {
        $where = new \Pomm\Query\Where();

        foreach ($criteria as $field => $values)
        {
            $where->andWhere(sprintf("%s BETWEEN ? AND ?", $field), $values);
        }

        return $this->query($where); // SELECT ... WHERE field1 BETWEEN ? AND ? AND field2 BETWEEN...
    }

A mix of this solution with the IN is required to handle the resistor type which cannot be set in a range. Talking about ranges, Postgresql 9.2 users would like to take advantage of the range type with operator as powerful as "overlap" or "adjacent".