Opened 7 years ago

Closed 7 years ago

#4354 closed defect (fixed)

[PATCH] OGR SQL 'COUNT([DISTINCT] xx)' not conformant with other SQL implementations w.r.t NULL values

Reported by: Even Rouault Owned by: warmerdam
Priority: normal Milestone: 1.9.0
Component: OGR_SF Version: unspecified
Severity: normal Keywords:

Description (last modified by Even Rouault)

Currently, OGR SQL considers COUNT(a_column) as a synonym of COUNT(*). However the standard mandates that COUNT(a_column) only count non-NULL values, whereas COUNT(*) count all rows (after the effect of WHERE clause of course). So COUNT(a_column) should not be considered as equivalent to COUNT(*)

The same applies for COUNT(DISTINCT a_column): it should only count unique non-NULL values.

References :

What is a bit annoying here is that ogr_sql.html says that : "As a special case, the COUNT() operator can be given a "*" argument instead of a field name which is a short form for count all the records though it would get the same result as giving it any of the column names [...]"

If we fix the current behaviour to be conformant with the standard, it can break assumptions made by users. But currently, there's an inconsistency if you look at the result of -sql "select sum(foo), count(foo), avg(foo) from bar" when there are null values

Attachments (1)

ticket_4354.patch (7.4 KB) - added by Even Rouault 7 years ago.

Download all attachments as: .zip

Change History (4)

Changed 7 years ago by Even Rouault

Attachment: ticket_4354.patch added

comment:1 Changed 7 years ago by Even Rouault

Description: modified (diff)

comment:2 Changed 7 years ago by Even Rouault

Description: modified (diff)

comment:3 Changed 7 years ago by Even Rouault

Milestone: 1.9.0
Resolution: fixed
Status: newclosed

r23436 /trunk/ (4 files in 3 dirs): [backward-compatibility] OGR SQL: exclude null-values from COUNT(column_name) or COUNT(DISTINCT colum_name) (#4354)

Note: See TracTickets for help on using tickets.