Posted By: Anonymous
It’s easy to find duplicates with one field:
SELECT name, COUNT(email) FROM users GROUP BY email HAVING COUNT(email) > 1
So if we have a table
ID NAME EMAIL 1 John [email protected] 2 Sam [email protected] 3 Tom [email protected] 4 Bob [email protected] 5 Tom [email protected]
This query will give us John, Sam, Tom, Tom because they all have the same
However, what I want is to get duplicates with the same
That is, I want to get “Tom”, “Tom”.
The reason I need this: I made a mistake, and allowed to insert duplicate
SELECT name, email, COUNT(*) FROM users GROUP BY name, email HAVING COUNT(*) > 1
Simply group on both of the columns.
Note: the older ANSI standard is to have all non-aggregated columns in the GROUP BY but this has changed with the idea of “functional dependency”:
In relational database theory, a functional dependency is a constraint between two sets of attributes in a relation from a database. In other words, functional dependency is a constraint that describes the relationship between attributes in a relation.
Support is not consistent:
- Recent PostgreSQL supports it.
- SQL Server (as at SQL Server 2017) still requires all non-aggregated columns in the GROUP BY.
- MySQL is unpredictable and you need
- GROUP BY lname ORDER BY showing wrong results;
- Which is the least expensive aggregate function in the absence of ANY() (see comments in accepted answer).
- Oracle isn’t mainstream enough (warning: humour, I don’t know about Oracle).