Posted By: Anonymous
There are plenty of similar questions to be found on here but I don’t think that any answer the question adequately.
I’ll continue from the current most popular question and use their example if that’s alright.
The task in this instance is to get the latest post for each author in the database.
The example query produces unusable results as its not always the latest post that is returned.
SELECT wp_posts.* FROM wp_posts WHERE wp_posts.post_status='publish' AND wp_posts.post_type='post' GROUP BY wp_posts.post_author ORDER BY wp_posts.post_date DESC
The current accepted answer is
SELECT wp_posts.* FROM wp_posts WHERE wp_posts.post_status='publish' AND wp_posts.post_type='post' GROUP BY wp_posts.post_author HAVING wp_posts.post_date = MAX(wp_posts.post_date) <- ONLY THE LAST POST FOR EACH AUTHOR ORDER BY wp_posts.post_date DESC
Unfortunately this answer is plain and simple wrong and in many cases produces less stable results than the orginal query.
My best solution is to use a subquery of the form
SELECT wp_posts.* FROM ( SELECT * FROM wp_posts ORDER BY wp_posts.post_date DESC ) AS wp_posts WHERE wp_posts.post_status='publish' AND wp_posts.post_type='post' GROUP BY wp_posts.post_author
My question is a simple one then:
Is there anyway to order rows before grouping without resorting to a subquery?
Edit: This question was a continuation from another question and the specifics of my situation are slightly different. You can (and should) assume that there is also a wp_posts.id that is a unique identifier for that particular post.
ORDER BY in a subquery is not the best solution to this problem.
The best solution to get the
max(post_date) by author is to use a subquery to return the max date and then join that to your table on both the
post_author and the max date.
The solution should be:
SELECT p1.* FROM wp_posts p1 INNER JOIN ( SELECT max(post_date) MaxPostDate, post_author FROM wp_posts WHERE post_status='publish' AND post_type='post' GROUP BY post_author ) p2 ON p1.post_author = p2.post_author AND p1.post_date = p2.MaxPostDate WHERE p1.post_status='publish' AND p1.post_type='post' order by p1.post_date desc
If you have the following sample data:
CREATE TABLE wp_posts (`id` int, `title` varchar(6), `post_date` datetime, `post_author` varchar(3)) ; INSERT INTO wp_posts (`id`, `title`, `post_date`, `post_author`) VALUES (1, 'Title1', '2013-01-01 00:00:00', 'Jim'), (2, 'Title2', '2013-02-01 00:00:00', 'Jim') ;
The subquery is going to return the max date and author of:
MaxPostDate | Author 2/1/2013 | Jim
Then since you are joining that back to the table, on both values you will return the full details of that post.
See SQL Fiddle with Demo.
To expand on my comments about using a subquery to accurate return this data.
MySQL does not force you to
GROUP BY every column that you include in the
SELECT list. As a result, if you only
GROUP BY one column but return 10 columns in total, there is no guarantee that the other column values which belong to the
post_author that is returned. If the column is not in a
GROUP BY MySQL chooses what value should be returned.
Using the subquery with the aggregate function will guarantee that the correct author and post is returned every time.
As a side note, while MySQL allows you to use an
ORDER BY in a subquery and allows you to apply a
GROUP BY to not every column in the
SELECT list this behavior is not allowed in other databases including SQL Server.