2

I am working on a complex MYSQL query that performs a UNION of SELECT statements on tables (that each contain joins to a users table). For the UNION to work, the number of fields returned from each SELECT STATEMENT must be the same. My task is to replace a user's name with "Deactivated" when the user is flagged as deactivated.

Simplifed the tables are:

users
id|first|last|uname|deactivated
    
posts
id|post|userid
    
comments
id|comment|postid|userid

The query (simplified) that I am starting with is

SELECT 
  p.post, 
  COALESCE(u.first, u.last, u.uname) as person 
FROM 
  posts `p` 
  LEFT JOIN users `u` ON p.userid = u.id 
WHERE 
  post LIKE '%$querystring%' 
GROUP BY 
  p.id 
UNION 
SELECT 
  c.comment, 
  COALESCE(u.first, u.last, u.uname) as person 
FROM 
  comments `c` 
  LEFT JOIN users `u` ON c.userid = u.id 
  LEFT JOIN posts `p` ON p.id = c.postid 
WHERE 
  comment LIKE '%$querystring%' 
GROUP BY 
  c.id

One option would be to pull the deactivated field with each query and then do logic after the results are returned but it would be nicer to do the logic in the query if possible.

It seems like there ought to be a way to condition the value of person from the users table on whether the deactivated field is true but my knowledge of SQL is not adequate to figuring this out.

How can I change the value returned for person based on the value in the deactivated field in users?

3
  • note that union defaults to union distinct, so your query will go to extra work (and remove duplicates between the post/person rows and comment/person rows). you almost always want union all instead. Commented May 26 at 17:23
  • 1
    you are left joining users, presumably because posts/comments may not have users or may have users that were deleted. "Deactivated" will not show for these (because u.deactivated will be null), is that what you want? Commented May 26 at 17:38
  • @ysth Although I saw the LEFT JOIN and imagined it was intentional, I didn't even think of its repercussions on 'Deactivated' :-\ although I tried to answer as completely as possible. Would you mind creating a dedicated answer with your finding which is worth an upvote? In the meantime, I'll add it as a note to my answer. Commented May 27 at 9:24

3 Answers 3

2

You can use IF function:

IF(u.deactivated, 'Deactivated', COALESCE(u.first,u.last,u.uname)) as person
Sign up to request clarification or add additional context in comments.

Comments

1

As mentioned, either IF or CASE can be used.

SELECT 
  p.post,
  CASE 
    WHEN u.deactivated = 1 THEN 'Deactivated'
    ELSE COALESCE(u.first, u.last, u.uname)
  END AS person

Comments

0

The most portable way of doing it is with a CASE:

CASE WHEN u.deactivated THEN 'Deactivated' ELSE COALESCE(u.first, u.last, u.uname) END

(or even, as noted by @ysth, CASE WHEN COALESCE(u.deactivated, TRUE) THEN … in case users or posts happened to be deleted instead of just deactivated)

I've demoed it as the first query of a fiddle.

Additionally, some tips:

  • use UNION ALL, not UNION, if you're sure that your result rows cannot be mixed (because obviously no row will be both a post and a comment).
    UNION makes the RDBMS ensure they are no duplicates between rows from both parts of the UNION (and even between rows of each part separately), so it's like a DISTINCT which forces it to do an internal sort and filter.
    (it could even inadvertently remove rows: if two posts, in 2000 and 2020, from the same author, have the same contents, or a post and a comment, or even from different authors that both end up Deactivated, UNION will only output 1 row instead of both. Of course this is theorical, because in reality I suppose you output their timestamp too, that will make them distinct entries (and not merged by the UNION), but keep it in mind both for performance and exactitude.
  • do not GROUP BY x.id: it would be useful if you had multiple rows for each id, but here if id is your primary key (and given there's at most 1 post per comment, and 1 user per post), you'll always have at most 1 row for each id, so GROUP BY is useless.
  • if you've got a complex treatment to apply to some fields, do it only once on the UNIONed rows, instead of once per side of the UNION: you'll gain in coherence (copy-paste-modify-on-one-side-only-forgetting-to-apply-on-the-other is a frequent source of errors);
    for example wrap your UNION ALL into a Common Table Expression, then compute your rendered name here.
    You could even do the text search this way (returning all posts and comments without filters, then filtering to keep only those matching $querystring), however from a performance point of view it's good to keep the filters as soon as possible

I've applied it to create the second query of the fiddle.

One last tip: wrapping it with PHP

As you're calling your SQL from PHP, beware of SQL injection:

do not write $querystring as part of the SQL, but use a placeholder (starting with :), e.g. if you're using PDO:

$statement = $pdo->prepare("SELECT … post LIKE :like … comment LIKE :like GROUP BY c.id");
$statement->execute([ 'like' => "%$querystring%" ]);

PHP will then ensure that every $querystring special char will be handled as a char part of the text to look for, and not as an SQL instruction.

If you don't, in the best case some searches will fail (for example, try to search for a comment containing "It's OK": the ' will create an SQL syntax error;
in the worst case someone will use it to break your database (do not try to search, but imagine what $querystring = "'; DROP TABLE users; SELECT 1 WHERE '' = '"; would do).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.