0

Could one be so kind to assist me with a following:

I have a query that results in two columns one being straight (columnA) from a table while other generated from subquery (columnB). If I do a sort (i.e. ORDER BY) on columnB I get much slower response then when doing the same on columnA (50+ times). Is there a way to speed up the sort on columnB in order to come close to speeds achieved when sorting columnA?

Note: Engine is Postgres

Update: Query looks similar to:

select columnA, array_to_string(array(select ... from tableB where ...), '%') as columnB
from tableA
where ... 
order by columnA

Any advice is much appreciated.

Update #2: Solved it by doing sort in new query and then feeding the results to the main query (using where statement to select only partial number of rows instead of a whole set which gave me performance I needed). Thanks everybody that replied.

8
  • Some extra info would be useful, like the schema of the table and the engine(sqlite,sqlserver,postgres,mysql) to get further info Commented May 14, 2013 at 21:38
  • Please post the query. I presume you're working with it in another window, is it that hard to copy and paste it for us? Why do people always forget that? Commented May 14, 2013 at 21:39
  • Well, it's hard that we can help you without seeing your query (or at least a simplified version of it). You could have a situation where the subquery could be replaced by a JOIN Commented May 14, 2013 at 21:39
  • Thanks for responses! The engine is postgres. The query is fairly large and includes number of joins. I am looking at more general approaches to speed up the query rather then specific query (I apologize if this is inconvenient). Commented May 14, 2013 at 21:42
  • 1
    I wouldn't be trying to format the output in SQL at all. That's not what SQL is for. Tell us more about what you are trying to achieve here and what are the actual sorting criteria (since without ORDER BY in the subquery, you are currently just getting the random order). Commented May 14, 2013 at 22:03

3 Answers 3

2

In your query

select columnA, array_to_string(array(select ... from tableB where ...), '%') as columnB
from tableA
where ... 
order by columnA

operations on columnB can't take advantage of an index. Not only that, the sort will have to deal with columns the width of many concatenated rows.

Your best bet is to reconsider why you need this sorted, because the sort order of the expression array_to_string(...) is arbitrary. It's arbitrary, because you say you're not sorting within the SELECT statement that's an argument to array().


I am using array_to_string to capture a number of values that I need to process later. Do you see an alternative?

A SELECT statement will capture any number of values.

If you need "to further process" some values in sorted order, you're probably better off returning the results of a SELECT...ORDER BY statement without using any array functions. That way, your application code can process the values in order just by walking the result set. You won't have to parse values out of a "%" delimited string.

Sign up to request clarification or add additional context in comments.

8 Comments

You are correctly pointing my next problem (with array_to_string). However I have to first fix the speed before I decide how to solve the sort order. Thanks for the response.
You can't possibly fix the speed without reconsidering your use of array_to_string().
@Bo. The solution to your speed problem is to sort before you generate the string (so you can tailor your indexing accordingly), and to do that you need to know what to sort on.
@BrankoDimitrijevic: How would you do that? I am not really sure how would that help.
@MikeSherrill'Catcall': Mike I am using array_to_string to capture a number of values that I need to process later. Do you see an alternative?
|
2

You could put the unsorted data into a temp table and then index column b. Then run a simple select with the order by on the now indexed column. No guarantees this will be faster, but it is something to try.

Comments

1

Since your "ColumnB" is a computed value, there is no index which could be used to speed up the sort. ColumnA probably is already sorted, so it's fast. There is nothing you can do to speed up the sorting of these computed values, except to pre-calculate them and put them in a table. This is a big reason why data warehouses typically don't work against the live data, but export daily roll-ups instead.

2 Comments

Thank you for your answer. I have a bit of different situation and I have update original post. Sorry for any inconvenience. I am not doing any sorting inside of subquery.
I was afraid that was the answer :). Maybe someone else has more advice.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.