0

I have table 'auctions' with 60k records. It has a vector column that contains tsearch vectors like below

auctions.tsvector_content_tsearch
107658 | '-75':75 '-83':81 '0.265':49 '0.50':140 '1':62 '1000':61 '1080':38 '16':39 '160':91 '170':86 '1920':36 '1920x1080':65,69 '2':154 '219':129 '23':3,20 '23.0':31 '236v3lsb':6,23 '24':164 '24.75':134 '250':58 '3.190':117 '30':80 '426':127 '5':54 '5.0':99 '56':74 '566':125 '9':40 'black':45 'cal':32 'cd/m2':59 'compatible':158 'czarny':46 'czas':51 'czuwać':139 'częstotliwość':71,77 'd':110,114 'd-sub':109 'dodatkowy':146,152 'dvi':7,24,113 'dvi-d':112 'ekran':30 'energia':131,137 'energy':97 'epeat':100 'ergonomics':104 'full':41 'g':120 'gs':106 'gwarancja':153,161 'hd':42 'hz':76 'informacja':151 'jasność':56 'kabel':147,149 'kensington':156 'kg':116,118 'khz':82 'kolor':43 'kontrast':60 'kąt':83,88 'lcd':2,15,19 'led':4,21 'lina':28 'lock':157 'maksymalny':64 'matryca':34,53,57 'miejsce':165 'miesiąc':163 'mm':50 'monitor':1,14,18,96 'mś':55 'nazwać':12 'norma':93 'obudowa':44 'odchylać':72,78 'ogólny':17 'okres':159 'opis':16 'optymalny':68 'philips':5,11,22 'piksel':66,70 'pionowy':73,85 'plamka':48 'pobór':130,136 'poziom':90 'poziomy':79,90 'producent':10 'przekątna':29 'przeć':133 'reakcja':52 'rodzina':25 'rohs':102 'rozdzielczość':63,67 'rękojmia':160 'serwis':167 'serwisować':166 'silver':101 'specyfikacja':8 'spełniać':94 'star':98 'stopień':87,92 'sub':111 'techniczny':9 'tryb':138 'tryba':138 'tuv':103,105 'typ':13,33 'v':27 'v-line':26 'vga':150 'waga':115 'wbudować':142 'widzenia':84,89 'widzenie':84,89 'widzieć':84,89 'wielkość':47 'wuxga':35 'wymiar':119 'wyposażenie':145 'wyposażyć':145 'x':37,121,123,126,128 'zasilacz':143 'zasilać':148 'zewn':108 'zewnętrzny':168 'złączać':107 'złącze':107 'łat':155 'ś':122

Table auction has an index on that column:

    "auctions_tsvector_content_tsearch_idx" gin (tsvector_content_tsearch)

When I search for some matching vectors query takes about 4000-5000ms; that is too long.
Is there any way to speed things up here?

EXPLAIN SELECT auctions.id FROM auctions WHERE (auctions.tsvector_content_tsearch @@ to_tsquery('polish', 'lcd'));


           QUERY PLAN                          
--------------------------------------------------------------
 Seq Scan on auctions  (cost=0.00..6598.02 rows=7762 width=4)
   Filter: (tsvector_content_tsearch @@ '''lcd'''::tsquery)
(2 rows)

_ EDIT __

Ok I think I found a problem: polish dictionary. Using standard postgres dictionary fix long time problem. Thanks for tips

1 Answer 1

1

Apparently, the planner estimated that sequential scan is going to be faster than using the index. Try the following:

  • SET enable_seqscan=off (useful for test, however - do not use it in production)
  • raising the stats target

That behaviour sometimes occurs with GIN indices. Check this thread on PostgreSQL mailing list. You can also consult the official PostgreSQL documentation about this issue.

Sign up to request clarification or add additional context in comments.

2 Comments

Turning enable_seqscan off is a workaround and useful during testing, but not for production. Fix the real problem, like better stats.
Completely agree. Purpose of turning enable_seqscan off was to see execution plan and query speed improvement with indices, in production environment that is not an option.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.