2

I have two tables with point data. Table 1 - No null Points.
Table 2 - About half of the points are null.

The query:

SELECT
  *
FROM
  Table1 
INNER JOIN
  Table2 
ON
  Table1.Point.STBuffer(2.5).STIntersects(Table2.Point) = 1
WHERE
  Table1.Point IS NOT NULL
  AND Table2.Point IS NOT NULL

Takes over 8 hours to complete.

If I copy the data to a temporary table like this:

INSERT INTO TempTable SELECT * FROM Table2 WHERE Point IS NOT NULL

The same query takes about 40 seconds.

If I add some null data back in.

INSERT INTO TempTable SELECT TOP 10000 * FROM Table2 WHERE Point IS NULL

It goes back to taking forever.

What is happening?

12
  • Have you put a geographical index on Table1.Point and Table2.Point. This usually is not as relevant as you might think but I thought I would ask. Commented Oct 9, 2015 at 15:04
  • Yes, these indexes are identical and present on each table and the problem persists if I include WITH(Index(SIndex_Table1_Point)): CREATE SPATIAL INDEX SIndex_Table1_Point ON Table1(Point) USING GEOGRAPHY_GRID WITH ( GRIDS = (HIGH, HIGH, HIGH, HIGH), CELLS_PER_OBJECT = 1 ) Commented Oct 9, 2015 at 15:07
  • 1
    Additionally, the nullable value may be having an adverse affect as it still resides in the Spatial Index as a record - and bloats it. Can you take point out into a separate linked table with a record only existing for those where there is a value? Commented Oct 9, 2015 at 15:25
  • 1
    @hcaelxxam "I see your point" No puns intended! :-) Commented Oct 9, 2015 at 15:26
  • 1
    @JonBellamy, I tried using GEOGRAPHY::STPointFromText('POINT EMPTY', 4326) but that had the same issue. The only working fix is using POINT(0,0). Commented Oct 9, 2015 at 17:56

1 Answer 1

1

Spatial Indexes badly handle NULLs (and thanks to the OP's own trials EMPTY instances too).

The best solution would be to store the spatial data in a separate linked table which only contains a record for each non-null and non-empty spatial instance.

However, One workaround which works for the OP is to set all nulls to POINT(0,0) coordinates, however for more global applications this could produce incorrect results - so the preferred method is the best if you are able to restructure the data.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.