4

I have a table (Table2) containing some areas (polygons) that are stored as geography data type. The table contains 1529 rows. In another table (Table1), I have approx. 22000 rows, each having an X/Y from which I create Points that are stored in a Geography column.

I need to make a spatial join to find out which area each point belongs to. I have created a spatial index on both tables, but I think the query is too slow. Right now, it takes about 72 seconds to make the join which looks like this:

SELECT ...
FROM [DatabaseA].dbo.Table1 t1 
INNER JOIN [DatabaseB].dbo.Table2 t2 ON t1.Geo.STIntersects(t2.Geo) = 1
WHERE t2.ObjectTypeId = 1 AND t2.CompanyId = 3

Please note that the two tables are in different databases but on the same server.

Before creating the spatial index, the query was much slower and I can see that the index is being used. However, creating the index on table2 does not affect performance, only the index on table1 gives better performance. Both indexes have High level grids

When I look at the execution plan, I notice a Filter part that takes 71% of the time:

CASE WHEN [Expr1015]>(2) THEN CASE WHEN [Expr1016]=[Expr1017] THEN (1) ELSE (0) END ELSE [DatabaseA].[dbo].[Table1].[Geo] as [t].[Geo].STIntersects([DatabaseB].[dbo].[Table2].[Geo] as [g].[Geo]) END=(1)

So, my question is:

Should this query be taking so long? Should I use other grid sizes? What does that filter expression mean?

Does anybody have a tip for optimizing this?

8
  • Very similar to many questions, see my response to this one here: stackoverflow.com/questions/7655408/sql-spatial-join/… Commented Aug 29, 2012 at 8:27
  • SQL Server 2008 or 2012? Commented Aug 29, 2012 at 9:16
  • @CatchingMonkey: As I wrote, the index is already being used, so adding a hint does not help on performance. Commented Aug 29, 2012 at 9:32
  • You tried it then? I have found consistently that it may say its using it, but when the hint is added, performance improves dramatically. Commented Aug 29, 2012 at 9:41
  • Yes, I tried it, but unfortunately with no luck Commented Aug 29, 2012 at 10:16

2 Answers 2

2

I had a similar problem. I had 2000 points and 85000 polygons. I needed to match the points with matching polygons. Originally that query was taking 8 hours.

SELECT Item.Name, Polygons.Name
FROM dbo.Geofence AS Polygons 
JOIN dbo.ItemLocation AS Points 
ON Polygons.GeoFence.STIntersects(Points.GeoLocation) = 1

The issue was that the points table had a non clustered index. Adding a clustered index reduced the time to 12 seconds.

Adding a spatial index(code below) reduced the time to 1 second. I also added one to the points table.

CREATE SPATIAL INDEX [SpatialIndex-Polygons] ON dbo.Polygons
(
    [Geofence]
)USING  GEOGRAPHY_GRID 
WITH (GRIDS =(LEVEL_1 = MEDIUM,LEVEL_2 = HIGH,LEVEL_3 = HIGH,LEVEL_4 = MEDIUM), 
CELLS_PER_OBJECT = 16, PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, 
ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
Sign up to request clarification or add additional context in comments.

Comments

1

In short, you're calling a function using the parameter t2.Geo, which must be evaluated by a function for all the values in t2 where t2.ObjecttypeId = 1 and t2.CompanyId = 3. Creating an index on table t2 doesn't really help because it can't use those precomputed index values. Instead, it must first run the function t1.Geo.STIntersects(t2.Geo) on all the values in t2 - which have virtually no relationship with the precomputed index values.

If speed is your goal and you have the storage, you could create a third table which has the result of every combination of t1.Geo.STIntersects(t2.Geo) precomputed. Then you could join t1 and t2 to the precomputed values in the third table which should be able to produce almost instantaneous query results (for source tables with 1,529 and 22,000 records).

If the data in t1 and t2 are relatively static, you could manually re-run a query which updates the data in the third table. If it changes frequently, it could be automatically maintained via triggers on updates, inserts, and deletes to t1 and t2 or you could wrap updates, inserts, and deletes into stored procedures which update the precomputed table.

1 Comment

The more I work with large spatial datasets (10K + records), the more I see the need to have computed tables storing whatever relationships and attributes I want to find. Spatial indexing lets you do a lot in a quick amount of time, but it's just not practical for an OLTP system with users waiting on the results of queries. At least not when you have a lot of data, which is often the case with spatial. Determine the intersects, nearest neighbor stuff, etc, and store it in a table. Refresh as often as needed.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.