1

I am trying to run two sql queries using sqlalchemy engine in python 3.7. However, I am having trouble joining results columns from two queries. Is there an efficient way to perform this for MSSQL?

Following is the table that is being queried

timestamp           startX  startY  Number
2019-05-13-10:31    695     384     0
2019-05-13-10:32    3914    256     25ZLH3300MEPACC16x25
2019-05-13-10:32    3911    442     25ZLH3300MEPACC16x25
2019-05-13-10:32    3904    2109    25ZLH3300MEPACC16x25
2019-05-13-10:32    3910    627     25ZLH3300MEPACC16x25
2019-05-13-10:32    3904    1445    25ZLH3300MEPACC16x25

and I need to get this as an output

timestamp           startX  startY  Number                 Quantity
2019-05-13-10:31    695     384     0                      1
2019-05-13-10:32    3914    256     25ZLH3300MEPACC16x25   5

First query returns unique records based on Number as follows

SELECT * FROM 
    (SELECT 
       [timestamp]
      ,[startX]
      ,[startY]
      ,[Number]
  ,ROW_NUMBER() OVER(Partition by [Table].Number, 
                                    [Table].Number,  
                                    type order by [timestamp] DESC) rownumber 
                                    FROM [Table]) a WHERE rownumber = 1

Second query returns count of duplicate records as Quantity column with a Number column.

SELECT [Table].Number, count(*) AS 'Quantity'
FROM   [Table]
GROUP  BY [TABLE].Number
HAVING count(*) >= 1

I would like to join results from query #1 and column quantity from query #2 based on Number as primary key.

connection = engine.connect()
            connection.execute(""" Query """)
11
  • See Why should I provide an MCVE for what seems to me to be a very simple SQL query? Commented May 14, 2019 at 14:40
  • @RaymondNijland I just added the table and the output required to satisfy that requirement Commented May 14, 2019 at 14:49
  • Why not just use SELECT * FROM (query1) AS q1 INNER JOIN (query2) AS q2 ON q1.number = q2.number? Commented May 14, 2019 at 14:53
  • you can't get the same results always on every run with this example data as SQL tables/resultsets are by ANSI/ISO SQL standard definition orderless.. ORDER BY timestamp DESC would still give non deterministic (random) results because the timestamps are not unique.. Do you have a column with IDENTITY in your table we need to that to get pure deterministic (fixed) results by adding that also in the ORDER BY Commented May 14, 2019 at 14:56
  • 1
    @Far - That's not caused by the INNER JOIN, that's caused by having type in your PARTITION BY; in other words, the first query already includes the duplicates that you don't want, so fix the first query. Commented May 14, 2019 at 15:05

2 Answers 2

3

You could just do it in a single query, one example would be:

SELECT
  *
FROM
(
    SELECT 
        [timestamp]
       ,[startX]
       ,[startY]
       ,[Number]
       ,ROW_NUMBER()
            OVER (PARTITION BY [Table].Number, 
                               [Table].type
                      ORDER BY [timestamp] DESC
                 )
                   AS rownumber 
       ,COUNT(*)
            OVER (PARTITION BY [Table].Number
                 )
                   AS Quantity
    FROM
      [Table]
)
    a
WHERE
        rownumber = 1
    AND quantity  > 1
Sign up to request clarification or add additional context in comments.

2 Comments

@RaymondNijland - Indeed, but the op wanted to know how to join the queries, not how to fix or validate either query ;)
very true i removed the comment under this answer because the topicsstarter said the order wasn't important.. +1 by the way the query is valid..
0

You could try using pandas and join on DataFrames:

import pandas as pd
import sqlalchemy

engine = slqalchemy.create_enging(my_sql_settings)

df_unique_records = pd.read_sql(sql=my_query_1, conn=engine, index_col=Number)
df_duplicate_counts = pd.read_sql(sql=my_query_2, conn=engine, index_col=Number)

df = df_unique_records.merge(df_duplicate_counts, left_index=True, right_index=True)

1 Comment

Thanks, but I wanted to know if I could have sql server handle this.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.