1

I wrote a HQL query that works fine in Hive, transforming it into a dynamic sparkSql Scala query throws ParseException:

    val l="1900-01-01 00:00:00.000001"
    val use_database="dev_lkr_send"

    val dfMet = spark.sql(s"""select 
    maxxx.cd_anomalie,
    maxxx.cd_famille,
    maxxx.libelle AS LIB_ANOMALIE,
    maxxx.MAJ_DATE AS DT_MAJ,
    maxxx.classification,
    maxxx.nb_rejeux AS NB_REJEUX,
    case when maxxx.indic_cd_erreur = 'O' then 1 else 0 end AS TOP_INDIC_CD_ERREUR,
    case when maxxx.invalidation_coordonnee = 'O' then 1 else 0 end AS TOP_COORDONNEE_INVALIDE,
    case when maxxx.typ_mvt = 'S' then 1 else 0 end AS TOP_SUPP,
    case when maxxx.typ_mvt = 'S' then to_date(substr(maxxx.dt_capt, 1, 19)) else null end AS DT_SUPP,
    minnn.typ_mvt,
    maxxx.typ_mvt,
    case when minnn.typ_mvt = 'C' then 'C' else 'M' end as TYP_MVT
from 
  (select s.cd_anomalie, s.cd_famille, s.libelle, s.maj_date, s.classification, s.nb_rejeux, s.dt_capt, s.typ_mvt from ${use_database}.pz_send_param_ano as s
    join
    (select cd_anomalie, min(dt_capt) as dtmin from ${use_database}.pz_send_param_ano where '"""+l+"""' <dtcapt group by cd_anomalie) as minn
    on s.cd_anomalie=minn.cd_anomalie and s.dt_capt=minn.dtmin) as minnn
join
    (select s.cd_anomalie, s.cd_famille, s.libelle, s.maj_date, s.classification, s.nb_rejeux, s.dt_capt, s.typ_mvt, s.indic_cd_erreur, s.invalidation_coordonnee from ${use_database}.pz_send_param_ano as s
    join
    (select cd_anomalie, max(dt_capt) as dtmax from ${use_database}.pz_send_param_ano group by cd_anomalie) as maxx
    on s.cd_anomalie=maxx.cd_anomalie and s.dt_capt=maxx.dtmax) as maxxx
on minnn.cd_anomalie=maxxx.cd_anomalie""")

This is the full Exception log:

org.apache.spark.sql.catalyst.parser.ParseException: mismatched input 'from' expecting {, 'WHERE', 'GROUP', 'ORDER', 'HAVING', 'LIMIT', 'LATERAL', 'WINDOW', 'UNION', 'EXCEPT', 'MINUS', 'INTERSECT', 'SORT', 'CLUSTER', 'DISTRIBUTE'}(line 15, pos 0)

1 Answer 1

1

Try wrapping up select queries with an alias at 16th,21st line.

Example:

val dfMet = spark.sql(s"""select 
    maxxx.cd_anomalie,
    maxxx.cd_famille,
    maxxx.libelle AS LIB_ANOMALIE,
    maxxx.MAJ_DATE AS DT_MAJ,
    maxxx.classification,
    maxxx.nb_rejeux AS NB_REJEUX,
    case when maxxx.indic_cd_erreur = 'O' then 1 else 0 end AS TOP_INDIC_CD_ERREUR,
    case when maxxx.invalidation_coordonnee = 'O' then 1 else 0 end AS TOP_COORDONNEE_INVALIDE,
    case when maxxx.typ_mvt = 'S' then 1 else 0 end AS TOP_SUPP,
    case when maxxx.typ_mvt = 'S' then to_date(substr(maxxx.dt_capt, 1, 19)) else null end AS DT_SUPP,
    minnn.typ_mvt,
    maxxx.typ_mvt,
    case when minnn.typ_mvt = 'C' then 'C' else 'M' end as TYP_MVT
from 
  (select s.cd_anomalie, s.cd_famille, s.libelle, s.maj_date, s.classification, s.nb_rejeux, s.dt_capt, s.typ_mvt from ${use_database}.pz_send_param_ano as s)s
    join
    (select cd_anomalie, min(dt_capt) as dtmin from ${use_database}.pz_send_param_ano where '"""+l+"""' <dtcapt group by cd_anomalie) as minn
    on s.cd_anomalie=minn.cd_anomalie and s.dt_capt=minn.dtmin) as minnn
join
    (select s.cd_anomalie, s.cd_famille, s.libelle, s.maj_date, s.classification, s.nb_rejeux, s.dt_capt, s.typ_mvt, s.indic_cd_erreur, s.invalidation_coordonnee from ${use_database}.pz_send_param_ano as s)s
    join
    (select cd_anomalie, max(dt_capt) as dtmax from ${use_database}.pz_send_param_ano group by cd_anomalie) as maxx
    on s.cd_anomalie=maxx.cd_anomalie and s.dt_capt=maxx.dtmax) as maxxx
on minnn.cd_anomalie=maxxx.cd_anomalie""") 
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.