0

I have a dataset and i need to extract data from a column based on the Index position

The SERVICE_NAME column contains "ISPFSDPartnerPubSub/4_2/ProxyServices/InboundAndOutbound/AP/InboundPartnerCommunicationsAPLPPS" I will need to extract based on 4th and 5th Index as 'colX' and 'colY'

How can i achieve it?

val log = spark.read.format("csv")
      .option("inferSchema", "true")
      .option("header", "true")
      .option("sep", ",")
      .option("quote", "\"")
      .option("multiLine", "true")
      .load("OSB.csv").cache()
val logs = log.withColumn("Id", monotonicallyIncreasingId()+1)
val df = spark.sql("select SERVICE_NAME, _raw from logs")

Expected Output Col X: AP Col Y: InboundPartnerCommunicationsAPLPPS

1 Answer 1

1

Update: for select string parts specified by index, such code can be used:

val df = Seq("ISPFSDPartnerPubSub/4_2/ProxyServices/InboundAndOutbound/AP/InboundPartnerCommunicationsAPLPPS").toDF("SERVICE_NAME")
val result =
  df
    .withColumn("splitted", split($"SERVICE_NAME", "/"))
    .select(
      $"splitted".getItem(4).alias("colX"),
      $"splitted".getItem(5).alias("colY")
    )

result.show(false)

Output:

+----+----------------------------------+
|colX|colY                              |
+----+----------------------------------+
|AP  |InboundPartnerCommunicationsAPLPPS|
+----+----------------------------------+

Soluion for columns by index: Selecting two columns by column indexes, with renaming, can be done in this way:

df.select(
  col(df.columns(4)).alias("colX"),
  col(df.columns(5)).alias("colY"))
Sign up to request clarification or add additional context in comments.

4 Comments

In df.columns it should be the actual column name which is SERVICE_NAME8. ?
Yes. In answer "df" means Dataframe, "df.columns" - return array of dataframe columns: spark.apache.org/docs/2.3.0/api/java/org/apache/spark/sql/…
i get java.lang.ArrayIndexOutOfBoundsException: 4 when i use your logic. Also how can i escape '/' in my string?
Looks like just string parts are required, please look on updated answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.