1

Got Input df1 and df2

df1:

Subcategory_Desc    Segment_Desc    Flow            Side        Row_no
APPLE               APPLE LOOSE     Apple Kanzi     Front       Row 1
APPLE               APPLE LOOSE     Apple Jazz      Front       Row 1
CITRUS              ORANGES LOOSE   Orange Navel    Front       Row 1
PEAR                PEARS LOOSE     Lemon           Right End   Row 1
AVOCADOS            AVOCADOS LOOSE  Avocado         Back        Row 1
TROPICAL FRUIT      KIWI FRUIT      Kiwi Gold       Back        Row 1
TROPICAL FRUIT      KIWI FRUIT      Kiwi Green      Left End    Row 1

df2:

Subcategory_Desc    Segment_Desc    Flow
TROPICAL FRUIT      KIWI FRUIT      5pk Kids Kiwi
APPLE               APPLE LOOSE     Apple GoldenDel
AVOCADOS            AVOCADOS LOOSE  Avocado Tray

Scenario: Dataframe df2 rows should be inserted to dataframe df1 considering below condition:

  1. Check for the similar Subcategory_Desc and Segment_Desc of df2 in df1 and insert that df2 row at the end of that particular Side(Front/Back). As given in expected Output.
  2. Need to consider Row_no column as well, because original dataset holds n number of Row_no, here have given Row 1 alone for sample data.

Expected Output:

Subcategory_Desc    Segment_Desc    Flow            Side        Row_no
APPLE               APPLE LOOSE     Apple Kanzi     Front       Row 1
APPLE               APPLE LOOSE     Apple Jazz      Front       Row 1
CITRUS              ORANGES LOOSE   Orange Navel    Front       Row 1
APPLE               APPLE LOOSE     Apple GoldenDel Front       Row 1
PEAR                PEARS LOOSE     Lemon           Right End   Row 1
AVOCADOS            AVOCADOS LOOSE  Avocado         Back        Row 1
TROPICAL FRUIT      KIWI FRUIT      Kiwi Gold       Back        Row 1
TROPICAL FRUIT      KIWI FRUIT      5pk Kids Kiwi   Back        Row 1
AVOCADOS            AVOCADOS LOOSE  Avocado Tray    Back        Row 1
TROPICAL FRUIT      KIWI FRUIT      Kiwi Green      Left End    Row 1

Not sure what simple logic can be used for this purpose.

1
  • Any ideas welcomed! Commented Jan 8, 2022 at 10:55

1 Answer 1

1

So, given the following dataframes:

import pandas as pd

df1 = pd.DataFrame(
    {
        "Subcategory_Desc": {
            0: "APPLE",
            1: "APPLE",
            2: "CITRUS",
            3: "PEAR",
            4: "AVOCADOS",
            5: "TROPICAL FRUIT",
            6: "TROPICAL FRUIT",
        },
        "Segment_Desc": {
            0: "APPLE LOOSE",
            1: "APPLE LOOSE",
            2: "ORANGES LOOSE",
            3: "PEARS LOOSE",
            4: "AVOCADOS LOOSE",
            5: "KIWI FRUIT",
            6: "KIWI FRUIT",
        },
        "Flow": {
            0: "Apple Kanzi",
            1: "Apple Jazz",
            2: "Orange Navel",
            3: "Lemon",
            4: "Avocado",
            5: "Kiwi Gold",
            6: "Kiwi Green",
        },
        "Side": {
            0: "Front",
            1: "Front",
            2: "Front",
            3: "Right_End",
            4: "Back",
            5: "Back",
            6: "Left_End",
        },
        "Row_no": {
            0: "Row 1",
            1: "Row 1",
            2: "Row 1",
            3: "Row 1",
            4: "Row 1",
            5: "Row 1",
            6: "Row 1",
        },
    }
)

df2 = pd.DataFrame(
    {
        "Subcategory_Desc": {0: "TROPICAL FRUIT", 1: "APPLE", 2: "AVOCADOS"},
        "Segment_Desc": {0: "KIWI FRUIT", 1: "APPLE LOOSE", 2: "AVOCADOS LOOSE"},
        "Flow": {0: "5pk Kids Kiwi", 1: "Apple GoldenDel", 2: "Avocado Tray"},
    }
)

You could try this:

# Initialize new column
df2["idx"] = ""

# Find indice of first match in df1
for _, row2 in df2.iterrows():
    for i, row1 in df1.iterrows():
        if i + 1 >= df1.shape[0]:
            break
        if (
            row1["Subcategory_Desc"] == row2["Subcategory_Desc"]
            and row1["Segment_Desc"] == row2["Segment_Desc"]
        ):
            row2["idx"] = i

df2 = df2.sort_values(by="idx").reset_index(drop=True)

# Starting from previous indice, find insertion indice in df1
for i, idx in enumerate(df2["idx"]):
    side_of_idx = df1.loc[idx, "Side"]
    df2.loc[i, "pos"] = df1.index[df1["Side"] == side_of_idx].to_list()[-1] + 1
positions = df2["pos"].astype("int").to_list()

# Clean up df2
df2 = df2.drop(columns=["idx", "pos"])
df2["Side"] = df2["Row_no"] = ""

# Iterate on df1 to insert new rows
for i, pos in enumerate(positions):

    # Fill missing values
    df2.loc[i, "Side"] = df1.loc[pos - 1, "Side"]
    df2.loc[i, "Row_no"] = df1.loc[pos, "Row_no"]

    # Insert row
    df1 = pd.concat(
        [df1.iloc[:pos], pd.DataFrame([df2.iloc[i]]), df1.iloc[pos:]], ignore_index=True
    ).reset_index(drop=True)

    # Increment next position since df1 has changed
    if i < len(positions) - 1:
        positions[i + 1] += 1

And so:

print(df1)
# Outputs
  Subcategory_Desc    Segment_Desc             Flow       Side Row_no
0            APPLE     APPLE LOOSE      Apple Kanzi      Front  Row 1
1            APPLE     APPLE LOOSE       Apple Jazz      Front  Row 1
2           CITRUS   ORANGES LOOSE     Orange Navel      Front  Row 1
3            APPLE     APPLE LOOSE  Apple GoldenDel      Front  Row 1
4             PEAR     PEARS LOOSE            Lemon  Right_End  Row 1
5         AVOCADOS  AVOCADOS LOOSE          Avocado       Back  Row 1
6   TROPICAL FRUIT      KIWI FRUIT        Kiwi Gold       Back  Row 1
7   TROPICAL FRUIT      KIWI FRUIT    5pk Kids Kiwi       Back  Row 1
8         AVOCADOS  AVOCADOS LOOSE     Avocado Tray       Back  Row 1
9   TROPICAL FRUIT      KIWI FRUIT       Kiwi Green   Left_End  Row 1
Sign up to request clarification or add additional context in comments.

1 Comment

Thank You for your input @Laurent. A little change worked perfect for me. Here is the change on incrementing next position. if i < len(positions) - 1: positions[i + 1] += i + 1

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.