0

I am looking at building lists of lists within a dictionary from an Excel spreadsheet.

My spreadsheet looks like this:

source_item_id target_item_id find_sting replace_sting
source_id1 target_id1 abcd1 efgh1
source_id1 target_id1 ijkl1 mnop1
source_id1 target_id2 abcd2 efgh2
source_id1 target_id2 ijkl2 mnop2
source_id2 target_id3 qrst uvwx
source_id2 target_id3 yzab cdef
source_id2 target_id4 ghij klmn
source_id2 target_id4 opqr stuv

My output dictionary should looks like this:

{
  "source_id1": [{
      "target_id1": [{
          "find_string": "abcd1",
          "replace_string": "efgh1"
      },
      {
          "find_string": "ijkl1",
          "replace_string": "mnop1"
      }]
  },
  {
      "target_id2": [{
          "find_string": "abcd2",
          "replace_string": "efgh2"
      },
      {
          "find_string": "ijkl2",
          "replace_string": "mnop2"
      }]
  }],
  "source_id2": [{
      "target_id3": [{
          "find_string": "qrst",
          "replace_string": "uvwx"
      },
      {
          "find_string": "yzab",
          "replace_string": "cdef"
      }]
  },
  {
      "target_id4": [{
          "find_string": "ghij",
          "replace_string": "klmn"
      },
      {
          "find_string": "opqr",
          "replace_string": "stuv"
      }]
  }]
}

With the following code I only get the last values in each of the lists:

import xlrd
xls_path = r"C:\data\ItemContent.xlsx"
book = xlrd.open_workbook(xls_path)
sheet_find_replace = book.sheet_by_index(1)
find_replace_dict = dict() 
for line in range(1, sheet_find_replace.nrows):
    source_item_id = sheet_find_replace.cell(line, 0).value
    target_item_id = sheet_find_replace.cell(line, 1).value
    find_string = sheet_find_replace.cell(line, 2).value
    replace_sting = sheet_find_replace.cell(line, 3).value
    find_replace_list = [{"find_string": find_string, "replace_sting": replace_sting}]
    find_replace_dict[source_item_id] = [target_item_id]
    find_replace_dict[source_item_id].append(find_replace_list)
print(find_replace_dict)

--> result

{
    "source_id1": ["target_id2", [{
        "find_string": "ijkl2",
        "replace_sting": "mnop2"
      }
    ]],
    "source_id2": ["target_id4", [{
        "find_string": "opqr",
        "replace_sting": "stuv"
      }
    ]]
}
3
  • Out of curiosity - this is a bit orthogonal - is there a reason that source_idx points to a list of dictionaries, each with one key only (target_idx)? It might feel more natural to have that be a dictionary instead of a list, with a key-value relationship. Commented Oct 5, 2021 at 17:38
  • The errors are most likely in the last three lines of the for loop; I'm pretty sure you are over-writing dictionary values here. I'd advise running this through a debugger and checking the output at each step. Worst-case: use print statements to print the values of find_replace_list and find_replace_dict. Commented Oct 5, 2021 at 17:43
  • Are you using an older version of Python? xlrd.open_workbook seems to fail in Python 3.9. Commented Oct 5, 2021 at 17:57

1 Answer 1

1

Your problem is rather complicated by the fact that you have a list of single-key dictionaries as the value of your source ids, but you can follow a pattern of parsing each line for the relevant items and, and then using those to target where you insert appends, or alternatively create new lists:

def process_line(line) -> Tuple[str, str, dict]:
    source_item_id = sheet_find_replace.cell(line, 0).value
    target_item_id = sheet_find_replace.cell(line, 1).value
    find_string = sheet_find_replace.cell(line, 2).value
    replace_string = sheet_find_replace.cell(line, 3).value
    return source_item_id, target_item_id, {
        "find_string": find_string,
        "replace_string": replace_string
    }

def find_target(target: str, ls: List[dict]) -> int:
    # Find the index of the target id in the list
    for i in len(ls):
        if ls[i].get(target):
            return i
    return -1  # Or some other marker

import xlrd
xls_path = r"C:\data\ItemContent.xlsx"
book = xlrd.open_workbook(xls_path)
sheet_find_replace = book.sheet_by_index(1)
result_dict = dict() 
for line in range(1, sheet_find_replace.nrows):
    source, target, replacer = process_line(line)
    # You can check here that the above three are correct
    source_list = result_dict.get(source, [])  # Leverage the default value of the get function
    target_idx = find_target(target, source_list)
    target_dict = source_list[target_idx] if target_idx >=0 else {}
    replace_list = target_dict.get(target, [])
    replace_list.append(replacer)
    
    target_dict[target] = replace_list
    if target_idx >= 0:
        source_list[target_idx] = target_dict
    else:
        source_list.append(target_dict)

    result_dict[source] = source_list

print(result_dict)

I would note that if source_id pointed to a dictionary rather than a list, this could be radically simplified, since we wouldn't need to search through the list for a potentially already-existing list item and then awkwardly replace or append as needed. If you can change this constraint (remember, you can always convert a dictionary to a list downstream), I might consider doing that.

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks for your reply. I am getting an Unresolved reference 'Tuple' and Unresolved reference 'List' for the process_line and find_target functions respectivly.
In looking at how I need to use this dictionary, making the source_id a dictionary instead of a list is also possible, especially if it is going to make it more simple. Thanks once again.
@Genspec You might need to from typing import Tuple and List, etc. Or you can drop the type hinting.
Thanks @Nathaniel, This is now working. I had to make a minor change in the find_target: for i in ls: if ls[len(ls)-1].get(target): return len(ls)-1

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.