Background
For some background, I'm trying to create a tool that converts worksheets into API calls using Python 3.5
For the conversion of the table cells to the schema needed for the API call, I've started down the path of using javascript like syntax for the headers used in the spreadsheet. e.g:
Worksheet Header (string)
dict.list[0].id
Python Dictionary
{
"dict":
"list": [
{"id": "my cell value"}
]
}
It's also possible that the header schema could have nested arrays/dicts:
one.two[0].three[0].four.five[0].six
And I also need to append to the object after it has been created as I go through each header.
What I've tried
add_branch
Based on https://stackoverflow.com/a/47276490/2903486 I am able to get nested dictionaries setup using values like one.two.three.four and I'm able to append to the existing dictionary as I go through the rows but I've been unable to add in support for arrays:
def add_branch(tree, vector, value):
key = vector[0]
tree[key] = value \
if len(vector) == 1 \
else add_branch(tree[key] if key in tree else {},
vector[1:],
value)
return tree
file = Worksheet(filePath, sheet).readRow()
rowList = []
for row in file:
rowObj = {}
for colName, rowValue in row.items():
rowObj.update(add_branch(rowObj, colName.split("."), rowValue))
rowList.append(rowObj)
return rowList
My own version of add_branch
import re, json
def branch(tree, vector, value):
"""
Used to convert JS style notation (e.g dict.another.array[0].id) to a python object
Originally based on https://stackoverflow.com/a/47276490/2903486
"""
# Convert Boolean
if isinstance(value, str):
value = value.strip()
if value.lower() in ['true', 'false']:
value = True if value.lower() == "true" else False
# Convert JSON
try:
value = json.loads(value)
except:
pass
key = vector[0]
arr = re.search('\[([0-9]+)\]', key)
if arr:
arr = arr.group(0)
key = key.replace(arr, '')
arr = arr.replace('[', '').replace(']', '')
newArray = False
if key not in tree:
tree[key] = []
tree[key].append(value \
if len(vector) == 1 \
else branch({} if key in tree else {},
vector[1:],
value))
else:
isInArray = False
for x in tree[key]:
if x.get(vector[1:][0], False):
isInArray = x[vector[1:][0]]
if isInArray:
tree[key].append(value \
if len(vector) == 1 \
else branch({} if key in tree else {},
vector[1:],
value))
else:
tree[key].append(value \
if len(vector) == 1 \
else branch({} if key in tree else {},
vector[1:],
value))
if len(vector) == 1 and len(tree[key]) == 1:
tree[key] = value.split(",")
else:
tree[key] = value \
if len(vector) == 1 \
else branch(tree[key] if key in tree else {},
vector[1:],
value)
return tree
What still needs help
My branch solution works pretty well actually now after adding in some things but I'm wondering if I'm doing something wrong/messy here or if theres a better way to handle where I'm editing nested arrays (my attempt started in the if IsInArray section of the code)
I'd expect these two headers to edit the last array, but instead I end up creating a duplicate dictionary on the first array:
file = [{
"one.array[0].dict.arrOne[0]": "1,2,3",
"one.array[0].dict.arrTwo[0]": "4,5,6"
}]
rowList = []
for row in file:
rowObj = {}
for colName, rowValue in row.items():
rowObj.update(add_branch(rowObj, colName.split("."), rowValue))
rowList.append(rowObj)
return rowList
Outputs:
[
{
"one": {
"array": [
{
"dict": {
"arrOne": [
"1",
"2",
"3"
]
}
},
{
"dict": {
"arrTwo": [
"4",
"5",
"6"
]
}
}
]
}
}
]
Instead of:
[
{
"one": {
"array": [
{
"dict": {
"arrOne": [
"1",
"2",
"3"
],
"arrTwo": [
"4",
"5",
"6"
]
}
}
]
}
}
]
listlooks like[{'id': 1}, {'id': 2}]how do you know whichidyou are referring to?list[0].idor by checking if the indicator is an int likelist.1.id-- was hoping it could be something like the last, but haven't figured that out entirely