0

I need to format the below shown multiple line string in python. I've tried many ways but they don't end up well.

AMAZON
IPHONE: 700
SAMSUNG: 600

=============

WALMART
IPHONE: 699

===========

ALIBABA
SONY: 500

So, the above data represent the online store and it's price of a mobile with its brand. I need to add these to a database. So, it should be like this

-------------------
AMAZON | IPHONE | 700
-------------------
AMAZON | SAMSUNG | 600
-------------------
WALMART | IPHONE | 699
-------------------
ALIBABA | SONY | 500
-------------------

I need to format the above text and store it in a database table.

What I have tried? I tried to split the multiple lines and create a dictionary more likely to be JSON. But It doesn't end well. But it takes only one line. If there is some other easy approach share me. Please help me with this!

2
  • is "=" character actually present in the string ? Commented Nov 15, 2020 at 16:41
  • @prajwal yeah! It's present with the string Commented Nov 15, 2020 at 16:45

2 Answers 2

1

I made some assumptions:

  • the vendor name is always before the products
  • at least === as separator between vendor entries
  • empty lines can be ignored

Working code:

str = """
AMAZON
IPHONE: 700
SAMSUNG: 600

=============

WALMART
IPHONE: 699

===========

ALIBABA
SONY: 500
"""

new_entry = True
print("-------------------")
for line in str.split("\n"):
    # assuming first entry is always the vendor name
    if not line.strip():
        continue
    elif new_entry:
        vendor = line.strip()
        new_entry = False
    elif "===" in line:
        new_entry = True
    else:
        product = line.split(":")
        print("{} | {} | {}".format(vendor, product[0].strip(), product[1].strip()))
        print("-------------------")

Output is:

-------------------
AMAZON | IPHONE | 700
-------------------
AMAZON | SAMSUNG | 600
-------------------
WALMART | IPHONE | 699
-------------------
ALIBABA | SONY | 500
-------------------

Alternative approach: The vendor name could also be found as being a text line, but without colon.

Sign up to request clarification or add additional context in comments.

Comments

1

answer submitted by @scito is adequate enough, but i am putting mine just in case. you can use regex, following is a working example :

strng = """
AMAZON
IPHONE: 700
SAMSUNG: 600

=============

WALMART
IPHONE: 699

===========

ALIBABA
SONY: 500

======
"""

multistrng = strng.split("\n") # get each line seperated by \n

import re 

market_re = re.compile('([a-zA-Z]+)') # regex to find market name

phone_re = re.compile(r"([a-zA-Z]+):\s(\d+)") # regex to find phone and its price

js = [] # list to hold all data found

for line in multistrng:
    phone = phone_re.findall(line) # if line contains phone and its price
    if phone:
        js[-1].append(phone[0]) # add phone to recently found marketplace
        continue
    market = market_re.findall(line)
    if market: # if line contains market place name
        js.append([market[0]])
        continue
    else:
        continue # empty lines ignore

# now you have the data in structured manner, you can print or add it to the database

for market in js:
    for product in market[1:]:
        print("---------------------")
        print("{} | {} | {}".format(market[0], product[0], product[1]))

print("---------------------")

output :

---------------------
AMAZON | IPHONE | 700
---------------------
AMAZON | SAMSUNG | 600
---------------------
WALMART | IPHONE | 699
---------------------
ALIBABA | SONY | 500
---------------------

data is stored in js list, if you iterate over js, first element in sub-list is market place, and rest is products for that market place.

[['AMAZON', ('IPHONE', '700'), ('SAMSUNG', '600')], ['WALMART', ('IPHONE', '699')], ['ALIBABA', ('SONY', '500')]]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.