Personally I'd parse the data from sitemap_bp.csv first, then use that dictionary to populate the new file.
import re
with open('sitemap_bp.csv','r') as csvinput, \
open('mobilesitemap-browse.csv','r') as csvinput2, \
open('output.csv', 'w') as csvoutput:
writer = csv.writer(csvoutput, lineterminator='\n')
sitemap = csvinput # no reason to pipe this through csv.reader
mobilesitemap = csv.reader(csvinput2)
item_number = re.compile(r"\d{5}_\d{7}_{7}")
item_number_mapping = {item_number.search(line).group(): line.strip()
for line in sitemap if item_number.search(line)}
# makes a dictionary {item_number: full_url, ...} for each item in sitemap
# alternate to the above, consider:
# # item_number_mapping = {}
# # for line in sitemap:
# # line = line.strip()
# # match = item_number.search(line)
# # if match:
# # item_number_mapping[match.group()] = match.string
all = [row + [item_number_mapping[row[1]] for row in mobilesitemap]
writer.writerows(all)
My guess is that after the first time through your outer for loop, it tries to iterate through sitemap again but can't since the file is already exhausted. The minimal change for that would be:
for mobilerow in mobilesitemap:
csvinput.seek(0) # seek to the start of the file object
next(sitemap) # skip the header row
for row in sitemap:
#print row[0]
if mobilerow[1] in row[0]:
#print row, mobilerow[1]
all.append((row[0], mobilerow[1]))
else:
all.append(row)
But the obvious reason not to do this is that it iterates through your sitemap_bp.csv file once per row in mobilesitemap-browse.csv, rather than just once like my code.
EDIT per question in comments
If you need to get a list of those URLs in sitemap_bp.csv that don't correspond with mobilesitemap-browse.csv, you're probably best-served by making a set for all the items you see as you see them, then using set operations to get the unseen items. This takes a little tinkering, but...
# instead of all = [row + [item number ...
seen = set()
all = []
for row in mobilesitemap:
item_no = row[1]
if item_no in item_number_mapping:
all.append(row + [item_number_mapping[item_no]])
seen.add(item_no)
# after this for loop, `all` is identical to the list comp version
unmatched_items = [item_number_mapping[item_num] for item_num in
set(item_number_mapping.keys()) - seen]
withexpressions. You can chain them with commas, e.g.with open('file1.txt') as file1, open('file2.txt') as file2, ...\d{4,}_\d{4,}_\d{4,}_\d{4,}_\d{4,}|\d{4,}_\d{4,}_\d{4,}_\d{4,}|\d{4,}_\d{4,}_\d{4,}|\d{4,}to capture new types.r"\d{4,}(?:_\d{4,})*"