Replace all the values in a certain column with certain values using csv reader Python

Question

This is the question continous from my previous question. Thank to many people, I could modify my code as below.

import csv
with open("SURFACE2", "rb") as infile, open("output.txt", "wb") as outfile:
    reader = csv.reader(infile, delimiter=" ")
    writer = csv.writer(outfile, delimiter=" ")
    for row in reader:
        row[18] = "999"                  

        writer.writerow(row)

I just change delimiter from "\t" to " ". Whiel with previous delimiter, the code only worked upto row[0], with " " the code can work until row[18].

15.20000           120.60000 98327      get data information here.  SURFACE DATA FROM ??????????? SOURCE    FM-12 SYNOP                                                                                155.00000         1         0         0         0         0         T         F         F   -888888   -888888      20020601030000 100820.00000

From the data line above, row[18] is just in the middle between 15.20000 and 120.60000.

I am not sure what happens in between these two values. Maybe delimiter changes? However visually I can't notice any difference. Is there any way which I can know the delimiter changed and if so, do you have any idea to handle multiple delimiter for one code?

Any idea or help would be really appreciated.

Thank you, Isaac

The results from repr(next(infile)):

'            15.20000           120.60000 98327      get data information here.  SURFACE DATA FROM ??????????? SOURCE    FM-12 SYNOP                                                                                155.00000         1         0         0         0         0         T         F         F   -888888   -888888      20020601030000 100820.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0\n'
'  99070.00000      0    155.00000      0    303.20001      0    297.79999      0      3.00000      0    140.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0\n'
'-777777.00000      0-777777.00000      0      1.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0\n'
'      1      0      0\n'
'            55.10000            -3.60000 03154      get data information here.  SURFACE DATA FROM ??????????? SOURCE    FM-12 SYNOP                                                                                 16.00000         1         0         0         0         0         T         F         F   -888888   -888888      20020601030000-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0\n'
'-888888.00000      0     16.00000      0    281.20001      0    279.89999      0      0.00000      0      0.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0\n'
'-777777.00000      0-777777.00000      0      1.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0\n'
'      1      0      0\n'

As you can see actually four first lines should be one line. For some reason, full line seems divided into 4 parts. Do you have any idea? Thank you, Isaac

Can you clarify what it means when you say "the code can now work until row[18]"? — Andrew Magee
– Andrew Magee, Commented Feb 27, 2015 at 3:50
I don't understand your question - what is the problem you are facing? — Burhan Khalid
– Burhan Khalid, Commented Feb 27, 2015 at 3:50
Ok so maybe there are exactly 19 fields in the row (row[18] being the last one)? — Andrew Magee
– Andrew Magee, Commented Feb 27, 2015 at 3:56
There must be a row that just doesn't have that many columns. In your loop you can say print(len(row)) to see how many columns there are in each row. — Andrew Magee
– Andrew Magee, Commented Feb 27, 2015 at 4:00
Right but the elements in the list row are fields, not characters. — Andrew Magee
– Andrew Magee, Commented Feb 27, 2015 at 4:02

Community · Accepted Answer · 2017-05-23 11:56:55Z

2

N.B. The file format is discussed on page 19 of this document. This more-or-less agrees with the sample data.

EDIT

OK, after considering the various comments, additional answers, and reading the original question it would seem that the file in question is not a CSV file. It is weather observation data formatted as "little_r" which uses fixed width fields padded with spaces. There is not much info available so I'm guessing, but each group of 4 lines seem to comprise a single observation. From your previous question it seems that you want to update the 3rd column in the first line? The other 3 lines would be skipped. Then update the 3rd column in the first line of the next set of 4 lines, etc., etc.

An example from the OP:

            15.20000           120.60000 98327      get data information here.  SURFACE DATA FROM ??????????? SOURCE    FM-12 SYNOP                                                                                155.00000         1         0         0         0         0         T         F         F   -888888   -888888      20020601030000 100820.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0
  99070.00000      0    155.00000      0    303.20001      0    297.79999      0      3.00000      0    140.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0
-777777.00000      0-777777.00000      0      1.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0-888888.00000      0
      1      0      0

The first 2 columns of the first line are (I'm guessing) the latitude and longitude for the observations. I have no idea what the 3rd column 98327 is, but this is the column that the OP wants to update (based on previous question).

It's not a CSV file, so don't process it as one. Instead, because there are fixed width fields, we know the offset and width of the field that needs to be updated. Based on the sample data the 3rd column occupies characters 41-46. So, to update the data and write to a new file:

offset_col_3 = 41
length_col_3 = 5

with open('SURFACE2') as infile, open('output.txt', 'w') as outfile:
    for line_no, line in enumerate(infile):
        if line_no % 4 == 0:    # every 4th line starting with the first
            line = '{}{:>5}{}'.format(line[:offset_col_3], 999, line[offset_col_3+length_col_3:])
        outfile.write(line)

Original answer

Try reading line 20 (row[19]) (assuming no header line in the CSV file, otherwise line 21) from the file and inspecting it in Python:

with open("SURFACE2") as infile:
    for i in range(20):
        print repr(next(infile))

The last line displayed will be row 18. If, for example, tabs are delimiters then you might see \t in between the columns of data. Compare the previous line to the last line to see if there is a difference in the delimiter used.

If you find that your CSV file is mixing delimiters, then you might have to split the fields manually.

edited May 23, 2017 at 11:56

CommunityBot

11 silver badge

answered Feb 27, 2015 at 4:01

mhawke

87.5k10 gold badges122 silver badges142 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Andrew Magee Over a year ago

He doesn't seem to be talking about row 18 but rather column 18 in a particular row.

mhawke Over a year ago

@AndrewMagee. Oh, well if rows and columns are being confused then it's a difficult to know what is being asked.

mhawke Over a year ago

@Isaac : no problem. This answer is still useful for you to inspect the data.

Schopenhauer Over a year ago

Well since my post was deleted by the moderator I assume that means we do not need no one else around here. That is what I have been in the USMC for the past 10 years, where we can handle things like that as gentlemen in a secluded location. I do this in good faith but your answer is helpful mine is not. Sorry kid but the Tyranny of the Majority has spoken. I do not live from programming, that is why I can live for it, by pure interest, but that is not helpful around here, got it, enjoy.

Schopenhauer Over a year ago

I spent near 10 hours editing and adding information to that post, but that annoys people and take some valuable resources that we can spend on your answers. Again thank you assholes, enjoy.

|

Andrew Magee · Accepted Answer · 2015-02-27 12:54:37Z

1

The csv module is not the right tool to use when you have fixed-width fields in your file. What you need to do is explicitly use the field lengths to split up the lines. For example:

# This would be your whole file
data = "\n".join([
    "abc  def gh i",
    "jk   lm  n  o",
    "p    q   r  s",
])
field_widths = [5, 4, 3, 1]

def fields(line, field_widths):
    pos = 0
    for length in field_widths:
        yield line[pos:pos + length].strip()
        pos += length

for line in data.split("\n"):
    print(list(fields(line, field_widths)))

will give you:

['abc', 'def', 'gh', 'i']
['jk', 'lm', 'n', 'o']
['p', 'q', 'r', 's']

answered Feb 27, 2015 at 12:54

Andrew Magee

6,7044 gold badges39 silver badges58 bronze badges

1 Comment

Isaac Over a year ago

thank you, I will try your solution and let you know the results.

Collectives™ on Stack Overflow

Replace all the values in a certain column with certain values using csv reader Python

2 Answers 2

7 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

7 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related