3

I need to split each line of text into an array using a loop. The problem is that there's no obvious delimiter to use given the formatting of the text file (which I can't change):

Adam Rippon      New York, NY    77.58144.6163.6780.94
Brandon Mroz     Broadmoor, CO   70.57138.1266.8471.28
Stephen Carriere Boston, MA      64.42138.8368.2770.56
Grant Hochstein  New York, NY    64.62133.8867.4468.44
Keegan Messing   Alaska, AK      61.15136.3071.0266.28
Timothy Dolensky Atlanta, AL     61.76123.0861.3063.78
Max Aaron        Broadmoor, CO   86.95173.4979.4893.51
Jeremy Abbott    Detroit, MI     99.86174.4193.4280.99
Jason Brown      Skokie Value,IL 87.47182.6193.3489.27
Joshua Farris    Broadmoor, CO   78.37169.6987.1783.52
Richard Dornbush All Year, CA    92.04144.3465.8278.52
Douglas Razzano  Coyotes, AZ     75.18157.2580.6976.56
Ross Miner       Boston, MA      71.94152.8772.5380.34
Sean Rabbit      Glacier, CA     60.58122.7656.9066.86
Lukas Kaugars    Broadmoor, CO   64.57114.7550.4766.28
Philip Warren    All Year, CA    55.80113.2457.0258.22
Daniel Raad      Southwest FL    52.98108.0358.6151.42
Scott Dyer       Brooklyn, OH    55.78100.9744.3357.64
Robert PrzepioskiRochester, NY   47.00100.3449.2651.08

Ideally I would like each name to be in [0] (or first name in [0] last name in [1]), each location to be in [2] or also in two different indexes for city and state, and then each score to be in their own index. For each person there are four separate numbers. Like for example Adam Rippon's scores are 77.58, 144.61, 63.67, 80.94

I can't split by spaces because some of the cities have a space between their name (like New York would then be split into New and York in two different array elements while Broadmoor would be in one element). Can't split cities by commas because Southwest FL has no comma. I also can't split the numbers by decimal point because those numbers would be wrong. So is there an easy way to go about doing this? Like perhaps a way to split numbers by the amount of decimal places?

3
  • How would you handle 77.58144.61? 77.58 and 144.61? 77.581and 44.61? Or it is assumed that there you'll always have 2 digits after the decimal point? Also for the last line, how would you separate the last name from the city? Commented Jul 13, 2015 at 20:12
  • That is exactly the issue I'm struggling with. I don't know how to get those numbers separately given that none of them are separated by spaces in the text file. Each separate number does have two digits after the decimal point, so I wasn't sure if I can somehow split the numbers using that. Commented Jul 13, 2015 at 20:21
  • As the answers stated, since it's a fixed size line format, you won't have troubles to get the firstname, lastname, city and state. Then if you know that each number should have exactly two digits after the decimal point, you know what is the length of each one so there's no more problems. Commented Jul 13, 2015 at 20:24

6 Answers 6

7

It looks like there is a fixed size for each column. So in your case, column 1 is 17 characters long, the second column is 16 characters long and the last one is 21 characters long.

Now you can simply iterate through the lines and make use of the substring() method. Something like...

String firstColumn = line.substring(0, 17).trim();
String secondColumn = line.substring(17, 33).trim();
String thirdColumn = line.substring(33, line.length).trim();

To extract the numbers, we could use a regular expression that searches for all numbers with two decimal places.

Pattern pattern = Pattern.compile("(\\d+\\.[0-9]{2})");

Matcher matcher = pattern.matcher(thirdColumn);

while(matcher.find())
{
    System.out.println(matcher.group());
}

So in this case 47.00100.3449.2651.08 will output

47.00
100.34
49.26
51.08
Sign up to request clarification or add additional context in comments.

4 Comments

So split by column size you mean? How do I go about doing that?
Ahhh, thanks a lot. So simple and should have been obvious.
How do you get each decimal out?
@Shar1er80 You could use regular expressions, f.e. \d+.[0-9]{2} so it will find every number that is followed by two decimal digits. Edited my post! @sam
1

It looks like each column has a fixed size (number of characters). As you already said you cannot split by tabs or spaces because of the last line where there is no tab or space between name and city.

I propose to read one line and then split the String by line.substring(startIndex,endIndex). For example line.substring(0,18) for the name (if I counted correctly). Then you can split this name in first and lastname by using the space as delimiter.

Comments

0

Assuming the fields are fixed width, which is what it appears to be, you can do substring operations to get each field and then parse accordingly. Something like:

String name = line.substring(0,x)
String city_state = line.substring(x, y)
String num1 = line.substring(y,z)

Etc. where the x, y and z are the column breaks.

Comments

0

This seems to be the good old fixed-position file format. It was highly popular in the days of punch card readers.

So basically, you read this file line by line, and then:

String name = line.substring(0,17).trim();
String location = line.substring(17,33).trim();

String[] scores = new String[4];
scores[0] = line.substring(33,38);
scores[1] = line.substring(38,44);
scores[2] = line.substring(44,49);
scores[3] = line.substring(49,54);

You can then go on and split the name by space, the location by ,, convert the scores into numbers and so on.

If you want to make all of the above more general, you can prepare a list of indexes, and create the array based on those indexes:

int[] fieldIndexes = { 0, 17,33,38,44,49,54 };
String values[] = new String[fieldIndexes.length - 1];

And then in your read loop (again I assume you read the line into line):

for ( int i = 1; i < fieldIndexes.length; i++ ) {

     values[i-1] = line.substring(fieldIndexes[i-1],fieldIndexes[i]).trim();

}

And then proceed to work with the values array.

Of course, make sure each line you read has the appropriate number of characters etc. so as to avoid out-of-bounds problems.

Comments

0

Read line by line, then in each line, substring by the corresponding limits. e.g.:

private static String[] split(String line) {
    return new String[] {
        line.substring(0, 16).trim(),
        line.substring(17, 32).trim(),
        line.substring(33, 37).trim(),
        line.substring(38, 43).trim(),
        line.substring(44, 48).trim(),
        line.substring(49, 53).trim(),
    };
}

Comments

0

Why don't you split by index ? The coordinates are the tricky one, but if you always have two numbers after the decimal points then this example can help.

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;


public class Split {

    public static void main(String[] args) throws IOException {

        List<Person> lst = new ArrayList<Split.Person>();

        BufferedReader br = new BufferedReader(new FileReader("c:\\test\\file.txt"));

        try {
            String line = null;

            while ((line = br.readLine()) != null) {

                Person p = new Person();

                String[] name = line.substring(0,17).split(" ");
                String[] city = line.substring(17,33).split(" ");

                p.setName(name[0].trim());
                p.setLastname(name[1].trim());
                p.setCity(city[0].replace(",","").trim());
                p.setState(city[1].replace(",","").trim());

                String[] coordinates = new String[4];
                String coor = line.substring(33);

                String first = coor.substring(0, coor.indexOf(".") + 3);

                coor = coor.substring(first.length());

                String second = coor.substring(0, coor.indexOf(".") + 3);

                coor = coor.substring(second.length());

                String third = coor.substring(0, coor.indexOf(".") + 3);

                coor = coor.substring(third.length());

                String fourth = coor.substring(0, coor.indexOf(".") + 3);

                coordinates[0] = first;
                coordinates[1] = second;
                coordinates[2] = third;
                coordinates[3] = fourth;

                p.setCoordinates(coordinates);

                lst.add(p);
            }

        } finally {
            br.close();
        }

        for(Person p : lst){
            System.out.println(p.getName());
            System.out.println(p.getLastname());
            System.out.println(p.getCity());
            System.out.println(p.getState());
            for(String s : p.getCoordinates()){
                System.out.println(s);
            }

            System.out.println();
        }
    }

    public static class Person {

        public Person(){}

        private String name;
        private String lastname;
        private String city;
        private String state;
        private String[] coordinates;
        public String getName() {
            return name;
        }
        public void setName(String name) {
            this.name = name;
        }
        public String getLastname() {
            return lastname;
        }
        public void setLastname(String lastname) {
            this.lastname = lastname;
        }
        public String getCity() {
            return city;
        }
        public void setCity(String city) {
            this.city = city;
        }
        public String getState() {
            return state;
        }
        public void setState(String state) {
            this.state = state;
        }
        public String[] getCoordinates() {
            return coordinates;
        }
        public void setCoordinates(String[] coordinates) {
            this.coordinates = coordinates;
        }
    }

}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.