1

After verifing this post in stackoverflow I am using indexOf() method to parse values from file. Below is the format of my file-

10/05/2005 10:02;AM;[email protected];student=student1 std=X marks=87 rollnumber=102
10/05/2005 10:05;AM;[email protected];student=student2 std=IX rollnumber=26
10/05/2005 10:15;PM;[email protected];student=student3 std=VII marks=87 attandance=5 rollnumber=12
10/05/2005 10:32;AM;[email protected];student=student4 std=V marks=87 rollnumber=69

Note:The domain name i.e. xyz.com in email is not going to be changed in anywhere.
Below is the code snippet i am using currently-

            FileInputStream fis = new FileInputStream(file);
            BufferedReader br = new BufferedReader(new InputStreamReader(
                    fis));

            String line = "";
            while ((line = br.readLine()) != null) {

                int index = -1;

                if ((index = line.indexOf("xyz.com")) != -1) {
                    int inStudent = line.indexOf("student=", index);
                    int spaceExistsinStudent = -1;
                    int studentIndex = -1;

                    if ((spaceExistsinStudent = line.indexOf("student=\"", inStudent)) != -1)
                        studentIndex = line.indexOf(" ", inStudent);
                    else
                        studentIndex = line.indexOf("\" ", spaceExistsinStudent);

                    int inSTD = line.indexOf("std=", studentIndex);
                    int spaceExistsinSTD = -1;
                    int stdIndex = -1;

                    if ((spaceExistsinSTD = line.indexOf("std=\"", inSTD)) != -1)
                        stdIndex = line.indexOf(" ", inSTD);
                    else
                        stdIndex = line.indexOf("\" ", spaceExistsinSTD);

                    String studentName = line.substring(inStudent + 9, studentIndex);
                    String stdName = line.substring(inSTD + 4, stdIndex);

There is no need to paste the entire code.
Well, using the above implementation, i am able to work, but is this effective solution as performace is considered? Any better way for achiveing the same....
Thank you in advance.

2
  • 1
    Even if the domain name isn't going to change in your approximation of the future, you should write your code to not embed it in a fixed string in the middle of your logic. At least make it a member variable, so it can be changed without tons of wasted effort in the future. Or better yet, make it a parameter to the method, and fake the configuration subsystem until later (when you do need it). Commented Sep 10, 2012 at 16:08
  • Yeah, i will keep it in mind... Commented Sep 10, 2012 at 16:56

3 Answers 3

2

Instead of indexOf(), i would suggest StringTokenizer. basically you can split you String based on some separator (eg: ;)..

Example inside your while loop

        StringTokenizer st = new StringTokenizer(line,";");


        st.nextToken(); //Date
        st.nextToken(); //AM
        String email = st.nextToken();
        String values = st.nextToken();

        StringTokenizer st2 = new StringTokenizer(values," ");


        while (st2.hasMoreElements()) {
            String token = (String) st2.nextElement();
            if(token.startsWith("student=")){
                System.out.println(token.substring("student=".length()));
            }else if(token.startsWith("std=")){
                System.out.println(token.substring("std=".length()));
            }

        }
Sign up to request clarification or add additional context in comments.

4 Comments

You seems right. However currently i am only worried about the performance. What do you think about your implementation in terms of execution time?
The bottleneck on performance will almost certainly not be the parsing code - it will be the disk IO. But yes, StringTokenizer is fast, and this implementation should in theory be faster than using indexOf, as it only scans the input line once.
@JoeK: Let me run both the version in my machine and trigger the execution time
@JoeK: Sadly, StringTokenizer is taking 1.5 times more time rather than indexOf. The reason why i go for indeOf is just because of it's better performance compare to StringTokenizer as i said in my question earlier.
1

You don't need to use indexOf for everything. If you want to look at one character, you can use charAt() e.g. where you are checking for a '"'

I would use a method which extracts the value for a field to simplify the code.

1 Comment

Can you tell me in more detail?
1

As I stated in an earlier comment, I am surprised that the parsing is the bottleneck here. But if you wish to know other ways that you could do this, and just try them out and see which is the fastest, here are two more ideas that haven't been posted- using .split:

String[] arr1 = line.split(";");
String dateTime = arr1[0];
String ampm = arr1[1];
String email = arr1[2];
String[] arr2 = arr1[3].split(" ");
String student, std, marks, rollnumber;
student = std = marks = rollnumber = null;
for (String str : arr2) {
    String value = str.substring(str.indexOf("=") + 1);
    switch(str.charAt(2)) {
    case 'u': student = value; break;
    case 'd': std = value; break;
    case 'r': marks = value; break;
    case 'l': rollnumber = value; break;
    }
}

Or using a regex:

private static final Pattern PATTERN = Pattern.compile("([^;]+);([^;]+);([^;]+);student=([^ ]+) std=([^ ]+) marks=([^ ]+) rollnumber=([^ ]+)");

Matcher m = PATTERN.matcher(line);
m.find();
String dateTime = m.group(1);
String ampm = m.group(2);
String email = m.group(3);
String student = m.group(4);
String std = m.group(5);
String marks = m.group(6);
String rollnumber = m.group(7);

1 Comment

Check this.. I referred this link before starting my dev. process.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.