10

Write a java program to read input from a file, and then sort the characters within each word. Once you have done that, sort all the resulting words in ascending order and finally followed by the sum of numeric values in the file.

  • Remove the special characters and stop words while processing the data
  • Measure the time taken to execute the code

Lets Say the content of file is: Sachin Tendulkar scored 18111 ODI runs and 14692 Test runs.

Output:achins adeklnrtu adn cdeors dio estt nrsu nrsu 32803

Time Taken: 3 milliseconds

My Code takes 15milliseconds to execute.....

please suggest me any fast way to solve this problem...........

Code:

import java.io.BufferedReader;
import java.io.FileReader;
import java.util.*;

public class Sorting {

    public static void main(String[] ags)throws Exception
    {
        long st=System.currentTimeMillis();
        int v=0;
        List ls=new ArrayList();
        //To read data from file
        BufferedReader in=new BufferedReader(
                 new FileReader("D:\\Bhive\\File.txt"));
        String read=in.readLine().toLowerCase();
        //Spliting the string based on spaces
        String[] sp=read.replaceAll("\\.","").split(" ");
        for(int i=0;i<sp.length;i++)
        {
            //Check for the array if it matches number
            if(sp[i].matches("(\\d+)"))
                //Adding the numbers
                v+=Integer.parseInt(sp[i]);
            else
            {
                //sorting the characters
                char[] c=sp[i].toCharArray();
                Arrays.sort(c);
                String r=new String(c);
                //Adding the resulting word into list
                ls.add(r);
            }
        }
        //Sorting the resulting words in ascending order
        Collections.sort(ls);
        //Appending the number in the end of the list
        ls.add(v);
        //Displaying the string using Iteartor
        Iterator it=ls.iterator();
        while(it.hasNext())
            System.out.print(it.next()+" ");
        long time=System.currentTimeMillis()-st;
        System.out.println("\n Time Taken:"+time);
    }
}
5
  • when I run the above code in my PC it takes only 2 ms.achins adeklnrtu adn cdeors dio estt nrsu nrsu 32803 Time Taken:2 Commented Jun 21, 2012 at 4:53
  • 2
    Does your file contain only one line? Commented Jun 21, 2012 at 5:03
  • you can try without using arraylist, directly append new String(c) + " "; to String variable. Commented Jun 21, 2012 at 5:16
  • 1
    Create the List after the splitting is done. At that point, you know the size and can provide the capacity. Perhaps instead of calling System.out.print every time, you could create the resulting string in memory (using StringBuilder) or creating a BufferedWriter first. But for your small input I am not sure all this would be worth the effort... Commented Jun 21, 2012 at 6:25
  • whats wrong in this? there is unnecessary iteration process in your code.. Commented Jun 21, 2012 at 6:46

5 Answers 5

5

Use indexOf() to extract words from your string instead of split(" "). It improves performance.

See this thread: Performance of StringTokenizer class vs. split method in Java

Also, try to increase the size of the output, copy-paste the line Sachin Tendulkar scored 18111 ODI runs and 14692 Test runs. 50,000 times in the text file and measure the performance. That way, you will be able to see considerable time difference when you try different optimizations.

EDIT

Tested this code (used .indexOf())

        long st = System.currentTimeMillis();
        int v = 0;
        List ls = new ArrayList();
        // To read data from file
        BufferedReader in = new BufferedReader(new FileReader("D:\\File.txt"));
        String read = in.readLine().toLowerCase();
        read.replaceAll("\\.", "");
        int pos = 0, end;
        while ((end = read.indexOf(' ', pos)) >= 0) {
            String curString = read.substring(pos,end);
            pos = end + 1;
        // Check for the array if it matches number
            try {
                // Adding the numbers
                v += Integer.parseInt(curString);
            }
            catch (NumberFormatException e) {
                // sorting the characters
                char[] c = curString.toCharArray();
                Arrays.sort(c);
                String r = new String(c);
                // Adding the resulting word into TreeSet
                ls.add(r);
            }
        }
        //sorting the list
        Collections.sort(ls);
        //adding the number
        list.add(v);
        // Displaying the string using Iteartor 
        Iterator<String> it = ls.iterator();
        while (it.hasNext()) {
            System.out.print(it.next() + " ");
        }
        long time = System.currentTimeMillis() - st;
        System.out.println("\n Time Taken: " + time + " ms");

Performance using 1 line in file
Your code: 3 ms
My code: 2 ms

Performance using 50K lines in file
Your code: 45 ms
My code: 32 ms

As you see, the difference is significant when the input size increases. Please test it on your machine and share results.

Sign up to request clarification or add additional context in comments.

9 Comments

The advice to go for a tree set rather than a sort on an arraylist is ludicrous.
Every time its trying to parse a string and I see that's costly. IMO, its better to check curString.charAtIndex[0]>65 in order to differentiate string from a number.
@sans481 Did you check or are you just talking out of your ass? parseInt should fail very fast for words, though not as fast as your test. See grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/…
@nes1983: Agreed, I based my results on duplicating the input string 50K times and running the test. But in that case, tree set does not add the additional duplicate elements. I am not sure how I made that blunder. This can be done only if the elements are unique and even in that case ArrayList would perform faster even after having an sort overhead. Thanks for the tip. Edited.
@nes1983 so which you think is better? parseInt? If so, why?
|
3

The only thing I see: the following line is needlessly expensive:

   System.out.print(it.next()+" ");

That's because print is inefficient, due to all the flushing going on. Instead, construct the entire string using a string builder, and then reduce to one call of print.

Comments

1

I removed the list and read it using Arrays only, In my machine the code to 6 msec with your code, by using Arrays only it taking 4 to 5 msec. Run this code in your machine and let me know the time.

import java.io.BufferedReader;

import java.io.FileReader;

import java.util.*;

public class Sorting {
public static void main(String[] ags)throws Exception
{
    long st=System.currentTimeMillis();
    int v=0;
    //To read data from file
    BufferedReader in=new BufferedReader(new FileReader("File.txt"));
    String read=in.readLine().toLowerCase();
    //Spliting the string based on spaces
    String[] sp=read.replaceAll("\\.","").split(" ");
    int j=0;
    for(int i=0;i<sp.length;i++)
    {
        //Check for the array if it matches number
        if(sp[i].matches("(\\d+)"))
            //Adding the numbers
            v+=Integer.parseInt(sp[i]);
        else
        {
            //sorting the characters
            char[] c=sp[i].toCharArray();
            Arrays.sort(c);
            read=new String(c);
            sp[j]= read;
            j++;
        }
    }
    //Sorting the resulting words in ascending order
    Arrays.sort(sp);
    //Appending the number in the end of the list
    //Displaying the string using Iteartor
    for(int i=0;i<j; i++)
        System.out.print(sp[i]+" ");
        System.out.print(v);
    st=System.currentTimeMillis()-st;
    System.out.println("\n Time Taken:"+st);
}

}

1 Comment

Bah. I wouldn't trust any recruiter that preferred the array over the array list. I can see how it's faster, but some things you just don't do for performance :). And the gain is probably really tiny, perhaps non-existent after Hotspot is done optimizing.
1

I ran the same code using a PriorityQueue instead of a List. Also, as nes1983 suggested, building the output string first, instead of printing every word individually helps reduce the runtime.

My runtime after these modifications was definitely reduced.

Comments

0

I have modified the code like this further by including @Teja logic as well and resulted in 1 millisecond from 2 millisescond:

long st=System.currentTimeMillis();
     BufferedReader in=new BufferedReader(new InputStreamReader(new FileInputStream("D:\\Bhive\\File.txt")));
     String read= in.readLine().toLowerCase();
     String[] sp=read.replaceAll("\\.","").split(" ");
     int v=0;
     int len = sp.length;
     int j=0;
     for(int i=0;i<len;i++)
     {
            if(isNum(sp[i]))
             v+=Integer.parseInt(sp[i]);
             else
            {
              char[] c=sp[i].toCharArray();
              Arrays.sort(c);
              String r=new String(c);
              sp[j] = r;
              j++;
             }
      }
        Arrays.sort(sp, 0, len);
        long time=System.currentTimeMillis()-st;
        System.out.println("\n Time Taken:"+time);
        for(int i=0;i<j; i++)
        System.out.print(sp[i]+" ");
        System.out.print(v); 

Wrote small utility to perform for checking a string contains number instead of regular expression:

private static boolean isNum(String cs){
     char [] s = cs.toCharArray();
     for(char c : s)
     {
      if(Character.isDigit(c))
       {
         return true;
       }
     }
     return false;
 }

Calcluate time before calling System.out operation as this one is blocking operation.

2 Comments

I think System.out is part of the process...well it should not be ideally, but you should not assume until explicitly stated.
Pls explain what part of the process. I think System.out is also wanted to be considered as part of the performance.Is that what you mean? However, if there is a System.out. it is a blocking operation.So to bench mark , you cannot include System.out.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.