4

I am trying to get the contents of a text file, remove everything other than alphabets and then convert it into an array of Strings for individual processing of words. I do this for getting the text file :

String temp1= IOUtils.toString(FIS,"UTF-8");
String temp2=temp1.replaceAll("[,.!;:\\r\\n]"," ");

And then to tokenize the string, I do this:

String[] tempStringArray = temp2.split(" ");

The problem is that when the array is created, there are empty String at various indices.These empty String are at the position of linebreak, more than one whitespace, replaced punctuation marks, etc in the text file.
I want these empty Strings to be removed from my String array or in a way which they are unable to enter the String array.
How can this be done?

3 Answers 3

5

Split by all whitespaces like: String[] tempStringArray = temp2.split("\\s+")

Sign up to request clarification or add additional context in comments.

3 Comments

But this leaves us with and empty "" in the array!
@nayandhabarde, is this true? solution works well, at least for the above question. Maybe yours is a bit different. can you paste you string?
truecaller.blog/2018/01/22/life-as-an-android-engineer try requesting this page html as a response, at the end it has -->\n
2

In your example, if you have more than one character from your character set [,.!;:\r\n] in a row, it will replace it with more than one empty space. When you call the split() method, it then places empty occurrences in the array that refer to the multiple blank spaces in a row.

You can use a regex in the split() method, which will work a lot better for your example.

Try repacing temp2.split(" ") with temp2.split("\\s+"). This will look for multiple spaces in a row, and just tokenise the text around the large gaps of empty space.

Comments

2

While the answers of Daniel Arthur and Young Millie are correct, one can replace the two steps by directly splitting at the Characters you want to avoid:

String[] tempStringArray = temp1.split("[,.!;:\\s]+");

1 Comment

Works as well.Thanks for the answer

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.