0

In my view i have a upload form:

<input type="file" name="file" value="search file" /><br />

In my controller I load it like that:

def file = request.getFile('file')
def f = file.getInputStream()
def input = f.getText()

So I have now a String called input with the content of the file.

I want it in UTF-8. How is this possible ?

Edit:

My problem is, that the file to be uploaded is in "Windows-1252" and German characters like äöü are different now in the string called "input". If i convert the file with "Notepad++" in UTF-8 and then upload it, it works. But I cant do that every time.

Edit2:

def file = request.getFile('file')                      //get file from view
def File tmpfile = new File('C:/tmp/tmpfile.txt')       //create temporary file
file.transferTo(tmpfile)                                //copy into tmpfile
CharsetToolkit toolkit = new CharsetToolkit(tmpfile)    //toolkit with tmpfile
def charset = toolkit.getCharset()                      //save charset in a variable
def input = tmpfile.getText(charset)                    //get text with right charset

I tried this with a few different documents. But the variable charset is always UTF_8

2 Answers 2

3

You can use getText(String charset)

def input = f.getText('UTF-8')
Sign up to request clarification or add additional context in comments.

11 Comments

Or you might be able to wrap the InputStream def input = new BufferedReader( new InputStreamReader( file.getInputStream(), "UTF-8") ).text
Doesnt work with my problem. I edited the description.
In that case, follow tim_yates approach. @user2244536
@user2244536, use f.getText('Windows-1252') if file in Windows-1252 encoding
Use CharsetToolkit to guess the charset of a file. @user2244536
|
1

I found a solution:

I used java-bib called jUniversalChardet and wrote the following method:

String getEncoding ( def inputstream ) {
    def byte[] buf = new byte[4096]

    def UniversalDetector detector = new UniversalDetector(null)

    def nread
    while ((nread = inputstream.read(buf)) > 0 && !detector.isDone()) {
      detector.handleData(buf, 0, nread)
    }
    detector.dataEnd();

    def encoding = detector.getDetectedCharset()

    return encoding
}

In my code i have the following now:

def file = request.getFile('file')
def f = file.getInputStream()
def encoding = getEncoding(file.getInputStream())
def input = f.getText(encoding)

And it works :)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.