0

I want to automate text language detection in LibreOffice Calc.

I have only 4 languages, each language has its own character set. Languages are not or rarely mixed in cells.

Languages are: English, Hebrew, Arabic, Russian.

As depicted in the picture bellow: enter image description here

I want to write a formula in column C cell, that will indicate the text language in the corresponding A cell.

I failed to identify any style indicator I can use.

I looked around and found a solution for Microsoft Office VBA.

I hope I do not need to write a macro using this API function getStringType(...)

Thanks.

1 Answer 1

1

Assuming all the text in a given cell is using the same script and that all text starts with a letter, testing the first character should be enough. This can be done with:

=UNICODE(A2)

If the number returned is between 65 and 122, the text is in English (this would need to be extended if you need to include characters with diacritical marks (ex.: é, à, ñ, ø, etc.)

The same can be done with the other alphabets. A Unicode character list can be used to determine the range in question. Here is one though you can easily find others that may better suit your purpose

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.