I have a string input that represents a formula like:
BMI = ( Weight / ( Height * Height ) ) * 703
I want to be able to extract all legal variables into a String[]
Legal variables are determined with almost the same rules as Java variable naming convention, except only alphanumeric characters are allowed:
- Any alphabet character upper or lower, may be followed by a digit
- Any word/text
- Any word/text followed by a digit
Therefore I expect the output to look like this:
BMI
Weight
Height
This is my current attempt:
/* helper method , find all variables in expression,
* Variables are defined a alphabetical characters a to z, or any word , variables cannot have numbers at the beginning
* using regex pattern "[A-Za-z0-9\\s]"
*/
public static List<String> variablesArray (String expression)
{
List<String> varList = null;
StringBuilder sb = null;
if (expression!=null)
{
sb = new StringBuilder();
//list that will contain encountered words,numbers, and white space
varList = new ArrayList<String>();
Pattern p = Pattern.compile("[A-Za-z0-9\\s]");
Matcher m = p.matcher(expression);
//while matches are found
while (m.find())
{
//add words/variables found in the expression
sb.append(m.group());
}//end while
//split the expression based on white space
String [] splitExpression = sb.toString().split("\\s");
for (int i=0; i<splitExpression.length; i++)
{
varList.add(splitExpression[i]);
}
}
return varList;
}
The result is not as I expected. I got extra empty lines, got "Height" twice, and shouldn't have gotten a number:
BMI
Weight
Height
Height
703