0

EDIT: editted for clarity as to what I'm having trouble with. I'm not getting the right responses as its counting dupes. I HAVE to use RegEx, can use tokenizer however but I did not.

What I am trying to do here is, there is 5 input files. I need to calculate how many "USER DEFINED VARIABLES" there are. Please ignore the messy code, I'm just learning Java.

I replaced: everything within ( and ), all non-word characters, any statements such as int, main etc, any digit with a space infront of it, and any blank space with a new line then trim it.

This leaves me with a list that has a variety of strings which I will match with my RegEx. However, at this point, how make my count only include unique identifiers?


EXAMPLE: For example, in the input file I have attached beneath the code, I am receiving "distinct/unique identifiers: 10" in my output file, when it should be "distinct/unique identifiers: 3"

And for example, in the 5th input file I have attached, I should have "distinct/unique identifiers: 3" instead I currently have "distinct/unique identifiers: 6"

I cannot use Set, Map etc.

Any help is great! Thanks.

import java.util.*
import java.util.regex.*;
import java.io.*;

public class A1_123456789 {

public static void main(String[] args) throws IOException {
    if (args.length < 1) {
        System.out.println("Wrong number of arguments");
        System.exit(1);
    }

    for (int i = 0; i < args.length; i++) {

        FileReader jk = new FileReader(args[i]);
        BufferedReader ij = new BufferedReader(jk);
        FileWriter fw = null;
        BufferedWriter bw = null;

        String regex = "\\b(\\w+)(\\s+\\1\\b)+";

        Pattern p = Pattern.compile("[_a-zA-Z][_a-zA-Z0-9]{0,30}");

        String line;
        int count = 0;

        while ((line = ij.readLine()) != null) {
           line = line.replaceAll("\\(([^\\)]+)\\)", " " );
           line = line.replaceAll("[^\\w]", " ");
           line = line.replaceAll("\\bint\\b|\\breturn\\b|\\bmain\\b|\\bprintf\\b|\\bif\\b|\\belse\\b|\\bwhile\\b", " ");
           line = line.replaceAll(" \\d", "");
           line = line.replaceAll(" ", "\n");
           line = line.trim();

            Matcher m = p.matcher(line);

            while (m.find()) {
                count++;
            }
        }

        try {
            String s1 = args[i];
            String s2 = s1.replaceAll("input","output");
            fw = new FileWriter(s2);
            bw = new BufferedWriter(fw);
            bw.write("distinct/unique identifiers: " + count);

        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                if (bw != null) {
                    bw.close();
                }

                if (fw != null) {
                    bw.close();
                }

            } catch (IOException ex) {
                ex.printStackTrace();
            }
        }
    }
}

//This is the 3rd input file below.

int celTofah(int cel)
{
    int fah;
    fah = 1.8*cel+32;
    return fah;
}

int main()
{
    int cel, fah;
    cel = 25;
    fah = celTofah(cel);
    printf("Fah: %d", fah);
    return 0;
}

//This is the 5th input file below.

int func2(int i)
{
    while(i<10)
    {
        printf("%d\t%d\n", i, i*i);
        i++;
    }
}

int func1()
{
    int i = 0;
    func2(i);
}

int main()
{
    func1();
    return 0;
}
5
  • I'm not seeing a clear problem statement here, just a large code dump with little explanation. Please edit your question and tell us what is going on here. Commented Feb 2, 2018 at 6:57
  • 1
    Can you please show us a single line, what your code outputs versus what you expect? Eg, int func2(int i) should give func2 i, perhaps, but what are you getting? Have you used an online tool to debug the regex? (BTW, I'd probably approach this problem by scanning for tokens rather than regexing) Commented Feb 2, 2018 at 6:58
  • BTW, I don't see any code to count unique entries, but then again, assuming no Set. etc, it's just "If match not in list/array then add match to list/array". Commented Feb 2, 2018 at 7:05
  • I have editted for clarity. @KenY-N , that is my problem. How do I implement counting unique entries instead with my code? Will I need to convert my string to an ArrayList, input only uniques and then match regex with the ArrayList? Commented Feb 2, 2018 at 7:10
  • What's the precise problem declaration? Note that defined variable and declared variable are not the same. Identifier is a third thing. Commented Feb 2, 2018 at 7:54

2 Answers 2

1

Try this

 LinkedList dtaa = new LinkedList();
        String[] parts =line.split(" ");
        for(int ii =0;ii<parts.length;ii++){
            if(ii == 0)
                dtaa.add(parts[ii]);
            else{
                if(dtaa.contains(parts[ii]))
                        continue;
                else
                    dtaa.add(parts[ii]);

            }
        }

       count = dtaa.size();

instead of

 Matcher m = p.matcher(line);

        while (m.find()) {
            count++;
        }
Sign up to request clarification or add additional context in comments.

1 Comment

I must use Pattern and Matcher to solve this, how can I incorporate that?
0

Amal Dev has suggested a correct implementation, but given the OP wants to keep Matcher, we have:

// Previous code to here

// Linked list of unique entries
LinkedList uniqueMatches = new LinkedList();

// Existing code
while ((line = ij.readLine()) != null) {
    line = line.replaceAll("\\(([^\\)]+)\\)", " " );
    line = line.replaceAll("[^\\w]", " ");
    line = line.replaceAll("\\bint\\b|\\breturn\\b|\\bmain\\b|\\bprintf\\b|\\bif\\b|\\belse\\b|\\bwhile\\b", " ");
    line = line.replaceAll(" \\d", "");
    line = line.replaceAll(" ", "\n");
    line = line.trim();

    Matcher m = p.matcher(line);

    while (m.find()) {
        // New code - get this match
        String thisMatch = m.group();
        // If we haven't seen this string before, add it to the list
        if(!uniqueMatches.contains(thisMatch))
            uniqueMatches.add(thisMatch);
    }
}

// Now see how many unique strings we have collected
count = uniqueMatches.size();

Note I haven't compiled this, but hopefully it works as is...

2 Comments

Amazing!! Thank you. Just a question, I am not sure how does "m.group()" work in this "String thisMatch = m.group();" ?
There's the official documentation, and there's a StackOverflow answer, but basically group() gives you back the string that last matched your regular expression.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.