14

Note: Little long question. I'm going to give a bounty for best answer.

What I'm trying to do is querying on Object. Here is the details. I have a file called employee.txt. So I parsed it and kept in the list

public static List<Employee> employeeList = new LinkedList<>();

Then here is my logic to query.

Take the query from user, and then parse it. The below is the logic to query through the list.

For ex: here is the query

select * from Employee where id > 10

My codes for that

String valueToCompare = split[5];  //10
EmployeeCriteria criteria = new EmployeeCriteria(
        isId, isName, isSalary, expression,
        valueToCompare);
result = EmployeeData.findAll(
        EmployeeData.employeeList, criteria);

Here is the findAll method

public static List<Employee> findAll(List<Employee> coll,
            ISearch<Employee> chk) {
        List<Employee> l = new LinkedList<Employee>();
        for (Employee obj : coll) {
            if (chk.search(new Employee(obj)))
                l.add(obj);
        }
        return l;
    }

And here is my search method

/**
     * Based on the type provided and for given expression it check against the
     * given value
     */

    @Override
    public boolean search(Employee obj) {
        if (expression.equals(EQUAL)) {
            if (isId()) {
                if (obj.getId() == Long.parseLong(valueToCompare)) {
                    return true;
                }
            } else if (isName()) {
                if (obj.getName().equals(valueToCompare)) {
                    return true;
                }
            } else if (isSalary()) {
                if (obj.getSalary() == Long.parseLong(valueToCompare)) {
                    return true;
                }
            } else {
                System.err.println(UserMessage.INVALIDCOLUMN_NAME);
            }

        } else if (expression.equals(NOT_EQUAL)) {
            if (isId()) {
                if (!(obj.getId() == Long.parseLong(valueToCompare))) {
                    return true;
                }
            } else if (isName()) {
                if (!(obj.getName().equals(valueToCompare))) {
                    return true;
                }
            } else if (isSalary()) {
                if (!(obj.getSalary() == Long.parseLong(valueToCompare))) {
                    return true;
                }
            } else {
                System.err.println(UserMessage.INVALIDCOLUMN_NAME);
            }

        } else if (expression.equals(GREATER)) {
            if (isId()) {
                if ((obj.getId() > Long.parseLong(valueToCompare))) {
                    return true;
                }
            } else if (isSalary()) {
                if ((obj.getSalary() > Long.parseLong(valueToCompare))) {
                    return true;
                }
            } else {
                System.err.println(UserMessage.INVALIDCOLUMN_NAME);
            }

        } else if (expression.equals(LESSER)) {
            if (isId()) {
                if ((obj.getId() < Long.parseLong(valueToCompare))) {
                    return true;
                }
            } else if (isSalary()) {
                if ((obj.getSalary() < Long.parseLong(valueToCompare))) {
                    return true;
                }
            } else {
                System.err.println(UserMessage.INVALID_IDENTIFIER);
            }

        }

        return false;
    }

Let me know if you want to see any other codes.

I just want to know,

In the first place LinkedList is correct data structure to use ? Is this performs well ?? Any enhancements to perform well ?

Any better way to achieve this ?

here are few example queries:

select * where ID > 100
select * where Name != Ramesh
select * where Salary < 500000
select Name order by Name
select ID

Thanks for any help. Bounty will be offered after 2 days. I can't do that now.

Note2: This is a test to check my data manage skills and I cannot use any database.

4
  • I think this might be better suited for codereview.stackexchange.com, but here is my 2 cents: You should parse your expression into classes implementing interfaces, e.g. a Value interface with 2 classes Field and Constant, and an Operator interface with classes like Equal, NotEqual, LessThan, and so on... Adding support for And, Or, Plus, and more would then be easier later in your life. Commented Sep 24, 2015 at 3:13
  • requirement seems smiler what joSQL library does. LinkedList to be best fitted if you wish to keep all data in memory as this provides easy delete operation compare to ArrayList. This solution can be challenged with number of records, For better performance of memory, load meta data information & key fields of Employee like index(ID,Name, Salary) not the whole object. Commented Sep 24, 2015 at 8:30
  • @SubhrajyotiMajumder Saw the source code of JoSQL and it's using reflection. And I'm limited to use no third part tools. Commented Sep 24, 2015 at 8:33
  • @SubhrajyotiMajumder And if I use Id+Name combination as a key, when user asks for select * where name=suresh, I need to iterate over all key set. That's where I stopped right now. Commented Sep 24, 2015 at 8:36

3 Answers 3

4

No, this does not perform well at all. You are searching through N employees every time. So if you have 1 million employees, you will search all 1 million employees before returning the correct employee. Even worst, if it doesn't exist, you will have to search exhaustibly before you can know if it exists.

Is this for use in production? If so then just use SQLite or some other simple database. You want to write once and read multiple times with indexes. I cannot stress enough that what you are writing will have bugs and instead you should use something that was already tested.

Assuming this is not for production and you are just having fun, then you want to emulate what databases do in real life. They create indexes. Indexes are usually best described as Map<String, List<Employee>>.

The idea is that initially reading the data from disk is expensive. But you read it once. For each dimension, Name, Salary, ID, etc... you want create separate indexes.

So let's say you were creating an index of all employees by ID. You would want to do something like:

Map<String, Employee> employeesById = new HashMap<>();

for(Employee e : employees) { 
  employeesById.put(e.getId(), e);
}

The above assumes that employee ids are unique. If they are not then you would need create a List<Employee>. For example for index by name:

Map<String,List<Employee>> employeesByName = new HashMap<>();

for(Employee e : employees) { 
  employeesByName.get(e.getName()).add(e); // Make sure to create the array if it doesn't exist
}

So now, for reading, let's say you get SELECT * FROM employees where id = 123; you can simply just return employeesById.get("123").

This solution is O(1). As the file gets bigger, you will not have any performance loss. This is probably the fastest solution.

Sign up to request clarification or add additional context in comments.

5 Comments

I didn't go in to a lot of detail about != and < but basically you need to cover each use case. For != you can just find all and then remove map.get("Ramesh") and for < you should use a sorted list. This answer would be too long if I had to cover every use case.
For != you can just find all and then remove map.get("Ramesh") Again this mean N iterations right ?
No, you can do new HashMap(mapByName).remove("Ramesh"). It wouldn't be N operations but it would be M operations where M is size of unique names.
Then the question of duplicates arise :(
There is no question of duplicates? It works just fine. You just have to use the right data structure. It will be much more performant than what you have today.
2

To add onto Amir's answer, you can also use a TreeMap<> instead of a HashMap<>. Databases normally don't create indexes as hash maps but as balanced binary trees, which is what Java's TreeMap<> class is.

http://docs.oracle.com/javase/8/docs/api/java/util/TreeMap.html

HashMap<> has O(1) lookup on keys, but is exact match on key only. TreeMap<> has O(log n) lookup on keys, but allows ranged lookups higherKey, lowerKey) and map partitioning on key range (subMap).

The problem of duplicates can either be solved by not allowing them (unique key) or storing values in a list and linear search on the subset (non-unique key).

1 Comment

I didn't want to get in details of TreeMap but you explained well. With all honesty, he will have to have many different data structures for different needs.
1

Assuming you are not interested in any of the in-memory database solution and using Java 8, You should convert all your conditions to predicate and apply them on the stream . This will take advantage of Java 8's parallelism feature http://blog.takipi.com/new-parallelism-apis-in-java-8-behind-the-glitz-and-glamour/

So in short your code can be much cleaner and faster

 public static Predicate<Employee> isSalaryGreaterThan(Integer salary) {
        return p -> p.getSalary() > salary;
    }

Predicate predicate1 =   isSalaryGreaterThan(50000)
    employeeList.stream().filter(predicate1).filter(predicate2)...collect(Collectors.<Employee>toList())

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.