Python: DFS Optimised Algorithm?

Question

I just implemented DFS on Python; however, I don't think this is the most optimised code possible due to the for loop on the third to last line. I know that this DFS works; however, is it optimised? The graph URL is attached below, as well as the code.

graphx=[[1, 1, 0, 0, 0, 0, 0, 0], [1, 1, 0, 0, 0, 0, 1, 0], [0, 0, 1, 1, 0, 0, 0, 0], [0, 0, 1, 1, 1, 0, 1, 0], [0, 0, 0, 1, 1, 1, 0, 0], [0, 0, 0, 0, 1, 1, 1, 1], [0, 1, 0, 1, 0, 1, 1, 0], [0, 0, 0, 0, 0, 1, 0, 1]]
visited=[False]*len(graphx[0])
stack=[]

def dfs(graphx, a):
    stack.append(a)
    while len(stack)!=0:
        v=stack.pop()
        if visited[v]==False:
            visited[v]=True
            print(v)
            for w in range(len(graphx[0])):
                if graphx[v][w]!=0:
                    stack.append(w)

Graph: https://i.sstatic.net/UcHqo.png

Somebody said this was O(n^2) time, which is very bad. How to optimise?

Edit: Is this BFS corect, based on an answer for the original DFS?

def bfs(G, start):
    """Perform bfs on adjacency list of graph
    G: Adjacency list representation of graph.
    start: Index of start node.
    """
    visited = [False] * len(G)
    queue = []
    queue.append(start)
    while queue:
        v = queue.pop()
        if not visited[v]:
            visited[v] = True
            print(v)
            for neighbor in G[v]:
                queue.insert(0, neighbor)

lightalchemist · Accepted Answer · 2014-07-25 02:51:04Z

The key problem with your algorithm is that you are representing your graph as a adjacency matrix. For V number of nodes, this will lead to O(V^2) runtime as you have to visit all V^2 entries of the matrix.

Representing the graph using an adjacency list is more efficient (runtime and memory-wise). Runtime is O(E) where E is number of edges.

# Run time is O(E) and not O(V^2)
# It visits each node and edge exactly once.
def dfs(G, start):
    """Perform dfs on adjacency list of graph
    G: Adjacency list representation of graph.
    start: Index of start node.
    """
    visited = [False] * len(G)
    stack = []
    stack.append(start)
    while stack:
        v = stack.pop()
        if not visited[v]:
            visited[v] = True
            print(v)
            for neighbor in G[v]:
                stack.append(neighbor)

# This is your code. Takes O(V^2) because you are visiting every entry in
# the adjacency matrix, which has V^2 entries.
def dfs_adjacency_mat(G, start):
    visited=[False]*len(G[0])
    stack=[]
    stack.append(start)
    while len(stack)!=0:
        v=stack.pop()
        if visited[v]==False:
            visited[v]=True
            print(v)
            for w in range(len(G[0])):
                if G[v][w]!=0:
                    stack.append(w)

def main():
    # Represent graph as adjacency list, not adjacency matrix
    G = [[1],       # Node 0 (A) has node 1 (B) as neighbor
        [0, 6],    # Node 1 (B) has node 0 (A) and 6 (G) as neighbor
        [3],
        [2, 4, 6],
        [3, 5],
        [4, 6, 7],
        [1, 3, 5],
        [5]]
    print("Using adjacency list")
    dfs(G, 0)  # Start dfs at node 0 (i.e., node A)

    print('-' * 50)

    print("Using adjacency matrix")
    # Adjacency matrix
    graphx = [[1, 1, 0, 0, 0, 0, 0, 0],
              [1, 1, 0, 0, 0, 0, 1, 0],
              [0, 0, 1, 1, 0, 0, 0, 0],
              [0, 0, 1, 1, 1, 0, 1, 0],
              [0, 0, 0, 1, 1, 1, 0, 0],
              [0, 0, 0, 0, 1, 1, 1, 1],
              [0, 1, 0, 1, 0, 1, 1, 0],
              [0, 0, 0, 0, 0, 1, 0, 1]]
    dfs_adjacency_mat(graphx, 0)


if __name__ == '__main__':
    main()

Output:

Using adjacency list
0
1
6
5
7
4
3
2
--------------------------------------------------
Using adjacency matrix
0
1
6
5
7
4
3
2

Great, thanks! Can you check my similar code for BFS (attached in the OP)?
DFS is supposed to have running time O(E), though... can you somehow improve?

Ulrich Eckhardt · Accepted Answer · 2014-07-22 19:30:04Z

1

Your question is rather vague and open-ended. Optimizing code can go so far as to implement it in a different language, so you will have to be a bit more precise about what you actually want to know. Suggesting that you rewrite the whole thing in C is probably not the answer you are looking for.

Anyhow, since you are using an adjacency matrix, locating neighbours is linear over the number of nodes. If you had an adjacency list, you would be able to achieve that linear over the number of edges instead. If the graph is sparse, that is an important difference.

Some Python-specific notes:

Why the 0s and 1s? Shouldn't that be True and False?
Whenever you do "i in range(len(s))", consider "e in s" or "i, e in enumerate(s)".
Don't compare "x==False" or "x==True", but simply use "not x" or "x".

answered Jul 22, 2014 at 19:30

Ulrich Eckhardt

17.7k5 gold badges31 silver badges63 bronze badges

5 Comments

qq121 Over a year ago

0s and 1s are used in the adjacency matrix; should I replace them with true or false? Regarding your 2nd point, what does enumerate do? Is it just like range? I need the indices to check the adjancency matrix.

Ulrich Eckhardt Over a year ago

If the 0s and 1s represent boolean values, the answer is yes. Concerning enumerate, fire up an interactive Python session and type help(enumerate) or look at the online docs.

qq121 Over a year ago

Can you help with rewriting the for loop in that way? Thanks!

qq121 Over a year ago

Not [stack.append(w) for w in enumereate(len(graphx[0])) if graphx[v][w]!=0]?

Ulrich Eckhardt Over a year ago

No, try again. It seems you didn't understand enumerate() yet.

Collectives™ on Stack Overflow

Python: DFS Optimised Algorithm?

2 Answers 2

2 Comments

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related