1

I have a text file (links.txt) in the following format:

www.independent.co.uk www.bbc.co.uk www.theguardian.com www.telegraph.co.uk 
www.dailymail.co.uk en.wikipedia.org www.huffingtonpost.co.uk www.bbc.co.uk 
www.newsnow.co.uk www.express.co.uk 

I have another text file (keys.txt) in the following format:

www.independent.co.uk www.bbc.co.uk www.theguardian.com

I want to compare both the text files and the URLs that are common in both the files has to be printed

I tried using the urltools package in python but couldn't do it for multiple urls

4
  • 2
    Please, post what have you tried Commented Aug 3, 2018 at 17:34
  • 1
    Those two files seem to be the same "format", ie a space-separated list of domains. What is the difference between them? Commented Aug 3, 2018 at 17:41
  • Yes, they are in the same format. I want to see if the urls in keys.txt are existing in links.txt. If they exist then it has to print that specific url. Commented Aug 3, 2018 at 17:43
  • I used urltools.compare which compares 2 urls and tells whether they match or not. Commented Aug 3, 2018 at 17:45

2 Answers 2

1

How about this:

links = open('links.txt', 'r')
links_data = links.read()
links.close()

keys = open('keys.txt', 'r')
keys_data = keys.read()
keys.close()

keys_split = keys_data.split()

for url in keys_split:
    if url in links_data:
        print(url)

Just make sure that links.txt and keys.txt are in the current working directory and everything should work fine. I'm assuming your URLs will always be space-delimited.

Sign up to request clarification or add additional context in comments.

2 Comments

Yes, this is what I was looking for. Thanks! @agillgilla
No problem. Please mark it as the accepted answer if it answered your question @clawstack.
0

To print only unique URL instead common URL, just modify condition not in, here is complete code -

links = open('links.txt', 'r')
links_data = links.read()
links.close()

keys = open('keys.txt', 'r')
keys_data = keys.read()
keys.close()

keys_split = keys_data.split()

for url in keys_split:
    if url not in links_data:
        print(url)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.