How to get python script properly working to take input from .txt files and return the number of positives

Question

I have a python script that is supposed to take a directory full of .txt files and determine if each .txt file return positive or negative for matching certain text statements inside the file itself like "known infection source". However, my script doesn't work and returns the following error message. Any help would be greatly appreciated!

Sample JSON file text

{
    "detected_referrer_samples": [
        {
            "positives": 1,
            "sha256": "325f928105efb4c227be1a83fb3d0634ec5903bdfce2c3580ad113fc0f15373c",
            "total": 52
        },
        {
            "positives": 20,
            "sha256": "48d85943ea9cdd1e480d73556e94d8438c1b2a8a30238dff2c52dd7f5c047435",
            "total": 53
        }
    ],
    "detected_urls": [],
    "domain_siblings": [],
    "resolutions": [],
    "response_code": 1,
    "verbose_msg": "Domain found in dataset",
    "whois": null
}

Error

Traceback (most recent call last):
  File "vt_reporter1.py", line 35, in <module>
    print(vt_result_check(path))
  File "vt_reporter1.py", line 20, in vt_result_check
    vt_result |= any(sample['positives'] > 0 for sample_type in sample_types
  File "vt_reporter1.py", line 21, in <genexpr>
    for sample in vt_data.get(sample_type, []))
AttributeError: 'list' object has no attribute 'get'

Code

import os
import json
import csv

path=r'./output/'
csvpath='C:/Users/bwerner/Documents'

def vt_result_check(path):
    vt_result = False
    for filename in os.listdir(path):
        with open(path + filename, 'r') as vt_result_file:
            vt_data = json.load(vt_result_file)

        # Look for any positive detected referrer samples
        # Look for any positive detected communicating samples
        # Look for any positive detected downloaded samples
        # Look for any positive detected URLs
        sample_types = ('detected_referrer_samples', 'detected_communicating_samples',
                        'detected_downloaded_samples', 'detected_urls')
        vt_result |= any(sample['positives'] > 0 for sample_type in sample_types
                                                 for sample in vt_data.get(sample_type, []))

        # Look for a Dr. Web category of known infection source
        vt_result |= vt_data.get('Dr.Web category') == "known infection source"

        # Look for a Forecepoint ThreatSeeker category of elevated exposure
        # Look for a Forecepoint ThreatSeeker category of phishing and other frauds
        # Look for a Forecepoint ThreatSeeker category of suspicious content
        threats = ("elevated exposure", "phishing and other frauds", "suspicious content")
        vt_result |= vt_data.get('Forcepoint ThreatSeeker category') in threats

    return vt_result

if __name__ == '__main__':
    print(vt_result_check(path))
    with open(csvpath, 'w') as csvfile:
        writer.writerow([vt_result_check(path)])

It looks like your input file is being read in as a list not a dict. — berkelem
– berkelem, Commented Aug 14, 2018 at 18:48
@berkelem Thank you for your comment! How do I change my input file to being read as a dict? — bedford
– bedford, Commented Aug 14, 2018 at 18:51
You'd have to inspect it to see what format the data is in. It is possible you have a list containing a dict, like in this question: stackoverflow.com/questions/25613565/… — berkelem
– berkelem, Commented Aug 14, 2018 at 19:30

ab123 · Accepted Answer · 2018-08-14 18:50:42Z

0

The error tells you everything you need to know about what is going wrong, which is that you cannot call the get() function on a list. In Python, the get() function can only be used with dictionaries, which are different from lists. Instead of using the get() function, call a specific index of the list and your program should work. For example:

for sample in list[10:11]

which returns the 11th element of the list.

answered Aug 14, 2018 at 18:50

ab123

3474 silver badges17 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

bedford Over a year ago

Thanks for your help! Where in my code do I use your code?

bedford Over a year ago

Thanks for your help! Where in my code do I use your code?

ab123 Over a year ago

where ever you were getting the error with the get() function, you should replace with just calling elements by index. Sorry for the late response. Please mark as correct if it works!

bedford Over a year ago

Thanks for your response! I've tried to fix the error with your suggestion but now I got a new error. Below is my new error. I would greatly appreciate your help! File "vt_reporter.py", line 35, in <module> print(vt_result_check(path)) File "vt_reporter.py", line 24, in vt_result_check vt_result |= vt_data.get('Dr.Web category') == "known infection source" AttributeError: 'list' object has no attribute 'get'

ab123 Over a year ago

This error is actually the same as before because you haven't implemented the solution everywhere. Make sure that you replace every instance of the function get() with the new solution. Please mark as correct if it works!

Pushkar Chintaluri · Accepted Answer · 2018-08-14 21:08:18Z

0

Can you post the contents of the file, or some text that represents the content of the file that is being read in?

Here's some feedback based on what is seen in the code you posted:

vt_result_file contains valid JSON
And whatever it is reading, is being read into python as a List. We can determine this because of the error that you are receiving. Look at the last line of the error:

AttributeError: 'list' object has no attribute 'get'

It says that you are trying to access the "get" attribute on a "list". Looking at your code, we can see that you are calling "get" on "vt_data" three times:
- Once as
```
vt_data.get(sample_type, [])
```
- Another time as
```
vt_data.get('Dr.Web category')
```
- And finally as
```
vt_data.get('Forcepoint ThreatSeeker category')
```
Per the error message, your variable vt_data is a list and not a dictionary.

So you need to ask yourself:

Were you expecting vt_result_file to contain a dictionary? If so, open the file and examine what is contained there, and turn it into a dictionary.

Unfortunately, without seeing the contents of this file, it is hard to suggest what you need to change to fix this error.

answered Aug 14, 2018 at 21:08

Pushkar Chintaluri

767 bronze badges

3 Comments

bedford Over a year ago

Thank you for your help! I've posted the contents of one of the files I'm trying to parse. If you have a chance to look at it I would greatly appreciate your help!

Pushkar Chintaluri Over a year ago

Trying out your code with your sample input does work for me. You may want to print out the filename you are accessing in each iteration and check if you are trying to read a file which is not formatted as given here.

bedford Over a year ago

Do you mind sharing your code with how you inputted the sample file I gave since maybe I'm incorrectly reading in the directory.

Collectives™ on Stack Overflow

How to get python script properly working to take input from .txt files and return the number of positives

2 Answers 2

5 Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related