0

I get csv files from gmail inbox, I am using gmail api, once I get those csv files I dont want to store them on my local, I want to connect import those files into my bucket from GCP, but I can not get my path from the csv files.

 upload_to_bucket(file_name, attachment_content,bucket_name)
  File "/Users/Emails/mypython”, line 91, in upload_to_bucket
    blob.upload_from_filename(file_path)
  File "/Users/.local/share/virtualenvs/Emails-wLEQ9xGC/lib/python3.10/site-packages/google/cloud/storage/blob.py", line 2704, in upload_from_filename
    content_type = self._get_content_type(content_type, filename=filename)
  File "/Users/.local/share/virtualenvs/Emails-wLEQ9xGC/lib/python3.10/site-packages/google/cloud/storage/blob.py", line 1674, in _get_content_type
    content_type, _ = mimetypes.guess_type(filename)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/mimetypes.py", line 307, in guess_type
    return _db.guess_type(url, strict)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/mimetypes.py", line 123, in guess_type
    scheme, url = urllib.parse._splittype(url)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/parse.py", line 1039, in _splittype
    match = _typeprog.match(url)
TypeError: cannot use a string pattern on a bytes-like object

I already tested to get these csv files from inbox to my local but know I added "upload_bycket" function but the file_path I tried to get it from attachments, but couldn't found it

import os
import base64
from typing import List
import time
from datetime import datetime
from Google import Create_Service
from google.cloud import storage

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = r'mystorageaccont.json'

storage_client = storage.Client()

bucket_name = 'mybucketname'

def upload_to_bucket(blob_name, file_path, bucket_name):
    bucket = storage_client.get_bucket(bucket_name)
    blob = bucket.blob(blob_name)
    blob.upload_from_filename(file_path)
    return blob

def search_emails(query_stirng: str, label_ids: List=None):
    try:
        message_list_response = service.users().messages().list(
            userId='me',
            labelIds=label_ids,
            q=query_string
        ).execute()

        message_items = message_list_response.get('messages')
        next_page_token = message_list_response.get('nextPageToken')
        
        while next_page_token:
            message_list_response = service.users().messages().list(
                userId='me',
                labelIds=label_ids,
                q=query_string,
                pageToken=next_page_token
            ).execute()

            message_items.extend(message_list_response.get('messages'))
            next_page_token = message_list_response.get('nextPageToken')
        return message_items
    except Exception as e:
        raise NoEmailFound('No emails returned'

)
    
def get_file_data(message_id, attachment_id, file_name, new_Location):
    response = service.users().messages().attachments().get(
        userId='me',
        messageId=message_id,
        id=attachment_id
    ).execute()

    file_data = base64.urlsafe_b64decode(response.get('data').encode('UTF-8'))
    return file_data
    
def get_message_detail(message_id, msg_format='metadata', metadata_headers: List=None):
    message_detail = service.users().messages().get(
        userId='me',
        id=message_id,
        format=msg_format,
        metadataHeaders=metadata_headers
    ).execute()
    return message_detail

def save_file_data(email_messages):
    for email_message in email_messages: 
        messageDetail = get_message_detail(email_message['id'], msg_format='full', metadata_headers=['parts'])  
        headers=messageDetail["payload"]["headers"]
        messageDetailPayload = messageDetail.get('payload') 
        if 'parts' in messageDetailPayload: 
            for msgPayload in messageDetailPayload['parts']: 
                file_name = msgPayload['filename'] 
                filetype = ".csv"
                if file_name.find(filetype) != -1:
                    body = msgPayload['body'] 
                    if 'attachmentId' in body: 
                        attachment_id = body['attachmentId'] 
                        attachment_content = get_file_data(email_message['id'], attachment_id, file_name, save_location) 
                        upload_to_bucket(file_name, attachment_content,bucket_name)

if __name__ == '__main__': 
    CLIENT_FILE = 'mycredentialsforconnectgmail.json' 
    API_NAME = 'gmail' 
    API_VERSION = 'v1' 
    SCOPES = ['https://mail.google.com/'] 
    service = Create_Service(CLIENT_FILE, API_NAME, API_VERSION, SCOPES) 
    query_string = 'has:attachment' 
    email_messages = search_emails(query_string) 
    save_file_data(email_messages)


                 

Also I made an small app using files from my local and it works, but now the path is on my variable from the inbox

2
  • Post the full traceback, it's impossible to tell where the error is originating from. Commented Jun 10, 2022 at 15:59
  • Hi Peter, i added all, basically my upload bucket function works if I use local files, but I am getting csv files from email inbox, need to send it to gcp bucket, but how to find the file_path for these csv files? Commented Jun 10, 2022 at 16:16

1 Answer 1

1

Your error says cannot use a string pattern on a bytes-like object

Python 3 gives your data in bytes. You can't encode it. Change the code

file_data = base64.urlsafe_b64decode(response.get('data').encode('UTF-8'))

to

file_data = base64.urlsafe_b64decode(response.get('data'))

Sign up to request clarification or add additional context in comments.

4 Comments

Hi, I got the same error, I think is because what I am sending as path_file upload_to_bucket(file_name, attachment_content,bucket_name), attachment_content is not a real path, i couldn't find the path of the csv files picked from email inbox
provide a sample of what you get as the path
If I print(attachment_content) show me the data from the csv, didn't show me real path
I don't believe print commands will show up in Logs on Production. Try using logging.info() but you first have to set your log level to info i.e. logging.setLevel(logging.INFO)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.