1

I need to migrate tables from MS Access to Postgres. I'd like to use pyodbc to do this as it allows me to connect to the Access database using python and query the data.

The problem I have is I'm not exactly sure how to programmatically create a table with the same schema other than just creating a SQL statement using string formatting. pyodbc provides the ability to list all of the fields, field types and field lengths, so I can create a long SQL statement with all of the relevant information, however how can I do this for a bunch of tables? would I need to build SQL string statements for each table?

import pyodbc

access_conn_str = (r'DRIVER={Microsoft Access Driver (*.mdb, *.accdb)}; 'r'DBQ=C:\Users\bob\access_database.accdb;')
access_conn = pyodbc.connect(access_conn_str)
access_cursor = access_conn.cursor()

postgres_conn_str = ("DRIVER={PostgreSQL Unicode};""DATABASE=access_database;""UID=user;""PWD=password;""SERVER=localhost;""PORT=5433;")
postgres_conn = pyodbc.connect(postgres_conn_str)
postgres_cursor = postgres_conn.cursor()

table_ditc = {}
row_dict = {}

for row in access_cursor.columns(table='table1'):
    row_dict[row.column_name] = [row.type_name, row.column_size]

table_ditc['table1'] = row_dict

for table, values in table_ditc.items():
    print(f"Creating table for {table}")

    access_cursor.execute(f'SELECT * FROM {table}')
    result = access_cursor.fetchall()

    postgres_cursor.execute(f'''CREATE TABLE {table} (Do I just put a bunch of string formatting in here?);''')
    postgres_cursor.executemany(f'INSERT INTO {table} (Do I just put a bunch of string formatting) VALUES (string formatting?)', result)

postgres_conn.commit()

As you can see, with pyodbc I'm not exactly sure how to build the SQL statements. I know I could build a long string by hand, but if I were doing a bunch of different tables, with different fields etc. that would not be realistic. Is there a better, easier way to create the table and insert rows based off of the schema of the Access database?

8
  • Have you looked around for existing tools that might take care of the grunt work for you? Something like this, perhaps? Commented Mar 13, 2021 at 15:13
  • @GordThompson No I haven't looked into other tools. What you suggested looks pretty neat, I'll check it out. Commented Mar 13, 2021 at 16:35
  • MDB-tools. Example: mdb-schema -T some_table some_db.mdb postgres Commented Mar 13, 2021 at 16:39
  • @AdrianKlaver I gave MDB-tools a shot. When I export the tables with this command mdb-schema access_database.accdb postgres | tr 'A-Z' 'a-z' | psql -d postgres_database -U postgres -W -h 192.168.0.242 -p 5433 It creates all of the tables but then at the end I get two errors - ERROR: relation "msysnavpanegroups" does not exist ERROR: relation "msysnavpanegrouptoobjects" does not exist I also get errors when trying to load the data (its looping and saying column does not exist). Do you have any idea why this may be? Commented Mar 13, 2021 at 19:12
  • Well first of all I would direct the output of mdb-schema to a file first, in order to verify the output. Any time you migrate from one system to another there will be mismatches. Is there msysnavpanegroups table in the Access database? Is it in the output produced by mdb-schema? What column does not exist and what is the exact error? Add the answers to above to your question. Commented Mar 13, 2021 at 20:08

1 Answer 1

3

I ultimately ended up using a combination of pyodbc and pywin32. pywin32 is "basically a very thin wrapper of python that allows us to interact with COM objects and automate Windows applications with python" (quoted from second link below).

I was able to programmatically interact with Access and export the tables directly to Postgres with DoCmd.TransferDatabase

https://learn.microsoft.com/en-us/office/vba/api/access.docmd.transferdatabase https://pbpython.com/windows-com.html

import win32com.client
import pyodbc
import logging
from pathlib import Path

conn_str = (r'DRIVER={Microsoft Access Driver (*.mdb, *.accdb)}; 'rf'DBQ={access_database_location};')
conn = pyodbc.connect(conn_str)
cursor = conn.cursor()

a = win32com.client.Dispatch("Access.Application")
a.OpenCurrentDatabase(access_database_location)

table_list = []

for table_info in cursor.tables(tableType='TABLE'):
    table_list.append(table_info.table_name)

for table in table_list:
    logging.info(f"Exporting: {table}")

    acExport = 1
    acTable = 0
    db_name = Path(access_database_location).stem.lower()

    a.DoCmd.TransferDatabase(acExport, "ODBC Database", "ODBC;DRIVER={PostgreSQL Unicode};"f"DATABASE={db_name};"f"UID={pg_user};"f"PWD={pg_pwd};""SERVER=localhost;"f"PORT={pg_port};", acTable, f"{table}", f"{table.lower()}_export_from_access")

    logging.info(f"Finished Export of Table: {table}")
    logging.info("Creating empty table in EGDB based off of this")

This approach seems to be working for me. I like how the creation of the table/fields as well as insertion of data is all handled automatically (which was the original problem I was having with pyodbc).

If anyone has better approaches I'm open to suggestions.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.