Best way to access data in python (with code)

Question

So I've been trying around forever to find the best solution for this but everything seems to have its downside.

For my project, the main code (in python) needs to access data as fast as possible. The data consists of 3 parameters (h1, h2, q) which can each take every value in a certain range which are the keys and 2 parameters (n, p) which are calculated on the base of the 3 keys.
As it takes way too long to calculate them in the main program, I want to calculate all possible options beforehand and access them in my main.

My first solution was to build a three times nested dictionary which contain an array with the 2 calculated values. With this I have two problems:

1.) I didn't really find a nice solution to access the data from my main method (it's way too big to just copy-paste it in the code as a variable)

2.) The values are already sorted so linear search is taking way too much time which I see as the downside on the dictionary solution here. (I thought I might overcome this by splitting it up into several smaller dictionaries and determine which one to search with a simple if-condition)

My second idea was using SQLite3 to make it more professional but it doesn't seem to be any faster using SQL, actually it's a bit slower even.

Is there a possibility that makes use of the already sorted data? I would be super happy if someone had an idea on what to do here as I've been spending quite some time coming up with a solution already.

My desired application would be the access to the two calculated parameters something like:

x = n(h1 = 0.6, h2 = 0.7, q = 45)
y = p(h1 = 1.2, h2 = 0.9, q = 37)

the data (in form of a dictionary) would be something like this:

dic = {0.6: {0.6: {25: [0.721015194906449, 38.5765797217578], 30: [0.894997480537817, 41.4758084379593], 35: [1.09507774740190, 44.4203224961566], 40: [1.32615185231568, 47.4085304722462], 45: [1.59288184615660, 50.4387051155152]}, 0.8: {25: [0.699928591599505, 38.1952586298542], 30: [0.870503808316267, 41.0984549056602], 35: [1.06723254390729, 44.0475150243940], 40: [1.29499245972316, 47.0407802928774], 45: [1.55841136584623, 50.0764537801238]}, 1.0: {25: [0.678904933323108, 37.8080339255472], 30: [0.846078073485272, 40.7153829050255], 35: [1.03946048887268, 43.6691963099343], 40: [1.26390856173115, 46.6677433191870], 45: [1.52401375488507, 49.7091525400001]}}, 0.8: {0.6: {25: [0.699928591599505, 38.1952586298542], 30: [0.870503808316267, 41.0984549056602], 35: [1.06723254390729, 44.0475150243940], 40: [1.29499245972316, 47.0407802928774], 45: [1.55841136584623, 50.0764537801238]}, 0.8: {25: [0.678904933323108, 37.8080339255472], 30: [0.846078073485272, 40.7153829050255], 35: [1.03946048887268, 43.6691963099343], 40: [1.26390856173115, 46.6677433191870], 45: [1.52401375488507, 49.7091525400001]}, 1.0: {25: [0.657945584308118, 37.4146226119758], 30: [0.821721975385228, 40.3263243022160], 35: [1.01176326951241, 43.2851144574096], 40: [1.23290137939785, 46.2891847659982], 45: [1.48968930734260, 49.3365841236722]}}, 1.0: {0.6: {25: [0.678904933323108, 37.8080339255472], 30: [0.846078073485272, 40.7153829050255], 35: [1.03946048887268, 43.6691963099343], 40: [1.26390856173115, 46.6677433191870], 45: [1.52401375488507, 49.7091525400001]}, 0.8: {25: [0.657945584308118, 37.4146226119758], 30: [0.821721975385228, 40.3263243022160], 35: [1.01176326951241, 43.2851144574096], 40: [1.23290137939785, 46.2891847659982], 45: [1.48968930734260, 49.3365841236722]}, 1.0: {25: [0.637052017659749, 37.0147183330305], 30: [0.797437318766716, 39.9309893301663], 35: [0.984142627694414, 42.8949977775162], 40: [1.20197209243436, 45.9048519379388], 45: [1.45543814550046, 48.9585152177077]}}}

with the dictionary syntax being {h1:{h2:{q:[p,n]}}}

Could you elaborate on the problem you had with using a dictionary? Where does a linear search happen? — Kemp
– Kemp, Commented Mar 9, 2021 at 13:26
So my thoughts were that when accessing the dictionary, it would not jump to e.g. h1 = 0.9 but first go through all of them as dictionary data can also be nominal data like "hair color: brown". So I felt like the dictionary solution doesn't make use of the ordered nature of my data. — rhea
– rhea, Commented Mar 9, 2021 at 14:07
Dictionaries hash their keys, so they can search for keys much more efficiently than a linear search. It may be worth profiling the performance and seeing if it's ok for you as the dictionary search is likely to be on par with the fastest search you can do in other structures. If you wanted to flatten it somewhat then you could use a tuple of (h1, h2, q) as the key so you don't have nested dictionaries to search multiple times. Again, worth profiling. — Kemp
– Kemp, Commented Mar 9, 2021 at 14:17
It depends how you're generating it, but it could be worth looking into something like saving the dictionary to a file as json and then loading it from the file before using it. That won't work out-of-the-box with a tuple as the dictionary key, but it will work with the nested dictionaries. If you're going the tuple route then you could do some work to, for example, convert the keys to strings and back on save and load. Alternatively, you could save the flattened dictionary to a csv file, with the key values in the first free columns and then a column per value in the list. — Kemp
– Kemp, Commented Mar 9, 2021 at 15:40

Kemp · Accepted Answer · 2021-03-09 17:00:28Z

Dictionary search

Dictionaries hash their keys, so they can search for keys much more efficiently than a linear search. It may be worth profiling the performance and seeing if it's ok for you as-is as the dictionary search is likely to be on par with the fastest search you can do in other structures. If you wanted to flatten it somewhat then you could use a tuple of (h1, h2, q) as the key so you don't have nested dictionaries to search multiple times. Again, worth profiling.

Saving and loading data

It depends how you're generating the data, but it could be worth looking into something like saving the dictionary to a file as json and then loading it from the file before using it. That won't work out-of-the-box with a tuple as the dictionary key, but it will work with the nested dictionaries. If you're going the tuple route then you could do some work to, for example, convert the keys to strings and back on save and load.

Alternatively, you could save the flattened dictionary to a csv file, such as the following. Note that as it's a csv file you can generate it any way you want, not just via Python, and also edit it by hand if you need to.

# Save

import csv

dic = {
    (1, 2, 3): [4, 5]
}

with open('data.csv', 'w', newline='') as f:
    writer = csv.writer(f)

    for k, v in dic.items():
        writer.writerow([str(x) for x in ([*k] + v)])


# Load

import csv

dic = {}

with open('data.csv', 'r') as f:
    reader = csv.reader(f)

    for row in reader:
        row = [int(x) for x in row]
        dic[tuple(row[:3])] = row[3:]

print(dic)

Having a class to handle it all

You can wrap all this up in a class such as the following:

import csv
from typing import Tuple


class Parameters:
    def __init__(self) -> None:
        self.params = {}

    def load(self, filename: str) -> None:
        with open(filename, 'r') as f:
            reader = csv.reader(f)

            for row in reader:
                row = [int(x) for x in row]
                self.params[tuple(row[:3])] = row[3:]

    def get_parameters(self, h1: int, h2: int, q: int) -> Tuple[int, int]:
        return self.params[(h1, h2, q)]


p = Parameters()
p.load('data.csv')
print(p.get_parameters(1, 2, 3))

Collectives™ on Stack Overflow

Best way to access data in python (with code)

1 Answer 1

Dictionary search

Saving and loading data

Having a class to handle it all

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Dictionary search

Saving and loading data

Having a class to handle it all

Comments

Your Answer

Sign up or log in

Post as a guest

Related