I have a database table that stores imported information. For simplicity, its something like:
CREATE TABLE `data_import` (
`id` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
`amount` DECIMAL(12,2) NULL DEFAULT NULL,
`payee` VARCHAR(50) NULL DEFAULT NULL,
`posted` TINYINT(1) NOT NULL DEFAULT 0,
PRIMARY KEY (`id`),
INDEX `payee` (`payee`)
)
I also have a table that stores import rules:
CREATE TABLE `import_rules` (
`id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
`search` VARCHAR(50) NULL DEFAULT NULL,
PRIMARY KEY (`id`),
INDEX `search` (`search`)
)
The idea is that for each imported transaction, the query needs to try find a single matching rule - this match is done on the data_import.payee and import_rules.seach fields. Because these are both varchar fields, I have indexed them in the hope of making the query faster.
This is what I have come up with so far, which seems to work fine. Albeit slower than I hoped.
SELECT i.id, i.payee, i.amount, i.posted r.id, r.search
FROM import_data id
LEFT JOIN import_rules ir on REPLACE(i.payee, ' ', '') = REPLACE(ir.search, ' ', '')
One thing that the above query does not cater for, is that if import_data.posted = 1, then I dont need to find a rule for that line - is it possible to stop the query joining on that particular row? Similarly, if the payee is null, then it shouldn't try join either.
Are there any other ways that I can optimise this? I realise that doing text joins is not ideal...not sure if there are any better methods.