Skip to main content
Commonmark migration
Source Link
  1. Parsing the .bc3 file uploaded by the user.

    Parsing the .bc3 file uploaded by the user.

    Everything is working as expected.

  2. Instantiating the Concept model.

    I save the instances in concept_instances = [c1, c2, c3... cn].

  3. Inserting Concept instances into the database.

    In order to speed up the load I use the bulk_create(concept_instances) method.

  4. Instantiating the Deco model.

    I save the instances in deco_instances = [d1, d2, d3... dn]. But, to do that I need to retrieve each Concept object from the database because of the parent_concept and concept fields.

  5. Inserting Deco instances into the database.

    As before, to speed up the load I use the bulk_create(deco_instances) method.

Everything is working as expected.

  1. Instantiating the Concept model.

I save the instances in concept_instances = [c1, c2, c3... cn].

  1. Inserting Concept instances into the database.

In order to speed up the load I use the bulk_create(concept_instances) method.

  1. Instantiating the Deco model.

I save the instances in deco_instances = [d1, d2, d3... dn]. But, to do that I need to retrieve each Concept object from the database because of the parent_concept and concept fields.

  1. Inserting Deco instances into the database.

As before, to speed up the load I use the bulk_create(deco_instances) method.

  1. Parsing the .bc3 file uploaded by the user.

Everything is working as expected.

  1. Instantiating the Concept model.

I save the instances in concept_instances = [c1, c2, c3... cn].

  1. Inserting Concept instances into the database.

In order to speed up the load I use the bulk_create(concept_instances) method.

  1. Instantiating the Deco model.

I save the instances in deco_instances = [d1, d2, d3... dn]. But, to do that I need to retrieve each Concept object from the database because of the parent_concept and concept fields.

  1. Inserting Deco instances into the database.

As before, to speed up the load I use the bulk_create(deco_instances) method.

  1. Parsing the .bc3 file uploaded by the user.

    Everything is working as expected.

  2. Instantiating the Concept model.

    I save the instances in concept_instances = [c1, c2, c3... cn].

  3. Inserting Concept instances into the database.

    In order to speed up the load I use the bulk_create(concept_instances) method.

  4. Instantiating the Deco model.

    I save the instances in deco_instances = [d1, d2, d3... dn]. But, to do that I need to retrieve each Concept object from the database because of the parent_concept and concept fields.

  5. Inserting Deco instances into the database.

    As before, to speed up the load I use the bulk_create(deco_instances) method.

Added primary_key=True to the code field.
Source Link
oubiga
  • 13
  • 1
  • 6
class Concept(models.Model):
    code = models.CharField(_('code'), max_length=20, unique=Trueprimary_key=True)
    root = models.BooleanField(_('is it root'), default=False)
    chapter = models.BooleanField(_('is it chapter'), default=False)
    parent = models.BooleanField(_('is it parent'), default=False)
    unit = models.CharField(_('unit'), blank=True, max_length=3)
    summary = models.CharField(_('summary'), blank=True, max_length=100)
    price = models.DecimalField(_('price'), max_digits=12, decimal_places=3,
                                null=True, blank=True)
    date = models.DateField(_('creation date'), null=True, blank=True)
    concept_type = models.CharField(_('concept type'), max_length=3, blank=True)

    def __str__(self):
        return '%s: %s' % (self.code, self.summary)


class Deco(models.Model):
    parent_concept = models.ForeignKey(Concept, null=True, blank=True,
                                       related_name='decos')
    concept = models.ForeignKey(Concept, null=True, blank=True)
    factor = models.DecimalField(max_digits=12, decimal_places=3,
                                 default=Decimal('0.000'))
    efficiency = models.DecimalField(max_digits=12, decimal_places=3,
                                     default=Decimal('0.000'))

    def __str__(self):
        return '%s: %s' % (self.parent_concept, self.concept)
class Concept(models.Model):
    code = models.CharField(_('code'), max_length=20, unique=True)
    root = models.BooleanField(_('is it root'), default=False)
    chapter = models.BooleanField(_('is it chapter'), default=False)
    parent = models.BooleanField(_('is it parent'), default=False)
    unit = models.CharField(_('unit'), blank=True, max_length=3)
    summary = models.CharField(_('summary'), blank=True, max_length=100)
    price = models.DecimalField(_('price'), max_digits=12, decimal_places=3,
                                null=True, blank=True)
    date = models.DateField(_('creation date'), null=True, blank=True)
    concept_type = models.CharField(_('concept type'), max_length=3, blank=True)

    def __str__(self):
        return '%s: %s' % (self.code, self.summary)


class Deco(models.Model):
    parent_concept = models.ForeignKey(Concept, null=True, blank=True,
                                       related_name='decos')
    concept = models.ForeignKey(Concept, null=True, blank=True)
    factor = models.DecimalField(max_digits=12, decimal_places=3,
                                 default=Decimal('0.000'))
    efficiency = models.DecimalField(max_digits=12, decimal_places=3,
                                     default=Decimal('0.000'))

    def __str__(self):
        return '%s: %s' % (self.parent_concept, self.concept)
class Concept(models.Model):
    code = models.CharField(_('code'), max_length=20, primary_key=True)
    root = models.BooleanField(_('is it root'), default=False)
    chapter = models.BooleanField(_('is it chapter'), default=False)
    parent = models.BooleanField(_('is it parent'), default=False)
    unit = models.CharField(_('unit'), blank=True, max_length=3)
    summary = models.CharField(_('summary'), blank=True, max_length=100)
    price = models.DecimalField(_('price'), max_digits=12, decimal_places=3,
                                null=True, blank=True)
    date = models.DateField(_('creation date'), null=True, blank=True)
    concept_type = models.CharField(_('concept type'), max_length=3, blank=True)

    def __str__(self):
        return '%s: %s' % (self.code, self.summary)


class Deco(models.Model):
    parent_concept = models.ForeignKey(Concept, null=True, blank=True,
                                       related_name='decos')
    concept = models.ForeignKey(Concept, null=True, blank=True)
    factor = models.DecimalField(max_digits=12, decimal_places=3,
                                 default=Decimal('0.000'))
    efficiency = models.DecimalField(max_digits=12, decimal_places=3,
                                     default=Decimal('0.000'))

    def __str__(self):
        return '%s: %s' % (self.parent_concept, self.concept)
Added bc3parser.py module
Source Link
oubiga
  • 13
  • 1
  • 6

bc3parser.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""Parses bc3 files and insert all the data into the database."""

import re

from enkendas.models import Version, Concept, Deco, Text

from .utils import optional_codes, parse_dates

# regex stuff
# parsers stuff

concepts = {}
decos = {}

# decos = {'PER02': [('Qexcav', '1', '231.13'), ('Qzanj', '1', '34.5'),
#                    ('Qexcav2', '1', '19.07'), ('Qrelltras', '1', '19.07')],
# ...
#          'Qexcav': [('MMMT.3c', '1', '0.045'), ('O01OA070', '1', '0.054'),
#                     ('M07CB030', '1', '0.036'), ('%0300', '1', '0.03')]}

def dispatch_record(record):
    """
    Dispatch every record.

    Check the first character of the record and send it to the proper function.
    """
    if record.startswith('D'):
        parse_decomp(record)
    elif record.startswith('V'):
        parse_version(record)
    elif record.startswith('C'):
        parse_concept(record)
    elif record.startswith('T'):
        parse_text(record)
    else:
        pass

def parse_file(file):
    """
    Parse the whole file.

    file is a generator returned by file.chunks(chunk_size=80000) in views.py.
    """
    while True:
        try:
            record = ''
            incomplete_record = ''
            # Iterates over the file sent by the user.
            byte_string = next(file)
            byte_stripped_string = byte_string.strip()
            string = byte_stripped_string.decode(encoding='ISO-8859-1')
            # List of records.
            durty_strings_list = string.split('~')

            # Check if one chunk in chunks is complete.
            if durty_strings_list[-1] != '' and incomplete_record != '':
                incomplete_record = incomplete_record + durty_strings_list.pop(-1)
                dispatch_record(incomplete_record)
                incomplete_record = ''
            elif durty_strings_list[-1] != '' and incomplete_record == '':
                incomplete_record = durty_strings_list.pop(-1)

            for durty_string in durty_strings_list:
                stripped_string = durty_string.strip()
                if durty_string == '':
                    record = record + ''
                # TODO: I didn't create a regex for 'M' and 'E' records yet.
                elif durty_string[0] == 'M' or durty_string[0] == 'E':
                    continue

                if record != '':
                    # Dispatch the previous record.
                    dispatch_record(record)
                    # Reset the used record.
                    record = ''
                    # Assign the current record.
                    record = stripped_string
                else:
                    record = record + stripped_string
        except StopIteration as e:
            dispatch_record(record)
            break

    concept_instances = []
    for key_code, data in concepts.items():
        code = key_code
        root = chapter = parent = False
        if len(key_code) > 2 and key_code[-2:] == '##':
            root = True
            code = key_code[:-2]
        elif len(key_code) > 1 and key_code[-1:] == '#':
            chapter = True
            code = key_code[:-1]
        if code in decos:
            parent = True
        concept = Concept(code=code, root=root, chapter=chapter, parent=parent,
                          unit=data['unit'], summary=data['summary'],
                          price=data['price'], date=data['date'],
                          concept_type=data['concept_type'])
        concept_instances.append(concept)

    Concept.objects.bulk_create(concept_instances)

    deco_instances = []
    cobjs_storage = {}
    for concept in Concept.objects.all():
        if concept.parent is False:
            continue

        dec = decos[concept.code]
        for child, factor, efficiency in dec:
            if child == '':
                continue
            if factor == '':
                factor = '0.000'
            if efficiency == '':
                efficiency = '0.000'
            # To avoid extra queries.
            if child in cobjs_storage:
                cobj = cobjs_storage[child]
            else:
                cobj = Concept.objects.get(code=child)
                cobjs_storage.update({child: cobj})
            deco = Deco(parent_concept=concept, concept=cobj,
                        factor=float(factor), efficiency=float(efficiency))
            deco_instances.append(deco)
            decos.pop(concept.code, None)

    Deco.objects.bulk_create(deco_instances)

Process

Process

bc3parser.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""Parses bc3 files and insert all the data into the database."""

import re

from enkendas.models import Version, Concept, Deco, Text

from .utils import optional_codes, parse_dates

# regex stuff
# parsers stuff

concepts = {}
decos = {}

# decos = {'PER02': [('Qexcav', '1', '231.13'), ('Qzanj', '1', '34.5'),
#                    ('Qexcav2', '1', '19.07'), ('Qrelltras', '1', '19.07')],
# ...
#          'Qexcav': [('MMMT.3c', '1', '0.045'), ('O01OA070', '1', '0.054'),
#                     ('M07CB030', '1', '0.036'), ('%0300', '1', '0.03')]}

def dispatch_record(record):
    """
    Dispatch every record.

    Check the first character of the record and send it to the proper function.
    """
    if record.startswith('D'):
        parse_decomp(record)
    elif record.startswith('V'):
        parse_version(record)
    elif record.startswith('C'):
        parse_concept(record)
    elif record.startswith('T'):
        parse_text(record)
    else:
        pass

def parse_file(file):
    """
    Parse the whole file.

    file is a generator returned by file.chunks(chunk_size=80000) in views.py.
    """
    while True:
        try:
            record = ''
            incomplete_record = ''
            # Iterates over the file sent by the user.
            byte_string = next(file)
            byte_stripped_string = byte_string.strip()
            string = byte_stripped_string.decode(encoding='ISO-8859-1')
            # List of records.
            durty_strings_list = string.split('~')

            # Check if one chunk in chunks is complete.
            if durty_strings_list[-1] != '' and incomplete_record != '':
                incomplete_record = incomplete_record + durty_strings_list.pop(-1)
                dispatch_record(incomplete_record)
                incomplete_record = ''
            elif durty_strings_list[-1] != '' and incomplete_record == '':
                incomplete_record = durty_strings_list.pop(-1)

            for durty_string in durty_strings_list:
                stripped_string = durty_string.strip()
                if durty_string == '':
                    record = record + ''
                # TODO: I didn't create a regex for 'M' and 'E' records yet.
                elif durty_string[0] == 'M' or durty_string[0] == 'E':
                    continue

                if record != '':
                    # Dispatch the previous record.
                    dispatch_record(record)
                    # Reset the used record.
                    record = ''
                    # Assign the current record.
                    record = stripped_string
                else:
                    record = record + stripped_string
        except StopIteration as e:
            dispatch_record(record)
            break

    concept_instances = []
    for key_code, data in concepts.items():
        code = key_code
        root = chapter = parent = False
        if len(key_code) > 2 and key_code[-2:] == '##':
            root = True
            code = key_code[:-2]
        elif len(key_code) > 1 and key_code[-1:] == '#':
            chapter = True
            code = key_code[:-1]
        if code in decos:
            parent = True
        concept = Concept(code=code, root=root, chapter=chapter, parent=parent,
                          unit=data['unit'], summary=data['summary'],
                          price=data['price'], date=data['date'],
                          concept_type=data['concept_type'])
        concept_instances.append(concept)

    Concept.objects.bulk_create(concept_instances)

    deco_instances = []
    cobjs_storage = {}
    for concept in Concept.objects.all():
        if concept.parent is False:
            continue

        dec = decos[concept.code]
        for child, factor, efficiency in dec:
            if child == '':
                continue
            if factor == '':
                factor = '0.000'
            if efficiency == '':
                efficiency = '0.000'
            # To avoid extra queries.
            if child in cobjs_storage:
                cobj = cobjs_storage[child]
            else:
                cobj = Concept.objects.get(code=child)
                cobjs_storage.update({child: cobj})
            deco = Deco(parent_concept=concept, concept=cobj,
                        factor=float(factor), efficiency=float(efficiency))
            deco_instances.append(deco)
            decos.pop(concept.code, None)

    Deco.objects.bulk_create(deco_instances)

Process

added 1 character in body
Source Link
oubiga
  • 13
  • 1
  • 6
Loading
deleted 43 characters in body
Source Link
Jamal
  • 35.2k
  • 13
  • 134
  • 238
Loading
Source Link
oubiga
  • 13
  • 1
  • 6
Loading