Best practices for manipulating database result sets in Python?

I am writing a simple Python web application that consists of several pages of business data formatted for the iPhone. I'm comfortable programming Python, but I'm not very familiar with Python "idiom," especially regarding classes and objects. Python's object oriented design differs somewhat from other languages I've worked with. So, even though my application is working, I'm curious whether there is a better way to accomplish my goals.

Specifics: How does one typically implement the request-transform-render database workflow in Python? Currently, I am using pyodbc to fetch data, copying the results into attributes on an object, performing some calculations and merges using a list of these objects, then rendering the output from the list of objects. (Sample code below, SQL queries redacted.) Is this sane? Is there a better way? Are there any specific "gotchas" I've stumbled into in my relative ignorance of Python? I'm particularly concerned about how I've implemented the list of rows using the empty "Record" class.

class Record(object):
    pass

def calculate_pnl(records, node_prices):
    for record in records:
        try:
            # fill RT and DA prices from the hash retrieved above
            if hasattr(record, 'sink') and record.sink:
                record.da = node_prices[record.sink][0] - node_prices[record.id][0]
                record.rt = node_prices[record.sink][1] - node_prices[record.id][1]
            else:
                record.da = node_prices[record.id][0]
                record.rt = node_prices[record.id][1]

            # calculate dependent values: RT-DA and PNL
            record.rtda = record.rt - record.da
            record.pnl = record.rtda * record.mw
        except:
            print sys.exc_info()

def map_rows(cursor, mappings, callback=None):
    records = []
    for row in cursor:
        record = Record()
        for field, attr in mappings.iteritems():
            setattr(record, attr, getattr(row, field, None))
        if not callback or callback(record):
            records.append(record)

    return records

def get_positions(cursor):
    # get the latest position time
    cursor.execute("SELECT latest data time")
    time = cursor.fetchone().time
    hour = eelib.util.get_hour_ending(time)

    # fetch the current positions
    cursor.execute("SELECT stuff FROM atable", (hour))

    # read the rows
    nodes = {}
    def record_callback(record):
        if abs(record.mw) > 0:
            if record.id: nodes[record.id] = None
            return True
        else:
            return False
    records = util.map_rows(cursor, {
        'id': 'id',
        'name': 'name',
        'mw': 'mw'
    }, record_callback)

    # query prices
    for node_id in nodes:
        # RT price
        row = cursor.execute("SELECT price WHERE ? ? ?", (node_id, time, time)).fetchone()
        rt5 = row.lmp if row else None

        # DA price
        row = cursor.execute("SELECT price WHERE ? ? ?", (node_id, hour, hour)).fetchone()
        da = row.da_lmp if row else None

        # update the hash value
        nodes[node_id] = (da, rt5)

    # calculate the position pricing
    calculate_pnl(records, nodes)

    # sort
    records.sort(key=lambda r: r.name)

    # return the records
    return records


Asked by: Miranda814 | Posted: 28-01-2022






Answer 1

The empty Record class and the free-floating function that (generally) applies to an individual Record is a hint that you haven't designed your class properly.

class Record( object ):
    """Assuming rtda and pnl must exist."""
    def __init__( self ):
        self.da= 0
        self.rt= 0
        self.rtda= 0 # or whatever
        self.pnl= None # 
        self.sink = None # Not clear what this is
    def setPnl( self, node_prices ):
        # fill RT and DA prices from the hash retrieved above
        # calculate dependent values: RT-DA and PNL

Now, your calculate_pnl( records, node_prices ) is simpler and uses the object properly.

def calculate_pnl( records, node_prices ):
    for record in records:
        record.setPnl( node_prices )

The point isn't to trivially refactor the code in small ways.

The point is this: A Class Encapsulates Responsibility.

Yes, an empty-looking class is usually a problem. It means the responsibilities are scattered somewhere else.

A similar analysis holds for the collection of records. This is more than a simple list, since the collection -- as a whole -- has operations it performs.

The "Request-Transform-Render" isn't quite right. You have a Model (the Record class). Instances of the Model get built (possibly because of a Request.) The Model objects are responsible for their own state transformations and updates. Perhaps they get displayed (or rendered) by some object that examines their state.

It's that "Transform" step that often violates good design by scattering responsibility all over the place. "Transform" is a hold-over from non-object design, where responsibility was a nebulous concept.

Answered by: Ada468 | Posted: 01-03-2022



Answer 2

Have you considered using an ORM? SQLAlchemy is pretty good, and Elixir makes it beautiful. It can really reduce the ammount of boilerplate code needed to deal with databases. Also, a lot of the gotchas mentioned have already shown up and the SQLAlchemy developers dealt with them.

Answered by: Lydia683 | Posted: 01-03-2022



Answer 3

Depending on how much you want to do with the data you may not need to populate an intermediate object. The cursor's header data structure will let you get the column names - a bit of introspection will let you make a dictionary with col-name:value pairs for the row. You can pass the dictionary to the % operator. The docs for the odbc module will explain how to get at the column metadata.

This snippet of code to shows the application of the % operator in this manner.

>>> a={'col1': 'foo', 'col2': 'bar', 'col3': 'wibble'}
>>> 'Col1=%(col1)s, Col2=%(col2)s, Col3=%(col3)s' % a
'Col1=foo, Col2=bar, Col3=wibble'
>>> 

Answered by: Miller399 | Posted: 01-03-2022



Answer 4

Using a ORM for an iPhone app might be a bad idea because of performance issues, you want your code to be as fast as possible. So you can't avoid boilerplate code. If you are considering a ORM, besides SQLAlchemy I'd recommend Storm.

Answered by: Darcy675 | Posted: 01-03-2022



Similar questions

datetime - Manipulating time in python

In the code shown below, I need to manipulate the time var in python to display a date/time stamp in python to represent that delay. For example, when the user enters the delay time in hours, I need to set the jcarddeliver var to update itself with the value of the current date/time + delay. Also it should update the date var as well. For example, if the date is 24 Feb and time is 15:00 hrs and the delay ti...


python - Suggestions on manipulating an SVG map

I'm working on a map of the native languages of California for Wikipedia. The map contains areas that each correspond to a language. The original looks like this (click it to see the SVG):


url - Does the Python standard library contain a module for manipulating URIs?

I'd like to pass a URI to a constructor and get back an object on which I can call obj.type, obj.host, obj.port, etc. The "Request" object of the urllib2 module is close to what I need, but not quite it.


Manipulating Directory Paths in Python

Basically I've got this current url and this other key that I want to merge into a new url, but there are three different cases. Suppose the current url is localhost:32401/A/B/foo if key is bar then I want to return localhost:32401/A/B/bar if key starts with a slash and is /A/bar then I want to return localhost:32401/A/bar finally if key is its own independent url then I just want to return ...


Manipulating binary data in Python

I am opening up a binary file like so: file = open("test/test.x", 'rb') and reading in lines to a list. Each line looks a little like: '\xbe\x00\xc8d\xf8d\x08\xe4.\x07~\x03\x9e\x07\xbe\x03\xde\x07\xfe\n' I am having a hard time manipulating this data. If I try and print each line, python freezes, and emits beeping noises (I think there's a binary beep cod...


python - Libraries for manipulating multivariate polynomials

I need to write some code that deals with generating and manipulating multivariable polynomials. I'll outline my task with a simplified example. Lets say I am given three expressions: 2x^2, 3y + 1, and 1z. I then need to multiply these together which would give me 6x^2yz + 2x^2z. Then I would like to find the partial derivatives of this expression with respect to x, y, and z. This would give me 12xyz + 4xz, 6x^2z,...


Using datetime and manipulating date strings using python

I have a file of the following format Summary:meeting Description:None DateStart:20100629T110000 DateEnd:20100629T120000 Time:20100805T084547Z Summary:meeting Description:None DateStart:20100630T090000 DateEnd:20100630T100000 Time:20100805T084547Z I need to create a function that would retrieve "Summary" at a given "date" and "time". For example the function would have two arguments, the ...


python - Manipulating list from lxml xpath queries

Today I tried lxml as I got very nasty html output from particular web service, and I didn't want to go with re module, just for change and to learn something new. And I did, browsing http://codespeak.net/lxml/ and http://stackoverflow.com in parallel I won't try to explain above html template, but just for ove...


python - Accessing and manipulating the values ( in list form) of a dictionary

I have a dictionary with keys and a list attached as value to each key. I have to traverse list value attached to each key and segregate them into two different lists with '0' and '1' ( as '0' and '1' are the values in the list) also with the count of '0' , '1' and the total. Please let me know how should i go abut doing this. Thanks


python - Manipulating another model's data from a model save method

I am building a simple application which has questions. Each question will have a binary vote (yes/no). I want to record the vote per user, so each user only gets one vote per question (similar to stack overflow or reddit). I have a votes_no and votes_yes int field on the question. These need to be updated in a transaction every time a vote is added. Can I do this from the Vote model's save method? ...






Still can't find your answer? Check out these communities...



PySlackers | Full Stack Python | NHS Python | Pythonist Cafe | Hacker Earth | Discord Python



top