Large Sqlite database search

How is it possible to implement an efficient large Sqlite db search (more than 90000 entries)?

I'm using Python and SQLObject ORM:

    import re
    ...

    def search1():
        cr = re.compile(ur'foo')
        for item in Item.select():
            if cr.search(item.name) or cr.search(item.skim):
                print item.name

This function runs in more than 30 seconds. How should I make it run faster?

UPD: The test:

    for item in Item.select():
        pass

... takes almost the same time as my initial function (0:00:33.093141 to 0:00:33.322414). So the regexps eat no time.

A Sqlite3 shell query:

    select '' from item where name like '%foo%';

runs in about a second. So the main time consumption happens due to the inefficient ORM's data retrieval from db. I guess SQLObject grabs entire rows here, while Sqlite touches only necessary fields.


Asked by: Charlie239 | Posted: 30-11-2021






Answer 1

The best way would be to rework your logic to do the selection in the database instead of in your python program.

Instead of doing Item.select(), you should rework it to do Item.select("""name LIKE ....

If you do this, and make sure you have the name and skim columns indexed, it will return very quickly. 90000 entries is not a large database.

Answered by: Cherry273 | Posted: 01-01-2022



Answer 2

30 seconds to fetch 90,000 rows might not be all that bad.

Have you benchmarked the time required to do the following?

    for item in Item.select():
        pass

Just to see if the time is DB time, network time or application time?

If your SQLite DB is physically very large, you could be looking at -- simply -- a lot of physical I/O to read all that database stuff in.

Answered by: Roland269 | Posted: 01-01-2022



Answer 3

If you really need to use a regular expression, there's not really anything you can do to speed that up tremendously.

The best thing would be to write an sqlite function that performs the comparison for you in the db engine, instead of Python.

You could also switch to a db server like postgresql that has support for SIMILAR.

http://www.postgresql.org/docs/8.3/static/functions-matching.html

Answered by: Daniel537 | Posted: 01-01-2022



Answer 4

I would definitely take a suggestion of Reed to pass the filter to the SQL (forget the index part though).

I do not think that selecting only specified fields or all fields make a difference (unless you do have a lot of large fields). I would bet that SQLObject creates/instanciates 80K objects and puts them into a Session/UnitOfWork for tracking. This could definitely take some time.

Also if you do not need objects in your session, there must be a way to select just what the fields you need using custom-query creation so that no Item objects are created, but only tuples.

Answered by: Ada569 | Posted: 01-01-2022



Answer 5

Initially doing regex via Python was considered for y_serial, but that was dropped in favor of SQLite's GLOB (which is far faster). GLOB is similar to LIKE except that it's syntax is more conventional: * instead of %, ? instead of _ .

See the Endnotes at http://yserial.sourceforge.net/ for more details.

Answered by: Lily157 | Posted: 01-01-2022



Answer 6

Given your example and expanding on Reed's answer your code could look a bit like the following:

import re
import sqlalchemy.sql.expression as expr

...

def search1():
    searchStr = ur'foo'
    whereClause = expr.or_(itemsTable.c.nameColumn.contains(searchStr), itemsTable.c.skimColumn.contains(searchStr))
    for item in Items.select().where(whereClause):
        print item.name

which translates to

SELECT * FROM items WHERE name LIKE '%foo%' or skim LIKE '%foo%'

This will have the database do all the filtering work for you instead of fetching all 90000 records and doing possibly two regex operations on each record.

You can find some info here on the .contains() method here.

As well as the SQLAlchemy SQL Expression Language Tutorial here.

Of course the example above assumes variable names for your itemsTable and the column it has (nameColumn and skimColumn).

Answered by: Carlos334 | Posted: 01-01-2022



Similar questions

python - Is there any nosql flat file database just as sqlite?

Closed. This question does not meet Stack Overflow guid...


With Python, SQLite and HTML, how do i read from a database and print to page?

I am a beginner in SQLite, so please bear with me if what i'm about to ask seems silly. I currently have a database called "Status", with two columns "stamp" and "messages". "messages" contains a text, and "stamp" contains a time stamp. I would like to read the last 10 entries of this database, and display them to a HTML page. The code I am using to read from database is: @cherrypy.e...


With Python and SQLite, how do i return the last n item inside the database?

I have a database containing a list of status updates and time stamp. by executing the following python script import sqlite3 as lite import sys con = lite.connect('Status.db') with con: cur = con.cursor() cur.execute("SELECT * FROM Status") print "The Status database now contains:" for row in cur: print row results i...


python - With SQLite, how do i read from a list into a database?

I am running a server with cherrypy and python script. Currently, there is a web page containing data of a list of people, which i need to get. The format of the web page is as follow: www.url1.com, firstName_1, lastName_1 www.url2.com, firstName_2, lastName_2 www.url3.com, firstName_3, lastName_3 I want to be able to break it up into 3 columns and store the information inside a database....


database - Python, Sqlite not saving results on the file

I have this code in Python: conn = sqlite3.connect("people.db") cursor = conn.cursor() sql = 'create table if not exists people (id integer, name VARCHAR(255))' cursor.execute(sql) conn.commit() sql = 'insert into people VALUES (3, "test")' cursor.execute(sql) conn.commit() sql = 'insert into people VALUES (5, "test")' cursor.execute(sql) conn.commit() print 'Printing all inserted' cursor.execute("...


Python sqlite copy table from one database to another

I'm using python 2.7 with the builtin sqlite3 module on Windows XP. The code looks like the following: #!/usr/bin/env python2 import sqlite3 import sys def open_db(nam): conn = sqlite3.connect(sys.argv[1]) # Let rows returned be of dict/tuple type conn.row_factory = sqlite3.Row print "Openned database %s as %r" % (nam, conn) return conn def copy_table(table, src, dest): print "Co...


python sqlite - database loaded in to memory (RAM)?

Closed. This question needs to be more focused. It ...


python - flask sqlite cannot delete rows from database

I cant seem to delete rows from sqlite database when using flask @app.route('/deletePath',methods=['POST']) def deletePath(): if 'toDelPathIndex' in request.form: pathIndex = request.form['toDelPathIndex'] if not isInt(pathIndex): errorMessage="Given Path Index "+pathIndex+" is not valid" return render_template('addPath.html',error=...


python - How to write something to sqlite database?

I always got this error when I trying to run my python script: c.execute("INSERT INTO test (word) VALUES (?)", (word)) sqlite3.ProgrammingError: Incorrect number of bindings supplied. The current statement uses 1, and there are 4 supplied. Any idea what is the problem? import itertools, time, sqlite3 def main(): try: print ("Still running, pl...


database - Python SQLite ToDo script

I'm trying to make a script that asks the user to input a task, the task is then stored in a SQLite database. I am having a problem getting my delete function to work. Also when I input a new task I have to encapsulate in quotes in order to make it run. import sqlite3 import sys con = sqlite3.connect('taskmgr.db') def create_db(): with con: c.execute("DROP TABLE IF EXISTS ToDo") c.ex...


python - Getting data to SQLite database in Django

I'm new to Python and Django and I have ran into problems, I couldnt find answer to. I'm using Django 1.7 with SQLite3 and Python 3.4. I'm trying to make a sports league table, by getting data from a third party website and posting it to my own. So far I can get the needed data from the web. I have set up the tables to my SQLite db and can write the data, which I manually enter, to my page with django-tables2 app. The pro...


python - Every time i try to index my SQLite database it gets locked

I have a fairly large SQLite database populated database, where the table I am trying to index have approximately 170m records. In retrospect I should have created the index before I populated it, but that is too late now. Whenever I run my indexing code, which is supplied under, everything goes fine for the first 10 minutes before it gets locked. I am not doing anything with the database while this script runs. ...


python - SQLite 3 Database with Django

I am stuck on Django Documentation tutorial on displaying SQLite table. " If you’re interested, run the command-line client for your database and type \dt (PostgreSQL), SHOW TABLES; (MySQL), or .schema (SQLite) to display the tables Django created. " I have created a project named mysite. Location : C:\Python34\Scripts\mysite Inside mysite, there are mysite fol...


python - Data entry in sqlite database

l m a fresh user and l m not good at coding yet. l created a database with 3 columns: (date, station, pcp ). l have 41_years_data set and by using a for loop, l would like to insert these data set in database. I could not do it correctly. When l run the code, it gives "c.execute("INSERT INTO pcp VALUES(kdm,station,klm)") sqlite3.OperationalError: no such column: kdm" ...


Python Flask - Add POST data to sqlite database

I am sending data from a python script to the server using POST, I want to then store that data in an SQLite database. My current code returns a 500 internal server error to the client. Here is the client script: import requests from random import randint def WindSpeed(): #Creates makeshift WindSpeed data to send to server return randint(0,20) def Temp(): #Creates makeshift Temp data ...


python - Why I cannot store an image in SQLite database in Flask

I spent 2 days trying to figure out how I can store an image in my SQLite database after I receive the image from a from in HTML page using Flask in python. I did everythin I tried to convert it to binary using sqlite3.Binary(image) and then insert it to the database but I did not work. this is a recourse I found ...


python - Database SQLite INSERT

This question already has answers here:


C# vs Python query against SQLite database

I have a SQLite database that has a single table with 18 million rows and 24 columns. The schema is along the lines of, Date (VARCHAR) CompanyName (VARCHAR) Amount (REAL) AggCode (VARCHAR) Level1 ... Level20 (VARCHAR) I am querying the table two ways - first with a Python script, and then with a C# function that is exposed to Excel with ExcelDNA (ultimately I'd prefer to use Excel to run t...


python - How to print data from SQLite database in Flask

I'm having problems printing contents of SQLite database into a Flask web page. The output of the code below does not return the records I have fetched from the database into the Flask web page which is expecting it, and will display into a table. The code I have so far: flask.py from flask import Flask from flask import render_template import sqlite3 conn = sqlit...


database - Python, SQLITE 3, not such column

I have a problem with some code I wrote: def login(username, password): c.execute("SELECT * FROM masterPasswords WHERE passw='{}' AND user='{}'".format(password, username)) if passw == password and user == username: return True else: return False This is the error log: Traceback (most recent call last): File "C:/Users/fraze/Documents/GitHub/Pas...


python - Sometimes can't delete an Oracle database row using Django

I have a unit test which contains the following line of code Site.objects.get(name="UnitTest").delete() and this has worked just fine until now. However, that statement is currently hanging. It'll sit there forever trying to execute the delete. If I just say print Site.objects.get(name="UnitTest") then it works, so I know that it can retrieve the site. ...


python - How do I test a django database schema?

I want to write tests that can show whether or not the database is in sync with my models.py file. Actually I have already written them, only to find out that django creates a new database each time the tests are run based on the models.py file. Is there any way I can make the models.py test use the existing database schema? The one that's in mysql/postgresql, and not the one that's in /myapp/models.py ?


python - Using user input to find information in a Mysql database

I need to design a program using python that will ask the user for a barcode. Then, using this barcode, it will search a mysql to find its corresponding product. I am a bit stuck on how to get started. Does anyone have any tips for me?


python - Import XML into SQL database

I'm working with a 20 gig XML file that I would like to import into a SQL database (preferably MySQL, since that is what I am familiar with). This seems like it would be a common task, but after Googling around a bit I haven't been able to figure out how to do it. What is the best way to do this? I know this ability is built into MySQL 6.0, but that is not an option right now because it is an alpha development rel...


python - Using 'old' database with django

I'm using a hand built (Postgres) database with Django. With "inspectdb" I was able to automatically create a model for it. The problem is that some tables have multiple primary keys (for many-to-many relations) and they are not accessible via Django. What's the best way to access these tables?


database - python orm

This is a newbie theory question - I'm just starting to use Python and looking into Django and orm. Question: If I develop my objects and through additional development modify the base object structures, inheritance, etc. - would Django's ORM solution modify the database automatically OR do I need to perform a conversion (if the app is live)? So, I start with a basic Phone app Object person: name, address, city, s...


abap - Query SAP database from Python?

Closed. This question needs details or clarity. It ...


A database for python 3?

I'm coding a small piece of server software for the personal use of several users. Not hundreds, not thousands, but perhaps 3-10 at a time. Since it's a threaded server, SQLite doesn't work. It complains about threads like this: ProgrammingError: SQLite objects created in a thread can only be used in that same thread.The object was created in thread id 140735085562848 and this is thread id 43...


python - URLs stored in database for Django site

I've produced a few Django sites but up until now I have been mapping individual views and URLs in urls.py. Now I've tried to create a small custom CMS but I'm having trouble with the URLs. I have a database table (SQLite3) which contains code for the pages like a column for header, one for right menu, one for content.... so on, so on. I also have a column for the URL. How do I get Django to call the information in...


python - Using Django database layer outside of Django?

I've got a nice database I've created in Django, and I'd like to interface with through some python scripts outside of my website stuff, so I'm curious if it's possible to use the Django database API outside of a Django site, and if so does anyone have any info on how it can be done? Google hasn't yielded many hits for this.


database - Problem opening berkeley db in python

I have problems opening a berkeley db in python using bdtables. As bdtables is used by the library I am using to access the database, I need it to work. The problem seems to be that the db environment I am trying to open (I got a copy of the database to open), is version 4.4 while libdb is version 4.6. I get the following error using bsddb.dbtables.bsdTableDB([dbname],[folder]): (-30972, "DB_VERSION...


database - Store simple user settings in Python

I am programming a website in which users will have a number of settings, such as their choice of colour scheme, etc. I'm happy to store these as plain text files, and security is not an issue. The way I currently see it is: there is a dictionary, where all the keys are users and the values are dictionaries with the users' settings in them. For example, userdb["bob"]["colour_scheme"] would have the value "b...


python - Sometimes can't delete an Oracle database row using Django

I have a unit test which contains the following line of code Site.objects.get(name="UnitTest").delete() and this has worked just fine until now. However, that statement is currently hanging. It'll sit there forever trying to execute the delete. If I just say print Site.objects.get(name="UnitTest") then it works, so I know that it can retrieve the site. ...


python - How do I test a django database schema?

I want to write tests that can show whether or not the database is in sync with my models.py file. Actually I have already written them, only to find out that django creates a new database each time the tests are run based on the models.py file. Is there any way I can make the models.py test use the existing database schema? The one that's in mysql/postgresql, and not the one that's in /myapp/models.py ?


python - Using user input to find information in a Mysql database

I need to design a program using python that will ask the user for a barcode. Then, using this barcode, it will search a mysql to find its corresponding product. I am a bit stuck on how to get started. Does anyone have any tips for me?


python - Given an rpm package name, query the yum database for updates

I was imagining a 3-line Python script to do this but the yum Python API is impenetrable. Is this even possible? Is writing a wrapper for 'yum list package-name' the only way to do this?


python - Import XML into SQL database

I'm working with a 20 gig XML file that I would like to import into a SQL database (preferably MySQL, since that is what I am familiar with). This seems like it would be a common task, but after Googling around a bit I haven't been able to figure out how to do it. What is the best way to do this? I know this ability is built into MySQL 6.0, but that is not an option right now because it is an alpha development rel...


python - Using 'old' database with django

I'm using a hand built (Postgres) database with Django. With "inspectdb" I was able to automatically create a model for it. The problem is that some tables have multiple primary keys (for many-to-many relations) and they are not accessible via Django. What's the best way to access these tables?


python - How do I notify a process of an SQLite database change done in a different process?

Let's say I have two or more processes dealing with an SQLite database - a "player" process and many "editor" processes. The "player" process reads the database and updates a view - in my case it would be a waveform being mixed to the soundcard depending on events stored in the database. An "editor" process is any editor for that database: it changes the database constantly. Now I want the player t...


database - python orm

This is a newbie theory question - I'm just starting to use Python and looking into Django and orm. Question: If I develop my objects and through additional development modify the base object structures, inheritance, etc. - would Django's ORM solution modify the database automatically OR do I need to perform a conversion (if the app is live)? So, I start with a basic Phone app Object person: name, address, city, s...






Still can't find your answer? Check out these communities...



PySlackers | Full Stack Python | NHS Python | Pythonist Cafe | Hacker Earth | Discord Python



top