How do I determine the appropriate check interval?

I'm just starting to work on a tornado application that is having some CPU issues. The CPU time will monotonically grow as time goes by, maxing out the CPU at 100%. The system is currently designed to not block the main thread. If it needs to do something that blocks and asynchronous drivers aren't available, it will spawn another thread to do the blocking operation.

Thus we have the main thread being almost totally CPU-bound and a bunch of other threads that are almost totally IO-bound. From what I've read, this seems to be the perfect way to run into problems with the GIL. Plus, my profiling shows that we're spending a lot of time waiting on signals (which I'm assuming is what __semwait_signal is doing), which is consistent with the effects the GIL would have in my limited understanding.

If I use sys.setcheckinterval to set the check interval to 300, the CPU growth slows down significantly. What I'm trying to determine is whether I should increase the check interval, leave it at 300, or be scared with upping it. After all, I notice that CPU performance gets better, but I'm a bit concerned that this will negatively impact the system's responsiveness.

Of course, the correct answer is probably that we need to rethink our architecture to take the GIL into account. But that isn't something that can be done immediately. So how do I determine the appropriate course of action to take in the short-term?

Asked by: Ryan970 | Posted: 30-11-2021

Answer 1

The first thing I would check for would be to ensure that you're properly exiting threads. It's very hard to figure out what's going on with just your description to go from, but you use the word "monotonically," which implies that CPU use is tied to time rather than to load.

You may very well be running into threading limits of Python, but it should vary up and down with load (number of active threads,) and CPU usage (context switching costs) should reduce as those threads exit. Is there some reason for a thread, once created, to live forever? If that's the case, prioritize that rearchitecture. Otherwise, short term would be to figure out why CPU usage is tied to time and not load. It implies that each new thread has a permanent, irreversible cost in your system - meaning it never exits.

Answered by: Kelsey813 | Posted: 01-01-2022

Similar questions

python - How to determine appropriate strftime format from a date string?

The dateutil parser does a great job of correctly guessing the date and time from a wide variety of sources. We are processing files in which each file uses only one date/time format, but the format varies between files. Profiling shows a lot of time being used by dateutil.parser.parse. Since it only needs to be determined once per file, implementing something that isn't guessing the for...

grammar - How does the python determine the appropriate metaclass?

I maybe find the answer in official doc, but I still can not get it. Can you give me a more detailed or straight forward explanation in the second and third case? I will appreciate it if you could give me some examples! The appropriate metaclass for a class definition is deter...

python - Django voting system: Determine rank in appropriate way

I am trying to create a simple voting system but the problem is while displaying the rank of participants. For example if there are 20 participants and 10 have same votes then all of them are ranked same. All the participants have date_joined. So i want to re rank the participants having same number of votes based on date_joined. Any suggestions? Below is the model for participants: class participants...

python - Which of these scripting languages is more appropriate for pen-testing?

Closed. This question is opinion-based. It is not c...

Help me find an appropriate ruby/python parser generator

The first parser generator I've worked with was Parse::RecDescent, and the guides/tutorials available for it were great, but the most useful feature it has was it's debugging tools, specifically the tracing capabilities ( activated by setting $RD_TRACE to 1 ). I am looking for a parser generator that can help you debug it's rules. The thing is, it has to be written in python or in ruby, and have a verbose mo...

Would python be an appropriate choice for a video library for home use software

I am thinking of creating a video library software which keep track of all my videos and keep track of videos that I already haven't watched and stats like this. The stats will be specific to each user using the software. My question is, is python appropriate to create this software or do I need something like c++.

Is Python appropriate for algorithms focused on scientific computing?

arrays - Most appropriate data structure (Python)

I'm new to Python and have what is probably a very basic question about the 'best' way to store data in my code. Any advice much appreciated! I have a long .csv file in the following format: Scenario,Year,Month,Value 1,1961,1,0.5 1,1961,2,0.7 1,1961,3,0.2 etc. My scenario values run from 1 to 100, year goes from 1961 to 1990 and month goes from 1 to 12. My file therefore has 100*29...

python - Numpy time based vector operations where state of preceding elements matters - are for loops appropriate?

What do numpy arrays provide when performing time based calculations where state matters. In other words, where what has occurred in earlier or later in a sequence is important. Consider the following time based vectors, TIME = np.array([0., 10., 20., 30., 40., 50., 60., 70., 80., 90.]) FLOW = np.array([100., 75., 60., 20.0, 60.0, 50.0, 20.0, 30.0, 20.0, 10.0]) TEMP = np.array([300., 310...

python extend or append a list when appropriate

Is there a simple way to append a list if X is a string, but extend it if X is a list? I know I can simply test if an object is a string or list, but I was wondering if there is a quicker way than this?

python - Finding appropriate cut-off values

I try to implement Hampel tanh estimators to normalize highly asymmetric data. In order to do this, I need to perform the following calculation: Given x - a sorted list of numbers and...

python - What is an appropriate way to datamine the total number of results of a keyword search?

newbie programmer and lurker here, hoping for some sensible advice. :) Using a combination of Python, BeautifulSoup, and the Bing API, I was able to find what I wanted with the following code: import urllib2 from BeautifulSoup import BeautifulStoneSoup Appid = #My Appid query = #My query soup = BeautifulStoneSoup(urllib2.urlopen("" + Appid + "&query=" ...

python - Defining appropriate number of processes

I have a python code treating a lot of apache logs (decompress, parse, crunching numbers, regexping etc). One parent process which takes a list of files (up to few millions), and sends a list of files to parse to workers, using multiprocess pool. I wonder, if there is any guidelines / benchmarks / advices which can help me to estimate ideal number of child process ? Ie. having one process per core...

Still can't find your answer? Check out these communities...

PySlackers | Full Stack Python | NHS Python | Pythonist Cafe | Hacker Earth | Discord Python