# How to Calculate Centroid in python

I'm beginner to python coding. I'm working over structural coordinates. I have pdb structure which have xyz coordinate information (last three col)

``````ATOM      1  N   SER A   1      27.130   7.770  34.390
ATOM      2  1H  SER A   1      27.990   7.760  34.930
ATOM      3  2H  SER A   1      27.160   6.960  33.790
ATOM      4  3H  SER A   1      27.170   8.580  33.790
ATOM      5  CA  SER A   1      25.940   7.780  35.250
ATOM      6  CB  SER A   1      25.980   9.090  36.020
ATOM      7  OG  SER A   1      26.740  10.100  35.320
ATOM      8  HG  SER A   1      26.750  10.940  35.860
ATOM      9  C   SER A   1      24.640   7.790  34.460
ATOM     10  O   SER A   1      24.530   8.510  33.500
ATOM     11  N   CYS A   2      23.590   7.070  34.760
ATOM     12  H   CYS A   2      23.590   6.550  35.610
ATOM     13  CA  CYS A   2      22.420   7.010  33.900
ATOM     14  CB  CYS A   2      21.620   5.760  34.270
ATOM     15  SG  CYS A   2      22.480   4.210  33.970
ATOM     16  C   CYS A   2      21.590   8.220  34.040
ATOM     17  O   CYS A   2      21.370   8.690  35.160
``````
• I have 1000 atoms in my structure.
• I have two queries.

How I can calculate the centroid of the structure from xyz coordinates.
From the centroid I want to draw a sphere of radius 20cm.

``````I try this

from __future__ import division
import math as mean
import numpy as nx
from string import*

infile = open('file.pdb', 'r')           #open my file
infile.close()

text = []
for i in text1:
if i != '':
text.append(i)

for j in text:
x1 = eval(replace(j[30:38], ' ', ''))         #extract x-coordinate
y1 = eval(replace(j[38:46], ' ', ''))         #extract y-coordinate
z1 = eval(replace(j[46:54], ' ', ''))         #extract z-coordinate

idcord = []
idcord.append(x1); idcord.append(y1); idcord.append(z1)

centroid = nx.mean(idcord)
print centroid
``````

it gives the centroid of each atom (xyz) i need a central point how??????

Asked by: Walter354 | Posted: 30-11-2021

First of all, an easier way to read your file is with numpy's `genfromtxt` function. You don't need to import string, and you don't need to loop through all the lines and append text or count the characters.

``````from __future__ import division
import numpy as nx

data = nx.genfromtxt('file.pdb')
``````

Then, the last three columns can be accessed as:

``````data[:, -3:]
``````

Where the first `:` means "all rows", and `-3:` means from the third-to-last column to the last column.

So, you can average them as such:

``````nx.mean(data[:,-3:], axis=0)
``````

where the `axis=0` argument tells `nx.mean` to take the average along the first (`0th`) axis. It looks like this:

``````In : data[:,-3:]
Out:
array([[ 27.13,   7.77,  34.39],
[ 27.99,   7.76,  34.93],
[ 27.16,   6.96,  33.79],
[ 27.17,   8.58,  33.79],
[ 25.94,   7.78,  35.25],
[ 25.98,   9.09,  36.02],
[ 26.74,  10.1 ,  35.32],
[ 26.75,  10.94,  35.86],
[ 24.64,   7.79,  34.46],
[ 24.53,   8.51,  33.5 ],
[ 23.59,   7.07,  34.76],
[ 23.59,   6.55,  35.61],
[ 22.42,   7.01,  33.9 ],
[ 21.62,   5.76,  34.27],
[ 22.48,   4.21,  33.97],
[ 21.59,   8.22,  34.04],
[ 21.37,   8.69,  35.16]])

In : np.mean(data[:,-3:], axis=0)
Out: array([ 24.74647059,   7.81117647,  34.64823529])
``````

Some other things:

1) remove this line: `import math as mean`, which imports the entire `math` module and renames it `mean`. What you intended was `from math import mean` which imports the `mean` function from the `math` module. But in your code, you end up using the `math` function from the `numpy` (`nx`) module anyway, so you never used the `math` version.

2) your loop is not indented, which means you either pasted incorrectly into StackOverflow, or your loop is incorrectly indented. Possibly, this is what your code actually looks like:

``````for j in text:
x1 = eval(replace(j[30:38], ' ', ''))         #extract x-coordinate
y1 = eval(replace(j[38:46], ' ', ''))         #extract y-coordinate
z1 = eval(replace(j[46:54], ' ', ''))         #extract z-coordinate

idcord = []
idcord.append(x1); idcord.append(y1); idcord.append(z1)

centroid = nx.mean(idcord)
print centroid
``````

But the problem is that `idcord` gets set to an empty list every time the loop goes through, and a new centroid is calculated, for each particle. You don't even need the loop at all if you import the data file all at once as above. In fact, your entire code can be:

``````from __future__ import division
import numpy as nx

data = nx.genfromtxt('file.pdb')
nx.mean(data[:,-3:], axis=0)
``````

Answered by: Emily235 | Posted: 01-01-2022

try this

``````import numpy as nx
X = nx.rand(10,3)   # generate some number
centroid = nx.mean(X)
print centroid
``````

Answered by: Emily905 | Posted: 01-01-2022

