Python Regex - How to Get Positions and Values of Matches

How can I get the start and end positions of all matches using the re module? For example given the pattern r'[a-z]' and the string 'a1b2c3d4' I'd want to get the positions where it finds each letter. Ideally, I'd like to get the text of the match back too.


Asked by: Connie924 | Posted: 24-09-2021






Answer 1

import re
p = re.compile("[a-z]")
for m in p.finditer('a1b2c3d4'):
    print(m.start(), m.group())

Answered by: Julia528 | Posted: 25-10-2021



Answer 2

Taken from

Regular Expression HOWTO

span() returns both start and end indexes in a single tuple. Since the match method only checks if the RE matches at the start of a string, start() will always be zero. However, the search method of RegexObject instances scans through the string, so the match may not start at zero in that case.

>>> p = re.compile('[a-z]+')
>>> print p.match('::: message')
None
>>> m = p.search('::: message') ; print m
<re.MatchObject instance at 80c9650>
>>> m.group()
'message'
>>> m.span()
(4, 11)

Combine that with:

In Python 2.2, the finditer() method is also available, returning a sequence of MatchObject instances as an iterator.

>>> p = re.compile( ... )
>>> iterator = p.finditer('12 drummers drumming, 11 ... 10 ...')
>>> iterator
<callable-iterator object at 0x401833ac>
>>> for match in iterator:
...     print match.span()
...
(0, 2)
(22, 24)
(29, 31)

you should be able to do something on the order of

for match in re.finditer(r'[a-z]', 'a1b2c3d4'):
   print match.span()

Answered by: Grace366 | Posted: 25-10-2021



Answer 3

For Python 3.x

from re import finditer
for match in finditer("pattern", "string"):
    print(match.span(), match.group())

You shall get \n separated tuples (comprising first and last indices of the match, respectively) and the match itself, for each hit in the string.

Answered by: Kimberly402 | Posted: 25-10-2021



Answer 4

note that the span & group are indexed for multi capture groups in a regex

regex_with_3_groups=r"([a-z])([0-9]+)([A-Z])"
for match in re.finditer(regex_with_3_groups, string):
    for idx in range(0, 4):
        print(match.span(idx), match.group(idx))

Answered by: Edward199 | Posted: 25-10-2021



Similar questions

Python Regex - Positions and Values of Matches of Unicode Text

I have to match multiple occurrences of tokens in a document and get the value and the position of the matched token. For non-Unicode text I'm using this regex r"\b(?=\w)" + re.escape(word) + r"\b(?!\w)" with finditer and it works. For Unicode text I must use a word-boundary like solution like u"(\s|^)%s(\s|$)" % word. This will work in most of cases, but not when I ...


python - Compare equal length lists to find positions that share the same element

I want to compare a list of lists that have the same length, but differ in their content. My script should return only the positions that share exactly the same element (in all lists). For example: l = [[1,2,3,4,5,6,7,8],[9,8,8,4,3,4,5,7,8],[5,6,7,4,9,9,9,8],[0,0,1,4,7,6,3,8]] and as a result I get a list of positions p = [3,7] as in all list we have '4' and '8' at positions 3 and 7, respec...


How to use Python slice to extract a list of positions in a string

I have two strings: a='1234512345' b='abcdefghik' I would like to search string "a" for occurrences of '1' and then print the positions of "b" that correspond to that index, i.e. 'af' I can use import re starts=[match.start() for match in re.finditer(re.escape('1'), a)] to find that '1' occurs in positions [0,5]. How w...


python - Numpy - Getting index positions by testing adjacent indexes

I am using numpy is Python. I have an image loaded up into numpy 2-dimensional array as: [ [...], # row0 [...], # row1 [...], # row2 ... ] I need to get all the index positions of all pixels where either (only one of the following) north, south, east, or west adjacent pixels are of certain value. In my case if any of the 4 adjacent pixels is 0.


python - Django Admin page - adding new positions to module

I'm planning to create my first project in Django - shops database, but I have some concerns, as I read some tutorials. I like the admin page concept, but have a question: When adding a shop I want user to be able to add some product positions related to it, like - item -> price, but their number won't be the same every time. How to solve this in the admin panel? and what would be the best solution in the d...


python - Select strings by positions of words

For the following tuple mysentence = 'i have a dog and a cat', 'i have a cat and a dog', 'i have a cat', 'i have a dog' How to select only the strings 'i have a cat' , 'i have a dog', i.e exclude strings having the word dog or cat in the middle.


decimal positions to decimal via hex? (python)

Sorry for confusing title, but i did not know what to call this... So in python I'm reading a binary file via an addon (intelhex). This gets me the values for each byte loopin through it for x in range(start_addr,end_addr): print ih[x] gives: 1 79 60 246 which is the same as: 01 4F 3C F6 I want the deci...


python - Print from specific positions in NumPy array

Closed. This question is opinion-based. It is not c...


Creating a tree in python given nodes and their positions

I have an XML file which contains different nodes of data that I randomly generated. What I want to do is run through each node and create a tree out of it. My customized software uses the XML data to draw these nodes and their connections visually. There is no criteria for which node connects to which; given 500 nodes, I want the ability to generate a tree with a decently complex breadth and depth. I'm cod...


python - Extract elements of list at odd positions

So I want to create a list which is a sublist of some existing list. For example, L = [1, 2, 3, 4, 5, 6, 7], I want to create a sublist li such that li contains all the elements in L at odd positions. While I can do it by L = [1, 2, 3, 4, 5, 6, 7] li = [] count = 0 for i in L: if count % 2 == 1: li.append(i) count +=...


Python; Updating several positions in a list with "for x in range"

I'm trying to write a game where a snake moves around in a matrix, occupating positions with a sign(tecken in my barbaric language). It varies how much the snake will grow and itäs determined by p, then you can decide if it should grow horisontal or vertical and that information is given in b and c. What I then want is that my matrix will be updated with my sign and then all the position between my new position and my old ...






Still can't find your answer? Check out these communities...



PySlackers | Full Stack Python | NHS Python | Pythonist Cafe | Hacker Earth | Discord Python



top