How do i write a regular expression for the following pattern in python?

How do i look for the following pattern using regular expression in python? for the two cases

Am looking for str2 after the "=" sign

  • Case 1: str1=str2
  • Case 2: str1 = str2

please note there can be a space or none between the either side of the "=" sign

Mine is like this, but only works for one of the cases!

m=re.search('(?<=str\s\=\s)\w+','str = str2')

returns str2

Help!

Gath


Asked by: Wilson384 | Posted: 06-12-2021






Answer 1

if you indeed have only such simple strings to parse you don't need regular expression. you can just partition on = and strip (or even lstrip) last element of a resulting tuple:

>>> case = 'str = str2'
>>> case.partition('=')[2].lstrip()
'str2'

it'll be much faster than regexps. and just to show how fast i've made a simple test:

>>> timeit.timeit("'str1 = str2 '.partition('=')[2].strip()")
0.49051564213846177
>>> timeit.timeit("'str1 = str2 '.split('=')[1].strip()")
0.97673281637025866
>>> timeit.timeit('import re')
0.65663786250422618

Answered by: Maddie121 | Posted: 07-01-2022



Answer 2

re.search(r'=\s*(.*)', 'str = str2').group(1)

or if you just want a single word:

re.search(r'=\s*(\w+)', 'str = str2').group(1)

Extended to specific initial string:

re.search(r'\bstr\s*=\s*(\w+)', 'str=str2').group(1)

\b = word boundary, so won't match "somestr=foo"

It would be quicker to go trough all options once, instead of searching for single options one at the time:

option_str = "a=b, c=d, g=h"
options = dict(re.findall(r'(\w+)\s*=\s*(\w+)', option_str))
options['c']  # -> 'd'

Answered by: Chloe658 | Posted: 07-01-2022



Answer 3

If your data is fixed then you can do this without using regex. Just split it on '='. For example:

>>> case1 = "str1=str2"
>>> case2 = "str1 = str2"

>>> str2 = case1.split('=')[1].strip()
>>> str2 = case2.split('=')[1].strip()

This YOURCASE.split('=')[1].strip() statement will work for any cases.

Answered by: Dainton199 | Posted: 07-01-2022



Answer 4

Simply use split function

Answered by: Edgar602 | Posted: 07-01-2022



Answer 5

Expanding on @batbrat's answer, and the other suggestions, you can use re.split() to separate the input string. The pattern can use \s (whitespace) or an explicit space.

>>> import re
>>> c1="str1=str2"
>>> c2="str1 = str2"
>>> re.split(' ?= ?',c1)
['str1', 'str2']
>>> re.split(' ?= ?',c2)
['str1', 'str2']
>>> re.split(r'\s?=\s?',c1)
['str1', 'str2']
>>> re.split(r'\s?=\s?',c2)
['str1', 'str2']
>>> 

Answered by: Sienna700 | Posted: 07-01-2022



Answer 6

I think a regex is overkill if you only want to deal with the above two cases. Here's what I'd do-

>>> case1 = "str1=str2"
>>> case2 = "str1 = str2"
>>> case2.split()
['str1', '=', 'str2']
>>> ''.join(case2.split())
'str1=str2'
>>> case1[5:]
'str2'
>>> ''.join(case2.split())[5:]
'str2'
>>> 

Assumption

I assume you are looking for the specific token 'str1'. I also assume that str1 can be assigned different values. Something like what you'd have in a configuration file => propertyName = value.

This is just my opinion.

I knew that other ways were possible! SilentGhost gives a nice (better!) alternative.

Hope this helps.

Answered by: Anna126 | Posted: 07-01-2022



Answer 7

Two cases:

  • (case 1) if there is a single space before the '=', then there must also be a single space after the '='

    m=re.search(r'(?<=\S)(?:\s=\s|=)(\w+)','str = str2')
    print m.group(1)
    
  • (case 2) otherwise,

    m=re.search(r'(?<=\S)\s?=\s?(\w+)','str = str2')
    print m.group(1)
    

In the first case, you could also use the "(?P=…" construct for the second space or lack of it, but it still wouldn't work for a positive lookbehind assertion, since it wouldn't be a constant length subexpression.

Answered by: Brad630 | Posted: 07-01-2022



Answer 8

Related idea: I find using graphical regular expression tool helpful when trying to figure out correct pattern: http://kodos.sf.net.

Answered by: Leonardo231 | Posted: 07-01-2022



Similar questions

regex - Python regular expression to match a string pattern and return the sub string

I have many files with names like: &lt;some name&gt;_2536by1632.jpg &lt;some name1&gt;_4800by2304.JPG &lt;some name2&gt;_904by904.jpg So, the name part varies, the extension is always jpg, but it can be in capitals also. There are limited values possible for x and y in &lt;x&gt;by&lt;y&gt;, the list of which I have in this format: possible_sizes = [ (2536,1632)...


python - Regular expression match a or b pattern

I've got a small problem with the regular expression library in python, specifically with the match method with different patterns: import re files = ["noi100k_0p55m0p3_fow71f",\ "fnoi100v5_71f60s",\ "noi100k_0p55m0p3_151f_560s",\ "noi110v25_560s"] for i in files: keyws = i.split("_") for j in keyws: if re.match(r"noi(\w+)k|fnoi(\w+)v(\w+)|noi(\w+)v(\w+)",j): pri...


regex - Why my Python regular expression pattern run so slowly?

Please see my regular expression pattern code: #!/usr/bin/env python # -*- coding:utf-8 -*- import re print 'Start' str1 = 'abcdefgasdsdfswossdfasdaef' m = re.match(r"([A-Za-z\-\s\:\.]+)+(\d+)\w+", str1) # Want to match something like 'Moto 360x' print m # None is expected. print 'Done' It takes 49 seconds to finish, any problem with the pattern?


java - Python Domain Name Regular Expression Pattern

I would like to be able to match a domain by following the below rules: The domain name should be a-z | A-Z | 0-9 and hyphen(-) The domain name should between 1 and 63 characters long Last Tld must be at least two characters, and a maximum of 6 characters The domain name should not start or end with hyphen (-) (e.g. -google.com or google-.com) The domain name can be a subdo...


regex - Regular expression pattern (match newline and tab) in Python

I am using Python. Please help me to find Regex pattern for this: SELECT SELECT select1 FROM SELECT A FROM B WHERE C WHERE X FROM SELECT from1 FROM from2 WHERE from3 WHERE SELECT child1 FROM child2 I want to take out three parts: SELECT select1 FROM SELECT A FROM B WHERE C WHERE X and ...


regex - python regular expression match a pattern

So, I have been trying to match a simple pattern like B0000001 where just the first character be an alphabet "B" and rest should be numbers. The total length should be 9. I tried the pattern "^B[0-9]{7}$" but it fails most of the time. Any suggestions?


regex - Regular expression python ignore pattern between quotes

I am splinting a text based on ",". I need to ignore the commas in text between quotes (simple or doubled). Example of text: Capacitors,3,"C2,C7-C8",100nF,, Capacitors,3,'C2,C7-C8',100nF,, Have to return ['Capacitors','3','C2,C7-C8','100nF','',''] How to say this (ignore between quotes) in regular expressions? (of python) For now, I am usin...


python - Regular expression same pattern only applies to 1 result

I want to find several tags in a webpage using regular expression, they have the same pattern: data-tag-slug="NAME", like this(only a small section): ...category="rating" data-tag-id="40482" data-tag-name="safe" data-tag-slug="safe"&gt;&lt;a cla... ...category="" data-tag-id="42350" data-tag-name="solo" data-tag-slug="solo"&gt;&lt;a cla... And I coded tagNa...


regex - Python regular expression to find the pattern exists in the given string

This question already has answers here:


regex - Python regular expression - sometimes pattern can be at end of the string or sometimes it can be in the middle

I have a requirement to return a match if a following sub text is found but no other alphanumeric character should be before or after the sub text. For example: searching for text "OCI" in a filename: import re file_pattern = r".*([^a-zA-Z0-9]OCI[^a-zA-Z0-9]).*" text = "rce oci " m = re.match(file_pattern, text, re.IGNORECASE) if m is not None: print(m) else: print("no match found")


regex - Python regular expression unable to find pattern - using pyspark on Apache Spark

Can someone let me why the regular expression df = df2.withColumn(&quot;extracted&quot;, F.regexp_extract(&quot;title&quot;, &quot;[Pp]ython&quot;, 0)) Can find the pattern 'Python' or 'python' from the followng column called title title A fast PostgreSQL client library for Python: 3x faster than psycopg2 A project template for data science in Python A simple python framework ...


python - Regular Expression pattern in

This question already has an answer here:


python - How would I write this regular expression url pattern with path in django?

I am following a beginner django tutorial and my django version is set up to use path instead of the url and I am unsure how to write this code using path: url(r'^?P&lt;album_id&gt;[0-9]+', views.detail())


python - Regular expression - check if its the pattern at the end of string

I have a list of strings like this: something-12230789577 and I need to extract digits that end with a question mark symbol or NOTHING (which means the found pattern is at the end of the string) Match here should be: '12230789577' I wrote: r'\d+[?|/|]' but it returns no results in this example. \s works for space symbol, but here I'm met with ...


python - Regular expression to extract URL from an HTML link

I’m a newbie in Python. I’m learning regexes, but I need help here. Here comes the HTML source: &lt;a href="http://www.ptop.se" target="_blank"&gt;http://www.ptop.se&lt;/a&gt; I’m trying to code a tool that only prints out http://ptop.se. Can you help me please?


python - What is the regular expression for the "root" of a website in django?

I'm using django and when users go to www.website.com/ I want to point them to the index view. Right now I'm doing this: (r'^$', 'ideas.idea.views.index'), However, it's not working. I'm assuming my regular expression is wrong. Can anyone help me out? I've looked at python regular expressions but they didn't help me.


regex - Python Regular Expression to add links to urls

I'm trying to make a regular expression that will correctly capture URLs, including ones that are wrapped in parenthesis as in (http://example.com) and spoken about on coding horror at https://blog.codinghorror.com/the-problem-with-urls/ I'm currently using the foll...


python - Regular expression syntax for "match nothing"?

I have a python template engine that heavily uses regexp. It uses concatenation like: re.compile( regexp1 + "|" + regexp2 + "*|" + regexp3 + "+" ) I can modify the individual substrings (regexp1, regexp2 etc). Is there any small and light expression that matches nothing, which I can use inside a template where I don't want any matches? Unfortunately, sometimes '+' or '*' is appende...


regex - Python regular expression to match # followed by 0-7 followed by ##

I would like to intercept string starting with \*#\* followed by a number between 0 and 7 and ending with: ## so something like \*#\*0## but I could not find a regex for this


regex - How can I create a regular expression in Python?

I'm trying to create regular expressions to filter certain text from a text file. What I want to filter has this format: word_*_word.word So for example, I would like the python code every match. Sample results would be: program1_0.0-1_log.build program2_0.1-3_log.build How can I do this? Thanks a lot for your help


python - How can I build a regular expression which has options part

How can I build a regular expression in python which can match all the following? where it is a "string (a-zA-Z)" follow by a space follow by 1 or multiple 4 integers which separates by a comma: Example: someotherstring 42 1 48 17, somestring 363 1 46 17,363 1 34 17,401 3 8 14, otherstring 42 1 48 17,363 1 34 17, I have tried the following, since I need t...


python - How do I use a regular expression to match a name?

I am a newbie in Python. I want to write a regular expression for some name checking. My input string can contain a-z, A-Z, 0-9, and ' _ ', but it should start with either a-z or A-Z (not 0-9 and ' _ '). I want to write a regular expression for this. I tried, but nothing was matching perfectly. Once the input string follows the regular expression rules, I can proceed further, otherwise discard that string.


regex - python regular expression for domain names

I am trying use the following regular expression to extract domain name from a text, but it just produce nothing, what's wrong with it? I don't know if this is suitable to ask this "fix code" question, maybe I should read more. I just want to save some time. Thanks pat_url = re.compile(r''' (?:https?://)* (?:[\w]+[\-\w]+[.])* (?P&lt;domain&gt;[\w\-]*[\w.](com|net)([.]...


python - OR in regular expression?

I have text file with several thousands lines. I want to parse this file into database and decided to write a regexp. Here's part of file: blablabla checked=12 unchecked=1 blablabla unchecked=13 blablabla checked=14 As a result, I would like to get something like (12,1) (0,13) (14,0) Is it possible?


python - Regular expression to match start of filename and filename extension

What is the regular expression to match strings (in this case, file names) that start with 'Run' and have a filename extension of '.py'? The regular expression should match any of the following: RunFoo.py RunBar.py Run42.py It should not match: myRunFoo.py RunBar.py1 Run42.txt The SQL equivalent of what I am looking for is ... LIKE 'Run%.py' ...


regex - python regular expression to split paragraphs

How would one write a regular expression to use in python to split paragraphs? A paragraph is defined by 2 linebreaks (\n). But one can have any amount of spaces/tabs together with the line breaks, and it still should be considered as a paragraph. I am using python so the solution can use python's regular expression syntax whi...


python - Problem with Boolean Expression with a string value from a lIst

I have the following problem: # line is a line from a file that contains ["baa","beee","0"] line = TcsLine.split(",") NumPFCs = eval(line[2]) if NumPFCs==0: print line I want to print all the lines from the file if the second position of the list has a value == 0. I print the lines but after that the following happens: Traceback (most recent call last): ['baaa'...


python - split twice in the same expression?

Imagine I have the following: inFile = "/adda/adas/sdas/hello.txt" # that instruction give me hello.txt Name = inFile.name.split("/") [-1] # that one give me the name I want - just hello Name1 = Name.split(".") [0] Is there any chance to simplify that doing the same job in just one expression?


python - Regular expression to extract URL from an HTML link

I’m a newbie in Python. I’m learning regexes, but I need help here. Here comes the HTML source: &lt;a href="http://www.ptop.se" target="_blank"&gt;http://www.ptop.se&lt;/a&gt; I’m trying to code a tool that only prints out http://ptop.se. Can you help me please?


regex - How can I translate the following filename to a regular expression in Python?

I am battling regular expressions now as I type. I would like to determine a pattern for the following example file: b410cv11_test.ext. I want to be able to do a search for files that match the pattern of the example file aforementioned. Where do I start (so lost and confused) and what is the best way of arriving at a solution that best matches the file pattern? Thanks in advance.


python - What is the regular expression for the "root" of a website in django?

I'm using django and when users go to www.website.com/ I want to point them to the index view. Right now I'm doing this: (r'^$', 'ideas.idea.views.index'), However, it's not working. I'm assuming my regular expression is wrong. Can anyone help me out? I've looked at python regular expressions but they didn't help me.


regex - Python Regular Expression to add links to urls

I'm trying to make a regular expression that will correctly capture URLs, including ones that are wrapped in parenthesis as in (http://example.com) and spoken about on coding horror at https://blog.codinghorror.com/the-problem-with-urls/ I'm currently using the foll...


python - Regular expression to detect semi-colon terminated C++ for & while loops

In my Python application, I need to write a regular expression that matches a C++ for or while loop that has been terminated with a semi-colon (;). For example, it should match this: for (int i = 0; i &lt; 10; i++); ... but not this: for (int i = 0; i &lt; 10; i++) This looks trivial at first glance, until you realise...


regex - Why is the regular expression returning an error in python?

Am trying the following regular expression in python but it returns an error import re ... #read a line from a file to variable line # loking for the pattern 'WORD' in the line ... m=re.search('(?&lt;=[WORD])\w+',str(line)) m.group(0) i get the following error: AttributeError: 'NoneType' object has no attribute 'group'






Still can't find your answer? Check out these communities...



PySlackers | Full Stack Python | NHS Python | Pythonist Cafe | Hacker Earth | Discord Python



top