A pythonic way to insert a space before capital letters

I've got a file whose format I'm altering via a python script. I have several camel cased strings in this file where I just want to insert a single space before the capital letter - so "WordWordWord" becomes "Word Word Word".

My limited regex experience just stalled out on me - can someone think of a decent regex to do this, or (better yet) is there a more pythonic way to do this that I'm missing?

Asked by: Sarah260 | Posted: 01-10-2021

Answer 1

You could try:

>>> re.sub(r"(\w)([A-Z])", r"\1 \2", "WordWordWord")
'Word Word Word'

Answered by: Jared575 | Posted: 02-11-2021

Answer 2

If there are consecutive capitals, then Gregs result could not be what you look for, since the \w consumes the caracter in front of the captial letter to be replaced.

>>> re.sub(r"(\w)([A-Z])", r"\1 \2", "WordWordWWWWWWWord")
'Word Word WW WW WW Word'

A look-behind would solve this:

>>> re.sub(r"(?<=\w)([A-Z])", r" \1", "WordWordWWWWWWWord")
'Word Word W W W W W W Word'

Answered by: Alina846 | Posted: 02-11-2021

Answer 3

Perhaps shorter:

>>> re.sub(r"\B([A-Z])", r" \1", "DoIThinkThisIsABetterAnswer?")

Answered by: Tara253 | Posted: 02-11-2021

Answer 4

Have a look at my answer on .NET - How can you split a “caps” delimited string into an array?

Edit: Maybe better to include it here.

re.sub(r'([a-z](?=[A-Z])|[A-Z](?=[A-Z][a-z]))', r'\1 ', text)

For example:

"SimpleHTTPServer" => ["Simple", "HTTP", "Server"]

Answered by: Carlos791 | Posted: 02-11-2021

Answer 5

Maybe you would be interested in one-liner implementation without using regexp:

''.join(' ' + char if char.isupper() else char.strip() for char in text).strip()

Answered by: Marcus667 | Posted: 02-11-2021

Answer 6

With regexes you can do this:

re.sub('([A-Z])', r' \1', str)

Of course, that will only work for ASCII characters, if you want to do Unicode it's a whole new can of worms :-)

Answered by: Alfred931 | Posted: 02-11-2021

Answer 7

If you have acronyms, you probably do not want spaces between them. This two-stage regex will keep acronyms intact (and also treat punctuation and other non-uppercase letters as something to add a space on):

re_outer = re.compile(r'([^A-Z ])([A-Z])')
re_inner = re.compile(r'(?<!^)([A-Z])([^A-Z])')
re_outer.sub(r'\1 \2', re_inner.sub(r' \1\2', 'DaveIsAFKRightNow!Cool'))

The output will be: 'Dave Is AFK Right Now! Cool'

Answered by: Cherry599 | Posted: 02-11-2021

Answer 8

I agree that the regex solution is the easiest, but I wouldn't say it's the most pythonic.

How about:

text = 'WordWordWord'
new_text = ''

for i, letter in enumerate(text):
    if i and letter.isupper():
        new_text += ' '

    new_text += letter

Answered by: Audrey676 | Posted: 02-11-2021

Answer 9

I think regexes are the way to go here, but just to give a pure python version without (hopefully) any of the problems ΤΖΩΤΖΙΟΥ has pointed out:

def splitCaps(s):
    result = []
    for ch, next in window(s+" ", 2):
        if next.isupper() and not ch.isspace():
            result.append(' ')
    return ''.join(result)

window() is a utility function I use to operate on a sliding window of items, defined as:

import collections, itertools

def window(it, winsize, step=1):
    it=iter(it)  # Ensure we have an iterator
    l=collections.deque(itertools.islice(it, winsize))
    while 1:  # Continue till StopIteration gets raised.
        yield tuple(l)
        for i in range(step):

Answered by: Audrey677 | Posted: 02-11-2021

Answer 10

To the old thread - wanted to try an option for one of my requirements. Of course the re.sub() is the cool solution, but also got a 1 liner if re module isn't (or shouldn't be) imported.

st = 'ThisIsTextStringToSplitWithSpace'
print(''.join([' '+ s if s.isupper()  else s for s in st]).lstrip())

Answered by: Jack500 | Posted: 02-11-2021

Similar questions

python - A pythonic way to insert a comma before capital letters [Regex]

This question already has answers here:

python - A pythonic way to insert a comma before capital letters [Regex]

This question already has answers here:

python - Is it pythonic for a function to return multiple values?

In python, you can have a function return multiple values. Here's a contrived example: def divide(x, y): quotient = x/y remainder = x % y return quotient, remainder (q, r) = divide(22, 7) This seems very useful, but it looks like it can also be abused ("Well..function X already computes what we need as an intermediate value. Let's have X return that value also"). W...

python - Is there a more Pythonic way to merge two HTML header rows with colspans?

I am using BeautifulSoup in Python to parse some HTML. One of the problems I am dealing with is that I have situations where the colspans are different across header rows. (Header rows are the rows that need to be combined to get the column headings in my jargon) That is one column may span a number of columns above or below it and the words need to be appended or prepended based on the spanning. Below is a routine to d...

python - What is the pythonic way to share common files in multiple projects?

Lets say I have projects x and y in brother directories: projects/x and projects/y. There are some utility funcs common to both projects in myutils.py and some db stuff in mydbstuff.py, etc. Those are minor common goodies, so I don't want to create a single package for them. Questions arise about the whereabouts of such files, possible changes to PYTHONPATH, proper way to import, etc. What is th...

python - Pythonic ways to use 'else' in a for loop

This question already has answers here:

python - pythonic way to compare compound classes?

I have a class that acts as an item in a tree: class CItem( list ): pass I have two trees, each with CItem as root, each tree item has some dict members (like item._test = 1). Now i need to compare this trees. I can suggest to overload a comparison operator for CItem: class CItem( list ): def __eq__( self, other ): # first compare items as lists if not list.__eq...

python - Pythonic URL Parsing

There are a number of questions about how to parse a URL in Python, this question is about the best or most Pythonic way to do it. In my parsing I need 4 parts: the network location, the first part of the URL, the path and the filename and querystring parts. http://www.somesite.com/base/first/secon...

list - Pythonic way to get some rows of a matrix

I was thinking about a code that I wrote a few years ago in Python, at some point it had to get just some elements, by index, of a list of lists. I remember I did something like this: def getRows(m, row_indices): tmp = [] for i in row_indices: tmp.append(m[i]) return tmp Now that I've learnt a little bit more since then, I'd use a list comprehension like this:

python - What is the Pythonic way to write this loop?

for jr in json_reports: jr['time_created'] = str(jr['time_created'])

python - How do you make this code more pythonic?

Could you guys please tell me how I can make the following code more pythonic? The code is correct. Full disclosure - it's problem 1b in Handout #4 of this machine learning course. I'm supposed to use newton's algorithm on the two data sets for fitting a logistic hypothesis. But they use matlab &amp; I'm using scipy ...

python - Pythonic Swap of 2 lists elements

I found that I have to perform a swap in python and I write something like this: arr[first], arr[second] = arr[second], arr[first] I suppose this is not so pythonic. Does somebody know how to do a swap in python more elegant? EDIT: I think another example will show my doubts: self.memberlist[someindexA], self.memberlist[someindexB] = self.memberlist[som...

Still can't find your answer? Check out these communities...

PySlackers | Full Stack Python | NHS Python | Pythonist Cafe | Hacker Earth | Discord Python