Warning, aspirin required.
This is quite tricky because you can have anything before
So, you have to attack at the other end: use a regexp that will match any character followed by whole sequence of consecutive
Then look at the fine print in the specs of re.sub(), it looks for non-overlapping occurences of the pattern, so the search for the next match starts after the end of the current match... which is after the end of the sequence of
So in practice, we look for a character followed by a
This is quite tricky because you can have anything before
|BS|
, including another |BS|
. And covering your rear with something such as [\|]|BS|
isn't general enough because it prevents backspacing over a |
. and in a regexp you can't express something like "not this string"... So, you have to attack at the other end: use a regexp that will match any character followed by whole sequence of consecutive
|BS|
. Due to the greedy way things are matched, this will always include the whole sequence of consecutive |BS|
, so you initial character cannot be itself part of a |BS|
.Then look at the fine print in the specs of re.sub(), it looks for non-overlapping occurences of the pattern, so the search for the next match starts after the end of the current match... which is after the end of the sequence of
|BS|
, so in a sequence of |BS|
you will only process one per call to sub().So in practice, we look for a character followed by a
|BS|
followed by zero or more other |BS|
(captured in a group) and replace that by just that captured group:import re pattern=re.compile(r'.\|BS\|((\|BS\|)*)') def noBS(s): print '------------' previous='' while s!=previous: previous=s s=re.sub(pattern,r'\1',s) print s # this shows that the two sequences of |BS| are processed in parallel return s print noBS("it |BS||BS||BS|this is one|BS||BS||BS|an example") print noBS("it |BS||BS||BS| |BS|this is one|BS||BS||BS|an example") print noBS("it |BS||BS||BS| |BS|this is o n e|BS||BS||BS||BS||BS||BS|an example") # The first 'BS|' gets backspaced over due to missing leading '|'... print noBS("it BS||BS||BS||BS||BS||BS||BS|this is o n e|BS||BS||BS||BS||BS||BS|an example")Output for he last one:
Output:it BS|BS||BS||BS||BS||BS|this is o n |BS||BS||BS||BS||BS|an example
it B|BS||BS||BS||BS|this is o n |BS||BS||BS||BS|an example
it |BS||BS||BS|this is o n|BS||BS||BS|an example
it|BS||BS|this is o |BS||BS|an example
i|BS|this is o|BS|an example
this is an example
Unfortunately, I don't think you can avoid n explicit iteration.
Unless noted otherwise, code in my posts should be understood as "coding suggestions", and its use may require more neurones than the two necessary for Ctrl-C/Ctrl-V.
Your one-stop place for all your GIMP needs: gimp-forum.net
Your one-stop place for all your GIMP needs: gimp-forum.net