Skip to content
Fix Code Error

How to match “anything up until this sequence of characters” in a regular expression?

March 13, 2021 by Code Error
Posted By: Anonymous

Take this regular expression: /^[^abc]/. This will match any single character at the beginning of a string, except a, b, or c.

If you add a * after it – /^[^abc]*/ – the regular expression will continue to add each subsequent character to the result, until it meets either an a, or b, or c.

For example, with the source string "qwerty qwerty whatever abc hello", the expression will match up to "qwerty qwerty wh".

But what if I wanted the matching string to be "qwerty qwerty whatever "

…In other words, how can I match everything up to (but not including) the exact sequence "abc"?

Solution

You didn’t specify which flavor of regex you’re using, but this will
work in any of the most popular ones that can be considered “complete”.

/.+?(?=abc)/

How it works

The .+? part is the un-greedy version of .+ (one or more of
anything). When we use .+, the engine will basically match everything.
Then, if there is something else in the regex it will go back in steps
trying to match the following part. This is the greedy behavior,
meaning as much as possible to satisfy.

When using .+?, instead of matching all at once and going back for
other conditions (if any), the engine will match the next characters by
step until the subsequent part of the regex is matched (again if any).
This is the un-greedy, meaning match the fewest possible to
satisfy
.

/.+X/  ~ "abcXabcXabcX"        /.+/  ~ "abcXabcXabcX"
          ^^^^^^^^^^^^                  ^^^^^^^^^^^^

/.+?X/ ~ "abcXabcXabcX"        /.+?/ ~ "abcXabcXabcX"
          ^^^^                          ^

Following that we have (?={contents}), a zero width
assertion
, a look around. This grouped construction matches its
contents, but does not count as characters matched (zero width). It
only returns if it is a match or not (assertion).

Thus, in other terms the regex /.+?(?=abc)/ means:

Match any characters as few as possible until a “abc” is found,
without counting the “abc”.

Answered By: Anonymous

Related Articles

  • pyspark window function from current row to a row with…
  • What are the undocumented features and limitations of the…
  • npm install error in vue
  • Downloading jQuery UI CSS from Google's CDN
  • Getting the closest string match
  • Android TextView Text not getting wrapped
  • binding backbone form view UIto model change to enable and…
  • Reference - What does this regex mean?
  • Ukkonen's suffix tree algorithm in plain English
  • Piping df into mutate + substring expression

Disclaimer: This content is shared under creative common license cc-by-sa 3.0. It is generated from StackExchange Website Network.

Post navigation

Previous Post:

Determine Whether Integer Is Between Two Other Integers?

Next Post:

How to redirect and append both stdout and stderr to a file with Bash?

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

  • Get code errors & solutions at akashmittal.com
© 2022 Fix Code Error