Curiously Chase

Regular Expression

An overview of Regular Expressions (a way to search for patterns in text in a programming language) from my perspective and experience.

Regular Expressions are a super power.

Regular expressions 1 are a way for someone to give a programming language a pattern to search for in text.

As a programmer I find myself using them fairly frequently.

As a simple example (using JavaScript), let's say I've built an application that allows someone to input a Slack channel URL and I want to get the Slack workspace and channel ID from the URL.

A Slack channel URL is made up of:

  • a protocol (https)
  • a Slack workspace (something.slack.com)
  • "archives"
  • a channel

An example URL might look like:

https://curious.slack.com/archives/C019PPFA2BV

I want to extract "curious" and "C019PPFA2BV" (and be able to do it regardless of what the values are).

This is where regular expressions shine.

A regular expression to match this pattern would look like this:

^https:\/\/([\w\d\-_]+)\.slack\.com\/archives\/([\d\w]+)$

I won't get into the pattern itself, but you can see that it looks similar to the URL pattern we want to match (you can see "slack.com" and "archives").

There are also two sets of parenthesis (). These are called capture groups. A capture group captures whatever content matches that part of the regular expression and returns it as a group.

This regular expression, based on the URL, will find one Match and two Groups:

  • Match is the whole URL because it matches the whole regular expression
  • Group 1 is "curious" because it matched the first regular expression subset.
  • Group 2 is "C019PPFA2BV" because it matched the second regular expression subset.

As you can imagine, this means you could build an application with infinite permutations and always have one way to define the pattern you're looking for and any values you want to extract from the permutation.

  • Regex 101 is an excellent tool for writing a regular expression and testing the expression against a string.

Footnotes

  1. Also called "RegEx"

Share on Twitter