profsummergig
39d ago
46
98
sshine
As part of learning Perl in 1999.

I only used the bare minimum for years.

I also hung out on a #regex IRC channel, so I got exposed to questions and answers by many people.

Later I read up on https://www.regular-expressions.info/ which has a lot of very good explanations.

The #regex IRC channel had an IRC bot with a quiz with 28 levels.

All sensibility ended after level 14 or so. At that point it was just "how deep does the PCRE rabbit-hole go?"

But there was a lot of useful, non-trivial stuff, too. Most specifically, look-aheads/lookbehinds, non-greedy matching, back-references, named capture-groups, character classes, anchors,

When I learned jq, I went much the same way: Started hanging out on #jq IRC channels and started trying to answer jq questions on StackOverflow. Sadly, I got outperformed the first six months, until it finally clicked.

murderfs
I read Mastering Regular Expressions by Jeffrey Friedl 20 years ago when I was in middle school, front to back, and it's probably been the best investment of reading time I've ever made.
pluc
I'm really surprised by the low quantity of people who learned by trying and instead read whole books or manuals. Had some code or whatnot that needed mass-replacing and used the built-in RegEx find-and-replace (I think it was EditPlus in those days). First learned how to match the exact string then extrapolate from there using {}, (), replacing, etc. It's a lot easier to learn when you need to solve a practical, immediate problem.
gnat
"Learned" in University but it wasn't until Jeff Friedl's Perl Conference talks that I really became one with the regex engine. He taught you how to think like the regex engine and thus how regular expressions would be interpreted and thus how to write them. Then I got a master class in RE from Tom Christiansen when we were writing the Perl Cookbook.

Jeff wrote "Mastering Regular Expressions", which grew from that talk. You probably want a copy even though it was first released in 1997. For the mindset of RE, you can't beat it.

Learning REs is a roll through:

   * how matching happens (advancing, matching, backtracking)
   * using * ? and {} to match repetitions
   * greediness and stinginess within the RE
   * character classes, both [manual] and escapes like \s \W etc
   * anchors and "what a line is"
   * grouping and backreferences
   * accessing groups outside the RE
   * substitution and access backrefs in substitutions
   * find ALL the matches
   * complex parsing (just don't, it's rare not to regret it)
and then it's an absolutely epic deep-dive into the minutiae of what line starts and ends might be, Unicode and regex, code to be executed from within the regex enging, using code to BUILD regex and worrying about when escaping happens or doesn't, denial of service regex, etc. that will take you through ASCII, various Unix tool chains over time, and a bunch of other fun stuff.
urbandw311er
I didn’t learn Regex. Storing all that in my head felt unnecessary and always has done. I simply look it up when I need anything more complex than .*
IncandescentGas
Learned regex in the 90's from the Perl documentation, or possibly one of the oreilly perl references. That was a time where printed language references were more convenient than searching the internet. Perl still includes a shell component for accessing it's documentation, that was invaluable in those ancient times. Perl's regex documentation is rather fantastic.

`perldoc perlre` from your terminal.

or https://perldoc.perl.org/perlre

A simple way to test a regex you're building is this website, which offers immediate parsing and documentation of your regex, lets you test it against various inputs, and lets you choose which language's regex parser you are targeting.

https://regexr.com/

rtheunissen
Practice, the more you use them the easier they become. I never studied them but knew when to use them, then just tinkered and iterated until the pattern did what I needed it to. After a while you can mostly just write and read them without much tinkering.

Regex101 is an excellent tool.

pjkundert
Besides just abundant trial and error, there's a great Python module call greenery, that produces Finite State Machines from any Regular Expression:

https://github.com/qntm/greenery

So, you can observe what kind of state machine is produced from any given Regular Expression. You can also use it to merge and such manipulate state machines, or simplify Regular Expressions.

Quite helpful.

Modified3019
Easy, I learn it every time I unfortunately need to use it through painful trial and error, and searching. Thankfully the are online evaluators now, but now you need to figure out which regex is being used.

Then I forget it, and have unreadable mystery functions laying around that I hope don’t have bugs.

But at least it’s a single line!

Seriously though, my actual need for them is low, so I avoid the things as much as I would avoid inlining assembly.

sinkasapa
That is a hard question because there are so many ways that one can understand regex. I learned how to read and use them using Unix tools like sed, but I think that my path to starting to understand them probably began with papers like "Regular Expression Matching Can Be Simple And Fast" by Russ Cox (https://swtch.com/~rsc/regexp/regexp1.html), well after feeling like I was pretty good at using them.

Then, as an expert in linguistic morphology, I started learning about things like subregular languages, as talked about in works such as Aural Pattern Recognition Experiments and the Subregular Hierarchy, by Rogers and Pullum (https://www.cs.earlham.edu/~jrogers/JoLLI.pdf). And I continue to wonder what the relationship is between these classes of languages and word formation.

RadiozRadioz
Piece by piece, googling "how to do X in regex". But that was slow and didn't have a great foundation.

Then I learned Perl and started learning RegEx properly. Now somehow I've turned into one of those wizards I admired in the Stack overflow answers section. It wasn't until I had to teach RegEx to a junior that I realized how far I'd come.

nnf
I read a lot on https://www.regular-expressions.info and experimented on https://rubular.com since I was also learning Ruby at the time. https://regexr.com is another good tool that breaks down your regex and matches.

One of the things I remember being difficult at the beginning was the subtle differences between implementations, like `^` meaning "beginning of line" in Ruby (and others) but meaning "beginning of string" in JavaScript (and others).

If you're just starting out, it'd be helpful to read about how a regex engine evaluates an expression against a string so that you can understand the "order of operations" and how repeating elements are matched.

riffraff
Mastering Regular Expressions by Jeffrey Friedl.

It's been many years but I remember it as both thorough and easy to understand.

Zhyl
For me the biggest hurdle was learning what they were 'for' and that took a long time. The real magic for me was capture groups - I could now suddenly see why you'd have a regex and not just string matching.

Then it was about knowing a situation or a problem when regexes would apply and knowing how to look up the things I needed to solve that problem. Some regex 'phrases' are good for grepping, others for find and replace. Some will help you swap names around, some to reformat phone numbers.

After a while the phrases give way to general understanding and certain things become fluent.

I still only really write short or basic regexes, but I use them all the time in editing text or doing things that are a little bit complicated but actually a short regex just turns it from a hard problem into an easy problem.

userm0d
Start with https://regexone.com/ fun puzzle style interactive tutorial to grasp the basics. After that it's the matter of either using it with your CLI tools or applying it to problems you are working on.
Minor49er
My first jobs were heavily focused on parsing data from HTML and regex was (and still is) the most common solution for the majority of cases

To learn it, I played a lot of regex golf [1]

I also enabled regular expressions in my code editor's Find feature so every search I'd make used regex. Having it enabled in my editor made learning it more immersive and useful, especially when combined with things like find-and-replace. I highly recommend permanently enabling that in your editor as well

Also, challenge your coworkers to see who can make the shortest patterns for a variety of cases and see whose is the most versatile. It's always a fun time

[1] https://alf.nu/RegexGolf

pseudo_meta
Use regex101, it's basically an "IDE" for regex.
wcarss
I used vim. Using console vim and doing regular search and replaces forced me to learn and remember.

Later, grepping logs was a pretty similar application that needed and extended those skills.

flashgordon
Regex the "env specific variant" or regex the concept (as it applies to theory of computation) etc?

The former I can never remember beyond the basics (*, +, ?, |). Even the | I go extra cautious and put in tons of parenthesis. If I ever need matching and grouping I resort to rtfm.

Now that latter, that's the more interesting and fun one!! Learnt it in college decades ago but really drilled it in by reimplementing Russ Cox's amazing Thomson nfa blog and breakdown in typescript!

ruzig
My job requires to write Regex to parse data. I learnt regex by solving real-world problems and it works. A lot of tricks I learnt by solving them
nemoniac
Emacs has two packages called rx and xr. They allow you to write regexes in an sexp syntax and translate between that and conventional regex syntax. Furthermore you can define regex snippets and compose them into new ones. This more than anything can give you a handle on how a regex is composed.

For example

  "\\`\\(?:[^^]\\|\\^\\(?: \\*\\|\\[\\)\\)"
can be written as

  (: bos 
     (| (not (in "^"))
        (: "^"
           (| " *" "["))))
Emacs also has other features to highlight matches and groups to help understand regexes better.

https://www.emacswiki.org/emacs/Rx_Notation https://github.com/mattiase/xr https://www.emacswiki.org/emacs/RegularExpression

Fuzzwah
I learned regex by writing an online poker hand history converter. I used to refer to it as spaghetti php code, but I've come to realise it was just newbie functional php wrapping a stack of regex.

Mid way through my 20 year career I realised that every job I'd had really came down to parsing data and outputting something a company finds value in. It's regex all the way down.

dobladov
Develop useful tools that need complex parsing, I gathered data from websites using regex to make automated tools, for example getting flat offers from my city and making a tool to notify me.

Simply try to parse some complex information like movie strings, as an exercise you can try to parse these movie names to produce a result like this.

``` { "name": "Dawn Of The Planet of The Apes", "year": "2014", "resolution": "1080p", "codec": "h264", "source": "web-dl", "audio": "AAC5.1", "group": "RARBG" } ```

https://raw.githubusercontent.com/dobladov/video-parser/main...

sbolt
I spent half a day playing with https://regexone.com and got the fundamentals in place, after that it’s been practice in solving tasks at work. Rubular is awesome if you’re looking to test out a pattern
fbomb
I learned regex incidentally from reading the classic book "Software Tools" (Kernighan & Plauger). It has a chapter which briefly describes the syntax but focuses on an analysis of the code used to parse & process them (in RATFOR).
grendelt
I read through Learning Perl and practiced different regex patterns with simple programs. Programming Perl dove deeper, as did the OReilly Mastering Regular Expressions book.

That and practice. I frequently check them with online regex tools to make sure the regex does what I want before I implement them.

illuminant
I was a Perl 5 magi and carried the O'Reilly book in hand for a few months. The impression has meant that every regex in 20 years was effortless, though Perl long forgotten.

Perl 5 regex familiarity seems like it futureproofed.

Now I suppose I mostly use JS or Vim which is such a subset.

daelon
Writing textmate grammars for VSCode extensions. I took over as the lead maintainer of the Godot engine VSCode extension, and one of the first things I worked on was adding syntax highlighting support for Godot 4's improvements to GDScript.

Textmate grammars are basically hundreds of nested regex snippets that recursively apply tags to regions of text. This is made worse by the fact that the grammar is written in JSON, so any escapes need to be double escaped, which means you can't easily copy your regex into something like regex101. You sort of just have to suffer until you get good at it.

treetalker
I taught myself through trial and error by using TextSoap (https://textsoap.com/mac/) to make my legal drafting easier.

TextSoap has been around forever and must be the most underrated app on the Mac. It’s amazing — I rank it alongside Keyboard Maestro, if that tells you anything. It’s also available on SetApp. I can’t say enough good things about it.

If you get into it, there is an Alfred workflow that lets you search for and apply cleaners to selected text.

Curiositry
Mostly by building things that needed complex RegEx, and debugging my regular expressions with https://regexr.com/.
geoffpado
The Regex Crossword was the first thing that made it click for me: https://regexcrossword.com/
sowerssix
I learned by having to parse fields from log messages, in order to ingest log sources that aren't supported by the $SIEM at $job. Having said that, I typically learn regex, then forget regex, then learn regex and so on....
dgunay
My first job involved a lot of web scraping and general data munging in PHP, so we accomplished it with a combination of XPath and regexes. Mostly XPath, with regex getting us through any particularly thorny bits of data we couldn't easily XPath out.

I had also done a tiny amount of regex in a college programming course, but really I didn't get "good" at them until I used them on the job.

willio58
I learned it in my "Programming Languages" class in university.

And then about 6 months later I had completely forgotten it.

It's one of those things you need to use regularly to keep it in memory. At least that's the case for me.

I tend to shy away from it these days for a lot of cases (ever try to regex validate an email??) but when I do use it I it's honestly just a process of re-learning for about 15 minutes each time.

sk11001
If you have no experience, go through a tutorial to get the general idea.

Learn the rest "on demand" whenever you need it, it's not something to spend a lot of practice time on. Because if you don't use it a lot, you'll forget most of what you learned anyway, and if you do use it a lot, then you don't need to spend dedicated learning time, you'll get good quite quickly.

andrewstuart
Don't learn it. Regex is not something you need to remember.

Just build the regex you need with ChatGPT along with an online regex tester.

austin-cheney
By using it to solve real practical problems.

The most common uses in JavaScript are in the RegExp test method and the String replace method. The replace method is cool because it can receive a function as the second argument and the argument of that function is the value matched by the RegExp that can be modified and returned.

r0ckarong
By trying to extract a C (pre 20) header and its comments by hand to generate some AsciiDoc documentation from it. Ended up refining the comment "format" that was already in use, parsing it with a python scripts and various Regex to then generate coherent documents.
PrivateButts
Having a series of data extraction/alteration tasks that regex made really easy. Regexr.com is a great playground for figuring it out, but having to use it in a practical way in my day to day for about a year cemented that skill.
jgwil2
Read about them on this page and then built an implementation: https://swtch.com/~rsc/regexp/
anotherhue
We took the academic approach in college, Kleene closures, Chomsky grammars etc. Once the meaning behind the line noise of symbols was clear it become fairly easy to write them.

Not sure they ever get easier to read though.

atribecalledqst
I was reading the book "JavaScript, the Definitive Guide" (latest edition is from 2020) and it had a great section on regexes.

Actually that book is also what helped demystify async programming for me.

wojciii
A compiler course as part of CS.

Then perl followed by Python regex when I needed to recognise specific strings.

I didn't use books for this. I remember reading Python docs and howtos.

antoinebalaine
Using vim to edit large dictionary files. I forked a steno dictionary for Plover, and edited its 300k entries in vim. I've ben using regex daily since then.
smarri
I built a web scraper that runs in terminal, it was trial and error to get it working. If I was doing it again now I'd use chatgpt to help learn it
pdntspa
Mainly in a job that forced me to use them (data mining). There are interactive tools that help you visualize what is going on, they are immensely helpful.
kkfx
At uni first year curious about other use of '*' and '?', reading Mastering Regular Expressions and playing around a bit.
iforgotmysocks
When I learn formal language theory in my computing theory class
cybervegan
First experienced in 1989 on Xenix grep, later more on SunOS (5 I think) in the early 90's, then Linux in the mid 90's.
solardev
It was part of a community college Perl course 20 years ago.

These days, ChatGPT is pretty good at both making and explaining them.

revskill
By understanding combinators and, or, not, any, anyFrom, anyFromTo, zeroOrMore, atLeastOne,....
djaouen
I learned command line globbing first. Then I just translated my knowledge there into the equivalent regexes.
bryanlarsen
It was a particularly useful piece of technology for activities during a misbegotten youth.
jeff-hykin
For context, AFAIK, I maintain the largest Textmate (read as "regex document") grammar in the world; the C++ textmate grammar for VS Code. (Don't mistake that as bragging, its a literally-unfixable dumpster fire) It has pretty much forced me to regularly use every regex feature, from recursive named backreferences to atomics and time complexity of lookahead's combinatorial explosions.

https://regexr.com/ is one of the most amazing interactive resources, I can't recommend enough. Back in the day I used it to go from beginer to intermediate. And while I never used this next site to learn, https://regex-vis.com/ is a great place to check out. From intermediate to master I've pretty much relied on rexegg.com/ for discovering the advanced stuff and engine differences. After that https://regex101.com/ was helpful for performance analysis. I first learned regex just mucking around in the CLI with some guidance from a programmer friend. Pure trial and error learning.

While I am inclined to say "the only way to learn regex is to use it", after reading the comments I must agree it would've been nice to have examples of pitfalls and misconceptions. There's a lot of them that can take a very very long time to learn without direct examples. I've never even heard of Jeff Fried (not till this post at least). So props to people who can actually read those kinds of books.

dec0dedab0de
First day on the job as tech support at a local dialup ISP/CLEC My manager gave me a regex cheatsheet, because we had a gigantic multi-file spam filter that we would all need to troubleshoot, and some of us were allowed to edit. It was fun.

Cheat sheets are the way to go though, especially because of the different versions. If you do enough with them one day the main stuff will just stick. Once you're fairly productive you will realize you missed a feature or trick that is particularly useful for what you've been doing, and after getting mad at yourself for missing it you will add that into your repertoire. repeat.

Also, don't be afraid to just split/cut and do it in your language of choice instead of regex. Most of the time it doesn't make too much of a difference performance wise. Many times it can be faster and/or more readable. The best approach is often a combination. Nobody likes the wizard that tries to put everything into one regex to rule them all.

Regarding versions, I learned with PCRE, have mostly worked with python, and have hit problems using other various implementations over the years. Though it's never enough of a problem that I can remember what those differences are, I just look it up and move on. Unless it's going to be an ongoing project, in which case I print out a new cheatsheet and hang it up.

jryan49
Slowly, over a long period of time. Stackoverflow, javadoc of pattern, regexr
JonChesterfield
Muddled through for a decade with a vague sense that nothing worked as I expected it to. Gradually realised there are many variations with different behaviour, obfuscated by dreadful documentation and emergent behaviour in the implementations.

Then wrote a regex engine. It's now extremely obvious how regular expressions work as they're very simple. The spurious divergence in syntax and semantics is still infuriating but at least I know what they're supposed to be desugaring to now. Recommended as a worthwhile exercise.

Regular expressions have "generate a letter of the alphabet" as a primitive. It might be ascii and use 'b' to generate that letter, or a hex escape like \x42 or similar. The notation varies a bit. Another primitive is "generate the empty string".

Then there are compound operations. One regex or another, one followed by another. Intersection, complement. All the set operations, for the reason that a regular expression is literally a notation for a set of strings.

Some things like "lookahead" are notation for intersection. The match previous construct, \2 or similar, takes you out of regular expressions but works like checking equality on the fly.

Finally anchors, $ or ^ etc, are specific to the match problem. It's still find an element of the denoted set but with some extra constraints on where the element can occur.

I'm pretty sure that's all of it. How anchors interact with the set description is a nuisance but seems well formed - I haven't bothered to work through that part yet because I'm mostly interested in string generation, not matching.

erickj
By continously ignoring that what I really needed was a pushdown automaton.
cmpalmer52
Learned the basics in college. Refined my skills writing Perl utilities.
renegat0x0
Though pain and misery of incorrectly placed letter
gabelschlager
Honestly, Regex is not nearly as complicated as most people make it out to be. Read through the rules and then design some simple expressions.
rgovostes
When I was young I used a chat program developed by occasional HN commenter @krazydad. It had a client-side scripting language called IPTSCRAE (https://en.wikipedia.org/wiki/IPTSCRAE), with which I could write commands or eventually chatbots with the incantation

    { ... } CHATSTR "(.*)" GREPSTR IF
Since I was about 11 years old, my brain was plastic enough to take the postfix notation in stride, and regular expression syntax is still second nature to me.
gumby
I just read the grep man page.
rcarmo
Perl
darig
[dead]
Annatar
[dead]
extr
What I do now is I write a comment like this in my IDE:

    # a regex that selects <something>
    regex =
And then copilot/supermaven auto-completes it. If that doesn't work, I ask GPT-4o/Sonnet. If that doesn't work, I assume that whatever I'm asking for is not really a natural fit for regex and I should accomplish my task in a different way.

In general I try not to use regex in production code. IMO it is an obsolete technology at this point. Most people do not know it well and trying to debug it is a nightmare. May I suggest a simple function or loop that is readable?