Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Some regex languages allow backtracking, and backtracking is usually the thing that causes regexes to blow up in resource cost: https://www.regular-expressions.info/catastrophic.html


You probably want a regex engine that runs in linear time:

* Google's RE2 https://github.com/google/re2/wiki/WhyRE2

* https://github.com/laurikari/tre/

There is a good series of articles about the problem: https://swtch.com/~rsc/regexp/regexp3.html

I would strongly recommend deploying such a regular expression matcher to avoid problems like this. There are examples in the above article that you can use to test anything in your production deployment that accepts regular expressions to see how well it copes.


Might have been a misdirect reply (although useful), but yeah, agree, linear-time regex engines are generally a much better idea.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: