suggested feature: super-linear regex detection #256

davisjam · 2018-04-12T16:42:00Z

I also created an npm module that queries a service I'm hosting. See discussion of behavior here.

For a lighter-weight approach there's safe-regex but it misses 90% of vulnerable regexes and has about a 90% false positive rate. I'm working on improvements.

gskinner · 2018-06-05T16:13:21Z

RegExr currently runs all matches asynchronously, and will display an error if a match takes too long. I know that's not exactly what you're suggesting, but it serves a similar purpose, and is much more straightforward for us than relying on a third party service or hosting a computationally expensive module ourselves.

I'm open to a discussion of the value for this if you think I'm missing something.

davisjam · 2018-06-05T16:53:53Z

RegExr will only detect problematic regexes if long-running input is supplied. But if I am testing a super-linear regex on non-triggering input then I won't realize it. Since I think RegExr is a widely-used regex service, I think identifying super-linear regexes would be a helpful enhancement.

gskinner · 2018-06-05T17:56:52Z

Can you go into more details on this:

how would it be hosted? On our server, or yours? What if your server goes down? Can it handle potentially tens of thousands of tests a day?
how do you see it being integrated into the UI? Would it be a Tool I need to specifically run, or would it run on edit like everything else (obviously the latter fits our model better, but the former is much less resource intensive).

davisjam · 2018-06-05T19:46:41Z

Hosting: I have open-sourced the code necessary to answer these queries. The service I am currently hosting (basically a DB so you don't need to recompute previously-computed results) would not scale to 10Ks of queries, since it is just a desktop in our lab. But it could, for example, be containerized for scaling if some generous sponsor were willing to provide hosting. The queries are independent and the results just get merged into a DB so this wouldn't be too hard.
I would suggest a Tool that will do best-effort responses during editing, but which can answer queries on-demand on a button press. Since the vulnerability of a regex doesn't change over time, once a regex has been checked (expensive) the result can be saved for subsequent lookup (cheap), so during editing we can flag already-known vulnerable regexes.

Since testing for vulnerability can be expensive, doing on-demand lookups for never-before-seen regexes could be a bottleneck and I wouldn't recommend doing so without an explicit request. That way never-before-seen regexes can be tested in the background and used the next time the regex is seen. This would be beneficial provided that RegExr is used widely enough that you see the same regex more than once -- I imagine this is the case?

gskinner mentioned this issue Jun 5, 2018

Show time to execute even with no match. #263

Closed

gskinner added the enhancement label Jun 5, 2018

gskinner closed this as completed Jun 5, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

suggested feature: super-linear regex detection #256

suggested feature: super-linear regex detection #256

davisjam commented Apr 12, 2018

gskinner commented Jun 5, 2018

davisjam commented Jun 5, 2018

gskinner commented Jun 5, 2018

davisjam commented Jun 5, 2018

suggested feature: super-linear regex detection #256

suggested feature: super-linear regex detection #256

Comments

davisjam commented Apr 12, 2018

gskinner commented Jun 5, 2018

davisjam commented Jun 5, 2018

gskinner commented Jun 5, 2018

davisjam commented Jun 5, 2018