-
Notifications
You must be signed in to change notification settings - Fork 29
Cache: Design: Server replies early with UNKNOWN rather than running the detectors synchronously. #39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This behavior is by design. Testing a regex can take worst-case Aside: This is a great fit for serverless if you know anyone giving that away for free. If you really want to wait for a real response, here are the options I see:
but this will make my server do extra work.
I'm open to other suggestions but I don't see any good alternatives right now. |
I understand that testing can require a lot of CPU and that this can get costly. I'm looking from a usability point of view though. People using the service do that to get an answer to the question whether their regexp is safe or not. If the service returns "UNKNOWN", this is not helpful at all and strongly hurts usability. So people will either have to (a) write a loop to regularly check again until the service returns VULNERABLE/SAFE, or (b) find an other service or application that can answer this question. If you do want this service to be usable in practice I think it's better to solve this issue in the library itself (where you are in control), rather than leave it up to the users to write their own while-loops with unknown intervals (where you don't have control on the impact on your server). I guess you will have to do some testing on what's least expensive: keeping a request open until the test is finished, or regularly poll via a new request until the test is finished. Ideal would indeed be if people can easily run the tests locally on their own computer. That would mean porting the logic from Java to node.js, or maybe embedding the Java application inside |
As for hosting costs: you could simply approach Amazon or Heroku or others, explain your project (making the software world a bit more safe), and ask if they would be interested in giving you hosting for free for this service. Some companies are really positive about these sort of initiatives and eager to lend a helping hand. |
The intended use case is CI (see my "pseudo eslint plugin") where the code base is being scanned repeatedly. With this approach an answer should be available relatively soon (e.g. within 1-2 hours in the worst case) so across two CI runs you'll get an answer. This isn't the same as immediately, but I wouldn't describe it as unusable nor unhelpful. The client does save results in a local cache once it gets a solid answer, reducing the overall network traffic. If you think this design is unacceptable, I'm happy to discuss a new design and review a PR. I can update the code on the current server as needed.
I've made this as easy as I know how. Run The client/server setup is intended to remove the dependency costs and permit easy integration into existing CI setups.
At the moment I don't have the time to pursue this myself. Seems a bit beyond my purview as a researcher. If anyone wants to take the lead on this, I'd be delighted to join in the fun. |
Yes, sorry, this doesn't render the service completely unusable. At my work, alarm bells go off when a CI build fails, you want to prevent that from happening needlessly. For unit tests, integration tests, and CI one aspect I find very important is that it runs reliability and consistently. But if the CI system "sometimes" fails, or fails on a next build for issues of some time ago rather than the changes at hand, that is really bad for the confidence in the CI system. I personally would implement a polling mechanism myself then.
Ah, I hadn't seen that yet in the docs (was looking at the docs on npm which differ from those on github). That sounds promising, thanks. Will give that a try. |
Thanks! |
Do let me know your thoughts when you've had a chance to look it over. |
ok will do! |
It looks like a microservices approach like AWS Lambda might be a good fit for a "don't reply until you have an answer" design. The expected workload should fit in the free-or-very-cheap tier. Initial funding could come from e.g. the AWS Research Credits program. If I understand the architecture right, each lambda could handle a single request, check a backend DB, and either run the detectors locally or return the DB's stored result. |
I tried to run the
In the end I tried the following, which gives "INTERNAL-ERROR" results:
|
I noticed that when using both
vulnRegexDetector.test
andvulnRegexDetector.testSync
, testing a new regexp always returns UNKNOWN the first few times, and only after a minute or so it returns VULNERABLE or SAFE. It would be nice if the library simply waits until the server knows the result and only then return it, right now I cannot reliably use the library in automated unit tests.The text was updated successfully, but these errors were encountered: