Skip to content

ci: removing "old" csv-based scan? #4294

Open
@terriko

Description

@terriko

Right now, cve-bin-tool gets cve scanned by a lot of different tools for different reasons. Some of that is me testing and learning about similar tools, some of it is for compliance with things like scorecard.

We're currently scanning with our own engine twice:

  • an older scan where we produce requirements.csv files
  • a newer scan using the cve-bin-tool github action

These are actually slightly different scans: the old csv scan includes javascript dependencies that I'm not sure are picked up otherwise (need to check this) and the csv files contain vendor information so that we look up the correct entry (and only the correct entry) in NVD if one is available.

Thanks to the great work by @inosmeet this summer to integrate purl identifiers into our language lookup and adding the "mismatch" database to help us weed out false positives during lookup, I think we should be close to the situation where the old and new scans will be equivalent, or could be made equivalent with a bit of extra data. So I'd like to propose that someone do the following:

  1. Run both scans and compare results: what's missing in the newer scan? Is it finding anything that the old scan was not?
    • note that the github action is basically running cve-bin-tool . on the cve-bin-tool source directory
    • the csv scan is running something like cve-bin-tool -i requirements.csv on a bunch of .csv files. Off the top of my head there's one in the main directory, one in doc/ and one buried in output_engine/ but you should should check test/test_requirements.py to see exactly what it does.
  2. Update things so that everything we want to scan is found by the new scan.
    • this might mean making a javascript lock file that our scanner will pick up on a repo scan or tweaking the action to scan that file explicitly if won't be picked up automatically
  3. Make sure any mappings are preserved so we're looking up the correct components
    • This may mean adding some mismatch data or purl2cpe mappings to make sure we're finding the correct components.
  4. Go through the machinery that keeps things up to date and make sure that any new files are kept up to date and old files that won't be used are removed
  5. remove or adjust any parts of the old scan that are no longer needed (possibly all of test/test_requirements.py if we've done our jobs right?)

I don't think this will be particularly hard, but it may require some deep thinking about results.

Metadata

Metadata

Assignees

No one assigned

    Labels

    CIRelated to our continuous integration service (GitHub Actions)hackathonIssues for folk participating in the Open Ecosystems hackathonsecuritypublic security-related issues.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions