Skip to content

split test suite #428

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

jkowalleck
Copy link
Member

@jkowalleck jkowalleck commented Mar 18, 2025

  • have a folder, that contains all test suites.
  • have a file for the purl core spec: PURL-SPECIFICATION.json
  • have a file for each purl type: PURL-TYPES.<type>.json
test-suite-data
    ├── PURL-SPECIFICATION.json
    ├── PURL-TYPES.apk.json
    ├── PURL-TYPES.bitnami.json
...
    ├── PURL-TYPES.generic.json
...
    └── PURL-TYPES.swift.json

Note

this PR does not add new or alter existing tests. it just refactors their location


fixes #427

TODO / DONE

  • split test suite per type
  • extract ALL he general cases

created via script:

#!/usr/bin/env python3

from json import loads as json_loads, dumps as json_dumps
from urllib.request import urlopen
from os.path import join
from typing import Any, List, Dict

TestCaseType = Dict[str, Any]

mixed_test_suite_url = 'https://raw.githubusercontent.com/package-url/purl-spec/refs/heads/main/test-suite-data.json'
with urlopen(mixed_test_suite_url) as mixed_test_suite_res:
    mixed_test_suite_data: List[TestCaseType] = json_loads(mixed_test_suite_res.read())

ts_types: Dict[str, List[TestCaseType]] = {'generic': []}

for tc in mixed_test_suite_data:
    if tc['type'] is None:
        print('none_case:', json_dumps(tc, indent=2), '', sep='\n')
        continue
    tc_type = str(tc['type']).lower()
    if tc_type not in ts_types:
        ts_types[tc_type] = [tc]
    else:
        ts_types[tc_type].append(tc)

for tct, tcs in ts_types.items():
    with open(join('test-suite', f'PURL-TYPES.{tct}.json'), 'wt') as tcf:
        tcf.write(json_dumps(tcs, indent=2))

@jkowalleck jkowalleck changed the title [WIP] split test suite split test suite Mar 19, 2025
@jkowalleck jkowalleck requested review from pombredanne, johnmhoran and a team March 19, 2025 09:42
@jkowalleck jkowalleck marked this pull request as ready for review March 19, 2025 09:43
Copy link
Contributor

@matt-phylum matt-phylum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The core file is missing tests for canonical encoding of all characters in all the different positions. This isn't currently defined by the spec for all characters, but at least there should be tests for the characters that are defined and for some critical cases where failure to encode and decode properly results in unexpected errors or incorrect values

  • plus vs space (broken for at least 4 implementations)
  • percent signs (broken for 1 implementation)
  • ampersands in qualifier values (broken for 2 implementations)
  • a control character < 0x10 (broken for 2 implementations)
  • a BMP Unicode character (< U+10000, eg , broken for 2 implementations)
  • a non-BMP Unicode character (>= U+10000, eg 💩, broken for 2 implementations).

Some of encoding cases used to be covered and are not anymore. For example, colons are now only tested for implementations that support pkg:docker and pkg:cpan. Slashes in qualifier values are only tested for implementations that support huggingface or mlflow.

[
{
"description": "invalid subpath - unencoded subpath cannot contain '..'",
"purl": "pkg:GOLANG/google.golang.org/genproto@abcdedf#/googleapis/%2E%2E/api/annotations/",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is doing three things. GOLANG should probably be written as golang and the subpath should not begin with /.

"description": "invalid encoded colon : between scheme and type",
"purl": "pkg%3Amaven/org.apache.commons/io",
"canonical_purl": null,
"type": "maven",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't work. I think you got it from elsewhere because there is already a test with this problem. type cannot be "maven" because {"type":"maven","namespace": "org.apache.commons","name":"io"} is a valid PURL and the test will fail because of "is_invalid": true. "type":"pkg:maven" or "type":null would be more accurate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matt-phylum Thanks for your comments on this test, which seems to be a variation of the test I added as part of my (merged) scheme PR #361. I had questions at the time re how to properly construct the input values when the PURL is meant to be invalid, and similar questions re the 3 is-invalid-true tests in my (merged) type PR #383. Subsequent discussion led to

Clarifying the core spec's test-construction instructions would be a valuable next step in the test-updating process. (I suspect the test-construction clarification could be done even while additional work on the "Character encoding" section continues.)

"description": "invalid subpath - unencoded subpath cannot contain '..'",
"purl": "pkg:GOLANG/google.golang.org/genproto@abcdedf#/googleapis/%2E%2E/api/annotations/",
"canonical_purl": "pkg:golang/google.golang.org/genproto@abcdedf#googleapis/api/annotations",
"type": "golang",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between the pkg:generic tests (currently an empty file) and the tests in this file? Should all of the tests in this file be type generic? They're cases where the package being type golang or whatever shouldn't make a difference, but some implementations validate that the package type is a supported type, and it's possible that an implementation doesn't support one of these types or doesn't support it correctly and gets an unexpected test result.

Comment on lines +83 to +94
{
"description": "double slash // after scheme is not significant",
"purl": "pkg://maven/org.apache.commons/io",
"canonical_purl": "pkg:maven/org.apache.commons/io",
"type": "maven",
"namespace": "org.apache.commons",
"name": "io",
"version": null,
"qualifiers": null,
"subpath": null,
"is_invalid": false
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of these tests are duplicates of core tests and aren't related to the package type.

Copy link
Contributor

@mprpic mprpic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I would file separate issues for the comments that @matt-phylum left in this PR since the change here is only about better organization of the test data.

@jkowalleck
Copy link
Member Author

jkowalleck commented Mar 19, 2025

LGTM, I would file separate issues for the comments that @matt-phylum left in this PR since the change here is only about better organization of the test data.

@matt-phylum's comment seams to be out of scope.
this PR is not about adding any new test cases, nor is it about altering exsing ones.

@jkowalleck jkowalleck requested a review from a team March 19, 2025 16:03
@jkowalleck
Copy link
Member Author

jkowalleck commented Mar 27, 2025

@pombredanne @package-url/purl-spec-helpers can we merge this one?

I mean, we could wait until all the tens of open PRs that want to modify the test suite are merged eventually, but this means we would not have a reasonably usable test suite in the meantime.
Or we might wait until we have clarification for certain edge cases in the test suite, which seams to take months.
All not desirable, I'd say.

Anyway, I am a huge fan of many small, scoped changes - like this one - so downstream users can review and adopt in small iterations, instead of one big blob of rework.

Let's merge this one, so that downstream users have an immediate benefit, even though all these other PRs are still in triage for months.
(I will then go through each one of open PRs and manually resolve the conflicts with the test-suite, if any)

PS: merging this PR now would especially enable downstream users to adopt every upcoming change to the test-suite even easier, as they can plug/choose what they need.

mprpic
mprpic previously approved these changes Mar 28, 2025
@jkowalleck
Copy link
Member Author

jkowalleck commented Apr 2, 2025

need to incorporate #416

PS: done via f528a57

},
{
"description": "maven often uses qualifiers",
"purl": "pkg:Maven/org.apache.xmlgraphics/[email protected]?classifier=sources&repositorY_url=repo.spring.io/release",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this an invalid example, since the solidus / in repo.spring.io/release is not percent encoded?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. The parser is supposed to split on ? before /.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about #439? I assume that / must be unencoded, when used as a PURL separator and encoded otherwise?

Signed-off-by: Jan Kowalleck <[email protected]>
Signed-off-by: Jan Kowalleck <[email protected]>
Signed-off-by: Jan Kowalleck <[email protected]>
Signed-off-by: Jan Kowalleck <[email protected]>
Signed-off-by: Jan Kowalleck <[email protected]>
Signed-off-by: Jan Kowalleck <[email protected]>
incorporate package-url#416

Signed-off-by: Jan Kowalleck <[email protected]>
@jkowalleck jkowalleck force-pushed the tests/split-test-suite branch from 598d148 to f528a57 Compare April 2, 2025 23:04
@jkowalleck jkowalleck requested a review from a team April 2, 2025 23:07
Copy link
Member

@johnmhoran johnmhoran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent idea @jkowalleck -- LGTM.

@johnmhoran johnmhoran added this to the 1.0-draft milestone Apr 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

split the test-suite
5 participants