Skip to content

Commit 1a65993

Browse files
authored
Merge pull request #105 from instructlab/lhawthorn-patch-5
Update knowledge-submissions-past-wikipedia.md
2 parents 5e07f93 + a2c98d2 commit 1a65993

File tree

1 file changed

+31
-17
lines changed

1 file changed

+31
-17
lines changed

docs/knowledge-submissions-past-wikipedia.md

+31-17
Original file line numberDiff line numberDiff line change
@@ -24,27 +24,41 @@ Status:
2424
- `denied`: Denied by the legal team, and posted on the [avoided list][avoided].
2525
- `submitted`: Sent to the legal team for review
2626
- `proposed`: The community would like to propose this as a possible place to take knowledge submissions from.
27+
- `reviewed - manually verify`: Legal team has reviewed this domain and while much of its source material meets our open licensing criteria, not all of it does. Each submission from this source must be manually verified to actually be under an appropriate content license or definitively in the public domain.
28+
29+
For the purposes of Knowledge submissions to the InstructLab project, data sourced from items in the `approved` category require no further vetting from the Triage and/or other Maintainer teams. Items in the `reviewed - manually verify` category will require vetting before the submission can be accepted.
30+
31+
To ensure that the data you would like to include in your knowledge submission meets the project licensing criteria, please make sure to talk to the Taxonomy maintainer team *before* you begin work on your submission. We would hate for you to do a great deal of work only to be told that the data source you selected would not work for the project. Please make sure you review the [Getting Started with Knowledge Submissions](https://github.com/instructlab/taxonomy?tab=readme-ov-file#getting-started-with-knowledge-contributions) documentation prior to submitting your request.
2732

2833
| Domain name | Status | Notes |
2934
| :-- | :-- | :-- |
30-
| <https://en.wikipedia.org/wiki/Main_Page> | approved | |
35+
| Wikipedia: <https://en.wikipedia.org/wiki/Main_Page> | approved | |
3136
| Project Gutenberg: <https://www.gutenberg.org/> | approved | Pre-1927 works; public domain under US copyright law |
32-
| <https://www.congress.gov/> | proposed | |
33-
| <https://www.whitehouse.gov/> | proposed | |
34-
| <https://www.senate.gov/> | proposed | |
35-
| <https://www.irs.gov/> | proposed | |
36-
| NASA: <https://www.nasa.gov/> | proposed | See guidelines: <https://www.nasa.gov/nasa-brand-center/images-and-media/> |
37-
| Smithsonian Libraries: <https://library.si.edu/>| proposed | For any material marked \"No Copyright - United States" or "CC0" as described here: <https://library.si.edu/copyright> |
38-
| European Union (EU): <https://european-union.europa.eu/> | proposed | Specifically documents submitted under "public registrars": <https://european-union.europa.eu/principles-countries-history/principles-and-values/access-information_en> |
39-
| Internet Archive: <https://archive.org/> | proposed | Pre-1927 works; public domain under US copyright law |
40-
| Wikisource (library): <https://en.wikisource.org/> | proposed | "free library that anyone can improve" |
41-
42-
### Next steps
43-
44-
1. We have to find the correct legal person to find a way to be the correct point person for this project.
45-
1. Collect suggested places from the community and add them to the above table
46-
1. Work with our legal team to get approvals and denials.
47-
1. Inform the triage team and triagers of the new locations we can or can not accept.
37+
| Wikisource (library): <https://en.wikisource.org/> | approved | "free library that anyone can improve" |
38+
| OpenStax textbooks family of publications <https://openstax.org/subjects> | approved | |
39+
| The Open Organization publications <https://theopenorganization.org/> | approved | |
40+
| The Scrum Guide <https://scrumguides.org/index.html> | approved | |
41+
| <https://www.congress.gov/> | reviewed - manually verify | |
42+
| <https://www.whitehouse.gov/> | reviewed - manually verify | |
43+
| <https://www.senate.gov/> | reviewed - manually verify | |
44+
| <https://www.irs.gov/> | reviewed - manually verify| |
45+
| NASA: <https://www.nasa.gov/> | reviewed - manually verify | See guidelines: <https://www.nasa.gov/nasa-brand-center/images-and-media/> |
46+
| Smithsonian Libraries: <https://library.si.edu/>| reviewed - manually verify | For any material marked \"No Copyright - United States" or "CC0" as described here: <https://library.si.edu/copyright> |
47+
| European Union (EU): <https://european-union.europa.eu/> | reviewed - manually verify | Specifically documents submitted under "public registrars": <https://european-union.europa.eu/principles-countries-history/principles-and-values/access-information_en> |
48+
| Internet Archive: <https://archive.org/> | reviewed - manually verify | Pre-1927 works; public domain under US copyright law |
49+
| PLOS family of open access journals: <https://plos.org/publish/> | reviewed - manually verify | |
50+
| Open Practice Library: <https://openpracticelibrary.com/> | reviewed - manually verify | |
51+
| Cynefin.io wiki: <https://cynefin.io/wiki/Main_Page> | reviewed - manually verify | |
52+
| The Open Education Project: <https://research.redhat.com/blog/research_project/foundations-in-open-source-education/> | reviewed - manually verify | |
53+
54+
### Process steps
55+
56+
1. Collect suggested places from the community by requesting they submit a pull request against this dev doc.
57+
1. Work with our legal team to adjudicate. [@lhawthorn](https://github.com/lhawthorn) is currently the owner of this step, but is happy to educate & empower other folks to do this work.
58+
1. Inform the triage team and triagers of the new locations we can or can not accept. This is currently done via an announcement in the [daily Triager Standup](https://github.com/instructlab/community/blob/main/Collaboration.md#triager-standup) and via a pull request to update the Knowledge Guide in one of the two locations listed below.
59+
60+
- [Approved sources][approved]
61+
- [Rejected sources][avoided]
4862

4963
[approved]: https://github.com/instructlab/taxonomy/blob/main/docs/KNOWLEDGE_GUIDE.md#accepted-knowledge
5064
[avoided]: https://github.com/instructlab/taxonomy/blob/main/docs/KNOWLEDGE_GUIDE.md#avoid-these-topics

0 commit comments

Comments
 (0)