You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/knowledge-submissions-past-wikipedia.md
+31-17
Original file line number
Diff line number
Diff line change
@@ -24,27 +24,41 @@ Status:
24
24
-`denied`: Denied by the legal team, and posted on the [avoided list][avoided].
25
25
-`submitted`: Sent to the legal team for review
26
26
-`proposed`: The community would like to propose this as a possible place to take knowledge submissions from.
27
+
-`reviewed - manually verify`: Legal team has reviewed this domain and while much of its source material meets our open licensing criteria, not all of it does. Each submission from this source must be manually verified to actually be under an appropriate content license or definitively in the public domain.
28
+
29
+
For the purposes of Knowledge submissions to the InstructLab project, data sourced from items in the `approved` category require no further vetting from the Triage and/or other Maintainer teams. Items in the `reviewed - manually verify` category will require vetting before the submission can be accepted.
30
+
31
+
To ensure that the data you would like to include in your knowledge submission meets the project licensing criteria, please make sure to talk to the Taxonomy maintainer team *before* you begin work on your submission. We would hate for you to do a great deal of work only to be told that the data source you selected would not work for the project. Please make sure you review the [Getting Started with Knowledge Submissions](https://github.com/instructlab/taxonomy?tab=readme-ov-file#getting-started-with-knowledge-contributions) documentation prior to submitting your request.
| Project Gutenberg: <https://www.gutenberg.org/>| approved | Pre-1927 works; public domain under US copyright law |
32
-
|<https://www.congress.gov/>| proposed ||
33
-
|<https://www.whitehouse.gov/>| proposed ||
34
-
|<https://www.senate.gov/>| proposed ||
35
-
|<https://www.irs.gov/>| proposed ||
36
-
| NASA: <https://www.nasa.gov/>| proposed | See guidelines: <https://www.nasa.gov/nasa-brand-center/images-and-media/>|
37
-
| Smithsonian Libraries: <https://library.si.edu/>| proposed | For any material marked \"No Copyright - United States" or "CC0" as described here: <https://library.si.edu/copyright>|
38
-
| European Union (EU): <https://european-union.europa.eu/>| proposed | Specifically documents submitted under "public registrars": <https://european-union.europa.eu/principles-countries-history/principles-and-values/access-information_en>|
39
-
| Internet Archive: <https://archive.org/>| proposed | Pre-1927 works; public domain under US copyright law |
40
-
| Wikisource (library): <https://en.wikisource.org/>| proposed | "free library that anyone can improve" |
41
-
42
-
### Next steps
43
-
44
-
1. We have to find the correct legal person to find a way to be the correct point person for this project.
45
-
1. Collect suggested places from the community and add them to the above table
46
-
1. Work with our legal team to get approvals and denials.
47
-
1. Inform the triage team and triagers of the new locations we can or can not accept.
37
+
| Wikisource (library): <https://en.wikisource.org/>| approved | "free library that anyone can improve" |
38
+
| OpenStax textbooks family of publications <https://openstax.org/subjects>| approved ||
39
+
| The Open Organization publications <https://theopenorganization.org/>| approved ||
40
+
| The Scrum Guide <https://scrumguides.org/index.html>| approved ||
| Smithsonian Libraries: <https://library.si.edu/>| reviewed - manually verify | For any material marked \"No Copyright - United States" or "CC0" as described here: <https://library.si.edu/copyright>|
47
+
| European Union (EU): <https://european-union.europa.eu/>| reviewed - manually verify | Specifically documents submitted under "public registrars": <https://european-union.europa.eu/principles-countries-history/principles-and-values/access-information_en>|
48
+
| Internet Archive: <https://archive.org/>| reviewed - manually verify | Pre-1927 works; public domain under US copyright law |
49
+
| PLOS family of open access journals: <https://plos.org/publish/>| reviewed - manually verify ||
50
+
| Open Practice Library: <https://openpracticelibrary.com/>| reviewed - manually verify ||
| The Open Education Project: <https://research.redhat.com/blog/research_project/foundations-in-open-source-education/>| reviewed - manually verify ||
53
+
54
+
### Process steps
55
+
56
+
1. Collect suggested places from the community by requesting they submit a pull request against this dev doc.
57
+
1. Work with our legal team to adjudicate. [@lhawthorn](https://github.com/lhawthorn) is currently the owner of this step, but is happy to educate & empower other folks to do this work.
58
+
1. Inform the triage team and triagers of the new locations we can or can not accept. This is currently done via an announcement in the [daily Triager Standup](https://github.com/instructlab/community/blob/main/Collaboration.md#triager-standup) and via a pull request to update the Knowledge Guide in one of the two locations listed below.
0 commit comments