From b01dc70e2aca87b579e49c3330be20165194e9b3 Mon Sep 17 00:00:00 2001 From: mpadge Date: Tue, 11 Mar 2025 13:10:16 +0100 Subject: [PATCH 1/3] rectify linebreaks in pkg_building, sr_policies --- pkg_building.Rmd | 336 ++++++++++++++++++++++++------------ softwarereview_policies.Rmd | 312 +++++++++++++++++---------------- 2 files changed, 379 insertions(+), 269 deletions(-) diff --git a/pkg_building.Rmd b/pkg_building.Rmd index 021146ded..1e373df4c 100644 --- a/pkg_building.Rmd +++ b/pkg_building.Rmd @@ -6,11 +6,15 @@ aliases: # Packaging Guide {#building} ```{block, type="summaryblock"} -rOpenSci accepts packages that meet our guidelines via a streamlined [Software Peer Review process](#whatissoftwarereview). To ensure a consistent style across all of our tools we have written this chapter highlighting our guidelines for package development. Please also read and apply our [chapter about continuous integration (CI)](#ci). Further guidance for after the review process is provided in the third section of this book starting with [a chapter about collaboration](#collaboration). +rOpenSci accepts packages that meet our guidelines via a streamlined [Software Peer Review process](#whatissoftwarereview). +To ensure a consistent style across all of our tools we have written this chapter highlighting our guidelines for package development. +Please also read and apply our [chapter about continuous integration (CI)](#ci). +Further guidance for after the review process is provided in the third section of this book starting with [a chapter about collaboration](#collaboration). -We recommend that package developers read Hadley Wickham and Jenny Bryan's thorough book on package development which is available for [free online](https://r-pkgs.org/). Our guide is partially redundant with other resources but highlights rOpenSci's guidelines. +We recommend that package developers read Hadley Wickham and Jenny Bryan's thorough book on package development which is available for [free online](https://r-pkgs.org/). +Our guide is partially redundant with other resources but highlights rOpenSci's guidelines. - To read why submitting a package to rOpenSci is worth the effort to meet guidelines, have a look at [reasons to submit](#whysubmit). + To read why submitting a package to rOpenSci is worth the effort to meet guidelines, have a look at [reasons to submit](#whysubmit). ``` @@ -18,22 +22,29 @@ We recommend that package developers read Hadley Wickham and Jenny Bryan's thoro ### Naming your package {#naming-your-package} -- We strongly recommend short, descriptive names in lower case. If your package deals with one or more commercial services, please make sure the name does not violate branding guidelines. You can check if your package name is available, informative and not offensive by using the [`pak::pkg_name_check()` function](https://pak.r-lib.org/reference/pkg_name_check.html); also use a search engine as you'd thus see if it's offensive in a language other than English. In particular, do *not* choose a package name that's already used on CRAN or Bioconductor. +- We strongly recommend short, descriptive names in lower case. + If your package deals with one or more commercial services, please make sure the name does not violate branding guidelines. + You can check if your package name is available, informative and not offensive by using the [`pak::pkg_name_check()` function](https://pak.r-lib.org/reference/pkg_name_check.html); also use a search engine as you'd thus see if it's offensive in a language other than English. + In particular, do *not* choose a package name that's already used on CRAN or Bioconductor. - There is a trade-off between the advantages of a unique package name and a less original package name. - A more unique package name might be easier to track (for you and us to assess package use for instance, less false positives when typing its name in GitHub code search) and search (for users to ask "how to use package blah" in a search engine). - - On the other hand a *too* unique package name might make the package less discoverable (that is to say, to find it by searching "how to do this-thing in R"). It might be an argument for naming your package something very close to its topic such as [geojson](https://github.com/ropensci/geojson)). + - On the other hand a *too* unique package name might make the package less discoverable (that is to say, to find it by searching "how to do this-thing in R"). + It might be an argument for naming your package something very close to its topic such as [geojson](https://github.com/ropensci/geojson)). - Find other interesting aspects of naming your package [in this blog post by Nick Tierney](https://www.njtierney.com/post/2018/06/20/naming-things/), and in case you change your mind, find out [how to rename your package in this other blog post of Nick's](https://www.njtierney.com/post/2017/10/27/change-pkg-name/). ### Creating metadata for your package {#creating-metadata-for-your-package} -We recommend you to use the [`codemetar` package](https://github.com/ropensci/codemetar) for creating and updating a JSON [CodeMeta](https://codemeta.github.io/) metadata file for your package via `codemetar::write_codemeta()`. It will automatically include all useful information, including [GitHub topics](#grooming). CodeMeta uses [Schema.org terms](https://schema.org/) so as it gains popularity the JSON metadata of your package might be used by third-party services, maybe even search engines. +We recommend you to use the [`codemetar` package](https://github.com/ropensci/codemetar) for creating and updating a JSON [CodeMeta](https://codemeta.github.io/) metadata file for your package via `codemetar::write_codemeta()`. +It will automatically include all useful information, including [GitHub topics](#grooming). +CodeMeta uses [Schema.org terms](https://schema.org/) so as it gains popularity the JSON metadata of your package might be used by third-party services, maybe even search engines. ## Platforms {#platforms} -- Packages should run on all major platforms (Windows, macOS, Linux). Exceptions may be granted packages that interact with system-specific functions, or wrappers for utilities that only operate on limited platforms, but authors should make every effort for cross-platform compatibility, including system-specific compilation, or containerization of external utilities. +- Packages should run on all major platforms (Windows, macOS, Linux). + Exceptions may be granted packages that interact with system-specific functions, or wrappers for utilities that only operate on limited platforms, but authors should make every effort for cross-platform compatibility, including system-specific compilation, or containerization of external utilities. ## Package API {#package-api} @@ -41,7 +52,10 @@ We recommend you to use the [`codemetar` package](https://github.com/ropensci/co - Functions and arguments naming should be chosen to work together to form a common, logical programming API that is easy to read, and auto-complete. - - Consider an `object_verb()` naming scheme for functions in your package that take a common data type or interact with a common API. `object` refers to the data/API and `verb` the primary action. This scheme helps avoid namespace conflicts with packages that may have similar verbs, and makes code readable and easy to auto-complete. For instance, in **stringi**, functions starting with `stri_` manipulate strings (`stri_join()`, `stri_sort()`, and in **googlesheets** functions starting with `gs_` are calls to the Google Sheets API (`gs_auth()`, `gs_user()`, `gs_download()`). + - Consider an `object_verb()` naming scheme for functions in your package that take a common data type or interact with a common API. + `object` refers to the data/API and `verb` the primary action. + This scheme helps avoid namespace conflicts with packages that may have similar verbs, and makes code readable and easy to auto-complete. + For instance, in **stringi**, functions starting with `stri_` manipulate strings (`stri_join()`, `stri_sort()`, and in **googlesheets** functions starting with `gs_` are calls to the Google Sheets API (`gs_auth()`, `gs_user()`, `gs_download()`). - For functions that manipulate an object/data and return an object/data of the same type, make the object/data the first argument of the function so as to enhance compatibility with the pipe operators (base R's `|>`, magrittr's `%>%`). @@ -51,33 +65,43 @@ We recommend you to use the [`codemetar` package](https://github.com/ropensci/co - Argument naming and order should be consistent across functions that use similar inputs. -- Package functions importing data should not import data to the global environment, but instead must return objects. Assignments to the global environment are to be avoided in general. +- Package functions importing data should not import data to the global environment, but instead must return objects. + Assignments to the global environment are to be avoided in general. ### Console messages {#console-messages} -- Use either the [cli package](https://cli.r-lib.org/), or base R's tools (`message()` and `warning()`) to communicate with the user in your functions. +- Use either the [cli package](https://cli.r-lib.org/), or base R's tools (`message()` and `warning()`) to communicate with the user in your functions. -- Highlights of the cli package include: automatic wrapping, respect of the [NO_COLOR convention](https://cli.r-lib.org/articles/cli-config-user.html?q=no#no_color), many [semantic elements](https://cli.r-lib.org/articles/semantic-cli.html), and extensive documentation. Read more in a [blog post](https://blog.r-hub.io/2023/11/30/cliff-notes-about-cli/). +- Highlights of the cli package include: automatic wrapping, respect of the [NO_COLOR convention](https://cli.r-lib.org/articles/cli-config-user.html?q=no#no_color), many [semantic elements](https://cli.r-lib.org/articles/semantic-cli.html), and extensive documentation. + Read more in a [blog post](https://blog.r-hub.io/2023/11/30/cliff-notes-about-cli/). - Please do not use `print()` or `cat()` unless it's for a `print.*()` or `str.*()` methods, as these methods of printing messages are harder for users to suppress. -- Provide a way for users to opt out of verbosity, preferably at the package level: make message creation dependent on an environment variable or option (like ["usethis.quiet"](https://usethis.r-lib.org/reference/ui.html?q=usethis.quiet#silencing-output) in the usethis package), rather than on a function parameter. The control of messages could be on several levels ("none, "inform", "debug") rather than logical (no messages at all / all messages). Control of verbosity is useful for end users but also in tests. More interesting comments can be found in an [issue of the tidyverse design guide](https://github.com/tidyverse/design/issues/42). +- Provide a way for users to opt out of verbosity, preferably at the package level: make message creation dependent on an environment variable or option (like ["usethis.quiet"](https://usethis.r-lib.org/reference/ui.html?q=usethis.quiet#silencing-output) in the usethis package), rather than on a function parameter. + The control of messages could be on several levels ("none, "inform", "debug") rather than logical (no messages at all / all messages). + Control of verbosity is useful for end users but also in tests. + More interesting comments can be found in an [issue of the tidyverse design guide](https://github.com/tidyverse/design/issues/42). ### Interactive/Graphical Interfaces {#interactive-graphical-interfaces} -If providing graphical user interface (GUI) (such as a Shiny app), to facilitate workflow, include a mechanism to automatically reproduce steps taken in the GUI. This could include auto-generation of code to reproduce the same outcomes, output of intermediate values produced in the interactive tool, or simply clear and well-documented mapping between GUI actions and scripted functions. (See also ["Testing"](#testing) below.) +If providing graphical user interface (GUI) (such as a Shiny app), to facilitate workflow, include a mechanism to automatically reproduce steps taken in the GUI. +This could include auto-generation of code to reproduce the same outcomes, output of intermediate values produced in the interactive tool, or simply clear and well-documented mapping between GUI actions and scripted functions. +(See also ["Testing"](#testing) below.) -The [`tabulizer` package](https://github.com/ropensci/tabulizer) e.g. has an interactive workflow to extract tables, but can also only extract coordinates so one can re-run things as a script. Besides, two examples of shiny apps that do code generation are [https://gdancik.shinyapps.io/shinyGEO/](https://gdancik.shinyapps.io/shinyGEO/), and [https://github.com/wallaceEcoMod/wallace/](https://github.com/wallaceEcoMod/wallace/). +The [`tabulizer` package](https://github.com/ropensci/tabulizer) e.g. has an interactive workflow to extract tables, but can also only extract coordinates so one can re-run things as a script. +Besides, two examples of shiny apps that do code generation are [https://gdancik.shinyapps.io/shinyGEO/](https://gdancik.shinyapps.io/shinyGEO/), and [https://github.com/wallaceEcoMod/wallace/](https://github.com/wallaceEcoMod/wallace/). -### Input checking +### Input checking {#input-checking} We recommend your package use a consistent method of your choice for [checking inputs](https://blog.r-hub.io/2022/03/10/input-checking/) -- either base R, an R package, or custom helpers. -### Packages wrapping web resources (API clients) +### Packages wrapping web resources (API clients) {#api-clients} If your package accesses a web API or another web resource, -- Make sure requests send an [user agent](https://httr2.r-lib.org/articles/wrapping-apis.html#user-agent), that is, a way to identify what (your package) or who sent the request. The users should be able to override the package's default user agent. Ideally the user agent should be different on continuous integration services, and in development (based on, for instance, the GitHub usernames of the developers). +- Make sure requests send an [user agent](https://httr2.r-lib.org/articles/wrapping-apis.html#user-agent), that is, a way to identify what (your package) or who sent the request. + The users should be able to override the package's default user agent. + Ideally the user agent should be different on continuous integration services, and in development (based on, for instance, the GitHub usernames of the developers). - You might choose different (better) defaults than the API, in which case you should document them. - Your package should help with pagination, by allowing the users to not worry about it at all since your package does all necessary requests. - Your package should help with rate limiting according to the API rules. @@ -88,9 +112,13 @@ For more information refer to the blog post [Why You Should (or Shouldn't) Build ## Code Style {#code-style} -- For more information on how to style your code, name functions, and R scripts inside the `R/` folder, we recommend reading the [code chapter in The R Packages book](https://r-pkgs.org/Code.html). We recommend the [`styler` package](https://github.com/r-lib/styler) for automating part of the code styling. We suggest reading the [Tidyverse style guide](https://style.tidyverse.org/). +- For more information on how to style your code, name functions, and R scripts inside the `R/` folder, we recommend reading the [code chapter in The R Packages book](https://r-pkgs.org/Code.html). + We recommend the [`styler` package](https://github.com/r-lib/styler) for automating part of the code styling. + We suggest reading the [Tidyverse style guide](https://style.tidyverse.org/). -- You can choose to use `=` over `<-` as long you are consistent with one choice within your package. We recommend avoiding the use of `->` for assignment within a package. If you do use `<-` throughout your package, and you also use `R6` in that package, you'll be forced to use `=` for assignment within your `R6Class` construction - this is not considered an inconsistency because you can't use `<-` in this case. +- You can choose to use `=` over `<-` as long you are consistent with one choice within your package. + We recommend avoiding the use of `->` for assignment within a package. + If you do use `<-` throughout your package, and you also use `R6` in that package, you'll be forced to use `=` for assignment within your `R6Class` construction - this is not considered an inconsistency because you can't use `<-` in this case. ## CITATION file {#citation-file} @@ -102,7 +130,8 @@ For more information refer to the blog post [Why You Should (or Shouldn't) Build - If one day [**after** review at rOpenSci](#authors-guide) you publish a software publication about your package, add it to the CITATION file. -- Less related to your package itself but to what supports it: if your package wraps a particular resource such as data source or, say, statistical algorithm, remind users of how to cite that resource via e.g. `citHeader()`. [Maybe even add the reference for the resource](https://discuss.ropensci.org/t/citation-of-original-article-when-implementing-specific-methods/2312). +- Less related to your package itself but to what supports it: if your package wraps a particular resource such as data source or, say, statistical algorithm, remind users of how to cite that resource via e.g. `citHeader()`. + [Maybe even add the reference for the resource](https://discuss.ropensci.org/t/citation-of-original-article-when-implementing-specific-methods/2312). As an example see [the dynamite CITATION file](https://github.com/ropensci/dynamite/blob/main/inst/CITATION) which refers to the R manual as well as other associated publications. @@ -143,22 +172,27 @@ bibentry( ) ``` -- You could also create and store a `CITATION.cff` thanks to the [cffr package](https://docs.ropensci.org/cffr/). It also provides a [GitHub Action workflow](https://docs.ropensci.org/cffr/reference/cff_gha_update.html) to keep the `CITATION.cff` file up-to-date. +- You could also create and store a `CITATION.cff` thanks to the [cffr package](https://docs.ropensci.org/cffr/). + It also provides a [GitHub Action workflow](https://docs.ropensci.org/cffr/reference/cff_gha_update.html) to keep the `CITATION.cff` file up-to-date. ## README {#readme} -- All packages should have a README file, named `README.md`, in the root of the repository. The README should include, from top to bottom: +- All packages should have a README file, named `README.md`, in the root of the repository. + The README should include, from top to bottom: - The package name. - Badges for continuous integration and test coverage, the badge for rOpenSci peer-review once it has started (see below), a repostatus.org badge, and any other badges (e.g. [R-universe](https://ropensci.org/blog/2021/10/14/runiverse-badges/)). - - Short description of goals of package (what does it do? why should a potential user care?), with descriptive links to all vignettes unless the package is small and there's only one vignette repeating the README. Please also ensure the vignettes are rendered and readable, see [the "documentation website" section](#website)). + - Short description of goals of package (what does it do? why should a potential user care?), with descriptive links to all vignettes unless the package is small and there's only one vignette repeating the README. + Please also ensure the vignettes are rendered and readable, see [the "documentation website" section](#website)). - Installation instructions using e.g. the [remotes package](https://remotes.r-lib.org/), [pak package](https://pak.r-lib.org/), or [R-universe](https://ropensci.org/blog/2021/06/22/setup-runiverse/). - Any additional setup required (authentication tokens, etc). - Brief demonstration usage. - If applicable, how the package compares to other similar packages and/or how it relates to other packages. - - Citation information i.e. Direct users to the preferred citation in the README by adding boilerplate text "here's how to cite my package". See e.g. [ecmwfr README](https://github.com/bluegreen-labs/ecmwfr#how-to-cite-this-package-in-your-article). + - Citation information i.e. Direct users to the preferred citation in the README by adding boilerplate text "here's how to cite my package". + See e.g. [ecmwfr README](https://github.com/bluegreen-labs/ecmwfr#how-to-cite-this-package-in-your-article). -If you use another repo status badge such as a [lifecycle](https://www.tidyverse.org/lifecycle/) badge, please also add a [repostatus.org](https://www.repostatus.org/) badge. [Example of a repo README with two repo status badges](https://github.com/ropensci/ijtiff#ijtiff-). +If you use another repo status badge such as a [lifecycle](https://www.tidyverse.org/lifecycle/) badge, please also add a [repostatus.org](https://www.repostatus.org/) badge. +[Example of a repo README with two repo status badges](https://github.com/ropensci/ijtiff#ijtiff-). - Once you have submitted a package and it has passed editor checks, add a peer-review badge via @@ -166,26 +200,31 @@ If you use another repo status badge such as a [lifecycle](https://www.tidyverse [![](https://badges.ropensci.org/_status.svg)](https://github.com/ropensci/software-review/issues/) ``` -where issue\_id is the number of the issue in the software-review repository. For instance, the badge for [`rtimicropem`](https://github.com/ropensci/rtimicropem) review uses the number 126 since it's the [review issue number](https://github.com/ropensci/software-review/issues/126). The badge will first indicated "under review" and then "peer-reviewed" once your package has been onboarded (issue labelled "approved" and closed), and will link to the review issue. +where issue\_id is the number of the issue in the software-review repository. +For instance, the badge for [`rtimicropem`](https://github.com/ropensci/rtimicropem) review uses the number 126 since it's the [review issue number](https://github.com/ropensci/software-review/issues/126). +The badge will first indicated "under review" and then "peer-reviewed" once your package has been onboarded (issue labelled "approved" and closed), and will link to the review issue. -- If your README has many badges consider ordering them in an html table to make it easier for newcomers to gather information at a glance. See examples in [`drake` repo](https://github.com/ropensci/drake) and in [`qualtRics` repo](https://github.com/ropensci/qualtRics/). Possible sections are +- If your README has many badges consider ordering them in an html table to make it easier for newcomers to gather information at a glance. + See examples in [`drake` repo](https://github.com/ropensci/drake) and in [`qualtRics` repo](https://github.com/ropensci/qualtRics/). + Possible sections are - Development (CI statuses cf [CI chapter](#ci), Slack channel for discussion, repostatus) - Release/Published ([CRAN version and release date badges from METACRAN](https://www.r-pkg.org/services#badges), [CRAN checks API badge](https://github.com/r-hub/cchecksbadges), Zenodo badge) - Stats/Usage (downloads e.g. [download badges from r-hub/cranlogs](https://github.com/r-hub/cranlogs.app#badges)) The table should be more wide than it is long in order to mask the rest of the README. -- If your package connects to a data source or online service, or wraps other software, consider that your package README may be the first point of entry for users. It should provide enough information for users to understand the nature of the data, service, or software, and provide links to other relevant data and documentation. For instance, - a README should not merely read, "Provides access to GooberDB," but also include, - "..., an online repository of Goober sightings in South America. More - information about GooberDB, and documentation of database structure and metadata - can be found at *link*". +- If your package connects to a data source or online service, or wraps other software, consider that your package README may be the first point of entry for users. + It should provide enough information for users to understand the nature of the data, service, or software, and provide links to other relevant data and documentation. + For instance, a README should not merely read, "Provides access to GooberDB," but also include, "..., an online repository of Goober sightings in South America. + More information about GooberDB, and documentation of database structure and metadata can be found at *link*". -- We recommend not creating `README.md` directly, but from a `README.Rmd` file (an R Markdown file) if you have any demonstration code. The advantage of the `.Rmd` file is you can combine text with code that can be easily updated whenever your package is updated. +- We recommend not creating `README.md` directly, but from a `README.Rmd` file (an R Markdown file) if you have any demonstration code. + The advantage of the `.Rmd` file is you can combine text with code that can be easily updated whenever your package is updated. - Consider using `usethis::use_readme_rmd()` to get a template for a `README.Rmd` file and to automatically set up a pre-commit hook to ensure that `README.md` is always newer than `README.Rmd`. -- Extensive examples should be kept for a vignette. If you want to make the vignettes more accessible before installing the package, we suggest [creating a website for your package](#website). +- Extensive examples should be kept for a vignette. + If you want to make the vignettes more accessible before installing the package, we suggest [creating a website for your package](#website). - Add a [code of conduct and contribution guidelines](#friendlyfiles). @@ -197,13 +236,21 @@ where issue\_id is the number of the issue in the software-review repository. Fo - All exported package functions should be fully documented with examples. -- If there is potential overlap or confusion with other packages providing similar functionality or having a similar name, add a note in the README, main vignette and potentially the Description field of DESCRIPTION. Examples in [rtweet README](https://docs.ropensci.org/rtweet/), [rebird README](https://docs.ropensci.org/rebird/#auk-vs-rebird), and the non-rOpensci package [slurmR](https://uscbiostats.github.io/slurmR/index.html#vs). +- If there is potential overlap or confusion with other packages providing similar functionality or having a similar name, add a note in the README, main vignette and potentially the Description field of DESCRIPTION. + Examples in [rtweet README](https://docs.ropensci.org/rtweet/), [rebird README](https://docs.ropensci.org/rebird/#auk-vs-rebird), and the non-rOpensci package [slurmR](https://uscbiostats.github.io/slurmR/index.html#vs). -- The package should contain top-level documentation for `?foobar`, (or ``?`foobar-package` `` if there is a naming conflict). Optionally, you can use both `?foobar` and ``?`foobar-package` `` for the package level manual file, using `@aliases` roxygen tag. [`usethis::use_package_doc()`](https://usethis.r-lib.org/reference/use_package_doc.html) adds the template for the top-level documentation. +- The package should contain top-level documentation for `?foobar`, (or ``?`foobar-package` `` if there is a naming conflict). + Optionally, you can use both `?foobar` and ``?`foobar-package` `` for the package level manual file, using `@aliases` roxygen tag. + [`usethis::use_package_doc()`](https://usethis.r-lib.org/reference/use_package_doc.html) adds the template for the top-level documentation. -- The package should contain at least one **HTML** vignette providing a substantial coverage of package functions, illustrating realistic use cases and how functions are intended to interact. If the package is small, the vignette and the README may have very similar content. +- The package should contain at least one **HTML** vignette providing a substantial coverage of package functions, illustrating realistic use cases and how functions are intended to interact. + If the package is small, the vignette and the README may have very similar content. -- As is the case for a README, top-level documentation or vignettes may be the first point of entry for users. If your package connects to a data source or online service, or wraps other software, it should provide enough information for users to understand the nature of the data, service, or software, and provide links to other relevant data and documentation. For instance, a vignette intro or documentation should not merely read, "Provides access to GooberDB," but also include, "..., an online repository of Goober sightings in South America. More information about GooberDB, and documentation of database structure and metadata can be found at *link*". Any vignette should outline prerequisite knowledge to be able to understand the vignette upfront. +- As is the case for a README, top-level documentation or vignettes may be the first point of entry for users. + If your package connects to a data source or online service, or wraps other software, it should provide enough information for users to understand the nature of the data, service, or software, and provide links to other relevant data and documentation. + For instance, a vignette intro or documentation should not merely read, "Provides access to GooberDB," but also include, "..., an online repository of Goober sightings in South America. + More information about GooberDB, and documentation of database structure and metadata can be found at *link*". + Any vignette should outline prerequisite knowledge to be able to understand the vignette upfront. The general vignette should present a series of examples progressing in complexity from basic to advanced usage. @@ -215,13 +262,18 @@ The general vignette should present a series of examples progressing in complexi - If your package provides access to a data source, we require that DESCRIPTION contains both (1) A brief identification and/or description of the organisation responsible for issuing data; and (2) The URL linking to public-facing page providing, describing, or enabling data access (which may often differ from URL leading directly to data source). -- Only use package startup messages when necessary (function masking for instance). Avoid package startup messages like "This is foobar 2.4-0" or citation guidance because they can be annoying to the user. Rely on documentation for such guidance. +- Only use package startup messages when necessary (function masking for instance). + Avoid package startup messages like "This is foobar 2.4-0" or citation guidance because they can be annoying to the user. + Rely on documentation for such guidance. - You can choose to have a README section about use cases of your package (other packages, blog posts, etc.), [example](https://github.com/ropensci/vcr#example-packages-using-vcr). ### roxygen2 use {#roxygen-2-use} -- We request all submissions to use [roxygen2](https://roxygen2.r-lib.org/) for documentation. roxygen2 is an R package that compiles `.Rd` files to your `man` folder in your package from tags written above each function. roxygen2 has [support for Markdown syntax](https://roxygen2.r-lib.org/articles/rd-formatting.html). One key advantage of using roxygen2 is that your `NAMESPACE` will always be automatically generated and up to date. +- We request all submissions to use [roxygen2](https://roxygen2.r-lib.org/) for documentation. + roxygen2 is an R package that compiles `.Rd` files to your `man` folder in your package from tags written above each function. + roxygen2 has [support for Markdown syntax](https://roxygen2.r-lib.org/articles/rd-formatting.html). + One key advantage of using roxygen2 is that your `NAMESPACE` will always be automatically generated and up to date. - More information on using roxygen2 documentation is available in the [R packages book](https://r-pkgs.org/man.html) and in [roxygen2 website itself](https://roxygen2.r-lib.org/). @@ -229,7 +281,9 @@ The general vignette should present a series of examples progressing in complexi - All functions should document the type of object returned under the `@return` heading. -- The default value for each parameter should be clearly documented. For example, instead of writing `A logical value determining if ...`, you should write ``A logical value (default `TRUE`) determining if ...``. It is also good practice to indicate the default values directly in your function definition: +- The default value for each parameter should be clearly documented. + For example, instead of writing `A logical value determining if ...`, you should write ``A logical value (default `TRUE`) determining if ...``. + It is also good practice to indicate the default values directly in your function definition: ```{r, eval=FALSE} f <- function(a = TRUE) { @@ -237,42 +291,59 @@ f <- function(a = TRUE) { } ``` -- Documentation should support user navigation by including useful [cross-links](https://roxygen2.r-lib.org/reference/tags-index-crossref.html) between related functions and documenting related functions together in groups or in common help pages. In particular, the `@family` tags, that automatically creates "See also" links and [can help group](https://pkgdown.r-lib.org/reference/build_reference.html) functions together on pkgdown sites, is recommended for this purpose. See [the "manual" section of The R Packages book](https://r-pkgs.org/man.html) and [the "function grouping" section of the present chapter](#function-grouping) for more details. +- Documentation should support user navigation by including useful [cross-links](https://roxygen2.r-lib.org/reference/tags-index-crossref.html) between related functions and documenting related functions together in groups or in common help pages. + In particular, the `@family` tags, that automatically creates "See also" links and [can help group](https://pkgdown.r-lib.org/reference/build_reference.html) functions together on pkgdown sites, is recommended for this purpose. + See [the "manual" section of The R Packages book](https://r-pkgs.org/man.html) and [the "function grouping" section of the present chapter](#function-grouping) for more details. -- You can re-use documentation pieces (e.g. details about authentication, related packages) across the vignettes/README/man pages. Refer to [roxygen2 vignette on documentation reuse](https://roxygen2.r-lib.org/articles/reuse.html). +- You can re-use documentation pieces (e.g. details about authentication, related packages) across the vignettes/README/man pages. + Refer to [roxygen2 vignette on documentation reuse](https://roxygen2.r-lib.org/articles/reuse.html). -- For including examples, you can use the classic `@examples` tag (plural "examples") but also the `@example ` tag (singular "example") for storing the example code in a separate R script (ideally under `man/`), and the `@exampleIf` tag for running examples conditionally and avoiding R CMD check failures. Refer to [roxygen2 documentation about examples](https://roxygen2.r-lib.org/articles/rd.html#examples). +- For including examples, you can use the classic `@examples` tag (plural "examples") but also the `@example ` tag (singular "example") for storing the example code in a separate R script (ideally under `man/`), and the `@exampleIf` tag for running examples conditionally and avoiding R CMD check failures. + Refer to [roxygen2 documentation about examples](https://roxygen2.r-lib.org/articles/rd.html#examples). -- Add `#' @noRd` to internal functions. You might be interested in the [devtag experimental package](https://github.com/moodymudskipper/devtag) for getting local manual pages when using `#' @noRd`. +- Add `#' @noRd` to internal functions. + You might be interested in the [devtag experimental package](https://github.com/moodymudskipper/devtag) for getting local manual pages when using `#' @noRd`. -- Starting from roxygen2 version 7.0.0, `R6` classes are officially supported. See the [roxygen2 docs](https://roxygen2.r-lib.org/articles/rd-other.html#r6) for details on how to document `R6` classes. +- Starting from roxygen2 version 7.0.0, `R6` classes are officially supported. + See the [roxygen2 docs](https://roxygen2.r-lib.org/articles/rd-other.html#r6) for details on how to document `R6` classes. ### URLs in documentation {#ur-ls-in-documentation} This subsection is particularly relevant to authors wishing to submit their package to CRAN. CRAN will check URLs in your documentation and does not allow redirect status codes such as 301. You can use the [urlchecker](https://github.com/r-lib/urlchecker) package to reproduce these checks and, in particular, replace URLs with the URLs they redirect to. -Others have used the option to escape some URLs (change `` to `https://ropensci.org/`, or `\url{https://ropensci.org/}` to `https://ropensci.org/`.), but if you do so, you will need to implement some sort of URL checking yourself to prevent them from getting broken without your noticing. Furthermore, links would not be clickable from local docs. +Others have used the option to escape some URLs (change `` to `https://ropensci.org/`, or `\url{https://ropensci.org/}` to `https://ropensci.org/`.), but if you do so, you will need to implement some sort of URL checking yourself to prevent them from getting broken without your noticing. +Furthermore, links would not be clickable from local docs. ## Documentation website {#website} -We recommend creating a documentation website for your package using [`pkgdown`](https://github.com/r-lib/pkgdown). The R packages book features a [chapter on pkgdown](https://r-pkgs.org/website.html), and of course `pkgdown` has [its own documentation website](https://pkgdown.r-lib.org/). +We recommend creating a documentation website for your package using [`pkgdown`](https://github.com/r-lib/pkgdown). +The R packages book features a [chapter on pkgdown](https://r-pkgs.org/website.html), and of course `pkgdown` has [its own documentation website](https://pkgdown.r-lib.org/). There are a few elements we'd like to underline here. ### Automatic deployment of the documentation website {#docsropensci} -You only need to worry about automatic deployment of your website until approval and transfer of your package repo to the ropensci organization; indeed, after that a pkgdown website will be built for your package after each push to the GitHub repo. You can find the status of these builds at `https://dev.ropensci.org/job/package_name`, e.g. [for `magick`](https://dev.ropensci.org/job/magick); and the website at `https://docs.ropensci.org/package_name`, e.g. [for `magick`](https://docs.ropensci.org/magick). The website build will use your pkgdown config file if you have one, except for the styling that will use the [`rotemplate` package](https://github.com/ropensci-org/rotemplate/). The resulting website will have a local search bar. Please report bugs, questions and feature requests about the central builds at [https://github.com/ropensci/docs/](https://github.com/ropensci/docs/) and about the template at [https://github.com/ropensci/rotemplate/](https://github.com/ropensci/rotemplate/). +You only need to worry about automatic deployment of your website until approval and transfer of your package repo to the ropensci organization; indeed, after that a pkgdown website will be built for your package after each push to the GitHub repo. +You can find the status of these builds at `https://dev.ropensci.org/job/package_name`, e.g. [for `magick`](https://dev.ropensci.org/job/magick); and the website at `https://docs.ropensci.org/package_name`, e.g. [for `magick`](https://docs.ropensci.org/magick). +The website build will use your pkgdown config file if you have one, except for the styling that will use the [`rotemplate` package](https://github.com/ropensci-org/rotemplate/). +The resulting website will have a local search bar. +Please report bugs, questions and feature requests about the central builds at [https://github.com/ropensci/docs/](https://github.com/ropensci/docs/) and about the template at [https://github.com/ropensci/rotemplate/](https://github.com/ropensci/rotemplate/). *If your package vignettes need credentials (API keys, tokens, etc.) to knit, you might want to [precompute them](https://ropensci.org/technotes/2019/12/08/precompute-vignettes/) since credentials cannot be used on the docs server.* -Before submission and before transfer, you could use the [approach documented by `pkgdown`](https://pkgdown.r-lib.org/reference/deploy_site_github.html) or the [`tic` package](https://docs.ropensci.org/tic/) for automatic deployment of the package's website. This would save you the hassle of running (and remembering to run) `pkgdown::build_site()` yourself every time the site needs to be updated. First refer to our [chapter on continuous integration](#ci) if you're not familiar with continuous integration. In any case, do not forget to update all occurrences of the website URL after transfer to the ropensci organization. +Before submission and before transfer, you could use the [approach documented by `pkgdown`](https://pkgdown.r-lib.org/reference/deploy_site_github.html) or the [`tic` package](https://docs.ropensci.org/tic/) for automatic deployment of the package's website. +This would save you the hassle of running (and remembering to run) `pkgdown::build_site()` yourself every time the site needs to be updated. +First refer to our [chapter on continuous integration](#ci) if you're not familiar with continuous integration. +In any case, do not forget to update all occurrences of the website URL after transfer to the ropensci organization. ### Grouping functions in the reference {#function-grouping} When your package has many functions, use grouping in the reference, which you can do more or less automatically. -If you use roxygen2 above version 6.1.1, you should use the `@family` tag in your functions documentation to indicate grouping. This will give you links between functions in the local documentation of the installed package ("See also" section) *and* allow you to use the `pkgdown` `has_concept` function in the config file of your website. Non-rOpenSci example courtesy of [`optiRum`](https://github.com/lockedata/optiRum): [family tag](https://github.com/lockedata/optiRum/blob/master/R/APR.R#L17), [`pkgdown` config file](https://github.com/lockedata/optiRum/blob/master/_pkgdown.yml) and [resulting reference section](https://itsalocke.com/optirum/reference/). +If you use roxygen2 above version 6.1.1, you should use the `@family` tag in your functions documentation to indicate grouping. +This will give you links between functions in the local documentation of the installed package ("See also" section) *and* allow you to use the `pkgdown` `has_concept` function in the config file of your website. +Non-rOpenSci example courtesy of [`optiRum`](https://github.com/lockedata/optiRum): [family tag](https://github.com/lockedata/optiRum/blob/master/R/APR.R#L17), [`pkgdown` config file](https://github.com/lockedata/optiRum/blob/master/_pkgdown.yml) and [resulting reference section](https://itsalocke.com/optirum/reference/). To customize the text of the cross-reference title created by roxygen2 (`Other {family}:`), refer to [roxygen2 docs regarding how to provide a `rd_family_title` list in `man/roxygen/meta.R`](https://roxygen2.r-lib.org/articles/rd.html#cross-references). Less automatically, see the example of [`drake` website](https://docs.ropensci.org/drake/) and [associated config file @@ -280,15 +351,17 @@ Less automatically, see the example of [`drake` website](https://docs.ropensci.o ### Branding of authors {#branding-of-authors} -You can make the names of (some) authors clickable by adding their URL, and you can even replace their names with a logo (think rOpenSci... or your organisation/company!). See [`pkgdown` documentation](https://pkgdown.r-lib.org/reference/build_home.html?q=authors#yaml-config-authors). +You can make the names of (some) authors clickable by adding their URL, and you can even replace their names with a logo (think rOpenSci... or your organisation/company!). +See [`pkgdown` documentation](https://pkgdown.r-lib.org/reference/build_home.html?q=authors#yaml-config-authors). ### Tweaking the navbar {#tweaking-the-navbar} -You can make your website content easier to browse by tweaking the navbar, refer to [`pkgdown` documentation](https://pkgdown.r-lib.org/articles/pkgdown.html#navigation-bar). In particular, note that if you name the main vignette of your package "pkg-name.Rmd", it'll be accessible from the navbar as a `Get started` link instead of via `Articles > Vignette Title`. +You can make your website content easier to browse by tweaking the navbar, refer to [`pkgdown` documentation](https://pkgdown.r-lib.org/articles/pkgdown.html#navigation-bar). +In particular, note that if you name the main vignette of your package "pkg-name.Rmd", it'll be accessible from the navbar as a `Get started` link instead of via `Articles > Vignette Title`. ### Math rendering {#mathjax} -Please refer to [pkgdown documentation](https://pkgdown.r-lib.org/dev/articles/customise.html#math-rendering). +Please refer to [pkgdown documentation](https://pkgdown.r-lib.org/dev/articles/customise.html#math-rendering). Our template is compatible with this configuration. ### Package logo {#package-logo} @@ -298,22 +371,30 @@ If your package doesn't have any logo, the [rOpenSci docs builder](#docsropensci ## Authorship {#authorship} -The `DESCRIPTION` file of a package should list package authors and contributors to a package, using the `Authors@R` syntax to indicate their roles (author/creator/contributor etc.) if there is more than one author, and using the comment field to indicate the ORCID ID of each author, if they have one (cf [this post](https://ropensci.org/technotes/2018/10/08/orcid/)). See [this section of "Writing R Extensions"](https://cran.rstudio.com/doc/manuals/r-release/R-exts.html#The-DESCRIPTION-file) for details. If you feel that your reviewers have made a substantial contribution to the development of your package, you may list them in the `Authors@R` field with a Reviewer contributor type (`"rev"`), like so: +The `DESCRIPTION` file of a package should list package authors and contributors to a package, using the `Authors@R` syntax to indicate their roles (author/creator/contributor etc.) if there is more than one author, and using the comment field to indicate the ORCID ID of each author, if they have one (cf [this post](https://ropensci.org/technotes/2018/10/08/orcid/)). +See [this section of "Writing R Extensions"](https://cran.rstudio.com/doc/manuals/r-release/R-exts.html#The-DESCRIPTION-file) for details. +If you feel that your reviewers have made a substantial contribution to the development of your package, you may list them in the `Authors@R` field with a Reviewer contributor type (`"rev"`), like so: ``` person("Bea", "Hernández", role = "rev", comment = "Bea reviewed the package (v. X.X.XX) for rOpenSci, see "), ``` -Only include reviewers after asking for their consent. Read more in this blog post ["Thanking Your Reviewers: Gratitude through Semantic Metadata"](https://ropensci.org/blog/2018/03/16/thanking-reviewers-in-metadata/). Please do not list editors as contributors. Your participation in and contribution to rOpenSci is thanks enough! +Only include reviewers after asking for their consent. +Read more in this blog post ["Thanking Your Reviewers: Gratitude through Semantic Metadata"](https://ropensci.org/blog/2018/03/16/thanking-reviewers-in-metadata/). +Please do not list editors as contributors. +Your participation in and contribution to rOpenSci is thanks enough! ### Authorship of included code {#authorship-included-code} -Many packages include code from other software. Whether entire files or single functions are included from other packages, rOpenSci packages should follow [the CRAN *Repository Policy*](https://cran.r-project.org/web/packages/policies.html): +Many packages include code from other software. +Whether entire files or single functions are included from other packages, rOpenSci packages should follow [the CRAN *Repository Policy*](https://cran.r-project.org/web/packages/policies.html): -> The ownership of copyright and intellectual property rights of all components of the package must be clear and unambiguous (including from the authors specification in the DESCRIPTION file). Where code is copied (or derived) from the work of others (including from R itself), care must be taken that any copyright/license statements are preserved and authorship is not misrepresented. +> The ownership of copyright and intellectual property rights of all components of the package must be clear and unambiguous (including from the authors specification in the DESCRIPTION file). +> Where code is copied (or derived) from the work of others (including from R itself), care must be taken that any copyright/license statements are preserved and authorship is not misrepresented. > -> Preferably, an ‘Authors@R' field would be used with ‘ctb' roles for the authors of such code. Alternatively, the ‘Author' field should list these authors as contributors. +> Preferably, an ‘Authors@R' field would be used with ‘ctb' roles for the authors of such code. +> Alternatively, the ‘Author' field should list these authors as contributors. > > Where copyrights are held by an entity other than the package authors, this should preferably be indicated via ‘cph' roles in the ‘Authors@R' field, or using a ‘Copyright' field (if necessary referring to an inst/COPYRIGHTS file). > @@ -328,26 +409,37 @@ For more explanations around licensing, refer to the [R packages book](https://r - All packages should pass `R CMD check`/`devtools::check()` on all major platforms. -- All packages should have a test suite that covers major functionality of the package. The tests should also cover the behavior of the package in case of errors. +- All packages should have a test suite that covers major functionality of the package. + The tests should also cover the behavior of the package in case of errors. -- It is good practice to write unit tests for all functions, and all package code in general, ensuring key functionality is covered. Test coverage below 75% will likely require additional tests or explanation before being sent for review. +- It is good practice to write unit tests for all functions, and all package code in general, ensuring key functionality is covered. + Test coverage below 75% will likely require additional tests or explanation before being sent for review. -- We recommend using [testthat](https://testthat.r-lib.org/) for writing tests. Strive to write tests as you write each new function. This serves the obvious need to have proper testing for the package, but allows you to think about various ways in which a function can fail, and to *defensively* code against those. [More information](https://r-pkgs.org/tests.html). +- We recommend using [testthat](https://testthat.r-lib.org/) for writing tests. + Strive to write tests as you write each new function. + This serves the obvious need to have proper testing for the package, but allows you to think about various ways in which a function can fail, and to *defensively* code against those. + [More information](https://r-pkgs.org/tests.html). -- Tests should be easy to understand. We suggest reading the blog post [*"Why Good Developers Write Bad Unit Tests"*](https://mtlynch.io/good-developers-bad-tests/) by Michael Lynch. +- Tests should be easy to understand. + We suggest reading the blog post [*"Why Good Developers Write Bad Unit Tests"*](https://mtlynch.io/good-developers-bad-tests/) by Michael Lynch. - Packages with Shiny apps should use a unit-testing framework such as [`shinytest2`](https://rstudio.github.io/shinytest2/) or [`shinytest`](https://rstudio.github.io/shinytest/articles/shinytest.html) to test that interactive interfaces behave as expected. - For testing your functions creating plots, we suggest using [vdiffr](https://vdiffr.r-lib.org/), an extension of the testthat package that relies on [testthat snapshot tests](https://testthat.r-lib.org/articles/snapshotting.html). -- If your package interacts with web resources (web APIs and other sources of data on the web) you might find the [HTTP testing in R book by Scott Chamberlain and Maëlle Salmon](https://books.ropensci.org/http-testing/) relevant. Packages helping with HTTP testing (corresponding HTTP clients): +- If your package interacts with web resources (web APIs and other sources of data on the web) you might find the [HTTP testing in R book by Scott Chamberlain and Maëlle Salmon](https://books.ropensci.org/http-testing/) relevant. + Packages helping with HTTP testing (corresponding HTTP clients): - [httptest2](https://enpiar.com/httptest2/) ([httr2](https://httr2.r-lib.org/)); - [httptest](https://enpiar.com/r/httptest/) ([httr](https://httr.r-lib.org/)); - [vcr](https://docs.ropensci.org/vcr/) ([httr](https://httr.r-lib.org/), [crul](https://docs.ropensci.org/crul)); - [webfakes](https://webfakes.r-lib.org/) ([httr](https://httr.r-lib.org/), [httr2](https://httr2.r-lib.org/), [crul](https://docs.ropensci.org/crul), [curl](https://jeroen.r-universe.dev/curl#)). -- testthat has a function `skip_on_cran()` that you can use to not run tests on CRAN. We recommend using this on all functions that are API calls since they are quite likely to fail on CRAN. These tests should still run on continuous integration. Note that from testthat 3.1.2 `skip_if_offline()` automatically calls `skip_on_cran()`. More info on [CRAN preparedness for API wrappers](https://books.ropensci.org/http-testing/cran-preparedness.html). +- testthat has a function `skip_on_cran()` that you can use to not run tests on CRAN. + We recommend using this on all functions that are API calls since they are quite likely to fail on CRAN. + These tests should still run on continuous integration. + Note that from testthat 3.1.2 `skip_if_offline()` automatically calls `skip_on_cran()`. + More info on [CRAN preparedness for API wrappers](https://books.ropensci.org/http-testing/cran-preparedness.html). - If your package interacts with a database you might find [dittodb](https://docs.ropensci.org/dittodb) useful. @@ -357,36 +449,37 @@ For more explanations around licensing, refer to the [R packages book](https://r ## Examples {#examples} -- Include extensive examples in the documentation. In addition to demonstrating how to use the package, these can act as an easy way to test package functionality before there are proper tests. However, keep in mind we require tests in contributed packages. +- Include extensive examples in the documentation. + In addition to demonstrating how to use the package, these can act as an easy way to test package functionality before there are proper tests. + However, keep in mind we require tests in contributed packages. -- You can run examples with `devtools::run_examples()`. Note that when you run R CMD CHECK or equivalent (e.g., `devtools::check()`) your examples that are not wrapped in `\dontrun{}` or `\donttest{}` are run. Refer to the [summary table](https://roxygen2.r-lib.org/articles/rd.html#functions) in roxygen2 docs. +- You can run examples with `devtools::run_examples()`. + Note that when you run R CMD CHECK or equivalent (e.g., `devtools::check()`) your examples that are not wrapped in `\dontrun{}` or `\donttest{}` are run. + Refer to the [summary table](https://roxygen2.r-lib.org/articles/rd.html#functions) in roxygen2 docs. -- To safe-guard examples (e.g. requiring authentication) to be run on CRAN you need to use `\dontrun{}`. However, for a first submission CRAN won't let you have all examples escaped so. In this case you might add some small toy examples, or wrap example code in `try()`. Also refer to the `@exampleIf` tag present, at the time of writing, in roxygen2 development version. +- To safe-guard examples (e.g. requiring authentication) to be run on CRAN you need to use `\dontrun{}`. + However, for a first submission CRAN won't let you have all examples escaped so. + In this case you might add some small toy examples, or wrap example code in `try()`. + Also refer to the `@exampleIf` tag present, at the time of writing, in roxygen2 development version. -- In addition to running examples locally on your own computer, we strongly advise that you run examples on one of the [continuous integration systems](#ci). Again, examples that are not wrapped in `\dontrun{}` or `\donttest{}` will be run, but for those that are you can configure your continuous integration builds to run them via R CMD check arguments `--run-dontrun` and/or `--run-donttest`. +- In addition to running examples locally on your own computer, we strongly advise that you run examples on one of the [continuous integration systems](#ci). + Again, examples that are not wrapped in `\dontrun{}` or `\donttest{}` will be run, but for those that are you can configure your continuous integration builds to run them via R CMD check arguments `--run-dontrun` and/or `--run-donttest`. ## Package dependencies {#pkgdependencies} -- Consider the trade-offs involved in relying on a package as a dependency. On one hand, - using dependencies reduces coding effort, and can build on useful functionality developed by - others, especially if the dependency performs complex tasks, is high-performance, - and/or is well vetted and tested. On the other hand, having many dependencies - places a burden on the maintainer to keep up with changes in those packages, at risk - to your package's long-term sustainability. It also - increases installation time and size, primarily a consideration on your and others' development cycle, and in automated build systems. "Heavy" packages - those with many dependencies themselves, and those with large amounts of compiled code - increase this cost. Here are some approaches to reducing - dependencies: +- Consider the trade-offs involved in relying on a package as a dependency. + On one hand, using dependencies reduces coding effort, and can build on useful functionality developed by others, especially if the dependency performs complex tasks, is high-performance, and/or is well vetted and tested. + On the other hand, having many dependencies places a burden on the maintainer to keep up with changes in those packages, at risk to your package's long-term sustainability. + It also increases installation time and size, primarily a consideration on your and others' development cycle, and in automated build systems. + "Heavy" packages - those with many dependencies themselves, and those with large amounts of compiled code - increase this cost. + Here are some approaches to reducing dependencies: - - Small, simple functions from a dependency package may be better copied into - your own package if the dependency if you are using only a few functions - in an otherwise large or heavy dependency. (See [*Authorship* section - above](#authorship-included-code) for how to acknowledge original authors - of copied code.) On the other hand, complex functions with many edge - cases (e.g. parsers) require considerable testing and vetting. + - Small, simple functions from a dependency package may be better copied into your own package if the dependency if you are using only a few functions in an otherwise large or heavy dependency. + (See [*Authorship* section above](#authorship-included-code) for how to acknowledge original authors of copied code.) + On the other hand, complex functions with many edge cases (e.g. parsers) require considerable testing and vetting. - - An common example of this is in returning tidyverse-style "tibbles" from package - functions that provide data. - One can avoid the modestly heavy **tibble** package dependency by returning - a tibble created by modifying a data frame like so: + - An common example of this is in returning tidyverse-style "tibbles" from package functions that provide data. + One can avoid the modestly heavy **tibble** package dependency by returning a tibble created by modifying a data frame like so: ``` class(df) <- c("tbl_df", "tbl", "data.frame") @@ -394,22 +487,22 @@ For more explanations around licensing, refer to the [R packages book](https://r (Note that this approach is [not universally endorsed](https://twitter.com/krlmlr/status/1067856118385381377).) - - Ensure that you are using the package where the function is defined, - rather than one where it is re-exported. For instance many functions in **devtools** can be found in smaller specialty packages such as **sessioninfo**. The `%>%` function - should be imported from **magrittr**, where it is defined, rather than the heavier - **dplyr**, which re-exports it. + - Ensure that you are using the package where the function is defined, rather than one where it is re-exported. + For instance many functions in **devtools** can be found in smaller specialty packages such as **sessioninfo**. + The `%>%` function should be imported from **magrittr**, where it is defined, rather than the heavier **dplyr**, which re-exports it. - - Some dependencies are preferred because they provide easier to interpret - function names and syntax than base R solutions. If this is the primary - reason for using a function in a heavy dependency, consider wrapping - the base R approach in a nicely-named internal function in your package. See e.g. the [rlang R script providing functions with a syntax similar to purrr functions](https://github.com/r-lib/rlang/blob/9b50b7a86698332820155c268ad15bc1ed71cc03/R/standalone-purrr.R). + - Some dependencies are preferred because they provide easier to interpret function names and syntax than base R solutions. + If this is the primary reason for using a function in a heavy dependency, consider wrapping the base R approach in a nicely-named internal function in your package. + See e.g. the [rlang R script providing functions with a syntax similar to purrr functions](https://github.com/r-lib/rlang/blob/9b50b7a86698332820155c268ad15bc1ed71cc03/R/standalone-purrr.R). - If dependencies have overlapping functionality, see if you can rely on only one. - More dependency-management tips can be found in the chapter ["Dependencies: Mindset and Background" of the R packages book](https://r-pkgs.org/dependencies-mindset-background.html) and in a [post by Scott Chamberlain](https://recology.info/2018/10/limiting-dependencies/). -- Use `Imports` instead of `Depends` for packages providing functions from other packages. Make sure to list packages used for testing (`testthat`), and documentation (`knitr`, roxygen2) in your `Suggests` section of package dependencies (if you use `usethis` for adding testing infrastructure via [`usethis::use_testthat()`](https://usethis.r-lib.org/reference/use_testthat.html) or a vignette via [usethis::use\_vignette()](https://usethis.r-lib.org/reference/use_vignette.html), the necessary packages will be added to `DESCRIPTION`). If you use any package in the examples or tests of your package, make sure to list it in `Suggests`, if not already listed in `Imports`. +- Use `Imports` instead of `Depends` for packages providing functions from other packages. + Make sure to list packages used for testing (`testthat`), and documentation (`knitr`, roxygen2) in your `Suggests` section of package dependencies (if you use `usethis` for adding testing infrastructure via [`usethis::use_testthat()`](https://usethis.r-lib.org/reference/use_testthat.html) or a vignette via [usethis::use\_vignette()](https://usethis.r-lib.org/reference/use_vignette.html), the necessary packages will be added to `DESCRIPTION`). + If you use any package in the examples or tests of your package, make sure to list it in `Suggests`, if not already listed in `Imports`. - If your (not Bioconductor) package depends on Bioconductor packages, make sure the installation instructions in the README and vignette are clear enough even for an user who is not familiar with the Bioconductor release cycle. @@ -419,11 +512,15 @@ For more explanations around licensing, refer to the [R packages book](https://r - If your package depends on Bioconductor after a certain version, mention it in DESCRIPTION and in the installation instructions. -- Specifying minimum dependencies (e.g. `glue (>= 1.3.0)` instead of just `glue`) should be a conscious choice. If you know for a fact that your package will break below a certain dependency version, specify it explicitly. - But if you don't, then no need to specify a minimum dependency. In that case when a user reports a bug which is explicitly related to an older version of a dependency then address it then. - An example of bad practice would be for a developer to consider the versions of their current state of dependencies to be the minimal version. That would needlessly force everyone to upgrade (causing issues with other packages) when there is no good reason behind that version choice. +- Specifying minimum dependencies (e.g. `glue (>= 1.3.0)` instead of just `glue`) should be a conscious choice. + If you know for a fact that your package will break below a certain dependency version, specify it explicitly. + But if you don't, then no need to specify a minimum dependency. + In that case when a user reports a bug which is explicitly related to an older version of a dependency then address it then. + An example of bad practice would be for a developer to consider the versions of their current state of dependencies to be the minimal version. + That would needlessly force everyone to upgrade (causing issues with other packages) when there is no good reason behind that version choice. -- For most cases where you must expose functions from dependencies to the user, you should import and re-export those individual functions rather than listing them in the `Depends` fields. For instance, if functions in your package produce `raster` objects, you might re-export only printing and plotting functions from the **raster** package. +- For most cases where you must expose functions from dependencies to the user, you should import and re-export those individual functions rather than listing them in the `Depends` fields. + For instance, if functions in your package produce `raster` objects, you might re-export only printing and plotting functions from the **raster** package. - If your package uses a *system* dependency, you should @@ -432,27 +529,37 @@ For more explanations around licensing, refer to the [R packages book](https://r - Check that it is listed by [`sysreqsdb`](https://github.com/r-hub/sysreqsdb#sysreqs) to allow automatic tools to install it, and [submit a contribution](https://github.com/r-hub/sysreqsdb#contributing) if not; - Check for it in a `configure` script ([example](https://github.com/ropensci/magick/blob/c116b2b8505f491db72a139b61cd543b7a2ce873/DESCRIPTION#L19)) and give a helpful error message if it cannot be found ([example](https://github.com/cran/webp/blob/master/configure)). - `configure` scripts can be challenging as they often require hacky solutions - to make diverse system dependencies work across systems. Use examples ([more here](https://github.com/search?q=org%3Acran+anticonf&type=Code)) as a starting point but note that it is common to encounter bugs and edge cases and often violate CRAN policies. Do not hesitate to [ask for help on our forum](https://discuss.ropensci.org/). + `configure` scripts can be challenging as they often require hacky solutions to make diverse system dependencies work across systems. + Use examples ([more here](https://github.com/search?q=org%3Acran+anticonf&type=Code)) as a starting point but note that it is common to encounter bugs and edge cases and often violate CRAN policies. + Do not hesitate to [ask for help on our forum](https://discuss.ropensci.org/). ## Recommended scaffolding {#recommended-scaffolding} -- For HTTP requests we recommend using [httr2](https://httr2.r-lib.org), [httr](https://httr.r-lib.org), [curl](https://jeroen.r-universe.dev/curl#), or [crul](http://docs.ropensci.org/crul/) over [RCurl](https://cran.rstudio.com/web/packages/RCurl/). If you like low level clients for HTTP, curl is best, whereas httr2, httr and crul are better for higher level access. +- For HTTP requests we recommend using [httr2](https://httr2.r-lib.org), [httr](https://httr.r-lib.org), [curl](https://jeroen.r-universe.dev/curl#), or [crul](http://docs.ropensci.org/crul/) over [RCurl](https://cran.rstudio.com/web/packages/RCurl/). + If you like low level clients for HTTP, curl is best, whereas httr2, httr and crul are better for higher level access. - For parsing JSON, use [jsonlite](https://github.com/jeroen/jsonlite) instead of [rjson](https://cran.rstudio.com/web/packages/rjson/) or [RJSONIO](https://cran.rstudio.com/web/packages/RJSONIO/). -- For parsing, creating, and manipulating XML, we strongly recommend [xml2](https://cran.rstudio.com/web/packages/xml2/) for most cases. [You can refer to Daniel Nüst's notes about migration from XML to xml2](https://gist.github.com/nuest/3ed3b0057713eb4f4d75d11bb62f2d66). +- For parsing, creating, and manipulating XML, we strongly recommend [xml2](https://cran.rstudio.com/web/packages/xml2/) for most cases. + [You can refer to Daniel Nüst's notes about migration from XML to xml2](https://gist.github.com/nuest/3ed3b0057713eb4f4d75d11bb62f2d66). -- For spatial data, the [sp](https://github.com/edzer/sp/) package should be considered deprecated in favor of [sf](https://r-spatial.github.io/sf/), and the packages rgdal, rgdal, and rgdal will be retired by the end of 2023. We recommend use of the spatial suites developed by the [r-spatial](https://github.com/r-spatial) and [rspatial](https://github.com/rspatial) communities. See [this GitHub issue](https://github.com/ropensci/software-review-meta/issues/47) for relevant discussions. +- For spatial data, the [sp](https://github.com/edzer/sp/) package should be considered deprecated in favor of [sf](https://r-spatial.github.io/sf/), and the packages rgdal, rgdal, and rgdal was retired at the end of 2023. + We recommend use of the spatial suites developed by the [r-spatial](https://github.com/r-spatial) and [rspatial](https://github.com/rspatial) communities. + See [this GitHub issue](https://github.com/ropensci/software-review-meta/issues/47) for relevant discussions. ## Version Control {#version-control} -- Your package source files have to be under version control, more specifically tracked with [Git](https://happygitwithr.com/). You might find the [gert package](https://docs.ropensci.org/gert/) relevant, as well as some of [usethis Git/GitHub related functionality](https://usethis.r-lib.org/reference/index.html#section-git-and-github); you can however use git as you want. +- Your package source files have to be under version control, more specifically tracked with [Git](https://happygitwithr.com/). + You might find the [gert package](https://docs.ropensci.org/gert/) relevant, as well as some of [usethis Git/GitHub related functionality](https://usethis.r-lib.org/reference/index.html#section-git-and-github); you can however use git as you want. -- The default branch name should not be `master`, as this can be offensive to some people. Refer to the [statement of the Git project and the Software Freedom Conservancy](https://sfconservancy.org/news/2020/jun/23/gitbranchname/) for more context. It is general practice to name a default branch `main`, although other names may also be used. See the tidyverse blog post ["Renaming the default branch"](https://www.tidyverse.org/blog/2021/10/renaming-default-branch/) to learn about usethis functionality to help with renaming default branches. +- The default branch name should not be `master`, as this can be offensive to some people. + Refer to the [statement of the Git project and the Software Freedom Conservancy](https://sfconservancy.org/news/2020/jun/23/gitbranchname/) for more context. + It is general practice to name a default branch `main`, although other names may also be used. + See the tidyverse blog post ["Renaming the default branch"](https://www.tidyverse.org/blog/2021/10/renaming-default-branch/) to learn about usethis functionality to help with renaming default branches. -- Make sure to list "scrap" such as `.DS_Store` files in .gitignore. You might find the [`usethis::git_vaccinate()` function](https://usethis.r-lib.org/reference/git_vaccinate.html), and the [gitignore package](https://docs.ropensci.org/gitignore/) relevant. +- Make sure to list "scrap" such as `.DS_Store` files in .gitignore. + You might find the [`usethis::git_vaccinate()` function](https://usethis.r-lib.org/reference/git_vaccinate.html), and the [gitignore package](https://docs.ropensci.org/gitignore/) relevant. - A later section of this book contains some [git workflow tips](#gitflow). @@ -462,18 +569,23 @@ This is a collection of CRAN gotchas that are worth avoiding at the outset. - Make sure your package title is in Title Case. - Do not put a period on the end of your title. -- Do not put 'in R' or 'with R' in your title as this is obvious from packages hosted on CRAN. If you would like this information to be displayed on your website nonetheless, check the [`pkgdown` documentation](https://pkgdown.r-lib.org/reference/build_home.html#yaml-config-home) to learn how to override this. +- Do not put 'in R' or 'with R' in your title as this is obvious from packages hosted on CRAN. + If you would like this information to be displayed on your website nonetheless, check the [`pkgdown` documentation](https://pkgdown.r-lib.org/reference/build_home.html#yaml-config-home) to learn how to override this. - Avoid starting the description with the package name or "This package ...". -- Make sure you include links to websites if you wrap a web API, scrape data from a site, etc. in the `Description` field of your `DESCRIPTION` file. URLs should be enclosed in angle brackets, e.g. ``. +- Make sure you include links to websites if you wrap a web API, scrape data from a site, etc. in the `Description` field of your `DESCRIPTION` file. + URLs should be enclosed in angle brackets, e.g. ``. - In both the `Title` and `Description` fields, the names of packages or other external software must be quoted using single quotes (e.g., *'Rcpp' Integration for the 'Armadillo' Templated Linear Algebra Library*). -- Avoid long running tests and examples. Consider `testthat::skip_on_cran` in tests to skip things that take a long time but still test them locally and on [continuous integration](#ci). +- Avoid long running tests and examples. + Consider `testthat::skip_on_cran` in tests to skip things that take a long time but still test them locally and on [continuous integration](#ci). - Include top-level files such as `paper.md`, continuous integration configuration files, in your `.Rbuildignore` file. For further gotchas, refer to the collaborative list maintained by ThinkR, ["Prepare for CRAN"](https://github.com/ThinkR-open/prepare-for-cran). ### CRAN checks {#cranchecks} -Once your package is on CRAN, it will be [regularly checked on different platforms](https://blog.r-hub.io/2019/04/25/r-devel-linux-x86-64-debian-clang/#cran-checks-101). Failures of such checks, when not false positives, can lead to the CRAN team's reaching out. You can monitor the state of the CRAN checks via +Once your package is on CRAN, it will be [regularly checked on different platforms](https://blog.r-hub.io/2019/04/25/r-devel-linux-x86-64-debian-clang/#cran-checks-101). +Failures of such checks, when not false positives, can lead to the CRAN team's reaching out. +You can monitor the state of the CRAN checks via - the [`foghorn` package](https://fmichonneau.github.io/foghorn/). diff --git a/softwarereview_policies.Rmd b/softwarereview_policies.Rmd index 3f1cb8bc9..699d43596 100644 --- a/softwarereview_policies.Rmd +++ b/softwarereview_policies.Rmd @@ -8,7 +8,8 @@ aliases: ```{block, type="summaryblock"} This chapter contains the policies of rOpenSci Software Peer Review. -In particular, you'll read our policies regarding software peer review itself: the [review submission process](#review-submission) including our [conflict of interest policies](#coi), and the [aims and scope of the Software Peer Review system](#aims-and-scope). This chapter also features our policies regarding [package ownership and maintenance](#ownership-after-softwarereview). +In particular, you'll read our policies regarding software peer review itself: the [review submission process](#review-submission) including our [conflict of interest policies](#coi), and the [aims and scope of the Software Peer Review system](#aims-and-scope). +This chapter also features our policies regarding [package ownership and maintenance](#ownership-after-softwarereview). Last but not least, you'll find the [code of conduct of rOpenSci Software Peer Review](#code-of-conduct). ``` @@ -16,28 +17,31 @@ Last but not least, you'll find the [code of conduct of rOpenSci Software Peer R ## Review process {#policiesreviewprocess} - For a package to be considered for the rOpenSci suite, package authors must initiate a request on the [ropensci/software-review](https://github.com/ropensci/software-review) repository. -- Packages are reviewed for quality, fit, documentation, clarity and the review process is quite similar to a manuscript review (see our [packaging guide](#building) and [reviewing guide](#reviewerguide) for more details). Unlike a manuscript review, this process will be an ongoing conversation. -- Once all major issues and questions, and those addressable with reasonable effort, are resolved, the editor assigned to a package will make a decision (accept, hold, or reject). Rejections are usually done early (before the review process begins, see [the aims and scope section](#aims-and-scope)), but in rare cases a package may also be not onboarded after review \& revision. It is ultimately editor's decision on whether or not to reject the package based on how the reviews are addressed. -- Communication between authors, reviewers and editors will first and foremost take place on GitHub, although you can choose to contact the editor by email or Slack for some issues. When submitting a package, please make sure your GitHub notification settings make it unlikely you will miss a comment. -- The author can choose to have their submission put on hold (editor applies the holding label). The holding status will be revisited every 3 months, and after one year the issue will be closed. -- If the author hasn't requested a holding label, but is simply not responding, we should close the issue within one month after the last contact intent. This intent will include a comment tagging the author, but also an email using the email address listed in the DESCRIPTION of the package which is one of the rare cases where the editor will try to contact the author by email. -- If a submission is closed and the author wishes to re-submit, they'll have to start a new submission. If the package is still in scope, the author will have to respond to the initial reviews before the editor starts looking for new reviewers. +- Packages are reviewed for quality, fit, documentation, clarity and the review process is quite similar to a manuscript review (see our [packaging guide](#building) and [reviewing guide](#reviewerguide) for more details). + Unlike a manuscript review, this process will be an ongoing conversation. +- Once all major issues and questions, and those addressable with reasonable effort, are resolved, the editor assigned to a package will make a decision (accept, hold, or reject). + Rejections are usually done early (before the review process begins, see [the aims and scope section](#aims-and-scope)), but in rare cases a package may also be not onboarded after review \& revision. + It is ultimately editor's decision on whether or not to reject the package based on how the reviews are addressed. +- Communication between authors, reviewers and editors will first and foremost take place on GitHub, although you can choose to contact the editor by email or Slack for some issues. + When submitting a package, please make sure your GitHub notification settings make it unlikely you will miss a comment. +- The author can choose to have their submission put on hold (editor applies the holding label). + The holding status will be revisited every 3 months, and after one year the issue will be closed. +- If the author hasn't requested a holding label, but is simply not responding, we should close the issue within one month after the last contact intent. + This intent will include a comment tagging the author, but also an email using the email address listed in the DESCRIPTION of the package which is one of the rare cases where the editor will try to contact the author by email. +- If a submission is closed and the author wishes to re-submit, they'll have to start a new submission. + If the package is still in scope, the author will have to respond to the initial reviews before the editor starts looking for new reviewers. ### Publishing in other Venues {#publishing-in-other-venues} -- We strongly suggest submitting your package for review *before* publishing - on CRAN or submitting a software paper describing the package to a journal. - Review feedback may result in major improvements and updates to your package, - including renaming and breaking changes to functions. We do not consider - previous publication on CRAN or in other venues sufficient reason to - not adopt reviewer or editor recommendations. -- Do not submit your package for review while it or an associated manuscript - is also under review at another venue, as this may result on conflicting - requests for changes. +- We strongly suggest submitting your package for review *before* publishing on CRAN or submitting a software paper describing the package to a journal. + Review feedback may result in major improvements and updates to your package, including renaming and breaking changes to functions. + We do not consider previous publication on CRAN or in other venues sufficient reason to not adopt reviewer or editor recommendations. +- Do not submit your package for review while it or an associated manuscript is also under review at another venue, as this may result on conflicting requests for changes. ### Conflict of interest for reviewers/editors {#coi} -Following criteria are meant to be a guide for what constitutes a conflict of interest for an editor or reviewer. The potential editor or reviewer has a conflict of interest if: +Following criteria are meant to be a guide for what constitutes a conflict of interest for an editor or reviewer. +The potential editor or reviewer has a conflict of interest if: - The potential reviewer/editor are from the same institution or institutional component (e.g., department) as any author with a major role. - The potential reviewer/editor has been a collaborator or has had other professional relationships with at least one person on the package who has a major role within in the past three years. @@ -50,49 +54,84 @@ In the case where none of the [associate editors](#associateditors) can serve as ## Aims and Scope {#aims-and-scope} -rOpenSci aims to support packages that enable reproducible research and managing the data lifecycle for scientists. Packages submitted to rOpenSci should fit into one or more of the categories outlined either below. Statistical software may also be submitted for peer review, for which we have a separate [set of guidelines and standards](https://stats-devguide.ropensci.org/index.html). The categories below are for general, and not statistical, software, while the remainder of this chapter applies to both kinds of software. If you are unsure whether your package fits into one of the general or statistical categories, please open an issue as a pre-submission inquiry ([**Examples**](https://github.com/ropensci/software-review/issues?q=is%3Aissue+label%3A0%2Fpresubmission)). +rOpenSci aims to support packages that enable reproducible research and managing the data lifecycle for scientists. +Packages submitted to rOpenSci should fit into one or more of the categories outlined either below. +Statistical software may also be submitted for peer review, for which we have a separate [set of guidelines and standards](https://stats-devguide.ropensci.org/index.html). +The categories below are for general, and not statistical, software, while the remainder of this chapter applies to both kinds of software. +If you are unsure whether your package fits into one of the general or statistical categories, please open an issue as a pre-submission inquiry ([**Examples**](https://github.com/ropensci/software-review/issues?q=is%3Aissue+label%3A0%2Fpresubmission)). -As this is a living document, these categories may change through time and not all previously onboarded packages would be in-scope today. For instance, data visualization packages are no longer in-scope. While we strive to be consistent, we evaluate packages on a case-by-case basis and may make exceptions. +As this is a living document, these categories may change through time and not all previously onboarded packages would be in-scope today. +For instance, data visualization packages are no longer in-scope. +While we strive to be consistent, we evaluate packages on a case-by-case basis and may make exceptions. -Note that not all rOpenSci projects and packages are in-scope or go through peer review. Projects developed by [staff](https://ropensci.org/about/#team) or at conferences may be experimental, exploratory, address core infrastructure priorities and thus not fall into these categories. Look for the peer-review badge - see below - to identify peer-reviewed packages in the rOpenSci repository. +Note that not all rOpenSci projects and packages are in-scope or go through peer review. +Projects developed by [staff](https://ropensci.org/about/#team) or at conferences may be experimental, exploratory, address core infrastructure priorities and thus not fall into these categories. +Look for the peer-review badge - see below - to identify peer-reviewed packages in the rOpenSci repository. ![example of a green peer-reviewed badge](images/status.png) ### Package categories {#package-categories} -- **data retrieval**: Packages for accessing and downloading data from online sources with scientific applications. Our definition of scientific applications is broad, including data storage services, journals, and other remote servers, as many data sources may be of interest to researchers. However, retrieval packages should be focused on data *sources* / *topics*, rather than *services*. For example a general client for Amazon Web Services data storage would not be in-scope. (Examples: [**rotl**](https://github.com/ropensci/software-review/issues/17), - [**gutenbergr**](https://github.com/ropensci/software-review/issues/41)) +- **data retrieval**: Packages for accessing and downloading data from online sources with scientific applications. + Our definition of scientific applications is broad, including data storage services, journals, and other remote servers, as many data sources may be of interest to researchers. + However, retrieval packages should be focused on data *sources* / *topics*, rather than *services*. + For example a general client for Amazon Web Services data storage would not be in-scope. + (Examples: [**rotl**](https://github.com/ropensci/software-review/issues/17), [**gutenbergr**](https://github.com/ropensci/software-review/issues/41)) -- **data extraction**: Packages that aid in retrieving data from unstructured sources such as text, images and PDFs, as well as parsing scientific data types and outputs from scientific equipment. Statistical/ML libraries for modeling or prediction are typically not included in this category, nor are code parsers. Trained models that act as utilities (e.g., for optical character recognition), may qualify. (Examples: [**tabulizer**](https://github.com/ropensci/software-review/issues/42) for extracting tables from PDF documents, [**genbankr**](https://github.com/ropensci/software-review/issues/47) for parsing files from GenBank, [**treeio**](https://github.com/ropensci/software-review/issues/179) for phylogentic reading in phylogentic tree files, [**lightr**](https://github.com/ropensci/software-review/issues/267) for parsing files from spectroscopic instruments)) +- **data extraction**: Packages that aid in retrieving data from unstructured sources such as text, images and PDFs, as well as parsing scientific data types and outputs from scientific equipment. + Statistical/ML libraries for modeling or prediction are typically not included in this category, nor are code parsers. + Trained models that act as utilities (e.g., for optical character recognition), may qualify. + (Examples: [**tabulizer**](https://github.com/ropensci/software-review/issues/42) for extracting tables from PDF documents, [**genbankr**](https://github.com/ropensci/software-review/issues/47) for parsing files from GenBank, [**treeio**](https://github.com/ropensci/software-review/issues/179) for phylogentic reading in phylogentic tree files, [**lightr**](https://github.com/ropensci/software-review/issues/267) for parsing files from spectroscopic instruments)) -- **data munging**: Packages for processing data from formats above. This area does not include broad data manipulations tools such as **reshape2** or **tidyr**, or tools for extracting data from R code itself. Rather, it focuses on tools for handling data in specific scientific formats generated from scientific workflows or exported from scientific instruments. (Examples: [**plateR**](https://github.com/ropensci/software-review/issues/60) for reading in data structured as plate maps for scientific instruments, or [**phonfieldwork**](https://github.com/ropensci/software-review/issues/385) for processing annotated audio files for phonics research) +- **data munging**: Packages for processing data from formats above. + This area does not include broad data manipulations tools such as **reshape2** or **tidyr**, or tools for extracting data from R code itself. + Rather, it focuses on tools for handling data in specific scientific formats generated from scientific workflows or exported from scientific instruments. + (Examples: [**plateR**](https://github.com/ropensci/software-review/issues/60) for reading in data structured as plate maps for scientific instruments, or [**phonfieldwork**](https://github.com/ropensci/software-review/issues/385) for processing annotated audio files for phonics research) - **data deposition**: Packages that support deposition of data into research repositories, including data formatting and metadata generation. (Example: [**EML**](https://github.com/ropensci/software-review/issues/80)) -- **data validation and testing**: Tools that enable automated validation and checking of data quality and completeness as part of scientific workflows. (Example: [**assertr**](https://github.com/ropensci/software-review/issues/23)) +- **data validation and testing**: Tools that enable automated validation and checking of data quality and completeness as part of scientific workflows. + (Example: [**assertr**](https://github.com/ropensci/software-review/issues/23)) -- **workflow automation**: Tools that automate and link together workflows, such as build systems and tools to manage continuous integration. Does not include general tools for literate programming. (e.g., R markdown extensions not under the previous topics). (Example: [**drake**](https://github.com/ropensci/software-review/issues/156)) +- **workflow automation**: Tools that automate and link together workflows, such as build systems and tools to manage continuous integration. + Does not include general tools for literate programming. + (Example, R markdown extensions not under the previous topics; [**drake**](https://github.com/ropensci/software-review/issues/156)) -- **version control**: Tools that facilitate the use of version control in scientific workflows. Note that this does not include all tools that interact with online version control services (e.g., GitHub), unless they fit into another category. (Example: [**git2rdata**](https://github.com/ropensci/software-review/issues/263)) +- **version control**: Tools that facilitate the use of version control in scientific workflows. + Note that this does not include all tools that interact with online version control services (e.g., GitHub), unless they fit into another category. + (Example: [**git2rdata**](https://github.com/ropensci/software-review/issues/263)) -- **citation management and bibliometrics**: Tools that facilitate managing references, such as for writing manuscripts, creating CVs or otherwise attributing scientific contributions, or accessing, manipulating or otherwise working with bibliometric data. (Example: [**RefManageR**](https://github.com/ropensci/software-review/issues/119)) +- **citation management and bibliometrics**: Tools that facilitate managing references, such as for writing manuscripts, creating CVs or otherwise attributing scientific contributions, or accessing, manipulating or otherwise working with bibliometric data. + (Example: [**RefManageR**](https://github.com/ropensci/software-review/issues/119)) -- **scientific software wrappers**: Packages that wrap non-R utility programs used for scientific research. These programs must be specific to research fields, not general computing utilities. Wrappers must be non-trivial, in that there must be significant added value above simple `system()` calls or bindings, whether in parsing inputs and outputs, data handling, etc. Improved installation process, or extension of compatibility to more platforms, may constitute added value if installation is complex. This does not include wrappers of other R packages or C/C++ libraries that can be included in R packages. It also does not include packages that are clients for web APIs, which must fall into one of the other categories. We strongly encourage wrapping open-source and open-licensed utilities - exceptions will be evaluated case-by-case, considering whether open-source options exist. (Examples: [**babette**](https://github.com/ropensci/software-review/issues/208), [**nlrx**](https://github.com/ropensci/software-review/issues/262)) +- **scientific software wrappers**: Packages that wrap non-R utility programs used for scientific research. + These programs must be specific to research fields, not general computing utilities. + Wrappers must be non-trivial, in that there must be significant added value above simple `system()` calls or bindings, whether in parsing inputs and outputs, data handling, etc. + Improved installation process, or extension of compatibility to more platforms, may constitute added value if installation is complex. + This does not include wrappers of other R packages or C/C++ libraries that can be included in R packages. + It also does not include packages that are clients for web APIs, which must fall into one of the other categories. + We strongly encourage wrapping open-source and open-licensed utilities - exceptions will be evaluated case-by-case, considering whether open-source options exist. + (Examples: [**babette**](https://github.com/ropensci/software-review/issues/208), [**nlrx**](https://github.com/ropensci/software-review/issues/262)) -- **field and laboratory reproducibility tools**: Packages that improve reproducibility of real-world workflows through standardization and automation of field and lab protocols, such as sample tracking and tagging, form and data sheet generation, interfacing with laboratory equipment or information systems, and executing experimental designs. (Example: [**baRcodeR**](https://github.com/ropensci/software-review/issues/336)) +- **field and laboratory reproducibility tools**: Packages that improve reproducibility of real-world workflows through standardization and automation of field and lab protocols, such as sample tracking and tagging, form and data sheet generation, interfacing with laboratory equipment or information systems, and executing experimental designs. + (Example: [**baRcodeR**](https://github.com/ropensci/software-review/issues/336)) - **database software bindings**: Bindings and wrappers for generic database APIs (Example: [**rrlite**](https://github.com/ropensci/software-review/issues/6)) In addition, we have some *specialty topics* with a slightly broader scope. -- **geospatial data**: We accept packages focused on accessing geospatial data, manipulating geospatial data, and converting between geospatial data formats. (Examples: [**osmplotr**](https://github.com/ropensci/software-review/issues/27), [**tidync**](https://github.com/ropensci/software-review/issues/174)). +- **geospatial data**: We accept packages focused on accessing geospatial data, manipulating geospatial data, and converting between geospatial data formats. + (Examples: [**osmplotr**](https://github.com/ropensci/software-review/issues/27), [**tidync**](https://github.com/ropensci/software-review/issues/174)). -- **translation**: As part of our work in [multilingual publishing](https://ropensci.org/multilingual-publishing/), we have a special interest in packages that facilitate the translation and publication of scientific and programming resources into multiple (human) languages so they are accessible to larger and more diverse audiences. These could include interfaces to automated translation programs, frameworks for managing documentation in multiple languages, or programs accessing specialized linguistic resources. This is a new and experimental scope, so please open a [pre-submission inquiry](https://github.com/ropensci/software-review/issues/new/choose) if you are interested in submitting a package in this category. +- **translation**: As part of our work in [multilingual publishing](https://ropensci.org/multilingual-publishing/), we have a special interest in packages that facilitate the translation and publication of scientific and programming resources into multiple (human) languages so they are accessible to larger and more diverse audiences. + These could include interfaces to automated translation programs, frameworks for managing documentation in multiple languages, or programs accessing specialized linguistic resources. + This is a new and experimental scope, so please open a [pre-submission inquiry](https://github.com/ropensci/software-review/issues/new/choose) if you are interested in submitting a package in this category. ### Other scope considerations {#other-scope-considerations} -Packages should be *general* in the sense that they should solve a problem as broadly as possible while maintaining a coherent user interface and code base. For instance, if several data sources use an identical API, we prefer a package that provides access to all the data sources, rather than just one. +Packages should be *general* in the sense that they should solve a problem as broadly as possible while maintaining a coherent user interface and code base. +For instance, if several data sources use an identical API, we prefer a package that provides access to all the data sources, rather than just one. Packages that include interactive tools to facilitate researcher workflows (e.g., shiny apps) must have a mechanism to make the interactive workflow reproducible, such as code generation or a scriptable API. @@ -102,14 +141,17 @@ Note that the packages developed internally by rOpenSci, through our events or t ### Package overlap {#overlap} -rOpenSci encourages competition among packages, forking and re-implementation as they improve options of users overall. However, as we want packages in the rOpenSci suite to be our top recommendations for the tasks they perform, we aim to avoid duplication of functionality of existing R packages in any repo without significant improvements. An R package that replicates the functionality of an existing R package may be considered for inclusion in the rOpenSci suite if it significantly improves on alternatives in any repository (RO, CRAN, BioC) by being: +rOpenSci encourages competition among packages, forking and re-implementation as they improve options of users overall. +However, as we want packages in the rOpenSci suite to be our top recommendations for the tasks they perform, we aim to avoid duplication of functionality of existing R packages in any repo without significant improvements. +An R package that replicates the functionality of an existing R package may be considered for inclusion in the rOpenSci suite if it significantly improves on alternatives in any repository (RO, CRAN, BioC) by being: - More open in licensing or development practices - Broader in functionality (e.g., providing access to more data sets, providing a greater suite of functions), but not only by duplicating additional packages - Better in usability and performance - Actively maintained while alternatives are poorly or no longer actively maintained -These factors should be considered *as a whole* to determine if the package is a significant improvement. A new package would not meet this standard only by following our package guidelines while others do not, unless this leads to a significant difference in the areas above. +These factors should be considered *as a whole* to determine if the package is a significant improvement. +A new package would not meet this standard only by following our package guidelines while others do not, unless this leads to a significant difference in the areas above. We recommend that packages highlight differences from and improvements over overlapping packages in their README and/or vignettes. @@ -119,7 +161,10 @@ We encourage developers whose packages are not accepted due to overlap to still ### Role of the rOpenSci team {#role-of-the-ropensci-team} -Authors of contributed packages essentially maintain the same ownership they had prior to their package joining the rOpenSci suite. Package authors will continue to maintain and develop their software after acceptance into rOpenSci. Unless explicitly added as collaborators, the rOpenSci team will not interfere much with day to day operations. However, this team may intervene with critical bug fixes, or address urgent issues if package authors do not respond in a timely manner (see [the section about maintainer responsiveness](#maintainer-responsiveness)). +Authors of contributed packages essentially maintain the same ownership they had prior to their package joining the rOpenSci suite. +Package authors will continue to maintain and develop their software after acceptance into rOpenSci. +Unless explicitly added as collaborators, the rOpenSci team will not interfere much with day to day operations. +However, this team may intervene with critical bug fixes, or address urgent issues if package authors do not respond in a timely manner (see [the section about maintainer responsiveness](#maintainer-responsiveness)). ### Maintainer responsiveness {#maintainer-responsiveness} @@ -134,15 +179,19 @@ The above is a bit vague, so the following are a few areas of consideration. - Package `hello` is not on CRAN, or on CRAN, but has no reverse dependencies. - Package `world` needs some fixes. The maintainer has responded but is simply very busy with a new job, or other reason, and will attend to soon. -We urge package maintainers to make sure they are receiving GitHub notifications, as well as making sure emails from rOpenSci staff and CRAN maintainers are not going to their spam box. Authors of onboarded packages will be invited to the rOpenSci Slack to chat with the rOpenSci team and the greater rOpenSci community. Anyone can also discuss with the rOpenSci community on the [rOpenSci discussion forum](https://discuss.ropensci.org/). +We urge package maintainers to make sure they are receiving GitHub notifications, as well as making sure emails from rOpenSci staff and CRAN maintainers are not going to their spam box. +Authors of onboarded packages will be invited to the rOpenSci Slack to chat with the rOpenSci team and the greater rOpenSci community. +Anyone can also discuss with the rOpenSci community on the [rOpenSci discussion forum](https://discuss.ropensci.org/). Should authors abandon the maintenance of an actively used package in our suite, we will consider petitioning CRAN to transfer package maintainer status to rOpenSci. ### Quality commitment {#quality-commitment} -rOpenSci strives to develop and promote high quality research software. To ensure that your software meets our criteria, we review all of our submissions as part of the Software Peer Review process, and even after acceptance will continue to step in with improvements and bug fixes. +rOpenSci strives to develop and promote high quality research software. +To ensure that your software meets our criteria, we review all of our submissions as part of the Software Peer Review process, and even after acceptance will continue to step in with improvements and bug fixes. -Despite our best efforts to support contributed software, errors are the responsibility of individual maintainers. Buggy, unmaintained software may be removed from our suite at any time. +Despite our best efforts to support contributed software, errors are the responsibility of individual maintainers. +Buggy, unmaintained software may be removed from our suite at any time. ### Package removal {#package-removal} @@ -150,134 +199,83 @@ In the unlikely scenario that a contributor of a package requests removal of the ## Ethics, Data Privacy and Human Subjects Research {#ethics-data-privacy-and-human-subjects-research} -rOpenSci packages and other tools are used for a variety of purposes, but our focus is on -tools for research. We expect that tools will enable ethical use by research -practitioners, who are obligated to adhere to ethical codes such [Declaration of -Helsinki](https://www.wma.net/policies-post/wma-declaration-of-helsinki-ethical-principles-for-medical-research-involving-human-subjects/) -and [The Belmont -Report](https://www.hhs.gov/ohrp/regulations-and-policy/belmont-report/index.html). -Researchers bear responsibility for their use of software, but software -developers must consider the ethical use of their products, and developers -themselves adhere to ethical codes for computer professionals such as those -expressed by [IEEE](https://www.computer.org/education/code-of-ethics) and -[ACM](https://ethics.acm.org/). rOpenSci contributors often play both the role -of both researcher and developer. - -We ask that software developers place themselves in researchers' role and -consider the requirements of an ethical workflow using authors' software. -Given the variation and degree of flux of ethical approaches for Internet-based -analyses, judgement calls rather than recipes are required. The [Ethical -Guidelines of The Association of Internet Researchers](https://aoir.org/ethics/) -provides a robust framework and we encourage authors, editors, and reviewers to use -this in evaluating their work. In general, adherence to legal or regulatory -minimum requirements may not be sufficient, though these (e.g., -GDPR), may be relevant. Package authors should direct -users to relevant resources for the ethical use of the software. - -Some packages, due to the nature of data they handle, may be determined by editors to require enhanced scrutiny. For these, editors may require additional (or reduced) functionality, and robust -documentation, defaults, and warnings to direct users to relevant ethical -practices. The following topics may merit enhanced scrutiny: - -- ***Vulnerable populations***: Authors of packages and workflows that deal with - information related to vulnerable populations bear responsibility to protect - them from likely harms. - -- ***Personally identifiable or sensitive data***: The release of personally - identifiable or sensitive data is potentially harmful. This includes "reasonably - re-identifiable" data - which a motivated individual could trace back to the - owner or creator even if the data are anonymized. This includes both cases where identifiers - (e.g., name, date of birth) are available as part of data, and also if unique - pseudonyms/screen names are linked with full-text posts, through which one - can link back individuals through cross-reference with other data sets. - -While the best response to ethical concerns will be context-specific, these -general guidelines should be followed by packages where the challenges above arise: - -- Packages should adhere to data source's terms of use, as expressed in - website Terms and Conditions, ["robots.txt"](https://docs.ropensci.org/robotstxt/) files, privacy policies, and - other relevant restrictions, and link to them prominently in package - documentation. Packages should provide or document functionality to adhere - to such restrictions (e.g., scrape from only allowed endpoints, use - appropriate rate limiting in code, examples, or vignettes). Note that while Terms - and Conditions, Privacy Policies, etc., may not provide sufficient bounds on - ethical usage, they can provide an outer bound. - -- A key tool in addressing the risks posed in studying vulnerable populations or - using personally identifiable data is ***informed consent***. Package authors - should support users' acquisition of informed consent when relevant. This may - include providing links to data source's preferred method of acquiring consent, - contact information of data providers (e.g. forum moderators), documentation - of informed consent protocols, or getting pre-approval for general uses of a - package. +rOpenSci packages and other tools are used for a variety of purposes, but our focus is on tools for research. +We expect that tools will enable ethical use by research practitioners, who are obligated to adhere to ethical codes such [Declaration of Helsinki](https://www.wma.net/policies-post/wma-declaration-of-helsinki-ethical-principles-for-medical-research-involving-human-subjects/) and [The Belmont Report](https://www.hhs.gov/ohrp/regulations-and-policy/belmont-report/index.html). Researchers bear responsibility for their use of software, but software developers must consider the ethical use of their products, and developers themselves adhere to ethical codes for computer professionals such as those expressed by [IEEE](https://www.computer.org/education/code-of-ethics) and [ACM](https://ethics.acm.org/). +rOpenSci contributors often play both the role of both researcher and developer. + +We ask that software developers place themselves in researchers' role and consider the requirements of an ethical workflow using authors' software. +Given the variation and degree of flux of ethical approaches for Internet-based analyses, judgement calls rather than recipes are required. The [Ethical Guidelines of The Association of Internet Researchers](https://aoir.org/ethics/) provides a robust framework and we encourage authors, editors, and reviewers to use this in evaluating their work. +In general, adherence to legal or regulatory minimum requirements may not be sufficient, though these (e.g., GDPR), may be relevant. +Package authors should direct users to relevant resources for the ethical use of the software. + +Some packages, due to the nature of data they handle, may be determined by editors to require enhanced scrutiny. +For these, editors may require additional (or reduced) functionality, and robust documentation, defaults, and warnings to direct users to relevant ethical practices. +The following topics may merit enhanced scrutiny: + +- ***Vulnerable populations***: Authors of packages and workflows that deal with information related to vulnerable populations bear responsibility to protect them from likely harms. + +- ***Personally identifiable or sensitive data***: The release of personally identifiable or sensitive data is potentially harmful. + This includes "reasonably re-identifiable" data - which a motivated individual could trace back to the owner or creator even if the data are anonymized. + This includes both cases where identifiers (e.g., name, date of birth) are available as part of data, and also if unique pseudonyms/screen names are linked with full-text posts, through which one can link back individuals through cross-reference with other data sets. + +While the best response to ethical concerns will be context-specific, these general guidelines should be followed by packages where the challenges above arise: + +- Packages should adhere to data source's terms of use, as expressed in website Terms and Conditions, ["robots.txt"](https://docs.ropensci.org/robotstxt/) files, privacy policies, and other relevant restrictions, and link to them prominently in package documentation. + Packages should provide or document functionality to adhere to such restrictions (e.g., scrape from only allowed endpoints, use appropriate rate limiting in code, examples, or vignettes). + Note that while Terms and Conditions, Privacy Policies, etc., may not provide sufficient bounds on ethical usage, they can provide an outer bound. + +- A key tool in addressing the risks posed in studying vulnerable populations or using personally identifiable data is ***informed consent***. + Package authors should support users' acquisition of informed consent when relevant. + This may include providing links to data source's preferred method of acquiring consent, contact information of data providers (e.g. forum moderators), documentation of informed consent protocols, or getting pre-approval for general uses of a package. - Note that consent is not implicitly granted just because data are accessible. - Accessible data are not necessarily public, as different persons and contexts - have different normative expectations of privacy (see work by [Social Data - Lab](http://socialdatalab.net/ethics-resources)). - -- Packages accessing personally identifiable information should take special - care to follow \[security best practices\]\[Package Development Security Best Practices\] - (e.g., exclusive use of secure internet protocols, strong mechanisms for - storing credentials, etc.). - -- Packages that access or handle personally identifiable or sensitive data - should enable, document, and demonstrate workflows for de-identification, - secure storage, other best practices to minimize risk of harm. - -As standards for data privacy and research continue to evolve, we welcome input -from authors on considerations specific to their software and supplemental -documentation such as approval from university ethics review boards. These -may be attached to issue threads of package submissions or pre-submission inquiries, -or conveyed directly to editors if needed. General -suggestions may be filed as [issues in this book's repository](https://github.com/ropensci/dev_guide/issues). + Note that consent is not implicitly granted just because data are accessible. Accessible data are not necessarily public, as different persons and contexts have different normative expectations of privacy (see work by [Social Data Lab](http://socialdatalab.net/ethics-resources)). + +- Packages accessing personally identifiable information should take special care to follow \[security best practices\]\[Package Development Security Best Practices\] (e.g., exclusive use of secure internet protocols, strong mechanisms for storing credentials, etc.). + +- Packages that access or handle personally identifiable or sensitive data should enable, document, and demonstrate workflows for de-identification, secure storage, other best practices to minimize risk of harm. + +As standards for data privacy and research continue to evolve, we welcome input from authors on considerations specific to their software and supplemental documentation such as approval from university ethics review boards. +These may be attached to issue threads of package submissions or pre-submission inquiries, or conveyed directly to editors if needed. +General suggestions may be filed as [issues in this book's repository](https://github.com/ropensci/dev_guide/issues). ### Resources {#resources} -The following resources may be helpful for researchers, package authors, editors -and reviewers in addressing ethical questions related to privacy and research software. - -- The [Declaration of - Helsinki](https://www.wma.net/policies-post/wma-declaration-of-helsinki-ethical-principles-for-medical-research-involving-human-subjects/) - and [The Belmont - Report](https://www.hhs.gov/ohrp/regulations-and-policy/belmont-report/index.html) - provide fundamental principles for ethical practice by researchers. -- Several organizations provide guidance on how to translate these principles - into the context of internet research. These include the [Ethical - Guidelines of The Association of Internet - Researchers](https://aoir.org/ethics/), the [NESH Guide to Internet - Research - Ethics](https://www.forskningsetikk.no/en/guidelines/social-sciences-humanities-law-and-theology/a-guide-to-internet-research-ethics/), - and [BPS' Ethics Guidelines for Internet-Mediated - Research](https://www.bps.org.uk/news-and-policy/ethics-guidelines-internet-mediated-research-2017). - [Anabo et al (2019)](https://doi.org/10.1007/s10676-018-9495-z) provide a - helpful overview of these. -- The Social Media Lab provides a [high-level - overview](http://socialdatalab.net/ethics-resources) with data on normative - expectations of privacy and use on social forums. -- Bechmann A., Kim J.Y. (2019) Big Data: A Focus on Social Media Research - Dilemmas. In: Iphofen R. (eds) Handbook of Research Ethics and Scientific - Integrity. [https://doi.org/10.1007/978-3-319-76040-7\_18-1](https://doi.org/10.1007/978-3-319-76040-7_18-1) -- Chu, K.-H., Colditz, J., Sidani, J., Zimmer, M., \& Primack, B. (2021). Re-evaluating standards of - human subjects protection for sensitive health data in social media networks. +The following resources may be helpful for researchers, package authors, editors and reviewers in addressing ethical questions related to privacy and research software. + +- The [Declaration of Helsinki](https://www.wma.net/policies-post/wma-declaration-of-helsinki-ethical-principles-for-medical-research-involving-human-subjects/) and [The Belmont Report](https://www.hhs.gov/ohrp/regulations-and-policy/belmont-report/index.html) provide fundamental principles for ethical practice by researchers. +- Several organizations provide guidance on how to translate these principles into the context of internet research. + These include the [Ethical Guidelines of The Association of Internet Researchers](https://aoir.org/ethics/), the [NESH Guide to Internet Research Ethics](https://www.forskningsetikk.no/en/guidelines/social-sciences-humanities-law-and-theology/a-guide-to-internet-research-ethics/), and [BPS' Ethics Guidelines for Internet-Mediated Research](https://www.bps.org.uk/news-and-policy/ethics-guidelines-internet-mediated-research-2017). + [Anabo et al (2019)](https://doi.org/10.1007/s10676-018-9495-z) provide a helpful overview of these. +- The Social Media Lab provides a [high-level overview](http://socialdatalab.net/ethics-resources) with data on normative expectations of privacy and use on social forums. +- Bechmann A., Kim J.Y. (2019) + Big Data: A Focus on Social Media Research Dilemmas. + In: Iphofen R. (eds) Handbook of Research Ethics and Scientific Integrity. [https://doi.org/10.1007/978-3-319-76040-7\_18-1](https://doi.org/10.1007/978-3-319-76040-7_18-1) +- Chu, K.-H., Colditz, J., Sidani, J., Zimmer, M., \& Primack, B. (2021). + Re-evaluating standards of human subjects protection for sensitive health data in social media networks. Social Networks, 67, 41–46. [https://dx.doi.org/10.1016/j.socnet.2019.10.010](https://dx.doi.org/10.1016/j.socnet.2019.10.010) -- Lomborg, S., \& Bechmann, A. (2014). Using APIs for Data Collection on Social - Media. The Information Society, 30(4), 256--265. +- Lomborg, S., \& Bechmann, A. (2014). + Using APIs for Data Collection on Social Media. + The Information Society, 30(4), 256--265. [https://dx.doi.org/10.1080/01972243.2014.915276](https://dx.doi.org/10.1080/01972243.2014.915276) -- Flick, C. (2016). Informed consent and the Facebook emotional manipulation - study. *Research Ethics*, *12*(1), 14--28. - [https://doi.org/10.1177/1747016115599568](https://doi.org/10.1177/1747016115599568) -- Sugiura, L., Wiles, R., \& Pope, C. (2017). Ethical challenges in online - research: Public/private perceptions. *Research Ethics*, *13*(3--4), +- Flick, C. (2016). + Informed consent and the Facebook emotional manipulation study. + *Research Ethics*, *12*(1), 14--28. [https://doi.org/10.1177/1747016115599568](https://doi.org/10.1177/1747016115599568) +- Sugiura, L., Wiles, R., \& Pope, C. (2017). + Ethical challenges in online research: Public/private perceptions. + *Research Ethics*, *13*(3--4), 184--199. [https://doi.org/10.1177/1747016116650720](https://doi.org/10.1177/1747016116650720) -- Taylor, J., \& Pagliari, C. (2018). Mining social media data: How are - research sponsors and researchers addressing the ethical challenges? +- Taylor, J., \& Pagliari, C. (2018). + Mining social media data: How are research sponsors and researchers addressing the ethical challenges? Research Ethics, 14(2), 1--39. [https://doi.org/10.1177/1747016117738559](https://doi.org/10.1177/1747016117738559) -- Zimmer, M. (2010). "But the data is already public": on the ethics of - research in Facebook. Ethics and Information Technology, 12(4), 313--325. - [https://dx.doi.org/10.1007/s10676-010-9227-5](https://dx.doi.org/10.1007/s10676-010-9227-5) +- Zimmer, M. (2010). + "But the data is already public": on the ethics of research in Facebook. + Ethics and Information Technology, 12(4), 313--325. [https://dx.doi.org/10.1007/s10676-010-9227-5](https://dx.doi.org/10.1007/s10676-010-9227-5) ## Code of Conduct {#code-of-conduct} -rOpenSci's community is our best asset. Whether you're a regular contributor or a newcomer, we care about making this a safe place for you and we've got your back. We have a Code of Conduct that applies to all people participating in the rOpenSci community, including rOpenSci staff and leadership and to all modes of interaction online or in person. The [Code of Conduct](https://ropensci.org/code-of-conduct/) is maintained on the rOpenSci website. +rOpenSci's community is our best asset. +Whether you're a regular contributor or a newcomer, we care about making this a safe place for you and we've got your back. +We have a Code of Conduct that applies to all people participating in the rOpenSci community, including rOpenSci staff and leadership and to all modes of interaction online or in person. +The [Code of Conduct](https://ropensci.org/code-of-conduct/) is maintained on the rOpenSci website. From 5cc950430d8f20add6fa85be2ab8447e3bfbbc73 Mon Sep 17 00:00:00 2001 From: mpadge Date: Tue, 11 Mar 2025 13:15:27 +0100 Subject: [PATCH 2/3] consistent linebreaks in pkg_ci/security --- pkg_ci.Rmd | 34 +++++++++++++++++++++++++--------- pkg_security.Rmd | 15 ++++++++++----- 2 files changed, 35 insertions(+), 14 deletions(-) diff --git a/pkg_ci.Rmd b/pkg_ci.Rmd index 4b22bc007..da8942f55 100644 --- a/pkg_ci.Rmd +++ b/pkg_ci.Rmd @@ -42,15 +42,23 @@ R packages should have CI for all operating systems (Linux, Mac OSX, Windows) wh - Anything with file system / path calls -In case of any doubt regarding the applicability of these criteria to your package, it's better to add CI for all operating systems. Most CI services standards setups for R packages allow this with not much hassle. +In case of any doubt regarding the applicability of these criteria to your package, it's better to add CI for all operating systems. +Most CI services standards setups for R packages allow this with not much hassle. ## Which continuous integration service(s)? {#whichci} -There are a number of continuous integration services, including standalone services (CircleCI, AppVeyor), and others integrated into code hosting or related services (GitHub Actions, GitLab, AWS Code Pipeline). Different services support different operating system configurations. +There are a number of continuous integration services, including standalone services (CircleCI, AppVeyor), and others integrated into code hosting or related services (GitHub Actions, GitLab, AWS Code Pipeline). +Different services support different operating system configurations. -[GitHub Actions](https://github.com/features/actions) is a convenient option for many R developers who already use GitHub as it is integrated into the platform and supports all needed operating Systems. There are [actions supported for the R ecosystem](https://github.com/r-lib/actions/), as well and first-class support in the [{usethis}](https://usethis.r-lib.org/reference/github_actions.html) package. All packages submitted to rOpenSci for peer review are checked by our own [`pkgcheck` system](https://docs.ropensci.org/pkgcheck), described further in the [Guide for Authors](#authors-guide). These checks are also provided as a GitHub Action in the [`ropensci-review-tools/pkgcheck-action` repository](https://github.com/ropensci-review-tools/pkgcheck-action). Packages authors are encouraged to use that action to confirm prior to submission that a package passes all of our checks. See [our blog post](https://ropensci.org/blog/2022/02/01/pkgcheck-action/) for more information. +[GitHub Actions](https://github.com/features/actions) is a convenient option for many R developers who already use GitHub as it is integrated into the platform and supports all needed operating Systems. +There are [actions supported for the R ecosystem](https://github.com/r-lib/actions/), as well and first-class support in the [{usethis}](https://usethis.r-lib.org/reference/github_actions.html) package. +All packages submitted to rOpenSci for peer review are checked by our own [`pkgcheck` system](https://docs.ropensci.org/pkgcheck), described further in the [Guide for Authors](#authors-guide). +These checks are also provided as a GitHub Action in the [`ropensci-review-tools/pkgcheck-action` repository](https://github.com/ropensci-review-tools/pkgcheck-action). +Packages authors are encouraged to use that action to confirm prior to submission that a package passes all of our checks. +See [our blog post](https://ropensci.org/blog/2022/02/01/pkgcheck-action/) for more information. -[usethis supports CI setup for other systems](https://usethis.r-lib.org/reference/ci.html), though these functions are soft-deprecated. rOpenSci also supports the [circle](https://docs.ropensci.org/circle/) package, which aids in setting up CircleCI pipelines, and the [tic](https://docs.ropensci.org/tic/) package for building more complicated CI pipelines. +[usethis supports CI setup for other systems](https://usethis.r-lib.org/reference/ci.html), though these functions are soft-deprecated. +rOpenSci also supports the [circle](https://docs.ropensci.org/circle/) package, which aids in setting up CircleCI pipelines, and the [tic](https://docs.ropensci.org/tic/) package for building more complicated CI pipelines. #### Testing using different versions of R {#testing-using-different-versions-of-r} @@ -66,7 +74,8 @@ If you develop a package depending on or intended for Bioconductor, you might fi You can use these tips to minimize build time on CI: -- Cache installation of packages. The default [r-lib/actions workflows](https://github.com/r-lib/actions) do this. +- Cache installation of packages. + The default [r-lib/actions workflows](https://github.com/r-lib/actions) do this. #### System dependencies {#sysdeps-ci} @@ -78,11 +87,14 @@ We recommend [moving away from Travis](https://ropensci.org/technotes/2020/11/19 ### AppVeyor CI (Windows) {#app-veyor-ci-windows} -For continuous integration on Windows, see [R + AppVeyor](https://github.com/krlmlr/r-appveyor). Set it up using `usethis::use_appveyor()`. +For continuous integration on Windows, see [R + AppVeyor](https://github.com/krlmlr/r-appveyor). +Set it up using `usethis::use_appveyor()`. Here are tips to minimize AppVeyor build time: -- Cache installation of packages. [Example in a config file](https://github.com/r-lib/usethis/blob/2c52c06373849d52f78a26c5a0e080f518a2f825/inst/templates/appveyor.yml#L13). It'll already be in the config file if you set AppVeyor CI up using `usethis::use_appveyor()`. +- Cache installation of packages. + [Example in a config file](https://github.com/r-lib/usethis/blob/2c52c06373849d52f78a26c5a0e080f518a2f825/inst/templates/appveyor.yml#L13). + It'll already be in the config file if you set AppVeyor CI up using `usethis::use_appveyor()`. - Enable [rolling builds](https://www.appveyor.com/docs/build-configuration/#rolling-builds). @@ -111,11 +123,15 @@ If you run coverage on several CI services [the results will be merged](https:// ## Even more CI: OpenCPU {#even-more-ci-open-cpu} -After transfer to rOpenSci's "ropensci" GitHub organization, each push to the repo will be built on OpenCPU and the person committing will receive a notification email. This is an additional CI service for package authors that allows for R functions in packages to be called remotely via [https://ropensci.ocpu.io/](https://ropensci.ocpu.io/) using the [opencpu API](https://www.opencpu.org/api.html#api-json). For more details about this service, consult the OpenCPU [help page](https://www.opencpu.org/help.html) that also indicates where to ask questions. +After transfer to rOpenSci's "ropensci" GitHub organization, each push to the repo will be built on OpenCPU and the person committing will receive a notification email. +This is an additional CI service for package authors that allows for R functions in packages to be called remotely via [https://ropensci.ocpu.io/](https://ropensci.ocpu.io/) using the [opencpu API](https://www.opencpu.org/api.html#api-json). +For more details about this service, consult the OpenCPU [help page](https://www.opencpu.org/help.html) that also indicates where to ask questions. ## Even more CI: rOpenSci docs {#rodocsci} -After transfer to rOpenSci's "ropensci" GitHub organization, a pkgdown website will be built for your package after each push to the GitHub repo. You can find the status of these builds at `https://ropensci.r-universe.dev/ui#packages` and in the [commit status](https://ropensci.org/blog/2021/09/03/runiverse-docs/#how-it-works). The website build will use your pkgdown config file if you have one, except for the styling that will use the [`rotemplate` package](https://github.com/ropensci-org/rotemplate/). +After transfer to rOpenSci's "ropensci" GitHub organization, a pkgdown website will be built for your package after each push to the GitHub repo. +You can find the status of these builds at `https://ropensci.r-universe.dev/ui#packages` and in the [commit status](https://ropensci.org/blog/2021/09/03/runiverse-docs/#how-it-works). +The website build will use your pkgdown config file if you have one, except for the styling that will use the [`rotemplate` package](https://github.com/ropensci-org/rotemplate/). Please report bugs, questions and feature requests about the central builds and about the template at [https://github.com/ropensci-org/rotemplate/](https://github.com/ropensci-org/rotemplate/). diff --git a/pkg_security.Rmd b/pkg_security.Rmd index 49fb1f7b9..0501f7606 100644 --- a/pkg_security.Rmd +++ b/pkg_security.Rmd @@ -15,7 +15,8 @@ We recommend the article [Ten quick tips for staying safe online](https://journa ## GitHub access security {#git-hub-access-security} -- We recommend you [secure your GitHub account with two-factor (authentication) 2FA](https://help.github.com/articles/securing-your-account-with-two-factor-authentication-2fa/). It is *compulsory* for all ropensci GitHub organization members and outside collaborators so make sure to enable it before your package is approved. +- We recommend you [secure your GitHub account with two-factor (authentication) 2FA](https://help.github.com/articles/securing-your-account-with-two-factor-authentication-2fa/). + It is *compulsory* for all ropensci GitHub organization members and outside collaborators so make sure to enable it before your package is approved. - We also recommend you regularly check who has access to your package repository, and that you prune any unused access (such as from former collaborators). @@ -25,7 +26,8 @@ We recommend the article [Ten quick tips for staying safe online](https://journa ## Secrets in packages {#pkgsecrets} -This section contains guidance for when you develop a package interacting with a web resource requiring credentials (API keys, tokens, etc.). Also refer to [the `httr` vignette about sharing secrets](https://httr.r-lib.org/articles/secrets.html). +This section contains guidance for when you develop a package interacting with a web resource requiring credentials (API keys, tokens, etc.). +Also refer to [the `httr` vignette about sharing secrets](https://httr.r-lib.org/articles/secrets.html). ### Secrets in packages and user protection {#secrets-in-packages-and-user-protection} @@ -33,13 +35,15 @@ Say your package needs an API key for making requests on behalf of users of your - In your package documentation, guide the user so the API key doesn't end up in the .Rhistory/script of users of your package. - - Encourage the use of environment variables to store the API key (or even remove the possibility to pass it as an argument to the functions?). You could link [to this intro to startup files](https://rstats.wtf/r-startup.html) and [`usethis::edit_r_environ()`](https://usethis.r-lib.org/reference/edit.html). + - Encourage the use of environment variables to store the API key (or even remove the possibility to pass it as an argument to the functions?). + You could link [to this intro to startup files](https://rstats.wtf/r-startup.html) and [`usethis::edit_r_environ()`](https://usethis.r-lib.org/reference/edit.html). - Or your package could depend on, or encourage the use of, [`keyring` to help user store variables](https://github.com/r-lib/keyring#readme) in the specific OS' credential stores (more secure than .Renviron): i.e. you'd create a function for setting the key, and have another one for retrieving the key; or the user would write `Sys.setenv(SUPERSECRETKEY = keyring::key_get("myservice"))` at the beginning of their script. - Do not print the API key even in verbose mode in any message, warning, error. -- In the GitHub issue template, it should be stated not to share any credentials. If an user of your package accidentally shares credentials in an issue, make sure they're aware of that so they can revoke the key (i.e. ask them explicitly in an answer whether they realized they shared their key). +- In the GitHub issue template, it should be stated not to share any credentials. + If an user of your package accidentally shares credentials in an issue, make sure they're aware of that so they can revoke the key (i.e. ask them explicitly in an answer whether they realized they shared their key). ### Secrets in packages and development {#secrets-in-packages-and-development} @@ -47,7 +51,8 @@ You'll need to protect your secrets as you protect secrets of users, but there's #### Secrets and recorded requests in tests {#secrets-and-recorded-requests-in-tests} -If you use [`vcr`](https://docs.ropensci.org/vcr/) or [`httptest`](https://enpiar.com/r/httptest/) in tests for caching API responses, you need to make sure the recorded requests / fixtures do not contain secrets. Refer to [`vcr` security guidance](https://books.ropensci.org/http-testing/security-chapter.html) and [`httptest` guidance "Redacting and Modifying Recorded Requests"](https://enpiar.com/r/httptest/articles/redacting.html), and inspect your recorded requests / fixtures before committing them the first time to be sure you got the setup right. +If you use [`vcr`](https://docs.ropensci.org/vcr/) or [`httptest`](https://enpiar.com/r/httptest/) in tests for caching API responses, you need to make sure the recorded requests / fixtures do not contain secrets. +Refer to [`vcr` security guidance](https://books.ropensci.org/http-testing/security-chapter.html) and [`httptest` guidance "Redacting and Modifying Recorded Requests"](https://enpiar.com/r/httptest/articles/redacting.html), and inspect your recorded requests / fixtures before committing them the first time to be sure you got the setup right. `vcr` being an rOpenSci package, you can post any question you might have to [rOpenSci forum](https://discuss.ropensci.org/). From 1a58284250dc07f30c4a4043cdc46df9a0a501aa Mon Sep 17 00:00:00 2001 From: mpadge Date: Tue, 11 Mar 2025 13:27:29 +0100 Subject: [PATCH 3/3] rectify linebreaks in softwarereview_ chapters --- softwarereview_author.Rmd | 57 ++++++++++---- softwarereview_editor.Rmd | 112 ++++++++++++++++++--------- softwarereview_editor_management.Rmd | 27 ++++--- softwarereview_intro.Rmd | 43 +++++++--- softwarereview_policies.Rmd | 12 ++- softwarereview_reviewer.Rmd | 62 +++++++++++---- 6 files changed, 216 insertions(+), 97 deletions(-) diff --git a/softwarereview_author.Rmd b/softwarereview_author.Rmd index f7b932c2a..78bd54cfa 100644 --- a/softwarereview_author.Rmd +++ b/softwarereview_author.Rmd @@ -14,49 +14,72 @@ This concise guide presents the software peer review process for you as a packag - Do you expect to maintain your package for at least 2 years, or to be able to identify a new maintainer? - Consult our [policies](#policies) see if your package meets our criteria for fitting into our suite and does not overlap with other packages. - If you are unsure whether a package meets our criteria, feel free to open an issue as a pre-submission inquiry to ask if the package is appropriate. - - [Example response regarding overlap](https://github.com/ropensci/software-review/issues/199#issuecomment-375358362). Also consider adding some points about similar packages to your [package documentation](#docs-general). -- Please consider the best time in your package's development to submit. Your package should be sufficiently mature so that reviewers are able to review all essential aspects, but keep in mind that review may result in major changes. - - We strongly suggest submitting your package for review *before* publishing on CRAN or submitting a software paper describing the package to a journal. Review feedback may result in major improvements and updates to your package, including renaming and breaking changes to functions. + - [Example response regarding overlap](https://github.com/ropensci/software-review/issues/199#issuecomment-375358362). + Also consider adding some points about similar packages to your [package documentation](#docs-general). +- Please consider the best time in your package's development to submit. + Your package should be sufficiently mature so that reviewers are able to review all essential aspects, but keep in mind that review may result in major changes. + - We strongly suggest submitting your package for review *before* publishing on CRAN or submitting a software paper describing the package to a journal. + Review feedback may result in major improvements and updates to your package, including renaming and breaking changes to functions. - Do not submit your package for review while it or an associated manuscript is also under review at another venue, as this may result in conflicting requests for changes. -- Please also consider the time and effort needed to respond to reviews: think about your availability or that of your collaborators in the next weeks and months following a submission. Note that reviewers are volunteers, and we ask that you respect their time and effort by responding in a timely and respectful manner. -- If you use [repostatus.org badges](https://www.repostatus.org/) (which we recommend), submit when you're ready to get an *Active* instead of *WIP* badge. Similarly, if you use [lifecycle badges](https://www.tidyverse.org/lifecycle/), submission should happen when the package is *Stable*. -- For any submission or pre-submission inquiry the README of your package should provide enough information about your package (goals, usage, similar packages) for the editors to assess its scope without having to install the package. Even better, set up a pkgdown website for allowing more detailed assessment of functionality online. +- Please also consider the time and effort needed to respond to reviews: think about your availability or that of your collaborators in the next weeks and months following a submission. + Note that reviewers are volunteers, and we ask that you respect their time and effort by responding in a timely and respectful manner. +- If you use [repostatus.org badges](https://www.repostatus.org/) (which we recommend), submit when you're ready to get an *Active* instead of *WIP* badge. + Similarly, if you use [lifecycle badges](https://www.tidyverse.org/lifecycle/), submission should happen when the package is *Stable*. +- For any submission or pre-submission inquiry the README of your package should provide enough information about your package (goals, usage, similar packages) for the editors to assess its scope without having to install the package. + Even better, set up a pkgdown website for allowing more detailed assessment of functionality online. - At the submission stage, all major functions should be stable enough to be fully documented and tested; the README should make a strong case for the package. - - Your README file should strive to explain your package's functionality and aims, assuming readers have little to no domain knowledge. All technical tems, including references to other software, should be clarified. + - Your README file should strive to explain your package's functionality and aims, assuming readers have little to no domain knowledge. + All technical tems, including references to other software, should be clarified. - Your package will continue to evolve after review, the chapter on *Package evolution* [provides guidance about the topic](#evolution). ## Preparing for Submission {#preparing-for-submission} - Read and follow [our packaging style guide](#building), [reviewer's guide](#preparereview) to ensure your package meets our style and quality criteria. - Feel free to ask any questions about the process, or your specific package, in our [Discussion Forum](https://discuss.ropensci.org). -- All submissions are automatically checked by our [pkgcheck](https://docs.ropensci.org/pkgcheck/) system to ensure packages follow our guidelines. All authors are expected to have run [the main `pkgcheck` function](https://docs.ropensci.org/pkgcheck/reference/pkgcheck.html) locally to confirm that the package is ready to be submitted. Alternatively, an even easier way to ensure a package is ready for submission is to use [the `pkgcheck` GitHub Action](https://github.com/ropensci-review-tools/pkgcheck-action) to run `pkgcheck` as a GitHub Action, as described in [our blog post](https://ropensci.org/blog/2022/02/01/pkgcheck-action/). -- If your package requires unusual system dependencies (see [*Packaging Guide*](#pkgdependencies)) for our GitHub Action to pass, please submit a pull request adding them to [our base Dockerfile](https://github.com/ropensci-review-tools/pkgcheck/blob/main/Dockerfile). +- All submissions are automatically checked by our [pkgcheck](https://docs.ropensci.org/pkgcheck/) system to ensure packages follow our guidelines. + All authors are expected to have run [the main `pkgcheck` function](https://docs.ropensci.org/pkgcheck/reference/pkgcheck.html) locally to confirm that the package is ready to be submitted. + Alternatively, an even easier way to ensure a package is ready for submission is to use [the `pkgcheck` GitHub Action](https://github.com/ropensci-review-tools/pkgcheck-action) to run `pkgcheck` as a GitHub Action, as described in [our blog post](https://ropensci.org/blog/2022/02/01/pkgcheck-action/). +- If your package requires unusual system dependencies (see [*Packaging Guide*](#pkgdependencies)) for our GitHub Action to pass, please submit a pull request adding them to [our base Dockerfile](https://github.com/ropensci-review-tools/pkgcheck/blob/main/Dockerfile). - If there are any aspects of `pkgcheck` which your package is unable to pass, please explain reasons in your submission template. - If you feel your package is in scope for the - [Journal of Open-Source Software](https://joss.theoj.org/) (JOSS), do not submit it to JOSS consideration until after the rOpenSci review process is over: if your package is deemed in scope by JOSS editors, only the accompanying short paper would be reviewed, (not the software that will have been extended reviewed by rOpenSci by that time). Not all rOpenSci packages will meet the criteria for JOSS. + [Journal of Open-Source Software](https://joss.theoj.org/) (JOSS), do not submit it to JOSS consideration until after the rOpenSci review process is over: if your package is deemed in scope by JOSS editors, only the accompanying short paper would be reviewed, (not the software that will have been extended reviewed by rOpenSci by that time). + Not all rOpenSci packages will meet the criteria for JOSS. ## The Submission Process {#the-submission-process} - Software is submitted for review by [opening a new issue](https://github.com/ropensci/software-review/issues/new/choose) in the software review repository and filling out the template. -- The template begins with a section which includes several HTML-styled variables (``). These are used by our `ropensci-review-bot`, and must be left in place, with values filled between the indicated start and end points, like this: +- The template begins with a section which includes several HTML-styled variables (``). + These are used by our `ropensci-review-bot`, and must be left in place, with values filled between the indicated start and end points, like this: ```{bash, eval=F} insert value here