Skip to content

Zephyr in tree samples CI coverage criteria and elastic quality maintain policy enhancement #85359

Open
@hakehuang

Description

@hakehuang

Introduction

Samples in Zephyr tree have been a very critical code reference for Zephyr RTOS. the basic sample concepts are defined here:
[samples definition
Goal of samples need to be more clear as to:

  1. Samples are constructed in Zephyr RTOS and its eco-systems, so its has to be comply with the zephyr-eco system, so we expect Sample is portable in Zephyr-ecosystem, constrained by Zephyr RTOS's meta-data/device tree settings/Kconfig or other filterable tags maintained in tree.
  2. The samples quality need to be maintained, the scope can be enhanced gradually, but the minimal sets shall be buildable in defined platforms, and runs by board testing CI on those platforms(not necessary full function check, but ensure there is no issues for the application to start).
  3. long term goal for samples is to have full function validation by common tooling(defined by fixtures and harness by now).
  4. the samples portability shall be considered, and CI shall help to enhance the portability.
  5. portable and function issue is a valid issue that community need take a look, the owner shall be (sample owner)/(platform owner)/(dedicated working group)/(test working group)
  6. the sample user case shall be well documented for at least one platform.

Problem description

  1. Some samples are not CI(twister) runnable, it will be either filter out by harness or just timeout.
  2. Abuse the platform_allow without any clear reason on this.
  3. Sample ownership is not clear.
  4. CI for samples need consider upstream simplicity and downstream portability.
  5. Rules by now are:

The bottom line here is:

  • Samples SHALL not be used or treated as tests
  • The tests in the context of the sample verifies the sample as documented and with supported platforms that have been verified to work with the sample, i.e. by whoever wrote the sample.
  • There is no expectation that a sample shall build and run on platforms that are not documented or verified
  • Filters shall be used to limit the test to the platforms that are known to run the sample as per the documentation and expectation of the sample code. This includes ANY filters, i.e. platform_allow is FULLY valid if a sample is known to only work on a limited set of platform and was VERIFIED on those.
  • A filter might be wider in scope and might pull in platforms that were never verified or tested to work with the sample, or the opposite, a filter might be too limited and potential platforms that would be able to run the sample will be excluded. This is a problem we have right now and what this discussion is about. We can fix this in different ways, but the above points and documented policy SHALL be observed.
  • Samples are not Tests
  • There is no expectation from samples to build and run on ALL platforms
  • Platform X failing on sample when this platform was never tested, verified and documented is not a BUG
  • If someone thinks a certain platform should be used with a specific sample, they should be able to modify the sample code to support that platform, document this as a supported platform and verify it actually works.

Concerns with above bottom line are

  • Sample writer needs to be the maintainer, and if it is not runnable in CI, the quality is merely depends on the maintainer.
  • Portability of sample is not considered, which make samples in tree less attraction. Porting sample to a new platform shall always use dts/kconfig/tags to extend( and change source code by based on those), not by hardcode for different platforms, and this needs considered in sample design.
  • filter by tag/Kconfig/DTS is reasonable, but platform_allow is filtered by no reason. Some reasons are:
-  there are many samples with such platform_allow.
-  No one, even the owner does not know whether it can be build / run on other platform
  1. leave samples in free-style to evolve would make the Zephyr sample code quality un-consistent, that shall be avoid.

Proposed change

  1. Stating that we are less prefer to platform_allow, instead that dts/Kconfig/feature tag is recommend.
  2. Enhance sample maintainer group(e.g system group), and assign maintainer to each.
  3. add integration_platforms to sample is a MUST, and CI in upstream will only build for those platforms. And board testing from upstream can focus on integration_platforms. integration_platform shall not keep adding if there is no significant differences.
  4. issues with filtered valid platform on given samples are valid issues(enhancement issues).
  5. establish a design guide and porting guide for sample in general.
  6. downstream testing for filtered platform need maintained by each vender.

Detailed RFC

  • document that platform_allow is less acceptable, using dts/Kconfig/tag is the recommend way by now, and open to future change.
  • allocate samples to dedicated working group.
  • consider to have architecture working group to maintain the sample in general.
  • tag as portability for issues found in the future as enhancement issue.
  • enhance the existing sample code, to allow twister CI can check at least run without initial/system fault.
  • open to community to allow enhancement to change platform_allow by dts/Kconfig/tag, test working group and other function group can propose.

[ ] net
[ ] bluetooth
[ ] boards
[ ] modules
[ ] sensor
[ ] subsys

        [ ] bindesc
        [ ] canbus
        [ ] console
        [ ] dap
        [ ] debug
        [ ] demand_paging
        [ ] display
        [ ] edac
        [ ] fs
        [ ] input
        [ ] ipc
        [ ] llext
        [ ] logging
        [ ] lorawan
        [ ] mgmt
        [ ] modbus
        [ ] nvs
        [ ] pm
        [ ] portability
        [ ] profiling
        [ ] rtio
        [ ] sensing
        [ ] settings
        [ ] shell
        [ ] sip_svc
        [ ] smf
        [ ] subsys.rst
        [ ] task_wdt
        [ ] testsuite
        [ ] tracing
        [ ] usb
        [ ] usb_c
        [ ] zbus

[ ] drivers

       [ ] adc
       [ ] audio
       [ ] auxdisplay
       [ ] can
       [ ] charger
       [ ] clock_control_litex
       [ ] clock_control_xec
       [ ] counter
       [ ] crypto
       [ ] dac
       [ ] display
       [ ] drivers.rst
       [ ] eeprom
       [ ] espi
       [ ] ethernet
       [ ] flash_shell
       [ ] fpga
       [ ] gnss
       [ ] haptics
       [ ] ht16k33
       [ ] i2c
       [ ] i2s
       [ ] ipm
       [ ] jesd216
       [ ] kscan
       [ ] lcd_cyclonev_socdk
       [ ] lcd_hd44780
       [ ] led
       [ ] led_strip
       [ ] lora
       [ ] mbox
       [ ] mbox_data
       [ ] memc
       [ ] misc
       [ ] mspi
       [ ] peci
       [ ] ps2
       [ ] rtc
       [ ] smbus
       [ ] soc_flash_nand
       [ ] soc_flash_nrf
       [ ] spi_bitbang
       [ ] spi_flash
       [ ] spi_flash_at45
       [ ] spi_fujitsu_fram
       [ ] uart
       [ ] video
       [ ] virtualization
       [ ] w1
       [ ] watchdog

Proposed change (Detailed)

proposed PRs for such as below:

#84458
#85446

Dependencies

not clear

Concerns and Unresolved Questions

Need feedback

Alternatives

we can keep Sample as it is now, but the defects are lists above as concerns.

Metadata

Metadata

Assignees

Labels

Architecture ReviewDiscussion in the Architecture WG requiredRFCRequest For Comments: want input from the communityarea: SamplesSamples

Type

Projects

Status

No status

Status

Todo

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions