Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test(primitives): Extract commonalities from market and storage-provider pallet benchmarks, and use real proof data #818

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from

Conversation

Jinxit
Copy link
Contributor

@Jinxit Jinxit commented Apr 4, 2025

Description

A problem arising from using real proofs for benchmarking the storage-provider pallet is that the runtime must use the real proof verifier, which means the market pallet benchmarks break. While the setup created for storage-provider could be copied across, it feels a bit more maintainable to:

  1. Extract commonalities and reuse them.
  2. Create a reproducible (and deterministic) way to generate the data.

This PR attempts to move test setup code, mostly for benchmarks, which is currently duplicated between the market and storage-provider pallets. I didn't try to extract everything, I was mostly aiming to get it to work to not make an already large PR any larger. There's a new CLI command to storage-provider-client called benchmark-data to generate code, keys and proofs which are then embedded in the benchmark binary. I've also created traits for things which are common to the respective pallets' configs, such as CurrencyProvider, BalanceOf, and traits for each pallet's config itself - since that is kind of part of the (internal) interfaces of the pallets.

This is the big one, so I'm very much open to comments for improvements. Going commit by commit might not compile, but the changes are grouped on a best-effort basis to make reviewing them easier. be6fb5d is a bit of a mess though, I'll admit.

There is some overlap with what this is doing and the downloading of parameters (just download-params), but I haven't really made any attempt to unify the two approaches.

Important points for reviewers

As a side effect of creating MarketProvider and StorageProviderProvider (what a name, it's like I'm back in Java-land), the #[pallet::constant] tags have disappeared from the constants which are used by the other pallet, respectively. This means the constants are no longer a part of the pallet metadata, which I think is fine for now. The end goal should probably be to merge the two pallets anyway and move stuff back out of primitives.

Checklist

  • Are there important points that reviewers should know?
    • Less pallet metadata
  • Make sure that you described what this change does.
  • If there are follow-ups, have you created issues for them?
  • Have you tested this solution?
  • Were there any alternative implementations considered?
    • Copy-paste the setup
    • Merge storage-provider and market (maybe later?)
  • Did you document new (or modified) APIs?

Jinxit added 6 commits April 4, 2025 11:03
…y instead of scale encoding a vector

Signed-off-by: Lucas Åström <[email protected]>
…orage-provider benchmarks, and use the same real proof data in both

Signed-off-by: Lucas Åström <[email protected]>
@Jinxit Jinxit added the ready for review Review is needed label Apr 4, 2025
Copy link
Contributor

@th7nder th7nder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff! In this first pass I just focused on the high-level functionality and concepts, not at the trait choices/code design.

let sector_size = seal_proof.sector_size();
let porep_params_path = params_root.join(format!("{sector_size}.{POREP_PARAMS_EXT}"));
let porep_params_vk_path = params_root.join(format!("{sector_size}.{POREP_VK_EXT_SCALE}"));
if !tokio::fs::try_exists(&porep_params_vk_path).await? {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should fail instead of generate, we should always use those shared params that are (should) be stored on blob.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was written before the downloading of params so it's generating a completely separate set - I'll switch it over.

Copy link
Contributor Author

@Jinxit Jinxit Apr 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, thinking about this again, shouldn't it generate if they are missing? If we're using this as a command to run and then commit to the repo (or to the blob store), then this should be how that data is generated in the first place. This command shouldn't be run on developer machines, really, outside of developing the command itself.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd agree about the generation, but to generate them properly you need to have a huge-ass machine and then update them in lots of places. Like the repo, chain, genesis, etc.
I think this is not a trivial step and needs to be done explicitly.

// This code has been generated by `target/release/polka-storage-provider-client proofs benchmark-data --sector-size 2KiB examples/test-data-big.car`
SectorSize::_2KiB => BenchmarkData {
storage_provider_name: "//StorageProvider",
verifying_key: include_bytes!("../../target/bench/params/2KiB.porep.vk.scale"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to store verifying key as well as generated proofs in the repo, somewhere committed. If we don't, CI will fail.
Params cannot be stored on the repo, as they're too big.

I'd suggest the workflow looks like this:

  • we generate the bench data for 1GiB sectors (both proofs, vks and stuff) on the beefy machine, by running the command benchmark-data
  • upload this as a backup to our blob storage (i'm doing it now)
  • commit the generate benched data to the repo (as well as the benchmarks)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I'll adjust the paths accordingly and add it to the download script.

.join(sector_size.to_string());
tokio::fs::create_dir_all(&proofs_root).await?;

for sector_number in 0..MAX_SECTORS_PER_CALL {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This MAX_SECTORS_PER_CALL is probably to be adjusted. I think that with 1GiB sectors we'll be able to ProveCommit only 1 per call, PreCommit is much different, it's not that compute heavy, but still.

To generate 20 seperate proofs for 1GiB, it'll take like 5 hours (1 proof -> ~20minutes).
Do we wanna do that if we'll accept one proof for prove commit anyways?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're planning on lowering the max sectors (even to 1), then yeah I agree this is a good time to do it.


pub fn load(sector_size: SectorSize) -> BenchmarkData<T> {
match sector_size {
// DO NOT MODIFY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw. shouldn't we require only one sector_size here to be loaded?
Like if we need benchmarks only for 1GiB/8MiB to not pollute the binary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean with features or something? I mostly started like this when going back and forth between generating data and running it, it was convenient to easily swap them out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it was bothering me that we'll only use SectorSize 1GiB for benchmarks and have multiple sector sizes in this switch which are not necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it's a good point, I think I'll switch it out (ha) to just returning the 1GB data as soon as I get that working.

Copy link
Member

@cernicc cernicc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This benchmarking looks like a rocket science :D Left some comments. Good job figuring this out.

}

fn generate_porep_params(
output_path: Option<PathBuf>,
output_path: Option<impl AsRef<Path>>,
seal_proof: RegisteredSealProof,
) -> Result<(), CliError> {
let output_path = if let Some(output_path) = output_path {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to accept an owned type if the ownership is needed by the function logic. The caller can then decide if another allocation is needed or if the ownership can just be transferred to the function.

In this example the owned value is immediately created from the reference. This is a good indication of the point above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was less about not cloning and more about developer experience in an earlier commit where I was using a TempDir directly which implements AsRef<Path>. That changed along the way, and now it's just a PathBuf on both call sites.

I'm up for changing it, but I don't think performance is the right reason. The command takes minutes to run after all. But writing simpler code I can agree to.

@@ -130,6 +144,59 @@ pub enum ProofsCommand {
/// CID - CommR of a replica (output of `porep` command)
comm_r: String,
},
/// Generates PoRep params, PoSt params, proofs and commitments used for benchmarking.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a feeling like we should hide everything required for benchmarks behind some feature flag. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. At first I considered runtime-benchmarks but I'm not sure whether that's already too overloaded with the actual benchmarking code.

pub type BalanceOf<T> = <<T as CurrencyProvider>::Currency as Currency<
<T as frame_system::Config>::AccountId,
>>::Balance;
pub trait CurrencyProvider: frame_system::Config {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing new line before the trait

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready for review Review is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants