Skip to content

AV-50 (AV-77) - Fetch missing cells from DHT #13

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Nov 29, 2022
Merged

Conversation

aterentic-ethernal
Copy link
Contributor

@aterentic-ethernal aterentic-ethernal commented Nov 22, 2022

This PR supports fetch of DHT cells on light client. Main changes are:

  • verify_equality now returns pair of verified and missing row indexes
  • rows function is added to data mod, to be able to construct data row out of a fetched cells

@aterentic-ethernal aterentic-ethernal force-pushed the AV-50 branch 2 times, most recently from 263c77a to bf69b72 Compare November 23, 2022 14:14
@aterentic-ethernal aterentic-ethernal changed the title Av 50 AV-50 (AV-77) - Fetch missing cells from DHT Nov 23, 2022
@aterentic-ethernal aterentic-ethernal marked this pull request as ready for review November 23, 2022 14:43
@aterentic-ethernal aterentic-ethernal requested review from a team November 24, 2022 09:35
@codecov-commenter
Copy link

codecov-commenter commented Nov 24, 2022

Codecov Report

Base: 58.19% // Head: 59.54% // Increases project coverage by +1.35% 🎉

Coverage data is based on head (0c7bea5) compared to base (b0721ef).
Patch coverage: 94.88% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #13      +/-   ##
==========================================
+ Coverage   58.19%   59.54%   +1.35%     
==========================================
  Files          39       39              
  Lines        5236     5411     +175     
==========================================
+ Hits         3047     3222     +175     
  Misses       2189     2189              
Impacted Files Coverage Δ
kate/recovery/src/index.rs 56.98% <20.00%> (-0.79%) ⬇️
kate/recovery/src/data.rs 63.17% <96.00%> (+9.55%) ⬆️
kate/recovery/src/commitments.rs 86.33% <96.80%> (+8.69%) ⬆️
kate/recovery/src/matrix.rs 82.83% <100.00%> (+3.55%) ⬆️
kate/src/com.rs 96.62% <100.00%> (-0.14%) ⬇️
primitives/nomad/merkle/src/error.rs 4.19% <0.00%> (-0.15%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@aterentic-ethernal aterentic-ethernal force-pushed the AV-50 branch 2 times, most recently from 483be54 to 0c7bea5 Compare November 25, 2022 07:28
if commitments.len() % config::COMMITMENT_SIZE > 0 {
return Err(Error::InvalidData(DataError::BadCommitmentsData));
}

let app_rows = com::app_specific_rows(index, dimension, app_id);
if <u32>::try_from(commitments.len() / config::COMMITMENT_SIZE)? != dimensions.extended_rows() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always cast to the biggest type:

Suggested change
if <u32>::try_from(commitments.len() / config::COMMITMENT_SIZE)? != dimensions.extended_rows() {
if (commitments.len() / config::COMMITMENT_SIZE) != dimensions.extended_rows() as usize {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that platform dependent and wouldn't as panic if 32-bit is not supported (should we even consider this case?). If we assume that smallest possible usize is u32 then 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is platform dependent. It could be u32 (like WASM) or u64 (like current micros).
See https://doc.rust-lang.org/std/primitive.usize.html

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to
if (commitments.len() / config::COMMITMENT_SIZE) != dimensions.extended_rows().try_into()?
to cast to bigger type and avoid panics. Wdyt?


if !all_rows_present {
return Err(Error::InvalidData(DataError::RowAndCommitmentsMismatch));
if <u32>::try_from(rows.len())? != dimensions.extended_rows() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Suggested change
if <u32>::try_from(rows.len())? != dimensions.extended_rows() {
if rows.len() != dimensions.extended_rows() as usize {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, used try_into for conversion.

Comment on lines 134 to 140
.collect::<Result<Vec<(u32, bool)>, Error>>()?;

let verified = verifications
.iter()
.filter(|(_, is_equal)| *is_equal)
.map(|&(i, _)| i)
.collect::<Vec<u32>>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verifications does not need to be consolidated into a Vec. BTW you could merge the latest .map().filter().map() into a single filter_map() call.

Suggested change
.collect::<Result<Vec<(u32, bool)>, Error>>()?;
let verified = verifications
.iter()
.filter(|(_, is_equal)| *is_equal)
.map(|&(i, _)| i)
.collect::<Vec<u32>>();
.filter(|(_, is_equal)| *is_equal)
.map(|&(i, _)| i)
.collect::<Vec<u32>>();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was interesting to change, removed almost 50% of code, after finding transpose fn on result :) So now it looks like:

	let verified = commitments
		.chunks_exact(config::COMMITMENT_SIZE)
		.zip(rows.iter())
		.zip(0u32..)
		.filter(|(.., index)| app_rows.contains(index))
		.filter_map(|((commitment, row), index)| {
			try_into_scalars(row.as_ref()?)
				.map(|scalars| Evaluations::from_vec_and_domain(scalars, domain).interpolate())
				.and_then(|polynomial| prover_key.commit(&polynomial).map_err(From::from))
				.map(|result| (result.to_bytes() == commitment).then_some(index))
				.transpose()
		})
		.collect::<Result<Vec<u32>, Error>>()?;

	app_rows.retain(|row| !verified.contains(row));

	Ok((verified, app_rows))

wdyt?

Comment on lines 142 to 145
let missing = app_rows
.into_iter()
.filter(|&row| !verified.contains(&row))
.collect::<Vec<u32>>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove new allocations:

Suggested change
let missing = app_rows
.into_iter()
.filter(|&row| !verified.contains(&row))
.collect::<Vec<u32>>();
app_rows.retains(|&row| !verified.contains(&row));

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MiguelDD1 This might be an issue in regards to retain. Don't know if the subsequent change in the retain implementation changed anything, but it might be prudent to consider the perf impact mentioned in the issue. Wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated code @sh3ll3x3c , it seems that issue is solved, and vec should be smaller than 512, so lets revert if we notice anything here.

Comment on lines +39 to +40
pub fn rows(cells: &[Cell]) -> Vec<(u32, Vec<u8>)> {
let mut sorted_cells = cells.to_vec();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If 1st step is cells.to_vec(), let's simplify the interface:

Suggested change
pub fn rows(cells: &[Cell]) -> Vec<(u32, Vec<u8>)> {
let mut sorted_cells = cells.to_vec();
pub fn rows(mut cells: Vec<Cell>) -> Vec<(u32, Vec<u8>)> {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this mutates argument just for the sake of sorting (and we cannot enforce sorted vector through type I guess), its kinda awkward to have interface like that. Wdyt?

Copy link
Contributor

@MiguelDD1 MiguelDD1 Nov 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MiguelDD1 doesn't this change make the interface more ambiguous in terms whether the data is actually changing or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this one, by marking it as mut we allow caller to clone if needed (or to avoid cloning otherwise), but it introduces implicit coupling. If we remove sorting as some point, we need to change interface of fn. Also, its not clear why cells are mutated. I know its not that important here, but wanted to here more arguments.

Comment on lines 41 to 45
sorted_cells.sort_by(|a, b| {
let by_row = a.position.row.cmp(&b.position.row);
let by_col = a.position.col.cmp(&b.position.col);
by_row.then(by_col)
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is not the default implementation if you add #[derive(PartialOrd, Ord)] to Position?
(see Ord Derivable:

Suggested change
sorted_cells.sort_by(|a, b| {
let by_row = a.position.row.cmp(&b.position.row);
let by_col = a.position.col.cmp(&b.position.col);
by_row.then(by_col)
});
sorted_cells.sort_by_key(|cell| cell.position);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well it depends whats affects sort more, rows or columns. In this case its row index. Having default implementation would maybe make that implicit?

Copy link
Contributor

@MiguelDD1 MiguelDD1 Nov 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, you're right.
What do you think about next simplification?

sorted_cells.sort_by(|a, b| (a.position.row, a.position.col).cmp(&(b.position.row,b.position.col))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cool, its intuitive to assume that first element of pair has precedence. Ive updated the code.

/// Size is nuber of the data cells in the matrix.
/// Index is list of pairs (app_id, start_index),
/// where start index is index of first cell for that application.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move comments (Size and Index) into each member.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@aterentic-ethernal aterentic-ethernal merged commit f536c60 into main Nov 29, 2022
@aterentic-ethernal aterentic-ethernal deleted the AV-50 branch November 29, 2022 09:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants