Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ReservoirSampling algorithm to randomized module #6204

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

cureprotocols
Copy link

@cureprotocols cureprotocols commented Mar 30, 2025

Algorithm Overview:

  • Efficient for selecting k random elements from a stream of unknown size
  • Commonly used in streaming systems, big data pipelines, and memory-limited environments
  • Time Complexity: O(n)
  • Space Complexity: O(k)

Implementation Details:

  • Class: ReservoirSampling
  • Package: com.thealgorithms.randomized
  • JavaDoc included for class and method
  • Demonstration included in main() method
  • File name and class name follow PascalCase
  • Fully formatted using clang-format

Reference:

Author: Michael Alexander Montoya (@cureprotocols)

  • I have read CONTRIBUTING.md.
  • This pull request is all my own work -- I have not plagiarized it.
  • All filenames are in PascalCase.
  • All functions and variable names follow Java naming conventions.
  • All new algorithms have a URL in their comments that points to Wikipedia or other similar explanations.
  • All new code is formatted with clang-format -i --style=file path/to/your/file.java

@DenizAltunkapan
Copy link
Contributor

What if sampleSize > stream.length? Perhaps there should be error handling for invalid input, the rest lgtm

@cureprotocols
Copy link
Author

cureprotocols commented Mar 31, 2025

✅ Added input validation for sampleSize > stream.length as requested — thanks for the suggestion, @DenizAltunkapan!

Let me know if anything else is needed — happy to improve further 🙌

Copy link
Member

@siriak siriak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good, could you please add some JUnit tests and remove main? You could check that the correct number of elements is returned, that they are all from the initial set, maybe other properties of the algorithm (see https://github.com/TheAlgorithms/Java/tree/master/src/test/java/com/thealgorithms)

@codecov-commenter
Copy link

codecov-commenter commented Apr 3, 2025

Codecov Report

Attention: Patch coverage is 91.66667% with 1 line in your changes missing coverage. Please review.

Project coverage is 73.79%. Comparing base (2570a99) to head (e2a0a9a).

Files with missing lines Patch % Lines
...om/thealgorithms/randomized/ReservoirSampling.java 91.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##             master    #6204   +/-   ##
=========================================
  Coverage     73.78%   73.79%           
- Complexity     5299     5304    +5     
=========================================
  Files           671      672    +1     
  Lines         18344    18356   +12     
  Branches       3546     3549    +3     
=========================================
+ Hits          13536    13545    +9     
- Misses         4262     4264    +2     
- Partials        546      547    +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cureprotocols
Copy link
Author

✅ Applied clang-format and added JUnit test to src/test/java/com/thealgorithms/randomized/.

All requested changes completed. Ready for final review 💪

siriak
siriak previously approved these changes Apr 6, 2025
Copy link
Member

@siriak siriak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good, please fix PR checks and it's ready to merge. Thank you for patience, I'm very busy at the moment so response times are high :)

@cureprotocols
Copy link
Author

✅ Final test file now formatted with clang-format --style=file
✅ Verified formatting with --dry-run --Werror
✅ Test class is correctly placed under:
src/test/java/com/thealgorithms/randomized/ReservoirSamplingTest.java

CI will re-run shortly. Appreciate all the support — ready for merge when you are 💪

@siriak
Copy link
Member

siriak commented Apr 6, 2025

Error: /home/runner/work/Java/Java/src/main/java/com/thealgorithms/randomized/ReservoirSampling.java:19:1: Utility classes should not have a public or default constructor. [HideUtilityClassConstructor]
Error: /home/runner/work/Java/Java/src/test/java/com/thealgorithms/randomized/ReservoirSamplingTest.java:3:47: Using the '.' form of import should be avoided - org.junit.jupiter.api.Assertions.. [AvoidStarImport]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants