-
Notifications
You must be signed in to change notification settings - Fork 337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak with small fragment size libraries in v0.24.0 #602
Comments
I also ran Valgrind debugging on FASTP using CI sample==2096077== Memcheck, a memory error detector ==2096077== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al. ==2096077== Using Valgrind-3.24.0 and LibVEX; rerun with -h for copyright info ==2096077== Command: /shared/software/fastp-0.24.0-custom/fastp --stdin --interleaved_in --out1 /tmp/snakemake_nutcracker_ffpe_7743_R1.fastq --out2 /tmp/snakemake_nutcracker_ffpe_7743_R2.fastq --adapter_fasta=resources/truseq.fa.gz --correction --disable_quality_filtering --thread=2 --html=results/81-fastp/nutcracker_ffpe.fastp.html --json=results/81-fastp/nutcracker_ffpe.fastp.json --length_required=30 --dont_eval_duplication ==2096077== Read1 before filtering: total reads: 9685298 total bases: 1452794700 Q20 bases: 1200457104(82.6309%) Q30 bases: 1150074618(79.1629%)Read2 before filtering: Read1 after filtering: Read2 after filtering: Filtering result: Insert size peak (evaluated by paired-end reads): 269 JSON report: results/81-fastp/nutcracker_ffpe.fastp.json /shared/software/fastp-0.24.0-custom/fastp --stdin --interleaved_in --out1 /tmp/snakemake_nutcracker_ffpe_7743_R1.fastq --out2 /tmp/snakemake_nutcracker_ffpe_7743_R2.fastq --adapter_fasta=resources/truseq.fa.gz --correction --disable_quality_filtering --thread=2 --html=results/81-fastp/nutcracker_ffpe.fastp.html --json=results/81-fastp/nutcracker_ffpe.fastp.json --length_required=30 --dont_eval_duplication Note that this CI sample was generated using bamsurgeon, and as such does not have any read-through adapter sequences. I would be more than happy to provide these test FASTQ files if that is helpful! |
Thank you very much! Can you please provide some data sample, along with the command that I can reproduce this issue? |
First off, I would like to thank you for developing and supporting FASTP for all these years. We've used it quite extensively, and its runtime performance and feature set is AMAZING.
As a result of our FFPE DNA extraction and library prep protocol, we have been generating Illumina WGS data from libraries with very short fragment sizes (down to a mean of 100bp fragments in some really low quality samples). When we run FASTP on these samples we see memory usage absolutely explode (up to 100+GB, seems to increase as fragment size decreases) using the following command:
fastp --stdin --interleaved_in --length_required=30 --dont_eval_duplication --disable_quality_filtering --thread=5 --correction --out1 <fastq_R1> --out2 <fastq_R2> --adapter_fasta=<adapter.fa>--html=<html_path> --json=<json_path>
We tried applying the memory leak patch in v0.24.0, and while memory usage did definitely improve, we are still seeing 100+GB for some samples.
I am definitely not a C programmer but have been running Valgrind to try to find the source of this leak, and it has provided some insights:
I attempted to patch this leak in
peprocessor.cpp
using the following:Which did reduce memory usage, but not significantly.
Valgrind also returned this:
I am completely out of my element here, so any advice or thoughts would be greatly appreciated.
The text was updated successfully, but these errors were encountered: