Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ip-location-api updatedb.mjs broken as of bun 1.2.0 #18662

Open
LucyEgan opened this issue Mar 31, 2025 · 8 comments · May be fixed by #18796
Open

Ip-location-api updatedb.mjs broken as of bun 1.2.0 #18662

LucyEgan opened this issue Mar 31, 2025 · 8 comments · May be fixed by #18796
Labels
bug Something isn't working node.js Compatibility with Node.js APIs regression

Comments

@LucyEgan
Copy link

What version of Bun is running?

1.2.7+5c0fa6dc2

What platform is your computer?

Linux 5.15.167.4-microsoft-standard-WSL2 x86_64 x86_64

What steps can reproduce the bug?

Upgrading from 1.1.38 to 1.2.0+ causes the following to change bahvour

bun install --no-cache -g ip-location-api && bun ~/.bun/install/global/node_modules/ip-location-api/script/updatedb.mjs

It seems to cause it to exit/crash part way through processing, which can be seen below in the 1.1 vs 1.2 examples, where 1.2 is missing "Database update completed!!"

What is the expected behavior?

Bun 1.1.38 - 1.1.45

docker run --rm --entrypoint bash oven/bun:1.1.45 -c 'bun install --no-cache -g ip-location-api && bun ~/.bun/install/global/node_modules/ip-location-api/script/updatedb.mjs && ls -lah ~/.bun/install/global/node_modules/ip-location-api/data/g/'
bun add v1.1.45 (196621f2)
Resolving dependencies
Resolved, downloaded and extracted [184]
Saved lockfile

installed [email protected]

46 packages installed [1.87s]
Downloading database
Decompressing GeoLite2-Country-CSV.zip
Extracting GeoLite2-Country-CSV_20250328/GeoLite2-Country-Blocks-IPv4.csv
Extracting GeoLite2-Country-CSV_20250328/GeoLite2-Country-Blocks-IPv6.csv
Extracting GeoLite2-Country-CSV_20250328/GeoLite2-Country-Locations-en.csv
[ "GeoLite2-Country-Locations-en.csv", "GeoLite2-Country-Blocks-IPv4.csv", "GeoLite2-Country-Blocks-IPv6.csv" ]
Creating database for ip-location-api
Database update completed!!
SUCCESS TO UPDATE
total 7.5M
drwxr-xr-x 2 root root 4.0K Mar 31 09:36 .
drwxr-xr-x 3 root root 4.0K Mar 31 09:36 ..
-rw-r--r-- 1 root root 1.3M Mar 31 09:36 4-1.dat
-rw-r--r-- 1 root root 1.3M Mar 31 09:36 4-2.dat
-rw-r--r-- 1 root root 616K Mar 31 09:36 4-3.dat
-rw-r--r-- 1 root root 2.0M Mar 31 09:36 6-1.dat
-rw-r--r-- 1 root root 2.0M Mar 31 09:36 6-2.dat
-rw-r--r-- 1 root root 506K Mar 31 09:36 6-3.dat
-rw-r--r-- 1 root root   64 Mar 31 09:36 GeoLite2-country-CSV.zip.sha256

What do you see instead?

Bun 1.2.0 - 1.2.7

docker run --rm --entrypoint bash oven/bun:1.2.7 -c 'bun install --no-cache -g ip-location-api && bun ~/.bun/install/global/node_modules/ip-location-api/script/updatedb.mjs && ls -lah ~/.bun/install/global/node_modules/ip-location-api/data/g/'
bun add v1.2.7 (5c0fa6dc)
Resolving dependencies
Resolved, downloaded and extracted [184]
Saved lockfile

installed [email protected]

46 packages installed [547.00ms]
Downloading database
Decompressing GeoLite2-Country-CSV.zip
Extracting GeoLite2-Country-CSV_20250328/GeoLite2-Country-Blocks-IPv4.csv
Extracting GeoLite2-Country-CSV_20250328/GeoLite2-Country-Blocks-IPv6.csv
Extracting GeoLite2-Country-CSV_20250328/GeoLite2-Country-Locations-en.csv
[ "GeoLite2-Country-Locations-en.csv", "GeoLite2-Country-Blocks-IPv4.csv", "GeoLite2-Country-Blocks-IPv6.csv" ]
Creating database for ip-location-api
total 2.7M
drwxr-xr-x 2 root root 4.0K Mar 31 09:35 .
drwxr-xr-x 3 root root 4.0K Mar 31 09:35 ..
-rw-r--r-- 1 root root 1.1M Mar 31 09:35 4-1.dat.tmp
-rw-r--r-- 1 root root 1.1M Mar 31 09:35 4-2.dat.tmp
-rw-r--r-- 1 root root 539K Mar 31 09:35 4-3.dat.tmp

Additional information

Have tested on all versions from 1.1.38 - 1.2.7 and it changes at 1.2.0.
Haven't yet dug deeper into the root cause, but would assume its byte/stream related during the processing of csv's to a optimized .dat version

@LucyEgan LucyEgan added bug Something isn't working needs triage labels Mar 31, 2025
@LucyEgan
Copy link
Author

Seems to fail in ip-location-api/src/db.mjs:500
After processing the same netblock at line 499,956 of GeoLite2-Country-Blocks-IPv4.csv, so likely a change in how buffers alloc/write could of been changed at 1.2.0?

Notes

  • Fails at the same netblock on 1.2.0 and 1.2.7

Around

if(ipv4){
        buffer1 = Buffer.allocUnsafe(4)
        buffer1.writeUInt32LE(start)
        buffer2 = Buffer.allocUnsafe(4)
        buffer2.writeUInt32LE(end)
} else {
        buffer1 = Buffer.allocUnsafe(8)
        buffer1.writeBigUInt64LE(start)
        buffer2 = Buffer.allocUnsafe(8)
        buffer2.writeBigUInt64LE(end)
}
buffer3 = Buffer.allocUnsafe(2)
buffer3.write(cc)
if(preBuffer1){
        if(!ws1.write(preBuffer1)) rs.pause()
        if (setting.smallMemory) {
                ws = createSmallMemoryFile(ws, ipv4, lineCount++, preBuffer2, preBuffer3)
        } else {
                if(!ws2.write(preBuffer2)) rs.pause()
                if(!ws3.write(preBuffer3)) rs.pause()
        }
}

@JasonTJames
Copy link

Try replacing Buffer.allocUnsafe() with Buffer.alloc().

This will zero-fill memory to determine if it is a memory handling issue:

buffer= buffer.alloc(4)
buffer1.writeUint32LE(start)

@LucyEgan
Copy link
Author

@Fendaril just tried your suggestion, replaced each of the .allocUnsafe with .alloc and same thing. Interestingly when processing each of them it gets to the point that it pauses, then it waits 5-6s before exiting, so it could be hanging, either internally in bun, or something else in ip-location-api.
Its exit code is 0

@JasonTJames
Copy link

JasonTJames commented Mar 31, 2025

Try this to see exactly where logs stop.

console.log(`[${lineCount}] Processing: ${start} - ${end}`); 

if (ipv4) {
    buffer1 = Buffer.alloc(4);
    buffer1.writeUInt32LE(start);
    buffer2 = Buffer.alloc(4);
    buffer2.writeUInt32LE(end);
} else {
    buffer1 = Buffer.alloc(8);
    buffer1.writeBigUInt64LE(start);
    buffer2 = Buffer.alloc(8);
    buffer2.writeBigUInt64LE(end);
}

console.log(`[${lineCount}] Buffers allocated, writing to stream...`);

// Add logging before writes
console.log(`[${lineCount}] Writing preBuffer1`);
if (!ws1.write(preBuffer1)) {
    console.log(`[${lineCount}] Pausing stream...`);
    rs.pause();
    ws1.once('drain', () => {
        console.log(`[${lineCount}] Resuming stream`);
        rs.resume();
    });
}

// Similar logs for other writes
console.log(`[${lineCount}] Writing preBuffer2`);
if (!ws2.write(preBuffer2)) rs.pause();

console.log(`[${lineCount}] Writing preBuffer3`);
if (!ws3.write(preBuffer3)) rs.pause();

console.log(`[${lineCount}] Done processing`);

If logs stop before writing buffers, then something is breaking before buffer allocation.

If logs stop after .pause() but before .resume(), then Bun 1.2+ isn’t handling backpressure correctly.

@RiskyMH RiskyMH added node.js Compatibility with Node.js APIs regression and removed needs triage labels Apr 1, 2025
@LucyEgan
Copy link
Author

LucyEgan commented Apr 1, 2025

It fails writing the buffer on 1.2.7 and logs 516, on 1.1.45 it never logs that line

if(!ws1.write(preBuffer1)) {
    console.log(`[${lineCount}] 516`);
    rs.pause()
}

@JasonTJames
Copy link

Since the log only appears in Bun 1.2.7 and not in 1.1.45 it could signal that that Bun 1.2+ changed handling of backpressure, which causes the internal buffer stream ws1 to reach it's internal limit faster.

The reason why it is pausing for 5-6 seconds and exiting silently could be because ws1.write() is returning false because the stream is overwhelmed. This means that rs.pause() is called but never resumed.

Here are some things that you can try:

Manually resume on drain:

currently, the script pauses but never resumes when the stream is prepared:

Try listening for the drain event so you can resume the stream

if (!ws1.write(prebuffer1)) { console.log([${lineCount}] 516- backpressure detected, pausing); rs.pause(); ws1.once('drain', () => { console.log([${lineCount}] resuming readable stream); rs.resume(); }); }
This may work because drain is called when the buffer is ready, making it safe to call resume.

You could also try increasing the readable buffer size:

const ws1= fs.createWriteStream('output.dat', {highWaterMark: 16 * 1024}); //64KB buffer

Since the default highWaterMark is 16KB for writable streams increasing it will reduce backpressure occurrences.

@LucyEgan
Copy link
Author

LucyEgan commented Apr 1, 2025

@JasonTJames It already has handling for drain when it gets triggered or pause. Also they are setting highWaterMark to 1mb already so 64kb would make the issue happen more often? Ill try increasing that higher to see if it changes anything. Does it work as expected for you?

@LucyEgan
Copy link
Author

LucyEgan commented Apr 2, 2025

@JasonTJames nice call on highWaterMark upping it above the set value of 1mb makes the problem get pushed back, setting to 4mb makes it go away, I'm going to patch it to 128mb to work around this issue until its sorted in bun.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working node.js Compatibility with Node.js APIs regression
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants