Issue with embed PDFs on www.professeurphifix.net #801

benoit74 · 2025-03-28T09:42:10Z

I'm trying to crawl www.professeurphifix.net and I've an issue with embedded PDFs

Let's focus on https://www.professeurphifix.net/orthographe_impression/ortho_a_1.html as an example.

The code showing the PDF is :

<embed src="ortho_a_1.pdf" width="680px" height="600px">

It is hence not explored by default by the crawler, but this is not a big deal thanks to the "recent" --selectLinks setting ;)

Command used:

crawl --scopeIncludeRx ortho_a_1 --selectLinks "a[href]->href,embed[src]->src" --seeds https://www.professeurphifix.net/orthographe_impression/ortho_a_1.html

With this "tweak", the resulting WARC contains the PDF but "something" seems to prevent it to be displayed on replayweb.page (and in the ZIM as well obviously).

Do I miss something? Is this rather a wombat.js issue?

Sample WARC with the HTML and the PDF:
rec-da74c0c8fc0b-20250328092919995-0.warc.gz

The text was updated successfully, but these errors were encountered:

ikreymer · 2025-04-01T05:45:52Z

I was able to load the PDF in both Chrome and Firefox just now in ReplayWeb.page.. Or maybe it works in some cases?
What browser were you using?

benoit74 · 2025-04-01T07:46:30Z

I still don't achieve to do it from both Firefox and Chrome on MacOS (latest versions or so)

Firefox:

Chrome (message is a bit clearer):

github-project-automation bot added this to Webrecorder Projects Mar 28, 2025

github-project-automation bot moved this to Triage in Webrecorder Projects Mar 28, 2025

benoit74 mentioned this issue Mar 28, 2025

professeurphifix.net openzim/zim-requests#401

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with embed PDFs on www.professeurphifix.net #801

Issue with embed PDFs on www.professeurphifix.net #801

benoit74 commented Mar 28, 2025

ikreymer commented Apr 1, 2025

benoit74 commented Apr 1, 2025

Issue with embed PDFs on www.professeurphifix.net #801

Issue with embed PDFs on www.professeurphifix.net #801

Comments

benoit74 commented Mar 28, 2025

ikreymer commented Apr 1, 2025

benoit74 commented Apr 1, 2025