You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
i wanted to generate a markdown of a really long pdf document (roughly around 100 pages). Simple print works, but as soon as it should be converted to markdown, it gives the following issue below. Is there a now limitation to the length of a document?
Traceback (most recent call last):
File "/Users/user/Desktop/Repositories/markitdown/script/markdown.py", line 73, in
main()
~~~~^^
File "/Users/user/Desktop/Repositories/markitdown/script/markdown.py", line 34, in main
text = process_file(file_path)
File "/Users/user/Desktop/Repositories/markitdown/script/markdown.py", line 19, in process_file
result = md.convert(file_path)
File "/Users/user/Desktop/Repositories/markitdown/packages/markitdown/src/markitdown/_markitdown.py", line 259, in convert
return self.convert_local(source, stream_info=stream_info, **kwargs)
~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/user/Desktop/Repositories/markitdown/packages/markitdown/src/markitdown/_markitdown.py", line 310, in convert_local
return self._convert(file_stream=fh, stream_info_guesses=guesses, **kwargs)
~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/user/Desktop/Repositories/markitdown/packages/markitdown/src/markitdown/_markitdown.py", line 541, in _convert
raise UnsupportedFormatException(
f"Could not convert stream to Markdown. No converter attempted a conversion, suggesting that the filetype is simply not supported."
)
markitdown._exceptions.UnsupportedFormatException: Could not convert stream to Markdown. No converter attempted a conversion, suggesting that the filetype is simply not supported
The text was updated successfully, but these errors were encountered:
Thanks for the report. Let's get to the bottom of this.
What version of the library are you using? Did you install it with [all] or at least [pdf]?
Is this a problem with all (e.g., smaller) PDFs? Or just this one?
Are you using the python library or the command line?
On my plate is to add a debug option and more python logging, to better support debugging these types of scenarios.
Hey there,
i wanted to generate a markdown of a really long pdf document (roughly around 100 pages). Simple print works, but as soon as it should be converted to markdown, it gives the following issue below. Is there a now limitation to the length of a document?
Traceback (most recent call last):
File "/Users/user/Desktop/Repositories/markitdown/script/markdown.py", line 73, in
main()
~~~~^^
File "/Users/user/Desktop/Repositories/markitdown/script/markdown.py", line 34, in main
text = process_file(file_path)
File "/Users/user/Desktop/Repositories/markitdown/script/markdown.py", line 19, in process_file
result = md.convert(file_path)
File "/Users/user/Desktop/Repositories/markitdown/packages/markitdown/src/markitdown/_markitdown.py", line 259, in convert
return self.convert_local(source, stream_info=stream_info, **kwargs)
~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/user/Desktop/Repositories/markitdown/packages/markitdown/src/markitdown/_markitdown.py", line 310, in convert_local
return self._convert(file_stream=fh, stream_info_guesses=guesses, **kwargs)
~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/user/Desktop/Repositories/markitdown/packages/markitdown/src/markitdown/_markitdown.py", line 541, in _convert
raise UnsupportedFormatException(
f"Could not convert stream to Markdown. No converter attempted a conversion, suggesting that the filetype is simply not supported."
)
markitdown._exceptions.UnsupportedFormatException: Could not convert stream to Markdown. No converter attempted a conversion, suggesting that the filetype is simply not supported
The text was updated successfully, but these errors were encountered: