-
Notifications
You must be signed in to change notification settings - Fork 481
Russian Webpage parsing support. #263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
More reference hello-efficiency-inc/raven-reader#269 |
The same applies to Chinese and other Asian languages, you get a bunch of unicodes rather than the actual content. See #264 |
Thanks for reporting @mrgodhani — and @HenryQW. I'll be honest: I don't have a ton of experience with encoding in these scenarios. This is where the encoding currently takes place: Does anything stand out to you as doing it wrong? Or any other suggestions? We're more than happy to accept help. |
My findings #267
Thanks |
This fix has been merged, and will be included in the next release. |
Expected Behavior
Proper encoding for Russian language.
Current Behavior
When parsing this link https://www.finam.ru/analysis/newsitem/putin-nagradil-grefa-ordenom-20190208-203615/?utm_source=rss&utm_medium=new_compaigns&utm_campaign=news_to_finamb it doesn't give proper encode output and hence format is messed up when rendering in html.
Steps to Reproduce
Detailed Description
I use this API for parsing articles in my reader app. And there are some Russian news feed try to use and are not able to get proper format output.
Possible Solution
The text was updated successfully, but these errors were encountered: