Skip to content

[kemonoparty] fix kemono api skipping latest posts #6931

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

zWolfrost
Copy link
Contributor

Making requests with either the offset (o) or query (q) parameters set to a blank string "?o=" for some reason fixes the problem where kemono API would skip posts that were just imported (see #6780).

@mikf
Copy link
Owner

mikf commented Feb 10, 2025

I ran some tests and discovered that while this does fix the missing posts at the beginning, it "moves" the problem to the next batch of posts.

For example, when kemono skips the first 3 posts of a creator with o=0:

  • o= returns posts 1-50
  • o=0 returns posts 4-53
  • o=50 returns posts 54-103

meaning instead of posts 1-3, it is now missing posts 51-53.
(1-50 from o= + 54-103 from o=50)

I think the easiest solution would be to do 2 API requests for the first o=0 batch, one with o=0 and another with o=, and somehow merge the results.

I have implemented a check for duplicate posts (or same revisions if they are available)
This should also fix the problem where sometimes a same post without multiple revisions would get extracted twice (caused by the "moved" batch issue as well)
I have reverted the previous changes.
@zWolfrost
Copy link
Contributor Author

I actually did some testing myself when I first made the commit and the behavior I've got was:

  • o= returning posts 1-50
  • o=0 returning posts 4-53
  • o=50 returning posts 51-100

So I just assumed fetching for o=0 was pointless.
Anyways, I have now handled the moved posts issue in the way you described.

@mikf
Copy link
Owner

mikf commented Feb 10, 2025

It seems like the API is just horribly broken.

I used /fanbox/user/527684 for some more testing, and the returned posts are completely different for o=0 and o=, and there's even a slight difference when not specifying o at all.

Here are the first few returned post IDs

o=0 o=
9255830 9335127
7631800 9326844
7606698 9321341
7602131 9317339
7597627 9316788
7541351 9309810
7488023 9307940
7034702 9276107
7033645 9271686
7032408 9267862
7016358 9259685
6931396 9255830
6883167 9254602
6879752 9136026

@mikf
Copy link
Owner

mikf commented Feb 10, 2025

Maybe we need to start using the /posts-legacy endpoint (/api/v1/fanbox/user/527684/posts-legacy?o=50) like the website itself does ...

@zWolfrost
Copy link
Contributor Author

zWolfrost commented Feb 10, 2025

Maybe we need to start using the /posts-legacy endpoint

From what I can see, it seems to have the same problems as the endpoint we currently use.
Maybe the additional metadata it gives might turn out to be useful in the future...

@mikf mikf added the site:bug label Mar 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants