-
Notifications
You must be signed in to change notification settings - Fork 68
Contribution to fix small bug and to reformat #27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
linhkid
commented
Apr 7, 2024
- Fix JSON error when there is incomplete prompt leading to failed json generation
- Add more fields in the summariaztion such as methodology, discussion, data used etc
- Add generated date to digest.html file
src/relevancy_prompt.txt
Outdated
1. {"Relevancy score": "an integer score out of 10", "Reasons for match": "1-2 sentence short reasonings", "Goal": "What kind of pain points the paper is trying to solve?", "Data": "Summary of the data source used in the paper", "Methodology": "Summary of methodologies used in the paper", "Git": "Link to the code repo (if available)", "Experiments & Results": "Summary of any experiments & its results", "Discussion & Next steps": "Further discussion and next steps of the research"} | ||
|
||
My research interests are: NLP, RAGs, LLM, Optmization in Machine learning, Data science, Generative AI, Optimization in LLM, Finance modelling ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interests get appended on here:
Line 23 in 0eadab7
prompt += query['interest'] |
No need to add them manually to the relevancy prompt
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK thanks
README.md
Outdated
**ArXiv Digest and Personalized Recommendations using Large Language Models.** | ||
**ArXiv Digest (extra version) and Personalized Recommendations using Large Language Models.** | ||
|
||
*(Note: This is an adjusted repo to match my needs. For original repo please refer to **AutoLLM** that I forked from)* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pull request to the original repo 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry Richard, pls ignore haha.
Revamp workflow, UI and using multiple models for this repo 1.Content Extraction After Filtering: Added a new step in the process between Stage 1 and Stage 2 After papers pass the relevancy filter, the system now extracts HTML content for them Only papers that make it through the threshold get their content fetched Uses the crawl_html_version function from download_new_papers.py that you already have Process Flow: Stage 1: Quick filtering based on title and abstract only Content Extraction: Fetch HTML content for papers that passed the filter Stage 2: Detailed analysis that includes the full content Updated Filter Prompt: Made it clear in the Stage 1 prompt that this is just preliminary screening Specified that papers scoring 7+ will be analyzed in depth with full content Added clearer instructions for the relevancy scoring Fixed Processing Flow: Always processes all available papers Uses fixed batches of 8 papers for Stage 1 (title & abstract only) Guarantees at least 10 papers will be analyzed in depth Revamp UI using Gradio and removed Sending email Workflow Improvements: Added clear comments explaining the fixed parameters Removed the batch update function which is no longer needed Simplified the code overall
Multiagent multipurpose