Skip to content

Commit 6f20722

Browse files
committed
Update Text Representation (Embeddings).ipynb
1 parent 8c10f3e commit 6f20722

File tree

1 file changed

+5
-2
lines changed

1 file changed

+5
-2
lines changed

Module 9 - GenAI (LLMs and Prompt Engineering)/1. Text Embeddings/Text Representation (Embeddings).ipynb

+5-2
Original file line numberDiff line numberDiff line change
@@ -422,13 +422,16 @@
422422
"id": "1cabe4ef-4f68-4194-99b3-784d56e5cc09",
423423
"metadata": {},
424424
"source": [
425-
"At this point, we can notice that the classifier is doing poorly with identifying relevant articles, while it is doing well with non-relevant ones. Our large feature vector could be creating a lot of noise in the form of very rarely occurring features that are not useful for learning. Let us change the count vectorizer to take a certain number of features as maximum.\n",
425+
"At this point, we can notice that the classifier is doing poorly with identifying relevant articles, while it is doing well with non-relevant ones. \n",
426426
"\n",
427427
"**Potential Reasons for poor classifier performance**\n",
428428
"1. Perhaps we need to balance the data - Clearly there is class imbalance\n",
429429
"2. Perhaps we need a better learning algorithm - Implement Logistic Regression, SVM, RF, etc...\n",
430430
"3. Perhaps we should look for tuning the classifier's parameter with the help of Hyperparameter Tuning\n",
431-
"4. Perhaps we need a better pre-processing and feature extraction mechanism - Right now we have a sparse and large feature vector"
431+
"4. Perhaps we need a better pre-processing and feature extraction mechanism - Right now we have a sparse and large feature vector\n",
432+
"\n",
433+
"**Let's work with potential reason number 4.** \n",
434+
"Our large feature vector could be creating a lot of noise in the form of very rarely occurring features that are not useful for learning. Let us change the count vectorizer to take a certain number of features as maximum."
432435
]
433436
},
434437
{

0 commit comments

Comments
 (0)