Open
Description
Please describe the module you would like to add to bricks
A brick that returns an embedding list containing only the nouns of a text, so that they can be used as pointers.
Do you already have an implementation?
ATTRIBUTE = "text"
def noun_splitter(record):
nouns_sents = []
for sent in record[ATTRIBUTE].sents:
nouns = [token.text for token in sent if token.pos_ == "NOUN" and len(token.text) > 1]
if nouns:
nouns_sents.extend([" ".join(nouns[i:i+1]) for i in range(0, len(nouns), 1)])
return list(set(nouns_sents))
Additional context
Can be implemented with SpaCy.