Extractors#
High Level Extractors#
These extractors dont require to supply a tokenizer, if you require a custom tokenizer, use the low level extractors.
- elemeta.nlp.extractors.high_level package
- Submodules
- elemeta.nlp.extractors.high_level.acronym_count module
- elemeta.nlp.extractors.high_level.avg_word_length module
- elemeta.nlp.extractors.high_level.capital_letters_ratio module
- elemeta.nlp.extractors.high_level.date_count module
- elemeta.nlp.extractors.high_level.detect_language_langdetect module
- elemeta.nlp.extractors.high_level.email_count module
- elemeta.nlp.extractors.high_level.embedding module
- elemeta.nlp.extractors.high_level.emoji_count module
- elemeta.nlp.extractors.high_level.hashtag_count module
- elemeta.nlp.extractors.high_level.hinted_profanity_sentence_count module
- elemeta.nlp.extractors.high_level.hinted_profanity_words_count module
- elemeta.nlp.extractors.high_level.link_count module
- elemeta.nlp.extractors.high_level.mention_count module
- elemeta.nlp.extractors.high_level.must_appear_words_percentage module
- elemeta.nlp.extractors.high_level.ner_identifier module
- elemeta.nlp.extractors.high_level.number_count module
- elemeta.nlp.extractors.high_level.out_of_vocabulary_count module
- elemeta.nlp.extractors.high_level.pii_identify module
- elemeta.nlp.extractors.high_level.punctuation_count module
- elemeta.nlp.extractors.high_level.regex_match_count module
- elemeta.nlp.extractors.high_level.semantic_text_pair_similarity module
- elemeta.nlp.extractors.high_level.sentence_avg_length module
- elemeta.nlp.extractors.high_level.sentence_count module
- elemeta.nlp.extractors.high_level.sentiment_polarity module
- elemeta.nlp.extractors.high_level.sentiment_subjectivity module
- elemeta.nlp.extractors.high_level.special_chars_count module
- elemeta.nlp.extractors.high_level.stop_words_count module
- elemeta.nlp.extractors.high_level.syllable_count module
- elemeta.nlp.extractors.high_level.text_complexity module
- elemeta.nlp.extractors.high_level.text_length module
- elemeta.nlp.extractors.high_level.toxicity_extractor module
- elemeta.nlp.extractors.high_level.unique_word_count module
- elemeta.nlp.extractors.high_level.unique_word_ratio module
- elemeta.nlp.extractors.high_level.word_count module
- elemeta.nlp.extractors.high_level.word_regex_matches_count module
- Module contents
Low Level Extractors#
- elemeta.nlp.extractors.low_level package
- Submodules
- elemeta.nlp.extractors.low_level.abstract_text_metafeature_extractor module
- elemeta.nlp.extractors.low_level.abstract_text_pair_metafeature_extractor module
- elemeta.nlp.extractors.low_level.avg_token_length module
- elemeta.nlp.extractors.low_level.hinted_profanity_token_count module
- elemeta.nlp.extractors.low_level.must_appear_tokens_parentage module
- elemeta.nlp.extractors.low_level.regex_token_matches_count module
- elemeta.nlp.extractors.low_level.semantic_embedding_pair_similarity module
- elemeta.nlp.extractors.low_level.semantic_text_to_group_similarity module
- elemeta.nlp.extractors.low_level.tokens_count module
- elemeta.nlp.extractors.low_level.unique_token_count module
- elemeta.nlp.extractors.low_level.unique_token_ratio module
- Module contents