elemeta.nlp.runners package#

Submodules#

elemeta.nlp.runners.metafeature_extractors_runner module#

class elemeta.nlp.runners.metafeature_extractors_runner.MetafeatureExtractorsRunner(metafeature_extractors: List[AbstractTextMetafeatureExtractor] | None = None, compute_intensive: bool = False)#

Bases: object

This class used to run multiple MetadataExtractors on a text

metafeature_extractors#

a list of `MetadataExtractor`s to run, if not supplied will run with all metadata extractors.

Type:

Optional[List[AbstractTextMetafeatureExtractor]]

run(text)#

runs all the metadata extractors on the input text

run_on_dataframe(df, text_column)#
runs all the metadata extractors on the given text_column in the given dataframe

and return new dataframe with metadata values as columns

Methods

run(text)

run metrics on list of text

run_on_dataframe(dataframe, text_column)

return new dataframe with all metafeature extractors values :param dataframe: dataframe with the text column :type dataframe: DataFrame :param text_column: the name of the text column in the given dataframe :type text_column: str

add_metafeature_extractor

add_metafeature_extractor(metafeature_extractor: AbstractTextMetafeatureExtractor) None#
run(text: str) Dict[str, Any]#

run metrics on list of text

Parameters:

text (str) – the text to run all metrics on

Returns:

metafeature_value_dict – returns a dictionary of extractor name and the metafeature value

Return type:

Dict[str, Any]

run_on_dataframe(dataframe: DataFrame, text_column: str) DataFrame#

return new dataframe with all metafeature extractors values :param dataframe: dataframe with the text column :type dataframe: DataFrame :param text_column: the name of the text column in the given dataframe :type text_column: str

Returns:

dataframe – dataframe with the values of the metafeature extractors as new columns

Return type:

DataFrame

elemeta.nlp.runners.pair_metafeature_extractors_runner module#

class elemeta.nlp.runners.pair_metafeature_extractors_runner.PairMetafeatureExtractorsRunner(input_1_extractors: List[AbstractTextMetafeatureExtractor], input_2_extractors: List[AbstractTextMetafeatureExtractor], input_1_and_2_extractors: List[AbstractTextPairMetafeatureExtractor])#

Bases: object

Methods

run(input_1, input_2)

run input_1_extractors on input_1, input_2_extractors on input_2 and input_1_and_2_extractors on the pair of input_1 and input_2

run(input_1: str, input_2: str) PairMetafeatureExtractorsRunnerResult#

run input_1_extractors on input_1, input_2_extractors on input_2 and input_1_and_2_extractors on the pair of input_1 and input_2

Parameters:
  • input_1 (str) –

  • input_2 (str) –

Returns:

the metafeatures extracted from text

Return type:

PairMetafeatureExtractorsRunnerResult

class elemeta.nlp.runners.pair_metafeature_extractors_runner.PairMetafeatureExtractorsRunnerResult(*, input_1: Dict[str, Any], input_2: Dict[str, Any], input_1_and_2: Dict[str, Any])#

Bases: BaseModel

Attributes:
model_computed_fields

Get the computed fields of this model instance.

model_extra

Get extra fields set during validation.

model_fields_set

Returns the set of fields that have been explicitly set on this model instance.

Methods

copy(*[, include, exclude, update, deep])

Returns a copy of the model.

model_construct([_fields_set])

Creates a new instance of the Model class with validated data.

model_copy(*[, update, deep])

Usage docs: https://docs.pydantic.dev/2.5/concepts/serialization/#model_copy

model_dump(*[, mode, include, exclude, ...])

Usage docs: https://docs.pydantic.dev/2.5/concepts/serialization/#modelmodel_dump

model_dump_json(*[, indent, include, ...])

Usage docs: https://docs.pydantic.dev/2.5/concepts/serialization/#modelmodel_dump_json

model_json_schema([by_alias, ref_template, ...])

Generates a JSON schema for a model class.

model_parametrized_name(params)

Compute the class name for parametrizations of generic classes.

model_post_init(_BaseModel__context)

Override this method to perform additional initialization after __init__ and model_construct.

model_rebuild(*[, force, raise_errors, ...])

Try to rebuild the pydantic-core schema for the model.

model_validate(obj, *[, strict, ...])

Validate a pydantic model instance.

model_validate_json(json_data, *[, strict, ...])

Usage docs: https://docs.pydantic.dev/2.5/concepts/json/#json-parsing

model_validate_strings(obj, *[, strict, context])

Validate the given object contains string data against the Pydantic model.

construct

dict

from_orm

json

parse_file

parse_obj

parse_raw

schema

schema_json

update_forward_refs

validate

input_1: Dict[str, Any]#
input_1_and_2: Dict[str, Any]#
input_2: Dict[str, Any]#
model_config: ClassVar[ConfigDict] = {}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'input_1': FieldInfo(annotation=Dict[str, Any], required=True), 'input_1_and_2': FieldInfo(annotation=Dict[str, Any], required=True), 'input_2': FieldInfo(annotation=Dict[str, Any], required=True)}#

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

Module contents#