actableai.utils.autogluon.get_feature_links(predictor, model_name)¶TODO write documentation
actableai.utils.autogluon.get_final_features(predictor, model_name)¶TODO write documentation
actableai.utils.autogluon.transform_features(predictor, model_name, data)¶TODO write documentation
actableai.utils.categorical_numerical_convert.convert_categorical_to_num(df, inplace=False)¶Convert categorical features in a dataframe to numerical values.
Parameters: df (pandas DataFrame): The dataframe containing the categorical features. inplace (bool, optional): Whether to perform modifications to df in-place
Returns: df (pandas DataFrame): The modified DataFrame with categorical features converted to numerical values. dict_label_encoders (dict): A dictionary containing the fitted LabelEncoder object for each converted column (the categorical features).
actableai.utils.categorical_numerical_convert.get_categorical_columns(df)¶actableai.utils.categorical_numerical_convert.inverse_convert_categorical_to_num(df_new, d, feat_name=None)¶Convert numerical values back to their original categorical values.
This function takes in a DataFrame and a dictionary of unique values for each column, and converts the numerical values in the DataFrame back to their original categorical values. It can be used to reverse the effect of the convert_categorical_to_num function.
Parameters: df (pandas DataFrame): The DataFrame containing the numerical values to be converted back to categorical. d (dict): A dictionary containing the fitted LabelEncoder object for each column. feat_name (str, optional): The name of a specific feature to be converted. If None, all features in the DataFrame will be converted.
Returns: df (pandas DataFrame): The modified DataFrame with numerical values converted back to categorical values.
Raises: ValueError: If the feature name (column) is not in the DataFrame.
actableai.utils.dataset_generator.DatasetGenerator¶Bases: object
generate(columns_parameters: List[dict], rows: int = 1000, output_path: Optional[Union[str, pathlib.Path]] = None, save_parameters_path: Optional[Union[str, pathlib.Path]] = None, random_state: Optional[int] = None) Optional[pandas.core.frame.DataFrame]¶Generate a dataset, this function generates random data and no sense should be expected from it
“name”: <column_name>, # Name of the column, default: col_<index> “type”: <type>, # Type of the column, choices: [“text”, “number”, “date”] “values”: [<value_list>] # List of values, if len(values) == rows then those values will
be used in the given order. If not each row will pick a random value in values. It will always override other parameters
# If value is set type is omitted
} - Text column parameters {
“type”: “text”, “n_categories”: <n_categories>, # The number of categories (number of unique strings), default:
rows
- “range”: (<min_range>, <max_range>), # Range for generated word lengths, min included, max excluded,
- default: (5, 10)
- “word_range”: (<min_range, <max_range>), # Range for the number of words to create, min included, max
- excluded, default: (1, 2)
} - Number column parameters: {
“type”: “number”, “float”: <True or False>, # default: True “range”: (<min_range>, <max_range>), # Min included, max excluded, default: (0, <rows>)
} - Date column: {
“type”: “date”, “freq”: <frequency>, # See pandas date_range freq parameter, default: “D” “start”: <date>, # Start date, default: None “end”: <date> # End date, default: Today - <random_number_of_days>
# At least one of those three parameters (freq, start, and end) must be None
}
Examples
Column parameters examples: Text column containing yes or no: {
“values”: [“yes”, “no”]
} Text column containing unique random string of len 10 {
“type”: “text”, “range”: (10, 11)
} Number column with random float between 0 and 1 {
“type”: “number”, “float”: True, “range”: (0, 1)
} Number column containing either 10, 100 or 1000 {
“values”: [10, 100, 1000]
}
generate_from_file(parameters_path: Union[str, pathlib.Path], output_path: Optional[Union[str, pathlib.Path]] = None) Optional[pandas.core.frame.DataFrame]¶Generate dataset from a file containing the parameters
Pandas DataFrame containing the generated dataset
actableai.utils.language.get_language_display_name(langcode: str) str¶actableai.utils.multilabel_predictor.MultilabelPredictor(labels, path, problem_types=None, eval_metrics=None, consider_labels_correlation=True, **kwargs)¶Bases: object
Tabular Predictor for predicting multiple columns in table. Creates multiple TabularPredictor objects which you can also use individually. You can access the TabularPredictor for a particular label via: multilabel_predictor.get_predictor(label_i)
evaluate(data, **kwargs)¶Returns dict where each key is a label and the corresponding value is the evaluate() output for just that label.
fit(train_data, tuning_data=None, **kwargs)¶Fits a separate TabularPredictor to predict each of the labels.
get_predictor(label)¶Returns TabularPredictor which is used to predict this label.
load(path)¶Load MultilabelPredictor from disk path previously specified when creating this MultilabelPredictor.
multi_predictor_file = 'multilabel_predictor.pkl'¶persist_models()¶TODO write documentation
predict(data, **kwargs)¶Returns DataFrame with label columns containing predictions for each label.
predict_proba(data, **kwargs)¶Returns dict where each key is a label and the corresponding value is the predict_proba() output for just that label.
save(path=None)¶Save MultilabelPredictor to disk.
unpersist_models()¶TODO write documentation
actableai.utils.openai.num_tokens_from_messages(messages, model='gpt-3.5-turbo-0301')¶Returns the number of tokens used by a list of messages.
actableai.utils.pdp_ice.get_pdp_and_ice(model, df_train, features='all', pdp=True, ice=True, grid_resolution=100, verbosity=0, n_samples=None)¶Get Partial Dependence Plot (PDP) and/or Individual Conditional Expectation (ICE) for a given model and dataframe.
Parameters: model: The trained model from AAIRegressionTask() or AAIClassificationTask() df_train (pandas DataFrame): dataset on which to compute the PDP/ICE features (list or str, optional): list of feature names/column numbers on
which to compute PDP/ICE, or ‘all’ to use all columns. If only one fetaure is required, its name or column number should be in a list.
pdp (bool, optional): set to True to compute PDP ICE (bool, optional): set to True to compute ICE grid_resolution (int, optional): number of points to sample in the grid
and plot (x-axis values)
Returns: A dictionary with keys as feature names and values as the computed PDP/ICE
results
If return_type=’raw’: tuple of two numpy arrays. First array represents the feature values and
second array represents the model predictions
If return_type=’plot’: sklearn.inspection.PartialDependenceDisplay object containing the plot
actableai.utils.river.MultiOutputPipeline(pipeline: river.compose.pipeline.Pipeline, metric_class: Type[ray.util.metrics.Metric])¶Bases: object
Wrapper around a pipeline with a multi output regressor or classifier
learn_one(x: dict, y, learn_unsupervised=False, **params)¶Learn one data point and update metrics
predict_one(x: dict, learn_unsupervised=True)¶Predict one data point, use the internal metrics to give the best output
actableai.utils.river.MultiOutputRegressor(models: list)¶Bases: river.base.regressor.Regressor, river.base.multi_output.MultiOutputMixin
Class representing a regressor with multiple output, one regressor per output
learn_one(x: dict, y: numbers.Number, **kwargs) river.base.regressor.Regressor¶Learn one data point
predict_one(x: dict) List[numbers.Number]¶Predict one data point
actableai.utils.river.NRMSE¶Bases: river.metrics.mse.RMSE
Normalized RMSE class (wrapper around river’s RMSE class)
get()¶Return the current value of the metric.
revert(y_true, y_pred, sample_weight=1.0)¶Revert the metric.
update(y_true, y_pred, sample_weight=1.0)¶Update the metric.
actableai.utils.river.metrics_to_dict(metrics_object: river.metrics.base.Metrics) Dict[str, float]¶Transform a river metrics object to a dictionary
actableai.utils.sanitize.sanitize_timezone(df: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame¶Sanitize TimeZone from DataFrame
actableai.utils.testing.generate_date_range(np_rng=None, start_date=None, min_periods=10, max_periods=60, periods=None, freq=None)¶TODO write documentation
actableai.utils.testing.generate_forecast_dataset(np_rng, prediction_length, n_groups=1, n_targets=1, freq=None, n_real_dynamic_features=0, n_cat_dynamic_features=0, n_real_static_features=0, n_cat_static_features=0, date_range_kwargs=None)¶TODO write documentation
actableai.utils.testing.generate_forecast_df(np_rng, prediction_length, n_group_by=0, n_targets=1, freq=None, n_real_static_features=0, n_cat_static_features=0, n_real_dynamic_features=0, n_cat_dynamic_features=0, date_range_kwargs=None)¶TODO write documentation
actableai.utils.testing.generate_random_date(np_rng=None, min_year=1900, min_month=1, min_day=1, max_year=2000, max_month=1, max_day=1, random_state=None)¶TODO write documentation
actableai.utils.testing.init_ray(**kwargs)¶actableai.utils.testing.unittest_autogluon_hyperparameters()¶actableai.utils.testing.unittest_dml_parameters()¶actableai.utils.testing.unittest_estimator_parameters()¶actableai.utils.testing.unittest_hyperparameters()¶actableai.utils.check_if_integer_feature(X: pandas.core.series.Series)¶actableai.utils.custom_precision_recall_curve(y_true, probas_pred, *, pos_label=None, sample_weight=None)¶actableai.utils.debiasing_feature_generator_args()¶actableai.utils.debiasing_hyperparameters()¶actableai.utils.explanation_hyperparameters()¶actableai.utils.fast_categorical_hyperparameters()¶actableai.utils.fill_na(df, fillna_dict=None, fill_median=True)¶actableai.utils.get_all_subclasses(cls: Type[actableai.utils.ClassType]) List[Type[actableai.utils.ClassType]]¶actableai.utils.get_type_special(X: pandas.core.series.Series) str¶actableai.utils.get_type_special_no_ag(X: pandas.core.series.Series) str¶From autogluon library TODO improve
actableai.utils.handle_boolean_features(df)¶actableai.utils.handle_datetime_features(df)¶actableai.utils.is_fitted(transformer)¶actableai.utils.is_gpu_available()¶actableai.utils.is_text_column(X, text_ratio=0.1)¶actableai.utils.memory_efficient_hyperparameters(ag_automm_enabled: bool = False, tabpfn_enabled: bool = False)¶actableai.utils.preprocess_dataset(df)¶actableai.utils.quantile_regression_hyperparameters()¶actableai.utils.random_directory(path='')¶Create random directory,