pyFUME model builder

class pyfume.pyfume.pyFUME(datapath=None, dataframe=None, nr_clus=2, process_categorical=False, method='Takagi-Sugeno', variable_names=None, merge_threshold=1.0, **kwargs)

Bases: object

Creates a new fuzzy model.

Parameters
  • datapath – The path to the csv file containing the input data (argument ‘datapath’ or ‘dataframe’ should be specified by the user).

  • dataframe – Pandas dataframe containing the input data (argument ‘datapath’ or ‘dataframe’ should be specified by the user).

  • nr_clus – Number of clusters that should be identified in the data (default = 2).

  • process_categorical – Boolean to indicate whether categorical variables should be processed (default = False).

  • method – At this moment, only Takagi Sugeno models are supported (default = ‘Takagi-Sugeno’)

  • variable_names – Names of the variables, if not specified the names will be read from the first row of the csv file (default = None).

  • merge_threshold – Threshold for GRABS to drop fuzzy sets from the model. If the jaccard similarity between two sets is higher than this threshold, the fuzzy set will be dropped from the model.

  • **kwargs – Additional arguments to change settings of the fuzzy model.

Returns

An object containing the fuzzy model, information about its setting (such as its antecedent and consequent parameters) and the different splits of the data.

calculate_error(method='MAE')

Calculates the performance of the model given the test data.

Args:

method: The performance metric to be used to evaluate the model (default = ‘MAE’). Choose from: Mean Absolute Error (‘MAE’), Mean Squared Error (‘MSE’), Root Mean Squared Error (‘RMSE’), Mean Absolute Percentage Error (‘MAPE’).

Returns

The performance as expressed by the chosen performance metric.

denormalize_values(data)

Takes normalized data points, and returns the denormalized (raw) values of that data point. This method only works when during modeling the data was normalized using the min-max method.

Parameters

xdata – The input data (as numpy array with each row a different data instance and variables in the same order as in the original training data set) for which the normalized values should be calculated.

Returns

Normalized values.

get_cluster_centers()

Returns the cluster centers as identified by pyFUME.

Returns

cluster centers.

get_data(data_set='test')

Returns the test or training data set.

Parameters

data_set – Used to specify whether the function should return the training (data_set = “train”), test set (data_set = “test”) or both training and test data (data_set = “all”). By default, the function returns the test set.

Returns

Tuple (x_data, y_data) containing the test or training data set.

get_firing_strengths(data, normalize=True)

Calculates the (normalized) firing strength/ activition level of each rule for each data instance of the given data.

Parameters
  • xdata – The input data (as numpy array with each row a different data instance and variables in the same order as in the original training data set) for which the labels should be calculated.

  • normalize – Boolean that indicates whether the retuned fiing strengths should be normalized (normalize = True) or not (normalize = False), When the firing strenghts are nomalized the summed fiing strengths for each data instance equals one.

Returns

Firing strength/activition level of each rule (columns) for each data instance (rows).

get_fold_indices()

Returns a list with the fold indices of each model that is created if crossvalidation is used when training.

Returns

Perfomance of each cross validation model.

get_model()

Returns the fuzzy model created by pyFUME.

Returns

The fuzzy model (as an executable object).

get_performance_per_fold()

Returns a list with the performances of each model that is created if crossvalidation is used when training..

Returns

Perfomance of each cross validation model.

normalize_values(data)

Calculates the normalized values of a data point, using the same scaling that was used to training data of the model. This method only works when the data was normalized using the min-max method.

Parameters

xdata – The input data (as numpy array with each row a different data instance and variables in the same order as in the original training data set) for which the normalized values should be calculated.

Returns

Normalized values.

predict_label(xdata)

Calculates the predictions labels of a data set using the fuzzy model.

Parameters

xdata – The input data (as numpy array with each row a different data instance and variables in the same order as in the original training data set) for which the labels should be calculated.

Returns

Prediction labels.

predict_test_data()

Calculates the predictions labels of the test data using the fuzzy model.

Returns

Prediction labels.

test_model(xdata, ydata, error_metric='MAE')

Calculates the performance of the model using the given data.

Parameters
  • xdata – The input data (as numpy array with each row a different data instance and variables in the same order as in the original training data set) for which the labels should be calculated.

  • ydata – The target data (as single-column numpy array).

  • error_metric – The error metric in which the performance should be expressed (default = ‘MAE’). Choose from: Mean Absolute Error (‘MAE’), Mean Squared Error (‘MSE’), Root Mean Squared Error (‘RMSE’), Mean Absolute Percentage Error (‘MAPE’).

Returns

The performance as expressed in the chosen metric.