pyFUME model builder¶
-
class
pyfume.pyfume.
pyFUME
(datapath=None, dataframe=None, nr_clus=2, process_categorical=False, method='Takagi-Sugeno', variable_names=None, merge_threshold=1.0, **kwargs)¶ Bases:
object
Creates a new fuzzy model.
- Parameters
datapath – The path to the csv file containing the input data (argument ‘datapath’ or ‘dataframe’ should be specified by the user).
dataframe – Pandas dataframe containing the input data (argument ‘datapath’ or ‘dataframe’ should be specified by the user).
nr_clus – Number of clusters that should be identified in the data (default = 2).
process_categorical – Boolean to indicate whether categorical variables should be processed (default = False).
method – At this moment, only Takagi Sugeno models are supported (default = ‘Takagi-Sugeno’)
variable_names – Names of the variables, if not specified the names will be read from the first row of the csv file (default = None).
merge_threshold – Threshold for GRABS to drop fuzzy sets from the model. If the jaccard similarity between two sets is higher than this threshold, the fuzzy set will be dropped from the model.
**kwargs – Additional arguments to change settings of the fuzzy model.
- Returns
An object containing the fuzzy model, information about its setting (such as its antecedent and consequent parameters) and the different splits of the data.
-
calculate_error
(method='MAE')¶ Calculates the performance of the model given the test data.
- Args:
method: The performance metric to be used to evaluate the model (default = ‘MAE’). Choose from: Mean Absolute Error (‘MAE’), Mean Squared Error (‘MSE’), Root Mean Squared Error (‘RMSE’), Mean Absolute Percentage Error (‘MAPE’).
- Returns
The performance as expressed by the chosen performance metric.
-
denormalize_values
(data)¶ Takes normalized data points, and returns the denormalized (raw) values of that data point. This method only works when during modeling the data was normalized using the min-max method.
- Parameters
xdata – The input data (as numpy array with each row a different data instance and variables in the same order as in the original training data set) for which the normalized values should be calculated.
- Returns
Normalized values.
-
get_cluster_centers
()¶ Returns the cluster centers as identified by pyFUME.
- Returns
cluster centers.
-
get_data
(data_set='test')¶ Returns the test or training data set.
- Parameters
data_set – Used to specify whether the function should return the training (data_set = “train”), test set (data_set = “test”) or both training and test data (data_set = “all”). By default, the function returns the test set.
- Returns
Tuple (x_data, y_data) containing the test or training data set.
-
get_firing_strengths
(data, normalize=True)¶ Calculates the (normalized) firing strength/ activition level of each rule for each data instance of the given data.
- Parameters
xdata – The input data (as numpy array with each row a different data instance and variables in the same order as in the original training data set) for which the labels should be calculated.
normalize – Boolean that indicates whether the retuned fiing strengths should be normalized (normalize = True) or not (normalize = False), When the firing strenghts are nomalized the summed fiing strengths for each data instance equals one.
- Returns
Firing strength/activition level of each rule (columns) for each data instance (rows).
-
get_fold_indices
()¶ Returns a list with the fold indices of each model that is created if crossvalidation is used when training.
- Returns
Perfomance of each cross validation model.
-
get_model
()¶ Returns the fuzzy model created by pyFUME.
- Returns
The fuzzy model (as an executable object).
-
get_performance_per_fold
()¶ Returns a list with the performances of each model that is created if crossvalidation is used when training..
- Returns
Perfomance of each cross validation model.
-
normalize_values
(data)¶ Calculates the normalized values of a data point, using the same scaling that was used to training data of the model. This method only works when the data was normalized using the min-max method.
- Parameters
xdata – The input data (as numpy array with each row a different data instance and variables in the same order as in the original training data set) for which the normalized values should be calculated.
- Returns
Normalized values.
-
predict_label
(xdata)¶ Calculates the predictions labels of a data set using the fuzzy model.
- Parameters
xdata – The input data (as numpy array with each row a different data instance and variables in the same order as in the original training data set) for which the labels should be calculated.
- Returns
Prediction labels.
-
predict_test_data
()¶ Calculates the predictions labels of the test data using the fuzzy model.
- Returns
Prediction labels.
-
test_model
(xdata, ydata, error_metric='MAE')¶ Calculates the performance of the model using the given data.
- Parameters
xdata – The input data (as numpy array with each row a different data instance and variables in the same order as in the original training data set) for which the labels should be calculated.
ydata – The target data (as single-column numpy array).
error_metric – The error metric in which the performance should be expressed (default = ‘MAE’). Choose from: Mean Absolute Error (‘MAE’), Mean Squared Error (‘MSE’), Root Mean Squared Error (‘RMSE’), Mean Absolute Percentage Error (‘MAPE’).
- Returns
The performance as expressed in the chosen metric.