RankFM¶

class rankfm.rankfm.RankFM(factors=10, loss='bpr', max_samples=10, alpha=0.01, beta=0.1, sigma=0.1, learning_rate=0.1, learning_schedule='constant', learning_exponent=0.25)[source]¶

Factorization Machines for Ranking Problems with Implicit Feedback Data

__init__(factors=10, loss='bpr', max_samples=10, alpha=0.01, beta=0.1, sigma=0.1, learning_rate=0.1, learning_schedule='constant', learning_exponent=0.25)[source]¶

store hyperparameters and initialize internal model state

Parameters:

factors – latent factor rank
loss – optimization/loss function to use for training: [‘bpr’, ‘warp’]
max_samples – maximum number of negative samples to draw for WARP loss
alpha – L2 regularization penalty on [user, item] model weights
beta – L2 regularization penalty on [user-feature, item-feature] model weights
sigma – standard deviation to use for random initialization of factor weights
learning_rate – initial learning rate for gradient step updates
learning_schedule – schedule for adjusting learning rates by training epoch: [‘constant’, ‘invscaling’]
learning_exponent – exponent applied to epoch number to adjust learning rate: scaling = 1 / pow(epoch + 1, learning_exponent)

Returns:

None

fit(interactions, user_features=None, item_features=None, sample_weight=None, epochs=1, verbose=False)[source]¶

clear previous model state and learn new model weights using the input data

Parameters:

interactions – dataframe of observed user/item interactions: [user_id, item_id]
user_features – dataframe of user metadata features: [user_id, uf_1, … , uf_n]
item_features – dataframe of item metadata features: [item_id, if_1, … , if_n]
sample_weight – vector of importance weights for each observed interaction
epochs – number of training epochs (full passes through observed interactions)
verbose – whether to print epoch number and log-likelihood during training

Returns:

self

fit_partial(interactions, user_features=None, item_features=None, sample_weight=None, epochs=1, verbose=False)[source]¶

learn or update model weights using the input data and resuming from the current model state

Parameters:

interactions – dataframe of observed user/item interactions: [user_id, item_id]
user_features – dataframe of user metadata features: [user_id, uf_1, … , uf_n]
item_features – dataframe of item metadata features: [item_id, if_1, … , if_n]
sample_weight – vector of importance weights for each observed interaction
epochs – number of training epochs (full passes through observed interactions)
verbose – whether to print epoch number and log-likelihood during training

Returns:

self

predict(pairs, cold_start='nan')[source]¶

calculate the predicted pointwise utilities for all (user, item) pairs

Parameters:	pairs – dataframe of [user, item] pairs to score cold_start – whether to generate missing values (‘nan’) or drop (‘drop’) user/item pairs not found in training data
Returns:	np.array of real-valued model scores

recommend(users, n_items=10, filter_previous=False, cold_start='nan')[source]¶

calculate the topN items for each user

Parameters:	users – iterable of user identifiers for which to generate recommendations n_items – number of recommended items to generate for each user filter_previous – remove observed training items from generated recommendations cold_start – whether to generate missing values (‘nan’) or drop (‘drop’) users not found in training data
Returns:	pandas dataframe where the index values are user identifiers and the columns are recommended items

similar_items(item_id, n_items=10)[source]¶

find the most similar items wrt latent factor space representation

Parameters:	item_id – item to search n_items – number of similar items to return
Returns:	np.array of topN most similar items wrt latent factor representations

similar_users(user_id, n_users=10)[source]¶

find the most similar users wrt latent factor space representation

Parameters:	user_id – user to search n_users – number of similar users to return
Returns:	np.array of topN most similar users wrt latent factor representations