RankFM

class rankfm.rankfm.RankFM(factors=10, loss='bpr', max_samples=10, alpha=0.01, beta=0.1, sigma=0.1, learning_rate=0.1, learning_schedule='constant', learning_exponent=0.25)[source]

Factorization Machines for Ranking Problems with Implicit Feedback Data

__init__(factors=10, loss='bpr', max_samples=10, alpha=0.01, beta=0.1, sigma=0.1, learning_rate=0.1, learning_schedule='constant', learning_exponent=0.25)[source]

store hyperparameters and initialize internal model state

Parameters:
  • factors – latent factor rank
  • loss – optimization/loss function to use for training: [‘bpr’, ‘warp’]
  • max_samples – maximum number of negative samples to draw for WARP loss
  • alpha – L2 regularization penalty on [user, item] model weights
  • beta – L2 regularization penalty on [user-feature, item-feature] model weights
  • sigma – standard deviation to use for random initialization of factor weights
  • learning_rate – initial learning rate for gradient step updates
  • learning_schedule – schedule for adjusting learning rates by training epoch: [‘constant’, ‘invscaling’]
  • learning_exponent – exponent applied to epoch number to adjust learning rate: scaling = 1 / pow(epoch + 1, learning_exponent)
Returns:

None

fit(interactions, user_features=None, item_features=None, sample_weight=None, epochs=1, verbose=False)[source]

clear previous model state and learn new model weights using the input data

Parameters:
  • interactions – dataframe of observed user/item interactions: [user_id, item_id]
  • user_features – dataframe of user metadata features: [user_id, uf_1, … , uf_n]
  • item_features – dataframe of item metadata features: [item_id, if_1, … , if_n]
  • sample_weight – vector of importance weights for each observed interaction
  • epochs – number of training epochs (full passes through observed interactions)
  • verbose – whether to print epoch number and log-likelihood during training
Returns:

self

fit_partial(interactions, user_features=None, item_features=None, sample_weight=None, epochs=1, verbose=False)[source]

learn or update model weights using the input data and resuming from the current model state

Parameters:
  • interactions – dataframe of observed user/item interactions: [user_id, item_id]
  • user_features – dataframe of user metadata features: [user_id, uf_1, … , uf_n]
  • item_features – dataframe of item metadata features: [item_id, if_1, … , if_n]
  • sample_weight – vector of importance weights for each observed interaction
  • epochs – number of training epochs (full passes through observed interactions)
  • verbose – whether to print epoch number and log-likelihood during training
Returns:

self

predict(pairs, cold_start='nan')[source]

calculate the predicted pointwise utilities for all (user, item) pairs

Parameters:
  • pairs – dataframe of [user, item] pairs to score
  • cold_start – whether to generate missing values (‘nan’) or drop (‘drop’) user/item pairs not found in training data
Returns:

np.array of real-valued model scores

recommend(users, n_items=10, filter_previous=False, cold_start='nan')[source]

calculate the topN items for each user

Parameters:
  • users – iterable of user identifiers for which to generate recommendations
  • n_items – number of recommended items to generate for each user
  • filter_previous – remove observed training items from generated recommendations
  • cold_start – whether to generate missing values (‘nan’) or drop (‘drop’) users not found in training data
Returns:

pandas dataframe where the index values are user identifiers and the columns are recommended items

similar_items(item_id, n_items=10)[source]

find the most similar items wrt latent factor space representation

Parameters:
  • item_id – item to search
  • n_items – number of similar items to return
Returns:

np.array of topN most similar items wrt latent factor representations

similar_users(user_id, n_users=10)[source]

find the most similar users wrt latent factor space representation

Parameters:
  • user_id – user to search
  • n_users – number of similar users to return
Returns:

np.array of topN most similar users wrt latent factor representations