Probabilistic Latent Tensor Factorization Matlab Toolkit

About

Probabilistic Latent Tensor Factorization (PLTF) is a probabilistic framework, which can be used to specify and factorize probabilistic models involing arbitrary number of dimensions. Given any model specification it is possible to use the standard PLTF update equations to calculate any number of factors given in the model.

The toolkit currently implements two operations. These are the generalized multiplication (tensor multiplication) and PLTF update rule iteration. For each one of these operations there is a parallel version working on Nvidia GPUs using CUDA and a sequential version working on CPU.

This software is developed by Can Kavaklıoğlu and A. Taylan Cemgil with support from The Scientific and Technological Research Council of Turkey (TUBITAK) project number 110E292 and Boğaziçi University Scientific Research Projects grant number P5723.

DISCLAIMER: This is alpha stage software and is provided "AS IS", without warranty of any kind, express or implied.

Downloads

Instructions: Simply extract the compressed files to your working directory in Matlab. If you have CUDA development environment setup you should be able to use the binaries to run the examples. Binaries listed below are compiled with CUDA SDK 4.1 and require CUDA enabled GPU with version 1.3 or above.

Important trick, required to solve the GLIBCXX_3.4.14 not found error in test systems with Matlab 7.10.0.499 (R2010a): it is necessary to displace some files from sys/os/glnxa64 directory of matlab. You can achieve this with the following shell commands: cd matlab_home/sys/os/glnxa64; mkdir old; mv libgcc_s.so.1 libstdc++.so.6.0.9 libstdc++.so.6 old/

Binaries

64 bit Matlab32 bit Matlab
Linux pltf_linux_64bit_0.01.tar.gz coming soon
Windows coming soon coming soon

Syntax

Generalized multiplication/division syntax

gmult_seq and gmult_par functions can be used to multiply or divide tensors with arbitrary number of dimensions. 'seq' stands for sequential code running on CPU and 'par' stands for parallel code running on GPU. Matlab syntax is same for both functions:
output_object = gmult_seq( input1_data, input1_cardinality_vector, ...
	                   input2_data, input2_cardinality_vector, ...
	                   output_cardinality_vector, ...
                           use_multiplication)
	
The operaration can be performed on the GPU by using the gmult_par function with the same sytax. With large enough sized data you can observe effects of the parallel code.

Tensor factorization with PLTF syntax

pltf_seq and pltf_par functions can be used to run PLTF update equations on the model specified by the function parameters. Currently only KL divergence metric version of PLTF update equations is available. The operation can be performed on the GPU by using the pltf_par with the same syntax:
[factor1 factor2] = pltf_seq ( iteration_number, ...
                               all_indices_symbol_vector, all_indices_cardinality_vector, ...
                               input_symbols_vector, input_data, ...
                               factor1_symbols_vector, ...
                               factor2_symbols_vector)
Please note that number of output objects must match number of factors specified in the function arguments.

Examples

Make sure you have config text file in your current working directory, if you do not have it there you may encounter the following error: common/tensorop_par.cu(152) : cutilCheckMsg() CUTIL CUDA error : Kernel execution failed : (9) invalid configuration argument.

Matrix multiplication using generalized multiplication notation

A = rand(10, 20);
B = rand(20, 30);
C = gmult_seq(A, [10 20 0], B, [0 20 30], [10 0 30], 1)
	
Please note that cardinality vectors must define a value for each dimension available. A value of 0 indicates corresponding tensor does not have any elements in that dimension. Therefore ordering of the cardinality vector is also crucial.

Comparison with parallel version

I=100;
J=200;
K=500;

A = rand(I, J);
B = rand(J, K);

tic; C = gmult_seq(A, [I J 0], B, [0 J K], [I 0 K], 1); toc;
Elapsed time is 0.905167 seconds.
tic; C = gmult_par(A, [I J 0], B, [0 J K], [I 0 K], 1); toc;
Elapsed time is 0.267984 seconds.
	
Tests are run on a laptop with Intel Core i7-2630QM processor and a Nvidia GT 550m GPU. Please note that matrix optimizations will provide a better result in this case. However the problem with matrix optimization is that they are not readily available with higher number of dimensions.

PARAFAC using PLTF updates

I=7;
J=8;
K=9;
A=10;

V_card_sym=['i','j','k','a'];
V_cards=[I, J, K, A];

A_card_sym=['i','a'];
A_true = round(10*rand(I,1,1,A));

B_card_sym=['j','a'];
B_true = round(20*rand(1,J,1,A));

C_card_sym=['k','a'];
C_true = round(30*rand(1,1,K,A));

X_card_sym = ['i','j','k'];
X_true = get_parafac(A_true,B_true,C_true,I,J,K,A,[I J K]);

X = poissrnd(X_true);
X(X==0)=0.000001; % suppress zeros, division/log problems, not the best method

iter_num=30;

tic; [factor_A factor_B factor_C] = pltf_seq ( iter_num, V_card_sym, V_cards, X_card_sym, X, ...
                                               A_card_sym, B_card_sym, C_card_sym); toc;
Elapsed time is 5.771417 seconds.
get_KL_div(X, get_parafac(factor_A,factor_B,factor_C,I,J,K,A,size(X)))
ans =
       9723.62
tic; [factor_A factor_B factor_C] = pltf_par ( iter_num, V_card_sym, V_cards, X_card_sym, X, ...
                                               A_card_sym, B_card_sym, C_card_sym); toc;
Elapsed time is 5.175981 seconds.
get_KL_div(X, get_parafac(factor_A,factor_B,factor_C,I,J,K,A,size(X)))
ans =
       9723.62

iter_num=100;
tic; [factor_A factor_B factor_C] = pltf_seq ( iter_num, V_card_sym, V_cards, X_card_sym, X, ...
                                               A_card_sym, B_card_sym, C_card_sym); toc;
Elapsed time is 19.134505 seconds.
get_KL_div(X, get_parafac(factor_A,factor_B,factor_C,I,J,K,A,size(X)))
ans =
       4256.09
tic; [factor_A factor_B factor_C] = pltf_par ( iter_num, V_card_sym, V_cards, X_card_sym, X, ...
                                               A_card_sym, B_card_sym, C_card_sym); toc;
Elapsed time is 17.015702 seconds.
get_KL_div(X, get_parafac(factor_A,factor_B,factor_C,I,J,K,A,size(X)))
ans =
       4256.09
-------------------------------------------------------------------------------------------------------------

function [X1] = get_parafac(A_data,B_data,C_data,I,J,K,A,X_dims)

  X1=zeros(X_dims);
  for i=1:I
    for j=1:J
      for k=1:K
        for a=1:A
            X1(i,j,k) = X1(i,j,k) + ...
                        A_data(i,1,1,a)*B_data(1,j,1,a)*C_data(1,1,k,a);
        end
      end
    end
  end
end

function [ KL ] = get_KL_div( p_mat, q_mat )
    KL = sum(sum(sum( p_mat .* (log( p_mat ) - log(q_mat)) - p_mat + q_mat  )));
end
	

References

  • Yilmaz, K. Y.; Cemgil, A. T. & Simsekli, U. Generalized Coupled Tensor Factorization, NIPS, 2011
  • Yilmaz, K. & Cemgil, A. T., Probabilistic Latent Tensor factorisation, Proc. of International Conference on Latent Variable analysis and Signal Separation, 2010, 6365, 346-353
  • Cemgil, A. T.; Simsekli, U. & Subakan, Y. C. Probabilistic Latent Tensor Factorization Framework for Audio Modeling, Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics WASPAA ’11, 2011
  • Cemgil, A. T., Bayesian Inference in Non-negative Matrix Factorisation Models, Computational Intelligence and Neuroscience, 2009
  • Fevotte, C. & Cemgil, A. T. Nonnegative matrix factorisations as probabilistic inference in composite models, Proc. 17th European Signal Processing Conference (EUSIPCO’09), 2009

Limitations

  • Memory: Current implementation needs large enough GPU and host device memory to calculate the largest possible object with maximum possible cardinalities on all dimensions. Although this is good for code clarity and testing purposes, it is in no way optimal. Next version of the software will provide ways of trading speed for variable memory requirements.
  • Hot start: Current implementation can not be seeded with results of a previous run. Next version of the software will provide ways of inputing initial states of randomly intialized factors.

Known Bugs

  • Will be listed here while they are fixed.

FAQ

Q. I can not get binaries to work!?
A. Make sure you have a working CUDA environment installed in for your operating system. If all else fails you can contact me via email. Please note that answers to your emails may be posted on this FAQ section.

Q. Why only binaries?
A. Sources will be available under GPL after publication of the work.

Q. How to ask questions, comment, suggest?
A. Feel free to contact me via email.


Created on Dec 2011