Skip to content

hrautila/matops

Repository files navigation

matops

Matrix operations. Almost complete implementation of BLAS level 1, 2 and 3 routines for double precision floating point. All computation is in place. The implementation supports matrix views (submatrices of larger matrices) and parallel execution of matrix operations in multiple threads. Some functions already allow parallel execution for better performance.

NOTE: I am not working on this code base anymore. I have moved most of the functionality to a new package github.com/hrautila/gomas. See there for more details.

Supported functionality is:

Blas level 3

Mult(C, A, B, alpha, beta, flags)           General matrix-matrix multiplication  (GEMM)
MultSymm(C, A, B, alpha, beta, flags)       Symmetric matrix-matrix multipication (SYMM)
MultTrm(B, A, alpha, flags)                 Triangular matrix-matrix multiplication (TRMM)  
SolveTrm(B, A, alpha, flags)                Triangular solve with multiple RHS (TRSM)
RankUpdateSym(C, A, alpha, beta,flags)      Symmetric matrix rank-k update (SYRK)
RankUpdate2Sym(C, A, B, alpha, beta, flags) Symmetric matrix rank-2k update (SYR2K)

Blas level 2

MVMult(X, A, Y, alpha, beta, flags)         General matrix-vector multiplication (GEMV)
MVRankUpdate(A, X, Y, alpha, flags)         General matrix rank update (GER)
MVRankUpdateSym(A, X, alpha, flags)         Symmetric matrix rank update (SYR)
MVRankUpdate2Sym(A, X, Y, alpha, flags)     Symmetric matrix rank 2 update (SYR2)
MVSolveTrm(X, A, alpha, flags)              Triangular solve (TRSV)
MVMultTrm(X, A, flags)                      Triangular matrix-vector multiplication (TRMV)

Blas level 1

Norm2(X, Y)         Vector norm (NRM2)
Dot(X, Y)           Inner product (DOT)
Swap(X, Y)          Vector-vector swap (SWAP)
InvScale(X, alpha)  Inverse scaling of X 
Scale(X, alpha)     Scaling of X (SCAL)

Additional

ScalePlus(A, B, alpha, beta, flags)         Calculate A = alpha*op(A) + beta*op(B)
NormP(X, norm)                              Matrix or vector norm, _1, _2, _Inf
UpdateTrm(C, A, B, alpha, beta, flags)      Triangular matrix update
MVUpdateTrm(C, X, Y, alpha, flags)          Triangular matrix update with vectors.

Lapack

DecomposeCHOL(A, nb)                Cholesky factorization (DPOTRF)
DecomposeLDLnoPiv(A, nb)            LDL factorization without pivoting
DecomposeLDL(A, W, ipiv, flgs, nb)  LDL factorization with pivoting
DecomposeLUnoPiv(A, nb)             LU factorization without pivoting
DecomposeLU(A, pivots, nb)          LU factorization with pivoting (DGETRF)
DecomposeQR(A, tau, nb)             QR factorization (DGEQRF)
DecomposeQRT(A, T, W, nb)           QR factorization, compact WY version (DGEQRT)
MultQ(C, A, tau, W, flgs, nb)       Multiply by Q  (DORMQR)
MultQT(C, A, T, W, flgs, nb)        Multiply by Q, compact WY version (DGEMQRT)
BuildQ(A, tau, W, nb)               Build matrix Q with ortonormal columns (DORGQR)
BuildQT(A, T, W, nb)                Build matrix Q with ortonormal columns 
BuildT(T, A, tau)                   Build block reflector T from elementary reflectors (DLARFT)
SolveCHOL(B, A, flags)              Solve Cholesky factorized linear system (DPOTRS)
SolveLDL(B, A, pivots, flags)       Solve LDL factorized linear system
SolveLU(B, A, pivots, flags)        Solve LU factorized linear system (DGETRS)
SolveQR(B, A, tau, W, flgs, nb)     Solve least square problem when m >= n (DGELS)
SolveQRT(B, A, T, W, flgs, nb)      Solve least square problem when m >= n, compact WY (DGELS)
InverseTrm(A, flags, nb)            Inverse triangular matrix (DTRTRI)

Support functions

TriL(A)                   Make A triangular, lower 
TriLU(A)                  Make A triangular, lower, unit-diagonal 
TriU(A)                   Make A triangular, upper 
TriUU(A)                  Make A triangular, upper, unit-diagonal 

Parameter functions

BlockingParams(m,n,k)     Blocking size parameters for low-level functions
NumWorkers(nwrk)          Number of threads for use in operations
DecomposeBlockSize(nb)    Block size for blocked decomposition algorithms

This is still WORK IN PROGRESS. Consider this as beta level code, at best.

Overall performance is compareable to ATLAS BLAS library. Some performance testing programs are in test subdirectory. Running package and performace tests requires github.com/hrautila/linalg packages as results are compared to existing BLAS/LAPACK implementation.

See the Wiki pages for some additional information.

About

Matrix operations, implementations of BLAS/LAPACK subroutines

Resources

License

Unknown, GPL-3.0 licenses found

Licenses found

Unknown
COPYING
GPL-3.0
COPYING.GPL

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published