矩阵论课程学习的一些代码实践

  • 前言
  • LU & LDV分解
  • 初等矩阵求逆
  • 利用LU分解解线性方程组
  • 上代码
  • examples
  • 未完待续


前言

  万恶的矩阵论终于考试结束,本人学艺不精,但是喜欢在学习新东西的过程中用python来实践理论,所以肯定也得和矩阵论“贴贴”。也算一时兴起,有些代码也参考了其他大神(抱大腿)。因为不是数学专业,也是根据自己的粗浅理解,总体还不是很完善,仅供参考,必定有错,毋庸置疑。也算是对自己一年的总结吧~ 谢谢各位



正文开始~

CSDN (゜-゜)つロ 干杯



LU & LDV分解

  若方阵A可以分解为A=LU,其中,L和U分别是上下三角矩阵,我们称这种分解为LU分解,L可以表示lower,U可以表示upper。当A可以分解为A=LDV形式,且L,V分别是对角线元素为1的下三角和上三角阵,D为对角矩阵时,称此种分解为LDV分解。这里我们不过多探究LU或者LDV分解的存在性和唯一性判定,具体可以参考相关的Linear Algebra书籍。


  给个大概的既视感:
lua矩阵定义 矩阵论中l是什么意思_矩阵

lua矩阵定义 矩阵论中l是什么意思_矩阵_02



  具体操作方法呢,我们可以先对方阵A进行初等行变换化为上三角阵,假设每一次的行变换操作等价于左乘一个变换矩阵P,我们有
lua矩阵定义 矩阵论中l是什么意思_python_03
  记lua矩阵定义 矩阵论中l是什么意思_矩阵论_04,其中lua矩阵定义 矩阵论中l是什么意思_lua矩阵定义_05是一个可逆的下三角矩阵。而我们在以前的学习中也知道行变换矩阵的逆有对应的形式,所以得到lua矩阵定义 矩阵论中l是什么意思_线性代数_06,即LU分解。

  而我们可以由LU分解继续得到LDV分解(注:有LU分解不一定就有LDV分解),此时U是一个上三角矩阵,我们把对角线的元素抽出组成对角矩阵D,然后将U去除以各行对应的对角元素即可得到V,下面是个简单例子:

lua矩阵定义 矩阵论中l是什么意思_python_07



初等矩阵求逆

  对于初等矩阵,即由初等变换(还不知道可以度娘)得到的矩阵,它们都是可逆矩阵(det不为0),且有着固定的逆矩阵形式,如:

  1. lua矩阵定义 矩阵论中l是什么意思_矩阵论_08表示第i行乘上一个常数k,其逆矩阵为 lua矩阵定义 矩阵论中l是什么意思_矩阵_09
  2. lua矩阵定义 矩阵论中l是什么意思_矩阵_10表示第i行和第j行互换,其逆矩阵为其本身,即lua矩阵定义 矩阵论中l是什么意思_矩阵_10,很好理解,再换一次就复原了;
  3. lua矩阵定义 矩阵论中l是什么意思_矩阵论_12表示第i行的k倍加到第j行,其逆矩阵为lua矩阵定义 矩阵论中l是什么意思_python_13.

利用LU分解解线性方程组

  对于线性方程组(暂不考虑相不相容的情况)lua矩阵定义 矩阵论中l是什么意思_矩阵论_14,我们对A进行LU分解,得到lua矩阵定义 矩阵论中l是什么意思_python_15,所以有lua矩阵定义 矩阵论中l是什么意思_lua矩阵定义_16,我们记lua矩阵定义 矩阵论中l是什么意思_线性代数_17,即把UX看成一个整体。这里是要利用L的下三角特性,可以比较方便地解出Y的值,然后在lua矩阵定义 矩阵论中l是什么意思_矩阵论_18中又可以利用U的上三角特性,方便地解出X。
lua矩阵定义 矩阵论中l是什么意思_矩阵_19



上代码

  具体在编程的时候还要注意大数吃小数的问题,所以可以在行变换时对各行进行排序,从小到大,让小的数去乘上对应的系数约去大数。

lua矩阵定义 矩阵论中l是什么意思_矩阵论_20

@classmethod
def LDV_2D(self, A, mode='LDV', eps=1e-6, Test=False, solve_equation=False):
	"""custom function for 2D-matrix LDV-factorization by Junno
	
	Args:
	    A ([np.darray]): [2D-matrix to be processed with LDV-factorization. ]
	    mode (str, optional): [choose mode]. Defaults to LDV.
	    eps ([float]): numerical threshold.
	    Test (bool, optional): [show checking information]. Defaults to False.
	    solve_equation(bool,optional): [True when use this function to solve equation like Ax=b]. Defaults to False.
	
	Raises:
	    ValueError: [matrix don't meet the requirement for LDV or LU factorization]
	
	Returns:
	    [np.darray]: [resort-matrix, L, D, V] or [resort-matrix, L, U]
	
	Last edited: 21-12-31

    Author: Junno
	"""
	
	assert len(A.shape) == 2 and A.shape[0] == A.shape[1], "please be sure A is 2D and square"
    # primary check for LDV:  nonsingular matrices has LDV factorization uniquely.
    if mode == 'LDV':
    	# full-rank check, we will introduce later!
        if not self.Check_full_rank(A):
            raise ValueError(
                "This matrix can't be operated with LDV factorization, maybe you can try LU factorization instead ^_^")
    M, N = A.shape
    # cm is to be operated
    cm = np.copy(A)
    # A with row exchange
    resort_m = np.copy(A)
    # record primary row transform operators
    operator_list = []
    L = np.eye(M)
    if Test:
    	print('original matrix:\n', A)

    for i in range(M):
        for j in range(i, N):
            if Test:
                print('During process in row:{}, col:{}'.format(i, j))
            # there are non-zero value in col i
            if sum(abs(cm[i:, j])) != 0:
                zero_index = []
                non_zero_index = []
                for k in range(i, M):
                    if abs(cm[k, j])>eps:
                        non_zero_index.append(k)
                    else:
                        zero_index.append(k)

                non_zero_index = sorted(
                    non_zero_index, key=lambda x: abs(cm[x, j]))
				# from small to large ans non-zero first
                sort_idnex = non_zero_index+zero_index

                if Test:
                    print('Sorting index for {} cols:\n'.format(i), sort_idnex)
				
				# do row-exchange and record
				# t means row-exchange transform and pre means primary row transform coefficients 
                if sort_idnex[0] != i:
                    operator_list.append(['t', [sort_idnex[0], i]])
                    cm[[sort_idnex[0], i], :] = cm[[i, sort_idnex[0]], :]
                    resort_m[[sort_idnex[0], i],
                             :] = resort_m[[i, sort_idnex[0]], :]
                if Test:
                    print('After resorting\n', cm)
				
				# operate elementary row transformation on matrix based on elimination coefficient
                if np.sum(abs(cm[i+1:, j]))>eps:
                    prefix = -cm[i+1:, j]/cm[i, j]
                    operator_list.append(['pre', i, prefix.tolist()])
                    temp = (np.array(prefix).reshape(-1, 1)
                            )@cm[i, :].reshape((1, -1))
                    cm[i+1:, :] += temp

                if Test:
                    print('After primary row transformation:')
                    print(cm)

                break

            else:
                continue

    # Prints a new matrix arranged according to the main elements (just swap rows without changing the matrix information)
    if Test:
        print('resort matrix:\n', resort_m)
        print('After primary row transformation: \n', cm)
        print('Operation list:\n', operator_list)

    # Operate on the identity matrix according to the saved sorting and elimination coefficients to get the matrix L
    for i in range(len(operator_list)-1, -1, -1):
        op = operator_list[i]
        if op[0] == 't':
            m, n = op[1]
            L[[m, n], :] = L[[n, m], :]
        elif op[0] == 'pre':
            ind = op[1]
            temp = -(np.array(op[2]).reshape(-1, 1)
                     )@L[ind, :].reshape((1, -1))
            L[ind+1:, :] += temp

    # Operate the same row transpose on the L to let it be lower triangle
    for i in range(len(operator_list)):
        op = operator_list[i]
        if op[0] == 't':
            m, n = op[1]
            L[[m, n], :] = L[[n, m], :]

    # LU mode
    if mode == 'LU':
        if solve_equation:
            return operator_list, resort_m, L, cm
        else:
            return resort_m, L, cm
    # LDV mode
    elif mode == 'LDV':
        # extract the principal diagonal elements of the transformed matrix
        diag = np.diag(cm)

        # Error
        if 0 in diag:
            raise ValueError(
                'this matrix can\'t be operated with LDV factorization')
        return resort_m, L, np.diag(diag), np.triu(cm/diag.reshape((-1, 1)))

  我这捉鸡的代码功底大家见笑,大致思路已经注释在里面,核心不过初等行变换以及记录变换的一些信息,随后合成L和U矩阵,继而可以得到LDV分解的结果。



examples

  先上一个教材的例子(《矩阵论》杨明,刘先忠
lua矩阵定义 矩阵论中l是什么意思_矩阵_21

>>> import numpy as np
>>> from Matrix_Solutions import Matrix_Solutions
>>> import math
>>> from copy import deepcopy
>>> np.set_printoptions(precision=3, suppress=True, linewidth=100)

>>> a=np.array([[2,2,3],[4,7,7],[-2,4,5]]).reshape((3,3)).astype(np.float32)
>>> a
array([[ 2.,  2.,  3.],
       [ 4.,  7.,  7.],
       [-2.,  4.,  5.]], dtype=float32)
# test LU factorization
>>> rm,L,U=Matrix_Solutions.LDV_2D(a, mode="LU",test=False)
>>> rm
array([[ 2.,  2.,  3.],
       [ 4.,  7.,  7.],
       [-2.,  4.,  5.]], dtype=float32)
>>> L
array([[ 1.,  0.,  0.],
       [ 2.,  1.,  0.],
       [-1.,  2.,  1.]])
>>> U
array([[2., 2., 3.],
       [0., 3., 1.],
       [0., 0., 6.]], dtype=float32)
>>> L@U
array([[ 2.,  2.,  3.],
       [ 4.,  7.,  7.],
       [-2.,  4.,  5.]])

# test LDV factorization
>>> re,L,D,V=Matrix_Solutions.LDV_2D(a, mode="LDV",Test=False) 
>>> L
array([[ 1.,  0.,  0.],
       [ 2.,  1.,  0.],
       [-1.,  2.,  1.]])
>>> D
array([[2., 0., 0.],
       [0., 3., 0.],
       [0., 0., 6.]], dtype=float32)
>>> V
array([[1.   , 1.   , 1.5  ],
       [0.   , 1.   , 0.333],
       [0.   , 0.   , 1.   ]], dtype=float32)
>>> L@D@V
array([[ 2.,  2.,  3.],
       [ 4.,  7.,  7.],
       [-2.,  4.,  5.]])

# test a random matrix
>>> a=np.random.randn(4,4)
>>> a
array([[-1.407, -0.787,  1.194,  0.275],
       [-0.539,  0.482, -0.562,  3.241],
       [-0.822,  0.482, -1.329,  1.048],
       [-0.386, -0.432,  0.55 ,  1.209]])
>>> rm,L,D,V=Matrix_Solutions.LDV_2D(a, mode="LDV",Test=False) 
# we can see that rm is the result of resorting a, and LDV is linked to rm
>>> rm
array([[-0.386, -0.432,  0.55 ,  1.209],
       [-1.407, -0.787,  1.194,  0.275],
       [-0.539,  0.482, -0.562,  3.241],
       [-0.822,  0.482, -1.329,  1.048]])
>>> L@D@V
array([[-0.386, -0.432,  0.55 ,  1.209],
       [-1.407, -0.787,  1.194,  0.275],
       [-0.539,  0.482, -0.562,  3.241],
       [-0.822,  0.482, -1.329,  1.048]])