矩阵论课程学习的一些代码实践
- 前言
- LU & LDV分解
- 初等矩阵求逆
- 利用LU分解解线性方程组
- 上代码
- examples
- 未完待续
前言
万恶的矩阵论终于考试结束,本人学艺不精,但是喜欢在学习新东西的过程中用python来实践理论,所以肯定也得和矩阵论“贴贴”。也算一时兴起,有些代码也参考了其他大神(抱大腿)。因为不是数学专业,也是根据自己的粗浅理解,总体还不是很完善,仅供参考,必定有错,毋庸置疑。也算是对自己一年的总结吧~ 谢谢各位
正文开始~
CSDN (゜-゜)つロ 干杯
LU & LDV分解
若方阵A可以分解为A=LU,其中,L和U分别是上下三角矩阵,我们称这种分解为LU分解,L可以表示lower,U可以表示upper。当A可以分解为A=LDV形式,且L,V分别是对角线元素为1的下三角和上三角阵,D为对角矩阵时,称此种分解为LDV分解。这里我们不过多探究LU或者LDV分解的存在性和唯一性判定,具体可以参考相关的Linear Algebra书籍。
给个大概的既视感:
具体操作方法呢,我们可以先对方阵A进行初等行变换化为上三角阵,假设每一次的行变换操作等价于左乘一个变换矩阵P,我们有
记,其中是一个可逆的下三角矩阵。而我们在以前的学习中也知道行变换矩阵的逆有对应的形式,所以得到,即LU分解。
而我们可以由LU分解继续得到LDV分解(注:有LU分解不一定就有LDV分解),此时U是一个上三角矩阵,我们把对角线的元素抽出组成对角矩阵D,然后将U去除以各行对应的对角元素即可得到V,下面是个简单例子:
初等矩阵求逆
对于初等矩阵,即由初等变换(还不知道可以度娘)得到的矩阵,它们都是可逆矩阵(det不为0),且有着固定的逆矩阵形式,如:
- 表示第i行乘上一个常数k,其逆矩阵为 ;
- 表示第i行和第j行互换,其逆矩阵为其本身,即,很好理解,再换一次就复原了;
- 表示第i行的k倍加到第j行,其逆矩阵为.
利用LU分解解线性方程组
对于线性方程组(暂不考虑相不相容的情况),我们对A进行LU分解,得到,所以有,我们记,即把UX看成一个整体。这里是要利用L的下三角特性,可以比较方便地解出Y的值,然后在中又可以利用U的上三角特性,方便地解出X。
上代码
具体在编程的时候还要注意大数吃小数的问题,所以可以在行变换时对各行进行排序,从小到大,让小的数去乘上对应的系数约去大数。
@classmethod
def LDV_2D(self, A, mode='LDV', eps=1e-6, Test=False, solve_equation=False):
"""custom function for 2D-matrix LDV-factorization by Junno
Args:
A ([np.darray]): [2D-matrix to be processed with LDV-factorization. ]
mode (str, optional): [choose mode]. Defaults to LDV.
eps ([float]): numerical threshold.
Test (bool, optional): [show checking information]. Defaults to False.
solve_equation(bool,optional): [True when use this function to solve equation like Ax=b]. Defaults to False.
Raises:
ValueError: [matrix don't meet the requirement for LDV or LU factorization]
Returns:
[np.darray]: [resort-matrix, L, D, V] or [resort-matrix, L, U]
Last edited: 21-12-31
Author: Junno
"""
assert len(A.shape) == 2 and A.shape[0] == A.shape[1], "please be sure A is 2D and square"
# primary check for LDV: nonsingular matrices has LDV factorization uniquely.
if mode == 'LDV':
# full-rank check, we will introduce later!
if not self.Check_full_rank(A):
raise ValueError(
"This matrix can't be operated with LDV factorization, maybe you can try LU factorization instead ^_^")
M, N = A.shape
# cm is to be operated
cm = np.copy(A)
# A with row exchange
resort_m = np.copy(A)
# record primary row transform operators
operator_list = []
L = np.eye(M)
if Test:
print('original matrix:\n', A)
for i in range(M):
for j in range(i, N):
if Test:
print('During process in row:{}, col:{}'.format(i, j))
# there are non-zero value in col i
if sum(abs(cm[i:, j])) != 0:
zero_index = []
non_zero_index = []
for k in range(i, M):
if abs(cm[k, j])>eps:
non_zero_index.append(k)
else:
zero_index.append(k)
non_zero_index = sorted(
non_zero_index, key=lambda x: abs(cm[x, j]))
# from small to large ans non-zero first
sort_idnex = non_zero_index+zero_index
if Test:
print('Sorting index for {} cols:\n'.format(i), sort_idnex)
# do row-exchange and record
# t means row-exchange transform and pre means primary row transform coefficients
if sort_idnex[0] != i:
operator_list.append(['t', [sort_idnex[0], i]])
cm[[sort_idnex[0], i], :] = cm[[i, sort_idnex[0]], :]
resort_m[[sort_idnex[0], i],
:] = resort_m[[i, sort_idnex[0]], :]
if Test:
print('After resorting\n', cm)
# operate elementary row transformation on matrix based on elimination coefficient
if np.sum(abs(cm[i+1:, j]))>eps:
prefix = -cm[i+1:, j]/cm[i, j]
operator_list.append(['pre', i, prefix.tolist()])
temp = (np.array(prefix).reshape(-1, 1)
)@cm[i, :].reshape((1, -1))
cm[i+1:, :] += temp
if Test:
print('After primary row transformation:')
print(cm)
break
else:
continue
# Prints a new matrix arranged according to the main elements (just swap rows without changing the matrix information)
if Test:
print('resort matrix:\n', resort_m)
print('After primary row transformation: \n', cm)
print('Operation list:\n', operator_list)
# Operate on the identity matrix according to the saved sorting and elimination coefficients to get the matrix L
for i in range(len(operator_list)-1, -1, -1):
op = operator_list[i]
if op[0] == 't':
m, n = op[1]
L[[m, n], :] = L[[n, m], :]
elif op[0] == 'pre':
ind = op[1]
temp = -(np.array(op[2]).reshape(-1, 1)
)@L[ind, :].reshape((1, -1))
L[ind+1:, :] += temp
# Operate the same row transpose on the L to let it be lower triangle
for i in range(len(operator_list)):
op = operator_list[i]
if op[0] == 't':
m, n = op[1]
L[[m, n], :] = L[[n, m], :]
# LU mode
if mode == 'LU':
if solve_equation:
return operator_list, resort_m, L, cm
else:
return resort_m, L, cm
# LDV mode
elif mode == 'LDV':
# extract the principal diagonal elements of the transformed matrix
diag = np.diag(cm)
# Error
if 0 in diag:
raise ValueError(
'this matrix can\'t be operated with LDV factorization')
return resort_m, L, np.diag(diag), np.triu(cm/diag.reshape((-1, 1)))
我这捉鸡的代码功底大家见笑,大致思路已经注释在里面,核心不过初等行变换以及记录变换的一些信息,随后合成L和U矩阵,继而可以得到LDV分解的结果。
examples
先上一个教材的例子(《矩阵论》杨明,刘先忠)
>>> import numpy as np
>>> from Matrix_Solutions import Matrix_Solutions
>>> import math
>>> from copy import deepcopy
>>> np.set_printoptions(precision=3, suppress=True, linewidth=100)
>>> a=np.array([[2,2,3],[4,7,7],[-2,4,5]]).reshape((3,3)).astype(np.float32)
>>> a
array([[ 2., 2., 3.],
[ 4., 7., 7.],
[-2., 4., 5.]], dtype=float32)
# test LU factorization
>>> rm,L,U=Matrix_Solutions.LDV_2D(a, mode="LU",test=False)
>>> rm
array([[ 2., 2., 3.],
[ 4., 7., 7.],
[-2., 4., 5.]], dtype=float32)
>>> L
array([[ 1., 0., 0.],
[ 2., 1., 0.],
[-1., 2., 1.]])
>>> U
array([[2., 2., 3.],
[0., 3., 1.],
[0., 0., 6.]], dtype=float32)
>>> L@U
array([[ 2., 2., 3.],
[ 4., 7., 7.],
[-2., 4., 5.]])
# test LDV factorization
>>> re,L,D,V=Matrix_Solutions.LDV_2D(a, mode="LDV",Test=False)
>>> L
array([[ 1., 0., 0.],
[ 2., 1., 0.],
[-1., 2., 1.]])
>>> D
array([[2., 0., 0.],
[0., 3., 0.],
[0., 0., 6.]], dtype=float32)
>>> V
array([[1. , 1. , 1.5 ],
[0. , 1. , 0.333],
[0. , 0. , 1. ]], dtype=float32)
>>> L@D@V
array([[ 2., 2., 3.],
[ 4., 7., 7.],
[-2., 4., 5.]])
# test a random matrix
>>> a=np.random.randn(4,4)
>>> a
array([[-1.407, -0.787, 1.194, 0.275],
[-0.539, 0.482, -0.562, 3.241],
[-0.822, 0.482, -1.329, 1.048],
[-0.386, -0.432, 0.55 , 1.209]])
>>> rm,L,D,V=Matrix_Solutions.LDV_2D(a, mode="LDV",Test=False)
# we can see that rm is the result of resorting a, and LDV is linked to rm
>>> rm
array([[-0.386, -0.432, 0.55 , 1.209],
[-1.407, -0.787, 1.194, 0.275],
[-0.539, 0.482, -0.562, 3.241],
[-0.822, 0.482, -1.329, 1.048]])
>>> L@D@V
array([[-0.386, -0.432, 0.55 , 1.209],
[-1.407, -0.787, 1.194, 0.275],
[-0.539, 0.482, -0.562, 3.241],
[-0.822, 0.482, -1.329, 1.048]])