一元線性回歸逐畫素判斷多組同時執行矩陣運算

一元線性回歸再簡單不過了，實現的方式多種多樣。呼叫scikit-learn linear_model.linearregression()、scipy.polyfit( ) 或 numpy.polyfit( )、stats.linregress( )、optimize.curve_fit( )、numpy.linalg.lstsq、statsmodels.ols ( )使用矩陣求逆方法的解析解、高中數學講的最小二乘法公式，詳細請看部落格：python環境下的8種簡單線性回歸演算法。

但如果資料y(y1,y2,y3,…yn)**現隨機位置與個數的無效值（0值），只有將對應的x(x1,x2,x3,…xn)與y中的無效值（0值）剔除，得到的y隨x變化趨勢才是準確的。對於較少組y與x的回歸，可以採用for迴圈一組一組的判斷計算。但是當有數以億計組y與x，用for迴圈則顯得效率低下。遇到這一問題，是由於我在計算多年ndvi序列趨勢，每一年的ndvi會出現隨機位置與個數的無效值（0值）。

經過探索，在健忘主義這篇部落格啟發下，找到了上述問題解決辦法。下面分享程式設計實現思路

通過改寫健忘主義這篇部落格，利用numpy 陣列提供的求和平方等函式，設定操作在axis=0上對列進行求和等運算。通過陣列運算提高操作效率。

函式輸入陣列x，y需要提前進行處理，在下節函式呼叫中說明。

def
lrg_ols
(x,y)
:'''
calculate slope and intercept by ols
refer to 
parameters
----------
x : 
type
n*m ndarray
description.
each column stands for a independent variable series, eg. x1,x2,x3,...,xn.
which "x1,x2,x3,...,xn" is 1,2, 3 ...n insequence. x can be created by 
x = np.arange(1,n+1).reshape(n,1)
x = np.broadcast_to(x,(n,m))            
y : 
type
n*m ndarray
description. 
each column stands for a dependent variable series, eg. y1 y2 y3...yn.
note the y's dimensions must be the same as x's
returns
-------
k : type
1*m ndarray
description.
corresponding slopes
b : type
1*m ndarray
description.
corresponding intercept
'''x_size = np.count_nonzero(x, axis=0)
# x_size should be equal y_size
# xyproduct = np.multiply(x,y) #element-wise produt of x and y
# xxproduct = np.multiply(x,x)
x_mean = np.
sum(x,axis=0)
.reshape(1,
-1)/ x_size
y_mean = np.
sum(y,axis=0)
.reshape(1,
-1)/ x_size
zi = np.
sum(x*y,axis=0)
.reshape(1,
-1)- x_size * x_mean * y_mean
mu = np.
sum(x*x,axis=0)
.reshape(1,
-1)- x_size * x_mean * x_mean
k = zi / mu
b = y_mean - k * x_mean
# 計算決定係數
y_pred = x*k + b
y_pred[np.where(x==0)
]=0   
ssr = np.
sum(np.square(y_pred - y_mean)
,axis=0)
.reshape(1,
-1)# 回歸平方和
sse = np.
sum(np.square(y - y_pred)
,axis=0)
.reshape(1,
-1)# 殘差平方和
sst = ssr + sse # 總偏差平方和
r2 = ssr / sst    
return k,b,r2,x_size

輸入陣列x，y。因變數y整理成二維陣列，每一列是乙個因變數時序；自變數整理成與x維度相同的陣列，每一列是乙個自變數陣列。# 然後進行資料篩選，將因變數無效值處賦值為0，並將對應自變數處賦值為0。

shp = data.shape        
# reshape後成為每一列是乙個時序
data = data.reshape(shp[0]
,-1)
# 構造自變數序列，1，2 ... n,
x = np.arange(
1,shp[0]
+1).reshape(shp[0]
,1)x = np.repeat(x,shp[1]
*shp[2]
,axis=1)
# 資料篩選，將負值賦值為0
cond = np.where(data<0)
data[cond]=0
x[cond]=0
k,b,r2,xsize = lrg_ols(x,data)

一元線性回歸逐畫素判斷多組同時執行矩陣運算

一元線性回歸模型

SPSS一元線性回歸

一元線性回歸模型

一元線性回歸 逐畫素判斷 多組同時執行 矩陣運算

一元線性回歸模型

SPSS一元線性回歸

一元線性回歸模型

相關推薦

一元線性回歸逐畫素判斷多組同時執行矩陣運算