AES是美国确立的一种高级数据加密算法标准,它是一种对数据分组进行对称加密的算法,这种算法是由比利时的Joan Daemen和Vincent Rijmen设计的,因此又被称为RIJNDAE算法.

根据密钥长度的不同,AES标准又区分为AES-128, AES-192, AES-256三种,密钥越长,对每一数据分组进行的加密步骤(加密轮数)也越多.AES-128/192/256分别对应10/12/14轮加密步骤. AES-256对应的密钥长度为256bits, 其每一数据分组都需要进行14轮的加密运算,(若将初始轮+结束轮视为完整一轮, 总共就是14轮).AES规定每一数据分组长度均为128bits.
由于加密过程中每一轮都需要一个密钥,因此首先需要从输入密钥(也称为种子密码)扩展出Nr(10/12/14)个密钥,总共是Nr+1个密钥.

AES加密步骤:
密钥扩展(每一轮加密都需要一个密钥) -> 初始轮加密(用输入密钥 AddRoundKey) ->重复轮加密(用扩展密钥 SubBytes/ShiftRow/MixColumns/AddRoundKey) -> 结束轮加密(用扩展密钥 SubBytes/ShiftRows/AddRoundKey)

AES解密步骤:
密钥扩展(每一轮解密都需要一个密钥) -> 初始轮解密(用输入密钥AddRoundKey) ->重复轮解密(用扩展密钥 InvShiftRows/InvSubBytes/AddRoundKey/InvMixColumns) -> 结束轮解密(用扩展密钥 InvShiftRows/InvSubBytes/AddRoundKey)

加/解密步骤由以下基本算子组成
AddRoundKey:  加植密钥
SubBytes: 字节代换
InvSubBytes: 字节逆代换
ShiftRow: 行移位
InvShiftRow: 行逆移位
MixColumn: 列混合
InvMixColumn: 列逆混合

AES的加密和解密互为逆过程, 因此两个过程其实可以相互交换.
对文件进行AES加密, 就是将文件划分成多个数据分组,每个为128bit,然后对每一个数据分组进行如上所叙的加密处理.

参考资料:
Advanced Encryption Standard (AES) (FIPS PUB 197) (November 26, 2001)
Advanced Encryption Standard by Example  (by Adam Berent)

下面是具体的AES-256加密解/密程序和注释.程序内也包含了AES-128/AES-192相应的测试数据,如有兴趣可以选择不同标准进行测试.
为了演示方便,程序只进行了一个分组的加密和解密运算.并在密钥扩展和每轮计算后都将结果打印出来,以方便与AES标准文件中的例子进行比较.

在Linux环境下编译和执行:
 gcc -o aes256 aes256.c
./aes256

/*---------------------------------------------------------------------
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License version 2 as
published by the Free Software Foundation.

A test for AES encryption (RIJNDAEL symmetric key encryption algorithm).

Reference:
	1. Advanced Encryption Standard (AES) (FIPS PUB 197)
	2. Advanced Encryption Standard by Example  (by Adam Berent)


Note:
1. Standard and parameters.
	   Key Size	 Block Size    Number of Rounds
	  (Nk words)     (Nb words)        (Nr)
AES-128       4               4             10
AES-192       6               4             12
AES-256       8               4             14


Midas Zhou
midaszhou@yahoo.com
https://github.com/widora/wegi
----------------------------------------------------------------------*/
#include <stdio.h>
#include <stdint.h>
#include <string.h>


/* S_BOX S盒 */
static const uint8_t sbox[256] = {
/* 0     1    2      3     4    5     6     7      8     9     A     B     C     D     E     F  */
  0x63, 0x7C, 0x77, 0x7B, 0xF2, 0x6B, 0x6F, 0xC5, 0x30, 0x01, 0x67, 0x2B, 0xFE, 0xD7, 0xAB, 0x76,
  0xCA, 0x82, 0xC9, 0x7D, 0xFA, 0x59, 0x47, 0xF0, 0xAD, 0xD4, 0xA2, 0xAF, 0x9C, 0xA4, 0x72, 0xC0,
  0xB7, 0xFD, 0x93, 0x26, 0x36, 0x3F, 0xF7, 0xCC, 0x34, 0xA5, 0xE5, 0xF1, 0x71, 0xD8, 0x31, 0x15,
  0x04, 0xC7, 0x23, 0xC3, 0x18, 0x96, 0x05, 0x9A, 0x07, 0x12, 0x80, 0xE2, 0xEB, 0x27, 0xB2, 0x75,
  0x09, 0x83, 0x2C, 0x1A, 0x1B, 0x6E, 0x5A, 0xA0, 0x52, 0x3B, 0xD6, 0xB3, 0x29, 0xE3, 0x2F, 0x84,
  0x53, 0xD1, 0x00, 0xED, 0x20, 0xFC, 0xB1, 0x5B, 0x6A, 0xCB, 0xBE, 0x39, 0x4A, 0x4C, 0x58, 0xCF,
  0xD0, 0xEF, 0xAA, 0xFB, 0x43, 0x4D, 0x33, 0x85, 0x45, 0xF9, 0x02, 0x7F, 0x50, 0x3C, 0x9F, 0xA8,
  0x51, 0xA3, 0x40, 0x8F, 0x92, 0x9D, 0x38, 0xF5, 0xBC, 0xB6, 0xDA, 0x21, 0x10, 0xFF, 0xF3, 0xD2,
  0xCD, 0x0C, 0x13, 0xEC, 0x5F, 0x97, 0x44, 0x17, 0xC4, 0xA7, 0x7E, 0x3D, 0x64, 0x5D, 0x19, 0x73,
  0x60, 0x81, 0x4F, 0xDC, 0x22, 0x2A, 0x90, 0x88, 0x46, 0xEE, 0xB8, 0x14, 0xDE, 0x5E, 0x0B, 0xDB,
  0xE0, 0x32, 0x3A, 0x0A, 0x49, 0x06, 0x24, 0x5C, 0xC2, 0xD3, 0xAC, 0x62, 0x91, 0x95, 0xE4, 0x79,
  0xE7, 0xC8, 0x37, 0x6D, 0x8D, 0xD5, 0x4E, 0xA9, 0x6C, 0x56, 0xF4, 0xEA, 0x65, 0x7A, 0xAE, 0x08,
  0xBA, 0x78, 0x25, 0x2E, 0x1C, 0xA6, 0xB4, 0xC6, 0xE8, 0xDD, 0x74, 0x1F, 0x4B, 0xBD, 0x8B, 0x8A,
  0x70, 0x3E, 0xB5, 0x66, 0x48, 0x03, 0xF6, 0x0E, 0x61, 0x35, 0x57, 0xB9, 0x86, 0xC1, 0x1D, 0x9E,
  0xE1, 0xF8, 0x98, 0x11, 0x69, 0xD9, 0x8E, 0x94, 0x9B, 0x1E, 0x87, 0xE9, 0xCE, 0x55, 0x28, 0xDF,
  0x8C, 0xA1, 0x89, 0x0D, 0xBF, 0xE6, 0x42, 0x68, 0x41, 0x99, 0x2D, 0x0F, 0xB0, 0x54, 0xBB, 0x16
};

/* Reverse S_BOX 反向S盒 */
static const uint8_t rsbox[256] = {
/* 0     1    2      3     4    5     6     7      8     9     A     B     C     D     E     F  */
  0x52, 0x09, 0x6A, 0xD5, 0x30, 0x36, 0xA5, 0x38, 0xBF, 0x40, 0xA3, 0x9E, 0x81, 0xF3, 0xD7, 0xFB,
  0x7C, 0xE3, 0x39, 0x82, 0x9B, 0x2F, 0xFF, 0x87, 0x34, 0x8E, 0x43, 0x44, 0xC4, 0xDE, 0xE9, 0xCB,
  0x54, 0x7B, 0x94, 0x32, 0xA6, 0xC2, 0x23, 0x3D, 0xEE, 0x4C, 0x95, 0x0B, 0x42, 0xFA, 0xC3, 0x4E,
  0x08, 0x2E, 0xA1, 0x66, 0x28, 0xD9, 0x24, 0xB2, 0x76, 0x5B, 0xA2, 0x49, 0x6D, 0x8B, 0xD1, 0x25,
  0x72, 0xF8, 0xF6, 0x64, 0x86, 0x68, 0x98, 0x16, 0xD4, 0xA4, 0x5C, 0xCC, 0x5D, 0x65, 0xB6, 0x92,
  0x6C, 0x70, 0x48, 0x50, 0xFD, 0xED, 0xB9, 0xDA, 0x5E, 0x15, 0x46, 0x57, 0xA7, 0x8D, 0x9D, 0x84,
  0x90, 0xD8, 0xAB, 0x00, 0x8C, 0xBC, 0xD3, 0x0A, 0xF7, 0xE4, 0x58, 0x05, 0xB8, 0xB3, 0x45, 0x06,
  0xD0, 0x2C, 0x1E, 0x8F, 0xCA, 0x3F, 0x0F, 0x02, 0xC1, 0xAF, 0xBD, 0x03, 0x01, 0x13, 0x8A, 0x6B,
  0x3A, 0x91, 0x11, 0x41, 0x4F, 0x67, 0xDC, 0xEA, 0x97, 0xF2, 0xCF, 0xCE, 0xF0, 0xB4, 0xE6, 0x73,
  0x96, 0xAC, 0x74, 0x22, 0xE7, 0xAD, 0x35, 0x85, 0xE2, 0xF9, 0x37, 0xE8, 0x1C, 0x75, 0xDF, 0x6E,
  0x47, 0xF1, 0x1A, 0x71, 0x1D, 0x29, 0xC5, 0x89, 0x6F, 0xB7, 0x62, 0x0E, 0xAA, 0x18, 0xBE, 0x1B,
  0xFC, 0x56, 0x3E, 0x4B, 0xC6, 0xD2, 0x79, 0x20, 0x9A, 0xDB, 0xC0, 0xFE, 0x78, 0xCD, 0x5A, 0xF4,
  0x1F, 0xDD, 0xA8, 0x33, 0x88, 0x07, 0xC7, 0x31, 0xB1, 0x12, 0x10, 0x59, 0x27, 0x80, 0xEC, 0x5F,
  0x60, 0x51, 0x7F, 0xA9, 0x19, 0xB5, 0x4A, 0x0D, 0x2D, 0xE5, 0x7A, 0x9F, 0x93, 0xC9, 0x9C, 0xEF,
  0xA0, 0xE0, 0x3B, 0x4D, 0xAE, 0x2A, 0xF5, 0xB0, 0xC8, 0xEB, 0xBB, 0x3C, 0x83, 0x53, 0x99, 0x61,
  0x17, 0x2B, 0x04, 0x7E, 0xBA, 0x77, 0xD6, 0x26, 0xE1, 0x69, 0x14, 0x63, 0x55, 0x21, 0x0C, 0x7D
};

/* Galois Field Multiplication E-table GF乘法E表 */
static const uint8_t Etab[256]= {
/* 0     1      2    3     4     5     6     7      8     9     A     B     C     D     E     F  */
  0x01, 0x03, 0x05, 0x0F, 0x11, 0x33, 0x55, 0xFF, 0x1A, 0x2E, 0x72, 0x96, 0xA1, 0xF8, 0x13, 0x35,
  0x5F, 0xE1, 0x38, 0x48, 0xD8, 0x73, 0x95, 0xA4, 0xF7, 0x02, 0x06, 0x0A, 0x1E, 0x22, 0x66, 0xAA,
  0xE5, 0x34, 0x5C, 0xE4, 0x37, 0x59, 0xEB, 0x26, 0x6A, 0xBE, 0xD9, 0x70, 0x90, 0xAB, 0xE6, 0x31,
  0x53, 0xF5, 0x04, 0x0C, 0x14, 0x3C, 0x44, 0xCC, 0x4F, 0xD1, 0x68, 0xB8, 0xD3, 0x6E, 0xB2, 0xCD,
  0x4C, 0xD4, 0x67, 0xA9, 0xE0, 0x3B, 0x4D, 0xD7, 0x62, 0xA6, 0xF1, 0x08, 0x18, 0x28, 0x78, 0x88,
  0x83, 0x9E, 0xB9, 0xD0, 0x6B, 0xBD, 0xDC, 0x7F, 0x81, 0x98, 0xB3, 0xCE, 0x49, 0xDB, 0x76, 0x9A,
  0xB5, 0xC4, 0x57, 0xF9, 0x10, 0x30, 0x50, 0xF0, 0x0B, 0x1D, 0x27, 0x69, 0xBB, 0xD6, 0x61, 0xA3,
  0xFE, 0x19, 0x2B, 0x7D, 0x87, 0x92, 0xAD, 0xEC, 0x2F, 0x71, 0x93, 0xAE, 0xE9, 0x20, 0x60, 0xA0,
  0xFB, 0x16, 0x3A, 0x4E, 0xD2, 0x6D, 0xB7, 0xC2, 0x5D, 0xE7, 0x32, 0x56, 0xFA, 0x15, 0x3F, 0x41,
  0xC3, 0x5E, 0xE2, 0x3D, 0x47, 0xC9, 0x40, 0xC0, 0x5B, 0xED, 0x2C, 0x74, 0x9C, 0xBF, 0xDA, 0x75,
  0x9F, 0xBA, 0xD5, 0x64, 0xAC, 0xEF, 0x2A, 0x7E, 0x82, 0x9D, 0xBC, 0xDF, 0x7A, 0x8E, 0x89, 0x80,
  0x9B, 0xB6, 0xC1, 0x58, 0xE8, 0x23, 0x65, 0xAF, 0xEA, 0x25, 0x6F, 0xB1, 0xC8, 0x43, 0xC5, 0x54,
  0xFC, 0x1F, 0x21, 0x63, 0xA5, 0xF4, 0x07, 0x09, 0x1B, 0x2D, 0x77, 0x99, 0xB0, 0xCB, 0x46, 0xCA,
  0x45, 0xCF, 0x4A, 0xDE, 0x79, 0x8B, 0x86, 0x91, 0xA8, 0xE3, 0x3E, 0x42, 0xC6, 0x51, 0xF3, 0x0E,
  0x12, 0x36, 0x5A, 0xEE, 0x29, 0x7B, 0x8D, 0x8C, 0x8F, 0x8A, 0x85, 0x94, 0xA7, 0xF2, 0x0D, 0x17,
  0x39, 0x4B, 0xDD, 0x7C, 0x84, 0x97, 0xA2, 0xFD, 0x1C, 0x24, 0x6C, 0xB4, 0xC7, 0x52, 0xF6, 0x01
};

/* Galois Field Multiplication L-table GF乘法L表 */
static const uint8_t Ltab[256]= {
/* 0     1      2    3     4     5     6     7      8     9     A     B     C     D     E     F  */
  0x0,  0x0,  0x19, 0x01, 0x32, 0x02, 0x1A, 0xC6, 0x4B, 0xC7, 0x1B, 0x68, 0x33, 0xEE, 0xDF, 0x03, // 0
  0x64, 0x04, 0xE0, 0x0E, 0x34, 0x8D, 0x81, 0xEF, 0x4C, 0x71, 0x08, 0xC8, 0xF8, 0x69, 0x1C, 0xC1, // 1
  0x7D, 0xC2, 0x1D, 0xB5, 0xF9, 0xB9, 0x27, 0x6A, 0x4D, 0xE4, 0xA6, 0x72, 0x9A, 0xC9, 0x09, 0x78, // 2
  0x65, 0x2F, 0x8A, 0x05, 0x21, 0x0F, 0xE1, 0x24, 0x12, 0xF0, 0x82, 0x45, 0x35, 0x93, 0xDA, 0x8E, // 3
  0x96, 0x8F, 0xDB, 0xBD, 0x36, 0xD0, 0xCE, 0x94, 0x13, 0x5C, 0xD2, 0xF1, 0x40, 0x46, 0x83, 0x38, // 4
  0x66, 0xDD, 0xFD, 0x30, 0xBF, 0x06, 0x8B, 0x62, 0xB3, 0x25, 0xE2, 0x98, 0x22, 0x88, 0x91, 0x10, // 5
  0x7E, 0x6E, 0x48, 0xC3, 0xA3, 0xB6, 0x1E, 0x42, 0x3A, 0x6B, 0x28, 0x54, 0xFA, 0x85, 0x3D, 0xBA, // 6
  0x2B, 0x79, 0x0A, 0x15, 0x9B, 0x9F, 0x5E, 0xCA, 0x4E, 0xD4, 0xAC, 0xE5, 0xF3, 0x73, 0xA7, 0x57, // 7
  0xAF, 0x58, 0xA8, 0x50, 0xF4, 0xEA, 0xD6, 0x74, 0x4F, 0xAE, 0xE9, 0xD5, 0xE7, 0xE6, 0xAD, 0xE8, // 8
  0x2C, 0xD7, 0x75, 0x7A, 0xEB, 0x16, 0x0B, 0xF5, 0x59, 0xCB, 0x5F, 0xB0, 0x9C, 0xA9, 0x51, 0xA0, // 9
  0x7F, 0x0C, 0xF6, 0x6F, 0x17, 0xC4, 0x49, 0xEC, 0xD8, 0x43, 0x1F, 0x2D, 0xA4, 0x76, 0x7B, 0xB7, // A
  0xCC, 0xBB, 0x3E, 0x5A, 0xFB, 0x60, 0xB1, 0x86, 0x3B, 0x52, 0xA1, 0x6C, 0xAA, 0x55, 0x29, 0x9D, // B
  0x97, 0xB2, 0x87, 0x90, 0x61, 0xBE, 0xDC, 0xFC, 0xBC, 0x95, 0xCF, 0xCD, 0x37, 0x3F, 0x5B, 0xD1, // C
  0x53, 0x39, 0x84, 0x3C, 0x41, 0xA2, 0x6D, 0x47, 0x14, 0x2A, 0x9E, 0x5D, 0x56, 0xF2, 0xD3, 0xAB, // D
  0x44, 0x11, 0x92, 0xD9, 0x23, 0x20, 0x2E, 0x89, 0xB4, 0x7C, 0xB8, 0x26, 0x77, 0x99, 0xE3, 0xA5, // E
  0x67, 0x4A, 0xED, 0xDE, 0xC5, 0x31, 0xFE, 0x18, 0x0D, 0x63, 0x8C, 0x80, 0xC0, 0xF7, 0x70, 0x07  // F
};

/*  RCON 表 */
static const uint32_t Rcon[15]= {
  0x01000000,
  0x02000000,
  0x04000000,
  0x08000000,
  0x10000000,
  0x20000000,
  0x40000000,
  0x80000000,
  0x1B000000,
  0x36000000,
  0x6C000000,
  0xD8000000,
  0xAB000000,
  0x4D000000,
  0x9A000000
};

/* Functions */
void print_state(const uint8_t *s);	/* 打印分组数据 */
int aes_ShiftRows(uint8_t *state);  	/* 行移位 */
int aes_InvShiftRows(uint8_t *state);	/* 行逆移位 */
int aes_ExpRoundKeys(uint8_t Nr, uint8_t Nk, const uint8_t *inkey, uint32_t *keywords);  /* 密钥扩展 */
int aes_AddRoundKey(uint8_t Nr, uint8_t Nk, uint8_t round, uint8_t *state, const uint32_t *keywords);  /* 加植密钥 */
int aes_EncryptState(uint8_t Nr, uint8_t Nk, uint32_t *keywords, uint8_t *state);	/* 分组加密 */
int aes_DecryptState(uint8_t Nr, uint8_t Nk, uint32_t *keywords, uint8_t *state);	/* 分组解密 */



/*==============
      MAIN
===============*/
int main(void)
{
	int i,k;
  const uint8_t Nb=4;		/* 分组长度 Block size in words, 4/4/4 for AES-128/192/256 */
	uint8_t Nk;		/* 密钥长度 column number, as of 4xNk, 4/6/8 for AES-128/192/256 */
	uint8_t Nr;		/* 加密轮数 Number of rounds, 10/12/14 for AES-128/192/256 */
  	uint8_t state[4*4];	/* 分组数据 State array, data in row sequence! */
	uint64_t ns;		/* 总分组数 Total number of states */

	/* 待加密数据 */
	const uint8_t input_msg[]= {
		0x00,0x11,0x22,0x33,0x44,0x55,0x66,0x77,0x88,0x99,0xaa,0xbb,0xcc,0xdd,0xee,0xff
	};

	/* AES-128/192/256 对应的密钥长度,加密轮数, 输入密钥 */
	#if 0 /* TEST data --- AES-128 */
	Nk=4;
	Nr=10;
	const uint8_t inkey[4*4]= {  /* Nb*Nk */
		0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09,0x0a,0x0b,0x0c,0x0d,0x0e,0x0f
	};
	#endif
	#if 0 /* TEST data --- AES-192 */
	Nk=6;
	Nr=12;
	const uint8_t inkey[4*6]= {  /* Nb*Nk */
		0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09,0x0a,0x0b,0x0c,0x0d,0x0e,0x0f,
		0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17
	};
	#endif

	#if 1 /* TEST data --- AES-256 */
	Nk=8;
	Nr=14;
	const uint8_t inkey[4*8]= {  /* Nb*Nk */
		0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09,0x0a,0x0b,0x0c,0x0d,0x0e,0x0f,
		0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,0x18,0x19,0x1a,0x1b,0x1c,0x1d,0x1e,0x1f
	};
	#endif


	/* 密钥扩展测试数据------TEST: For Key expansion */
	#if 0
	Nk=4;
	Nr=10;
	const uint8_t inkey[4*4]=  /* Nb*Nk */
	{ 0x2b, 0x7e, 0x15, 0x16, 0x28, 0xae, 0xd2, 0xa6, 0xab, 0xf7, 0x15, 0x88, 0x09, 0xcf, 0x4f, 0x3c };
	#endif

	#if 0
	Nk=6;
	Nr=12;
	const uint8_t inkey[4*6]=  /* Nb*Nk */
	{ 0x8e, 0x73, 0xb0, 0xf7, 0xda, 0x0e, 0x64, 0x52, 0xc8, 0x10, 0xf3, 0x2b,
	  0x80, 0x90, 0x79, 0xe5, 0x62, 0xf8, 0xea, 0xd2, 0x52, 0x2c, 0x6b, 0x7b  };
	#endif

	#if 0
	Nk=8;
	Nr=14;
	const uint8_t inkey[4*8]=  /* Nb*Nk */
	{ 0x60, 0x3d, 0xeb, 0x10, 0x15, 0xca, 0x71, 0xbe, 0x2b, 0x73, 0xae, 0xf0, 0x85, 0x7d, 0x77, 0x81,
          0x1f, 0x35, 0x2c, 0x07, 0x3b, 0x61, 0x08, 0xd7, 0x2d, 0x98, 0x10, 0xa3, 0x09, 0x14, 0xdf, 0xf4 };
	#endif

	uint32_t keywords[Nb*(Nr+1)];  /* 用于存放扩展密钥,总共Nr+1把密钥 Nb==4, All expended keys, as of words in a column:  0xb0b1b2b3   */

	/* 从输入密钥产生Nk+1个轮密,每轮需要一个密钥 Generate round keys */
	aes_ExpRoundKeys(Nr, Nk, inkey, keywords);

	/* 数据分组数量,这里我们只设定为1组. Cal. total states */
	//ns=(strlen(input_msg)+15)/16;
	ns=1;
	/* 如果是多个分组,那么分别对每个分组进行加密,i=0 ~ ns-1 */
	i=0;  /* i=0 ~ ns-1 */

	/* 将待加密数据放入到数据分组state[]中,注意:state[]中数据按column顺序存放! */
	bzero(state,16);
	for(k=0; k<16; k++)
		state[(k%4)*4+k/4]=input_msg[i*16+k];

	/* 加密数据分组 Encrypt each state */
	aes_EncryptState(Nr, Nk, keywords, state);

	/* 打印加密后的分组数据 */
	printf("***********************************\n");
	printf("*******  Finish Encryption  *******\n");
	printf("***********************************\n");
	print_state(state);

	/* 解密数据分组 Decrypt state  */
	aes_DecryptState(Nr, Nk, keywords, state);
	//printf("Finish decrypt message, Round Nr=%d, KeySize Nk=%d, States ns=%llu.\n", Nr, Nk, ns);

	/* 打印解密后的分组数据 */
	printf("***********************************\n");
	printf("*******  Finish Decryption  *******\n");
	printf("***********************************\n");
	print_state(state);

	return 0;
}



/* 打印分组数据 Print state  */
void print_state(const uint8_t *s)
{
	int i,j;
	for(i=0; i<4; i++) {
		for(j=0; j<4; j++) {
			printf("%02x",s[i*4+j]);
			//printf("'%c'",s[i*4+j]); /* !!! A control key MAY erase previous chars on screen! !!! */
			printf(" ");
		}
		printf("\n");
	}
	printf("\n");
}

/*------------------------------
行移位
Shift operation of the state.

@state[4*4]

Return:
	0	OK
	<0	Fails
-------------------------------*/
int aes_ShiftRows(uint8_t *state)
{
	int j,k;
	uint8_t tmp;

	if(state==NULL)
		return -1;

	for(k=0; k<4; k++) {
		/* each row shift k times */
		for(j=0; j<k; j++) {
			tmp=*(state+4*k);   /* save the first byte */
			//memcpy(state+4*k, state+4*k+1, 3);
			memmove(state+4*k, state+4*k+1, 3);
			*(state+4*k+3)=tmp; /* set the last byte */
		}
	}
	return 0;
}

/*------------------------------
行逆移位
Shift operation of the state.

@state[4*4]

Return:
	0	OK
	<0	Fails
-------------------------------*/
int aes_InvShiftRows(uint8_t *state)
{
	int j,k;
	uint8_t tmp;

	if(state==NULL)
		return -1;

	for(k=0; k<4; k++) {
		/* each row shift k times */
		for(j=0; j<k; j++) {
			tmp=*(state+4*k+3); /* save the last byte */
			memmove(state+4*k+1, state+4*k, 3);
			*(state+4*k)=tmp;   /* set the first byte */
		}
	}
	return 0;
}


/*-------------------------------------------------------------
加植密钥
Add round key to the state.

@Nr:	     	Number of rounds, 10/12/14 for AES-128/192/256
@Nk:        	Key size, in words.
@round:	Current round number.
@state:	Pointer to state.
@keywords[Nb*(Nr+1)]:   All round keys, in words.

--------------------------------------------------------------*/
int aes_AddRoundKey(uint8_t Nr, uint8_t Nk, uint8_t round, uint8_t *state, const uint32_t *keywords)
{
	int k;
	if(state==NULL || keywords==NULL)
		return -1;

	for(k=0; k<4*4; k++)
		state[k] = ( keywords[round*4+k%4]>>((3-(k>>2))<<3) &0xFF )^state[k];

	return 0;
}


/*----------------------------------------------------------------------------------------------------------
密钥扩展
从输入密钥(也称为种子密码)扩展出Nr(10/12/14)个密钥,总共是Nr+1个密钥
Generate round keys.

@Nr:	    Number of rounds, 10/12/14 for AES-128/192/256
@Nk:        		Key size, in words.
@inkey[4*Nk]:  	        Original key, 4*Nk bytes, arranged row by row.
@keywords[Nb*(Nr+1)]:   Output keys, in words.  Nb*(Nr+1)
			one keywords(32 bytes) as one column of key_bytes(4 bytes)

Note:
1. The caller MUST ensure enough mem space of input params.

Return:
	0	Ok
	<0	Fails
---------------------------------------------------------------------------------------------------------------*/
int aes_ExpRoundKeys(uint8_t Nr, uint8_t Nk, const uint8_t *inkey, uint32_t *keywords)
{
	int i;
	const int Nb=4;
	uint32_t temp;

	if(inkey==NULL || keywords==NULL)
		return -1;

	/* Re_arrange inkey to keywords, convert 4x8bytes each row_data to a 32bytes keyword, as a complex column_data. */
	for( i=0; i<Nk; i++ ) {
		keywords[i]=(inkey[4*i]<<24)+(inkey[4*i+1]<<16)+(inkey[4*i+2]<<8)+inkey[4*i+3];
	}

	/* Expend round keys */
	for(i=Nk; i<Nb*(Nr+1); i++) {
		temp=keywords[i-1];
		if( i%Nk==0 ) {
			/* RotWord */
			temp=( temp<<8 )+( temp>>24 );
			/* Subword */
			temp=(sbox[temp>>24]<<24) +(sbox[(temp>>16)&0xFF]<<16) +(sbox[(temp>>8)&0xFF]<<8)
				+sbox[temp&0xFF];
			/* temp=SubWord(RotWord(temp)) XOR Rcon[i/Nk-1] */
			temp=temp ^ Rcon[i/Nk-1];
		}
		else if (Nk>6 && i%Nk==4 ) {
			/* Subword */
			temp=(sbox[temp>>24]<<24) +(sbox[(temp>>16)&0xFF]<<16) +(sbox[(temp>>8)&0xFF]<<8)
                                +sbox[temp&0xFF];
		}

		/* Get keywords[i] */
		keywords[i]=keywords[i-Nk]^temp;
	}
	/* Print all keys */
	for(i=0; i<Nb*(Nr+1); i++)
		printf("keywords[%d]=0x%08X\n", i, keywords[i]);

	return 0;
}


/*----------------------------------------------------------------------
数据分组加密
Encrypt state.

@Nr:	   		 Number of rounds, 10/12/14 for AES-128/192/256
@Nk:        		 Key length, in words.
@keywordss[Nb*(Nr+1)]:   All round keys, in words.
@state[4*4]:		The state block.

Note:
1. The caller MUST ensure enough mem space of input params.

Return:
	0	Ok
	<0	Fails
------------------------------------------------------------------------*/
int aes_EncryptState(uint8_t Nr, uint8_t Nk, uint32_t *keywords, uint8_t *state)
{
	int i,k;
	uint8_t round;
	uint8_t mc[4];		    /* Temp. var */

	if(keywords==NULL || state==NULL)
		return -1;

	/* 1. AddRoundKey:  加植密钥 */
	printf(" --- Add Round_key ---\n");
        aes_AddRoundKey(Nr, Nk, 0, state, keywords);
	print_state(state);

	 /* 循环Nr-1轮加密运算 Run Nr round functions */
	 for( round=1; round<Nr; round++) {  /* Nr */

		/* 2. SubBytes: 字节代换 Substitue State Bytes with SBOX */
		printf(" --- SubBytes() Round:%d ---\n",round);
		for(k=0; k<16; k++)
			state[k]=sbox[state[k]];
		print_state(state);

		/* 3. ShiftRow: 行移位 Shift State Rows */
		printf(" --- ShiftRows() Round:%d ---\n",round);
		aes_ShiftRows(state);
		print_state(state);

		/* 4. MixColumn: 列混合 Mix State Cloumns */
		/* Galois Field Multiplication, Multi_Matrix:
			2 3 1 1
			1 2 3 1
			1 1 2 3
			3 1 1 2
		   Note:
		   1. Any number multiplied by 1 is equal to the number itself.
		   2. Any number multiplied by 0 is 0!
		*/
		printf(" --- MixColumn() Round:%d ---\n",round);
		for(i=0; i<4; i++) { /* i as column index */
	   		mc[0]=  ( state[i]==0 ? 0 : Etab[(Ltab[state[i]]+Ltab[2])%0xFF] )
  				^( state[i+4]==0 ? 0 : Etab[(Ltab[state[i+4]]+Ltab[3])%0xFF] )
				^state[i+8]^state[i+12];
			mc[1]=  state[i]
				^( state[i+4]==0 ? 0 : Etab[(Ltab[state[i+4]]+Ltab[2])%0xFF] )
				^( state[i+8]==0 ? 0 : Etab[(Ltab[state[i+8]]+Ltab[3])%0xFF] )
				^state[i+12];
			mc[2]=  state[i]^state[i+4]
				^( state[i+8]==0 ? 0 : Etab[(Ltab[state[i+8]]+Ltab[2])%0xFF] )
				^( state[i+12]==0 ? 0 : Etab[(Ltab[state[i+12]]+Ltab[3])%0xFF] );
			mc[3]=  ( state[i]==0 ? 0 : Etab[(Ltab[state[i]]+Ltab[3])%0xFF] )
				^state[i+4]^state[i+8]
				^( state[i+12]==0 ? 0 : Etab[(Ltab[state[i+12]]+Ltab[2])%0xFF] );

			state[i+0]=mc[0];
			state[i+4]=mc[1];
			state[i+8]=mc[2];
			state[i+12]=mc[3];
		}
		print_state(state);

		/* 5. AddRoundKey:  加植密钥 Add State with Round Key */
		printf(" --- Add Round_key ---\n");
	        aes_AddRoundKey(Nr, Nk, round, state, keywords);
		print_state(state);

   	} /* END Nr rounds */

	/* 6. SubBytes: 字节代换 Substitue State Bytes with SBOX */
	printf(" --- SubBytes() Round:%d ---\n",round);
	for(k=0; k<16; k++)
		state[k]=sbox[state[k]];
	print_state(state);

	/* 7. ShiftRow: 行移位 Shift State Rows */
	printf(" --- ShiftRows() Round:%d ---\n",round);
	aes_ShiftRows(state);
	print_state(state);

	/* 8. AddRoundKey:  加植密钥 Add State with Round Key */
	printf(" --- Add Round_key ---\n");
        aes_AddRoundKey(Nr, Nk, round, state, keywords);
	print_state(state);

	return 0;
}


/*----------------------------------------------------------------------
Decrypt the state.

@Nr:	   		 Number of rounds, 10/12/14 for AES-128/192/256
@Nk:        		Key length, in words.
@keywordss[Nb*(Nr+1)]:  All round keys, in words.
@state[4*4]:		The state block.

Note:
1. The caller MUST ensure enough mem space of input params.

Return:
	0	Ok
	<0	Fails
------------------------------------------------------------------------*/
int aes_DecryptState(uint8_t Nr, uint8_t Nk, uint32_t *keywords, uint8_t *state)
{
	int i,k;
	uint8_t round;
	uint8_t mc[4];		    /* Temp. var */

	if(keywords==NULL || state==NULL)
		return -1;

	/* 1. AddRoundKey:  加植密钥 Add round key */
	printf(" --- Add Round_key ---\n");
        aes_AddRoundKey(Nr, Nk, Nr, state, keywords);  /* From Nr_th round */
	print_state(state);

	 /* 循环Nr-1轮加密运算 Run Nr round functions */
	 for( round=Nr-1; round>0; round--) {  /* round [Nr-1  1]  */

		/* 2. InvShiftRow: 行逆移位 InvShift State Rows */
		printf(" --- InvShiftRows() Round:%d ---\n",Nr-round);
		aes_InvShiftRows(state);
		print_state(state);

		/* 3. InvSubBytes: 字节逆代换 InvSubstitue State Bytes with R_SBOX */
		printf(" --- (Inv)SubBytes() Round:%d ---\n",Nr-round);
		for(k=0; k<16; k++)
			state[k]=rsbox[state[k]];
		print_state(state);

		/* 4. AddRoundKey:  加植密钥 Add State with Round Key */
		printf(" --- Add Round_key Round:%d ---\n", Nr-round);
	        aes_AddRoundKey(Nr, Nk, round, state, keywords);
		print_state(state);

		/* 5. InvMixColumn: 列逆混合 Inverse Mix State Cloumns */
		/* Galois Field Multiplication, Multi_Matrix:
			0x0E 0x0B 0x0D 0x09
			0x09 0x0E 0x0B 0x0D
			0x0D 0x09 0x0E 0x0B
			0x0B 0x0D 0x09 0x0E
		   Note:
		   1. Any number multiplied by 1 is equal to the number itself.
		   2. Any number multiplied by 0 is 0!
		*/
		printf(" --- InvMixColumn() Round:%d ---\n",Nr-round);
		for(i=0; i<4; i++) { 	/* i as column index */
	   		mc[0]=  ( state[i]==0 ? 0 : Etab[(Ltab[state[i]]+Ltab[0x0E])%0xFF] )
  				^( state[i+4]==0 ? 0 : Etab[(Ltab[state[i+4]]+Ltab[0x0B])%0xFF] )
  				^( state[i+8]==0 ? 0 : Etab[(Ltab[state[i+8]]+Ltab[0x0D])%0xFF] )
  				^( state[i+12]==0 ? 0 : Etab[(Ltab[state[i+12]]+Ltab[0x09])%0xFF] );
	   		mc[1]=  ( state[i]==0 ? 0 : Etab[(Ltab[state[i]]+Ltab[0x09])%0xFF] )
  				^( state[i+4]==0 ? 0 : Etab[(Ltab[state[i+4]]+Ltab[0x0E])%0xFF] )
  				^( state[i+8]==0 ? 0 : Etab[(Ltab[state[i+8]]+Ltab[0x0B])%0xFF] )
  				^( state[i+12]==0 ? 0 : Etab[(Ltab[state[i+12]]+Ltab[0x0D])%0xFF] );
	   		mc[2]=  ( state[i]==0 ? 0 : Etab[(Ltab[state[i]]+Ltab[0x0D])%0xFF] )
  				^( state[i+4]==0 ? 0 : Etab[(Ltab[state[i+4]]+Ltab[0x09])%0xFF] )
  				^( state[i+8]==0 ? 0 : Etab[(Ltab[state[i+8]]+Ltab[0x0E])%0xFF] )
  				^( state[i+12]==0 ? 0 : Etab[(Ltab[state[i+12]]+Ltab[0x0B])%0xFF] );
	   		mc[3]=  ( state[i]==0 ? 0 : Etab[(Ltab[state[i]]+Ltab[0x0B])%0xFF] )
  				^( state[i+4]==0 ? 0 : Etab[(Ltab[state[i+4]]+Ltab[0x0D])%0xFF] )
  				^( state[i+8]==0 ? 0 : Etab[(Ltab[state[i+8]]+Ltab[0x09])%0xFF] )
  				^( state[i+12]==0 ? 0 : Etab[(Ltab[state[i+12]]+Ltab[0x0E])%0xFF] );

			state[i+0]=mc[0];
			state[i+4]=mc[1];
			state[i+8]=mc[2];
			state[i+12]=mc[3];
		}
		print_state(state);

   	} /* END Nr rounds */

	/* 6. InvShiftRow: 行逆移位 Inverse Shift State Rows */
	printf(" --- InvShiftRows() Round:%d ---\n",Nr-round);
	aes_InvShiftRows(state);
	print_state(state);

	/* 7. InvSubBytes: 字节逆代换 InvSubstitue State Bytes with SBOX */
	printf(" --- InvSubBytes() Round:%d ---\n",Nr-round);
	for(k=0; k<16; k++)
		state[k]=rsbox[state[k]];
	print_state(state);

	/* 8. AddRoundKey:  加植密钥 Add State with Round Key */
	printf(" --- Add Round_key Round:%d ---\n",Nr-round);
        aes_AddRoundKey(Nr, Nk, 0, state, keywords);
	print_state(state);

	return 0;
}