原帖链接(英文)

看到uncle bob一个令人拍案叫绝的思路,忍不住要分享一下:所有的嵌套循环都可以简化成单层循环。

1.问题引入

常规传统的素因子分解代码:

public List<Integer> factorsOf(int n) {
  ArrayList<Integer> factors = new ArrayList<>();

  for (int d = 2; n > 1; d++)
    for (; n % d == 0; n /= d)
      factors.add(d);

  return factors;
}

双层嵌套循环。

但Uncle bob在Clojure社区中看到另一种递归的形式(懒得看可以跳到下面Java翻译版):

(defn prime-factors [n]
  (loop [n n d 2 factors []]
          (if (> n 1)
            (if (zero? (mod n d))
              (recur (/ n d) d (conj factors d))
              (recur n (inc d) factors))
            factors)))

Java翻译版:

private List<Integer> factorsOf(int n) {
    return factorsOf(n, 2, new ArrayList<Integer>());
  }

  private List<Integer> factorsOf(int n, int d, List<Integer> factors) {
    if (n>1) {
      if (n%d == 0) {
        factors.add(d);
        return factorsOf(n/d, d, factors);
      } else {
        return factorsOf(n, d+1, factors);
      }
    }
    return factors;
  }

或许到这里还看不出来啥,但既然它是尾递归,而尾递归可以和循环自由转换。那么如果将尾递归用循环替代:

private List<Integer> factorsOf(int n, int d, List<Integer> factors) {
    while (true) {
      if (n > 1) {
        if (n % d == 0) {
          factors.add(d);
          n /= d;
        } else {
          d++;
        }
      } else
        return factors;
    }
  }

诶?循环突然就少了一层耶。

2.本质分析

首先要看双层循环中内层的循环。其控制条件是【n%d==0是否成立】,因此只要能在外层循环中引入这个控制条件的判断,那么就可以将内层循环替代掉了。

按照这个思路,我们可以考虑将所有(内层和外层)的控制条件抽取成额外变量字段,那么这段程序就变成了这样:

private List<Integer> factorsOf(int n, int d, List<Integer> factors) {
    while (true) {
      boolean factorsRemain = n > 1;
      boolean currentDivisorIsFactor = n % d == 0;
      if (factorsRemain) {
        if (currentDivisorIsFactor) {
          factors.add(d);
          n /= d;
        } else {
          d++;
        }
      } else
        return factors;
    }
  }

原先双重循环的控制条件转换成了while(true)+双重if来控制。

如果感觉双重if判断比较难受,也可以通过两个Boolean变量的组合判断将双重if改成单层:

private List<Integer> factorsOf(int n, int d, List<Integer> factors) {
    while (true) {
      boolean factorsRemain = n > 1;
      boolean currentDivisorIsFactor = n % d == 0;
      if (factorsRemain && currentDivisorIsFactor) {
          factors.add(d);
          n /= d;
      }
      if (factorsRemain && !currentDivisorIsFactor)
          d++;
      if (!factorsRemain)
        return factors;
    }
  }

而改成单层后组合条件可读性好像有点儿低,因此提取成:

private List<Integer> factorsOf(int n, int d, List<Integer> factors) {
    while (true) {
      boolean factorsRemain = n > 1;
      boolean currentDivisorIsFactor = n % d == 0;
  
      boolean factorOutCurrentDivisor = factorsRemain && 
                                        currentDivisorIsFactor;
      boolean tryNextDivisor = factorsRemain && !currentDivisorIsFactor;
      boolean allDone = !factorsRemain;
  
      if (factorOutCurrentDivisor) {
        factors.add(d);
        n /= d;
      }
      if (tryNextDivisor) {
        d++;
      }
      if (allDone)
        return factors;
    }
  }

为了便于进一步说明,我们将控制条件提取成枚举值:

private enum State {Starting, Factoring, Searching, Done}

  private List<Integer> factorsOf(int n, int d, List<Integer> factors) {
    State state = State.Starting;
    while (true) {
      boolean factorsRemain = n > 1;
      boolean currentDivisorIsFactor = n % d == 0;

      if (factorsRemain && currentDivisorIsFactor)
        state = State.Factoring;
      if (factorsRemain && !currentDivisorIsFactor)
        state = State.Searching;
      if (!factorsRemain)
        state = State.Done;

      switch (state) {
        case Factoring:
          factors.add(d);
          n /= d;
          break;
        case Searching:
          d++;
          break;
        case Done:
          return factors;
      }
    }
  }

这时发现,下一轮循环的状态是能够推断的:

private List<Integer> factorsOf(int n, int d, List<Integer> factors) {
    State state = State.Starting;
    while (true) {
      switch (state) {
        case Starting:
          if (n == 1)
            state = State.Done;
          else if (n % d == 0)
            state = State.Factoring;
          else
            state = State.Searching;
          break;
        case Factoring:
          factors.add(d);
          n /= d;
          if (n == 1)
            state = State.Done;
          else if (n % d != 0)
            state = State.Searching;
          break;
        case Searching:
          d++;
          if (n == 1)
            state = State.Done;
          else if (n % d == 0)
            state = State.Factoring;
          break;
        case Done:
          return factors;
      }
    }
  }

得出这个结论有啥用呢,我先将其简化一下:

private List<Integer> factorsOf(int n, int d, List<Integer> factors) {
    State state = State.Starting;
    while (true) {
      switch (state) {
        case Starting:
          break;
        case Factoring:
          factors.add(d);
          n /= d;
          break;
        case Searching:
          d++;
          break;
        case Done:
          return factors;
      }

      if (n == 1)
        state = State.Done;
      else if (n % d == 0)
        state = State.Factoring;
      else
        state = State.Searching;
    }
  }

清晰多了,那么我摆上结论:

java 优化双for java双层循环嵌套优化_算法

原始的双层嵌套循环实际上可以用摩尔有限状态机表示,而每次循环其实就算各个状态之间的跳转。

这种有限状态机之间的状态跳转也正是阿兰图灵早在1936年的论文中所展望的。Charles Petzold的著作《The Annotated Turing》中有详细介绍。

3.拍案叫绝的原因

感觉很精妙因为写了好多年代码,但就从未能跟状态机联想到一块儿。Uncle bob对此也谈了他的看法:因为Java的for循环中可以保存和改变判别条件的状态。正因如此,追求变量不可变的Clojure版本拿出来,他才第一时间联想到这些。这么看的话,函数式编程更加贴近于摩尔有限状态机。

Uncle Bob的另一个想法是,既然最外层的循环是while(true),那么或许可以从这一点出发再搞出一种新语言,把最外的循环这层也省去,就摆个状态机让他运行,没有for, while, if, else, goto, 就只有switch,程序根据switch在状态机(FSM)的各个状态节点之间跳转,直到被告知停止为止。