Keynote of LCPC'10

一直很奇怪,这么多人在研究如何用编译器实现程序的自动并行化,如果真的可以,为什么现在貌似还是人工并行为主要的工作,而没有真正feasible & executable的模型或者什么的可以使用,如果不行,为什么没有人干脆证明它是个NP-hard问题,这样就可以暂时避免前赴后继的时间和精力耗费其上……

Summary:

This PDF, almost indicated by its title, tells the reasons why compilers have failed in the auto-parallelization.

For the past 25 years, the compiler have accomplishments in 

-- Instruction-level parallelism(ILP)

-- Memory-hierarchy optimization(optimize (im)perfectly nested DO-loops + dense arrays with loop transformations & program abstractions, which has something to do with polyhedral methods)

-- Performance portability 

  - Java: byte-code interpretation + just-in-time(JIT) compilation

  - FFTW, SPIRAL: codelets + empirical search

  - ATLAS: parameterized program + empirical search

 

For Auto-parallelization, we only succeeded in the vectorization of some dense matrix programs, while the Databases(SQL programmers and DBMS implementers) and numerical linear algebra fields hit it.

Lessons learnt:

-- Compilers are good at lowering abstraction level of program but poor at raising it

-- Need to figure out levels and software contracts between levels