CORESPEQ Inc High Performance Computing Through Formal Methods


The CORESPEQ C Compiler is a special-purpose compiler that accepts a proper subset of C programs as input. The scope of input programs is carefully chosen to allow the specification of most scientific algorithms, yet enable high level loop analysis and transformations that are essential for high performance implementations. Program transformations include optimizations such as tiling for parallelization or data locality maximization, memory re-layout for prefetch-friendly array accesses, unrolling, etc. The final code-generation step performs target-specific vectorization.

COREquations Engine

The COREquations engine accepts mathematical equations as inputs; which are the most succinct and precise specification for scientific kernels. At this highest level of declarative specification, the design space of semantic-preserving program transformations that can be applied is also the most complete. This design space exploration provides various choices for parallelization, data partitioning, tiling (with both shape and size as parameters), loop ordering and memory layout. Furthermore, the engine can determine the value lifetime of all computations at all iterations analytically and therefore perform the memory minimization of temporary arrays. From the precise mathematical foundations of the polyhedral model, the engine can also determine incomplete and/or double definitions of variables as well as out-of-bounds array accesses. The engine can generate single-threaded tiled C code for the specification, as well as MPI for distributed memory machines (complete with data partitioning, communication and synchronization).

CORESPEQ Test Generator

Optimized code needs to be perfectly conformant with the provided (unoptimized) function within the parameter space of valid inputs. And this conformance should be verifiable. We generate extensive test harnesses (with an optional hadoop based parallel test execution framework) to help with this verification. Our test harnesses run the original code for result baselining, may optionally execute an extended-precision reference code to catch any overflows in the provided function itself, and finally, the optimized function for output equivalence. It provides for auto-generated as well as specially entered data vectors for continuum and corner case testing of the code. The test harnesses also provide speedup numbers under various loads (single execution thread, full system load etc.).