LLVM gcc

Discussion of chess software programming and technical issues.

Moderator: Ras

Guetti

LLVM gcc

Post by Guetti »

I just downloaded the update for Xcode 3.1. (Developer Tools for OS X)

What's New
• SDK support for targeting non-Mac OS X platforms, including iPhone OS SDK.
• GCC 4.2 & LLVM GCC 4.2 optional compilers for use with Mac OS X 10.5 SDK
• Updated assistants to create new projects, targets, and source files
• Toolbar uses a single popup to choose platform, target, and debug/release


LLVM GCC 4.2 is a new optional compiler based on the LLVM.org open source project. LLVM GCC 4.2 provides an LLVM-based back-end optimizer using the GCC 4.2 front-end parser. This compiler is both source and binary compatible with GCC 4.2 and requires the Mac OS X 10.5 SDK or "Current OS" SDK.

But what exactly does LLVM GCC? What is the difference to GCC?
jdart
Posts: 4408
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: LLVM gcc

Post by jdart »

But what exactly does LLVM GCC? What is the difference to GCC?
My understanding is LLVM is an alternative optimizer/code generator. Whether it is better than GCC's stock back end, I don't know. GCC has a pretty good optimizer but it's still buggy and every release seems to break something new, along with some fixes.

--Jno
Ron Murawski
Posts: 397
Joined: Sun Oct 29, 2006 4:38 am
Location: Schenectady, NY

Re: LLVM gcc

Post by Ron Murawski »

From the LLVC (Low Level Virtual Machine) website:

"Did you know that LLVM has a GCC 4.0 compatible C++ front-end and a great optimizer? We find that LLVM is able to compile C++ into substantially better code than GCC (for example). Also, because LLVM code can be converted to C, you can even use LLVM as a C++-to-C translator."

Also:

"LLVM is also .. great ... [for] ... compile-time, link-time, or run-time optimization"

The LLVC package looks interesting. Thanks for bringing it to my attention!

Ron
Guetti

Re: LLVM gcc

Post by Guetti »

Thanks. In the meantime I found the location of llvm-gcc.
The llvm-gcc binary is put into /Developer/usr/bin/
Apparently one can just add/replace CC=/Developer/usr/bin/llvm-gcc or llvm-g++ in the makefile.

I will try it out when I have the time.
Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: LLVM gcc

Post by Dann Corbit »

Ron Murawski wrote:From the LLVC (Low Level Virtual Machine) website:

"Did you know that LLVM has a GCC 4.0 compatible C++ front-end and a great optimizer? We find that LLVM is able to compile C++ into substantially better code than GCC (for example). Also, because LLVM code can be converted to C, you can even use LLVM as a C++-to-C translator."

Also:

"LLVM is also .. great ... [for] ... compile-time, link-time, or run-time optimization"

The LLVC package looks interesting. Thanks for bringing it to my attention!

Ron
It looks to me like it is good for portability, but from what I have read, it stinks as an optimizing compiler (does well compared to GCC with -O0, but that's nothing great).

LLVM is like the Microsoft CLR languages -- it compiles down to bytecode and then an interpreter handles the byte stream. I can't imagine how it would be a great optimizer.

On the other hand, if you are targeting something that does not have a C++ compiler and your source code is in C++, then it will rewrite the C++ as C for you.
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: LLVM gcc

Post by wgarvin »

Dann Corbit wrote: It looks to me like it is good for portability, but from what I have read, it stinks as an optimizing compiler (does well compared to GCC with -O0, but that's nothing great).

LLVM is like the Microsoft CLR languages -- it compiles down to bytecode and then an interpreter handles the byte stream. I can't imagine how it would be a great optimizer.

On the other hand, if you are targeting something that does not have a C++ compiler and your source code is in C++, then it will rewrite the C++ as C for you.
Uhh... do you have a source for these opinions? I think you might be wrong here. I haven't used it myself, but I've read a fair bit about LLVM over the past 7 years. It's much, much more than a bytecode interpreter.

The LLVM suite includes all the same pieces as a production compiler and linker -- frontends, intermediate representations, strong optimization passes, and backend code generators. The difference is that the pieces have well-defined interfaces and formats to exchange data with, and so you can use them whenever you want: at compile time or "link time", on-the-fly in your IDE, and even while your program is running (e.g. using the optimizer and code generator pieces to make a JIT).

The GCC-LLVM compiler that is being talked about is a native-code-generating optimizing compiler, just like GCC by itself is. They just replaced the optimizer/code generator pieces with the LLVM equivalents. From things I've read, I vaguely believe LLVM's optimizer is stronger than GCC 4's in some ways, although it may be less efficient at some other things. There might be bugs (though that is true of any compiler).

LLVM is mature technology, almost (but not quite) ready for widescale production use. It's most promising aspect is that it's a collection of building blocks for language designers, compiler and JIT writers, and tool builders to use to help them get decent-quality, optimized machine code out of their new custom language or tool.

You can read more about it here: http://llvm.org/
Low Level Virtual Machine (LLVM) is:

A compilation strategy designed to enable effective program optimization across the entire lifetime of a program. LLVM supports effective optimization at compile time, link-time (particularly interprocedural), run-time and offline (i.e., after software is installed), while remaining transparent to developers and maintaining compatibility with existing build scripts.

A virtual instruction set - LLVM is a low-level object code representation that uses simple RISC-like instructions, but provides rich, language-independent, type information and dataflow (SSA) information about operands. This combination enables sophisticated transformations on object code, while remaining light-weight enough to be attached to the executable. This combination is key to allowing link-time, run-time, and offline transformations.

A compiler infrastructure - LLVM is also a collection of source code that implements the language and compilation strategy. The primary components of the LLVM infrastructure are a GCC-based C & C++ front-end, a link-time optimization framework with a growing set of global and interprocedural analyses and transformations, static back-ends for the X86, X86-64, PowerPC 32/64, ARM, Thumb, IA-64, Alpha, SPARC, MIPS and CellSPU architectures, a back-end which emits portable C code, and a Just-In-Time compiler for X86, X86-64, PowerPC 32/64 processors, and an emitter for MSIL.

LLVM does not imply things that you would expect from a high-level virtual machine. It does not require garbage collection or run-time code generation (In fact, LLVM makes a great static compiler!). Note that optional LLVM components can be used to build high-level virtual machines and other systems that need these services.
Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: LLVM gcc

Post by Dann Corbit »

wgarvin wrote:
Dann Corbit wrote: It looks to me like it is good for portability, but from what I have read, it stinks as an optimizing compiler (does well compared to GCC with -O0, but that's nothing great).

LLVM is like the Microsoft CLR languages -- it compiles down to bytecode and then an interpreter handles the byte stream. I can't imagine how it would be a great optimizer.

On the other hand, if you are targeting something that does not have a C++ compiler and your source code is in C++, then it will rewrite the C++ as C for you.
Uhh... do you have a source for these opinions? I think you might be wrong here. I haven't used it myself, but I've read a fair bit about LLVM over the past 7 years. It's much, much more than a bytecode interpreter.
http://cliffhacks.blogspot.com/2007/03/ ... -llvm.html
The LLVM suite includes all the same pieces as a production compiler and linker -- frontends, intermediate representations, strong optimization passes, and backend code generators. The difference is that the pieces have well-defined interfaces and formats to exchange data with, and so you can use them whenever you want: at compile time or "link time", on-the-fly in your IDE, and even while your program is running (e.g. using the optimizer and code generator pieces to make a JIT).

The GCC-LLVM compiler that is being talked about is a native-code-generating optimizing compiler, just like GCC by itself is. They just replaced the optimizer/code generator pieces with the LLVM equivalents. From things I've read, I vaguely believe LLVM's optimizer is stronger than GCC 4's in some ways, although it may be less efficient at some other things. There might be bugs (though that is true of any compiler).

LLVM is mature technology, almost (but not quite) ready for widescale production use. It's most promising aspect is that it's a collection of building blocks for language designers, compiler and JIT writers, and tool builders to use to help them get decent-quality, optimized machine code out of their new custom language or tool.

You can read more about it here: http://llvm.org/
Low Level Virtual Machine (LLVM) is:

A compilation strategy designed to enable effective program optimization across the entire lifetime of a program. LLVM supports effective optimization at compile time, link-time (particularly interprocedural), run-time and offline (i.e., after software is installed), while remaining transparent to developers and maintaining compatibility with existing build scripts.

A virtual instruction set - LLVM is a low-level object code representation that uses simple RISC-like instructions, but provides rich, language-independent, type information and dataflow (SSA) information about operands. This combination enables sophisticated transformations on object code, while remaining light-weight enough to be attached to the executable. This combination is key to allowing link-time, run-time, and offline transformations.

A compiler infrastructure - LLVM is also a collection of source code that implements the language and compilation strategy. The primary components of the LLVM infrastructure are a GCC-based C & C++ front-end, a link-time optimization framework with a growing set of global and interprocedural analyses and transformations, static back-ends for the X86, X86-64, PowerPC 32/64, ARM, Thumb, IA-64, Alpha, SPARC, MIPS and CellSPU architectures, a back-end which emits portable C code, and a Just-In-Time compiler for X86, X86-64, PowerPC 32/64 processors, and an emitter for MSIL.

LLVM does not imply things that you would expect from a high-level virtual machine. It does not require garbage collection or run-time code generation (In fact, LLVM makes a great static compiler!). Note that optional LLVM components can be used to build high-level virtual machines and other systems that need these services.
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: LLVM gcc

Post by wgarvin »

That guy is comparing runtime-JITted code from LLVM against statically-compiled code from GCC. I'd be surprised if the JIT includes the sort of expensive optimizations you would find at compile or link time in a static compiler at -O3. In an apples to apples comparison it might do better.

http://www.cs.uiuc.edu/news/articles.ph ... 7Jun25-274
Apple is using LLVM in the forthcoming version of their operating system, MacOS 10.5 (Leopard). MacOS uses LLVM both at compile-time and run-time to optimize graphics shader codes. Shader codes are used to render individual scenes in visualization or game applications, and are largely composed of calls to standard OpenGL operations. At compile time, LLVM is used to compile and optimize individual operations in the MacOS OpenGL library. The library is saved as LLVM "bytecode," a compact, persistent code representation. LLVM is then used again at run-time, when shader codes are loaded into a visualization or game application, to translate these codes into efficient native code for the host processor.

Both Cray and Ageia are using LLVM to develop "back ends,"( i.e., native code generators), for their architectures. Cray is creating a back end for the AMD Opteron processors, and Ageia for a custom physics-based processor used for accelerating video games.
I think what they're really saying there (about Apple's use in OpenGL) is that the OpenGL interface is procedural and stateful, and they use LLVM to compile fast implementations at run-time that are optimized for the current states that you've set. It's similar to what game programmers do with GPU shader variations, except those are usually all pre-compiled to have them ready. The OpenGL API has so many states that pre-optimizing for all the useful combinations of them is probably impossible.

http://www.appleinsider.com/articles/08 ... plier.html
In addition to the pure LLVM Clang project, which uses an early, developmental front end code parser for Objective C/C/C++, Apple also started work on integrating components of LLVM into the existing GCC based on Lattner's LLVM/GCC Integration Proposal. That has resulted in a hybrid system that leverages the mature components of GCC, such as its front end parser, while adding the most valuable components of LLVM, including its modern code optimizers.

That project, known as LLVM-GCC, inserts the optimizer and code generator from LLVM into GCC, providing modern methods for "aggressive loop, standard scalar, and interprocedural optimizations and interprocedural analyses" missing in the standard GCC components.

LLVM-GCC is designed to be highly compatible with GCC so that developers can move to the new compiler and benefit from its code optimizations without making substantial changes to their workflow. Sources report that LLVM-GCC "compiles code that consistently runs 33% faster" than code output from GCC.

Apple also uses LLVM in the OpenGL stack in Leopard, leveraging its virtual machine concept of common IR to emulate OpenGL hardware features on Macs that lack the actual silicon to interpret that code. Code is instead interpreted or JIT on the CPU.
Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: LLVM gcc

Post by Dann Corbit »

wgarvin wrote:
That guy is comparing runtime-JITted code from LLVM against statically-compiled code from GCC. I'd be surprised if the JIT includes the sort of expensive optimizations you would find at compile or link time in a static compiler at -O3. In an apples to apples comparison it might do better.

http://www.cs.uiuc.edu/news/articles.ph ... 7Jun25-274
Apple is using LLVM in the forthcoming version of their operating system, MacOS 10.5 (Leopard). MacOS uses LLVM both at compile-time and run-time to optimize graphics shader codes. Shader codes are used to render individual scenes in visualization or game applications, and are largely composed of calls to standard OpenGL operations. At compile time, LLVM is used to compile and optimize individual operations in the MacOS OpenGL library. The library is saved as LLVM "bytecode," a compact, persistent code representation. LLVM is then used again at run-time, when shader codes are loaded into a visualization or game application, to translate these codes into efficient native code for the host processor.

Both Cray and Ageia are using LLVM to develop "back ends,"( i.e., native code generators), for their architectures. Cray is creating a back end for the AMD Opteron processors, and Ageia for a custom physics-based processor used for accelerating video games.
I think what they're really saying there (about Apple's use in OpenGL) is that the OpenGL interface is procedural and stateful, and they use LLVM to compile fast implementations at run-time that are optimized for the current states that you've set. It's similar to what game programmers do with GPU shader variations, except those are usually all pre-compiled to have them ready. The OpenGL API has so many states that pre-optimizing for all the useful combinations of them is probably impossible.

http://www.appleinsider.com/articles/08 ... plier.html
In addition to the pure LLVM Clang project, which uses an early, developmental front end code parser for Objective C/C/C++, Apple also started work on integrating components of LLVM into the existing GCC based on Lattner's LLVM/GCC Integration Proposal. That has resulted in a hybrid system that leverages the mature components of GCC, such as its front end parser, while adding the most valuable components of LLVM, including its modern code optimizers.

That project, known as LLVM-GCC, inserts the optimizer and code generator from LLVM into GCC, providing modern methods for "aggressive loop, standard scalar, and interprocedural optimizations and interprocedural analyses" missing in the standard GCC components.

LLVM-GCC is designed to be highly compatible with GCC so that developers can move to the new compiler and benefit from its code optimizations without making substantial changes to their workflow. Sources report that LLVM-GCC "compiles code that consistently runs 33% faster" than code output from GCC.

Apple also uses LLVM in the OpenGL stack in Leopard, leveraging its virtual machine concept of common IR to emulate OpenGL hardware features on Macs that lack the actual silicon to interpret that code. Code is instead interpreted or JIT on the CPU.
Since I do parsing all the time (I do SQL parsing as well as other things), I think that LLVM is something very interesting.

If the LLVM compiler is beating GCC by 1/3 that is very impressive. And with Apple's weight behind it, it can't help but grow better. It is nice to see new alternatives for programmers.
Ron Murawski
Posts: 397
Joined: Sun Oct 29, 2006 4:38 am
Location: Schenectady, NY

Re: LLVM gcc

Post by Ron Murawski »

wgarvin wrote:
That guy is comparing runtime-JITted code from LLVM against statically-compiled code from GCC. I'd be surprised if the JIT includes the sort of expensive optimizations you would find at compile or link time in a static compiler at -O3. In an apples to apples comparison it might do better.
From Dann's link to Cliff's Hacks:
"This is when I started to notice something: the JIT is impressively fast, considering that I had not activated any of LLVM's impressive suite of optimizations."

The LLVM JITted code results blew me away. Great stuff!

Ron