Skip to content

Commit b6473e5

Browse files
committed
Merge pull request #26 from SRI-CSL/master
Dragonegg now compiles apache, AND is part of the travis_ci
2 parents 41b6ce0 + 2e0bd3b commit b6473e5

File tree

8 files changed

+145
-97
lines changed

8 files changed

+145
-97
lines changed

.travis.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,14 +8,14 @@ python:
88
install:
99
- sudo apt-get update
1010
- sudo apt-get install llvm-3.4 clang-3.4 libapr1-dev libaprutil1-dev
11-
- sudo apt-get install dragonegg llvm-3.3 llvm-gcc
11+
- sudo apt-get install dragonegg llvm-3.0 llvm-gcc
1212
- export WLLVM_HOME=`pwd`
1313

1414
# command to run tests
1515
script:
1616
# build apache with clang (i.e. httpd-2.4.12)
1717
- ${WLLVM_HOME}/.travis/apache_clang.sh
18-
# when it works we can uncomment this puppy out
19-
# - ${WLLVM_HOME}/.travis/apache_dragonegg.sh
18+
# build apache with gcc and dragonegg (i.e. httpd-2.4.12)
19+
- ${WLLVM_HOME}/.travis/apache_dragonegg.sh
2020

2121

.travis/apache_dragonegg.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
# Make sure we exit if there is a failure
33
set -e
44

5+
export dragonegg_disable_version_check=true
56

67
export PATH=/usr/lib/llvm-3.0/bin:${WLLVM_HOME}:${PATH}
78
export LLVM_COMPILER=dragonegg

NOTES.txt

Lines changed: 0 additions & 23 deletions
This file was deleted.

README.md

Lines changed: 64 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -1,51 +1,53 @@
11

2-
[![Build Status](https://travis-ci.org/travitch/whole-program-llvm.svg?branch=master)](https://travis-ci.org/travitch/whole-program-llvm)
2+
[![Build Status](https://travis-ci.org/SRI-CSL/whole-program-llvm.svg?branch=master)](https://travis-ci.org/SRI-CSL/whole-program-llvm)
33

44

55
Introduction
66
============
77

8-
This is a small python-based wrapper around a GCC-compatible compiler
9-
to make it easy to build whole-program (or whole-library) LLVM bitcode
10-
files. The idea is that it first invokes the compiler as normal to
11-
build a real object file. It then invokes a bitcode compiler to
12-
generate the corresponding bitcode, recording the location of the
13-
bitcode file in an ELF section of the actual object file.
14-
15-
When object files are linked together, the contents of non-special ELF
16-
sections are just concatenated (so we don't lose the locations of any
17-
of the constituent bitcode files).
18-
19-
This package contains an extra utility, extract-bc, to read the
20-
contents of this ELF section and link all of the bitcode into a single
21-
whole-program bitcode file. This utility can also be used on built
22-
native static libraries to generate LLVM bitcode archives.
23-
24-
This two-phase build process is slower and more elaborate than normal,
25-
but in practice is necessary to be a drop-in replacement for gcc in
26-
any build system. Approaches using the LTO framework in gcc and the
27-
gold linker plugin work for many cases, but fail in the presence of
28-
static libraries in builds. This approach has the distinct advantage
29-
of generating working binaries, in case some part of a build process
30-
actually requires that.
31-
32-
Currently, this package only works using clang or the dragonegg plugin
33-
with gcc 4.5 (with the required patch for dragonegg).
8+
This project, WLLVM, provides tools for building whole-program (or
9+
whole-library) LLVM bitcode files from an unmodified C or C++
10+
source package. It currently runs on `*nix` platforms such as Linux,
11+
FreeBSD, and Mac OS X.
12+
13+
WLLVM provides python-based compiler wrappers that work in two
14+
steps. The wrappers first invoke the compiler as normal. Then, for
15+
each object file, they call a bitcode compiler to produce LLVM
16+
bitcode. The wrappers also store the location of the generated bitcode
17+
file in a dedicated section of the object file. When object files are
18+
linked together, the contents of the dedicated sections are
19+
concatenated (so we don't lose the locations of any of the constituent
20+
bitcode files). After the build completes, one can use an WLLVM
21+
utility to read the contents of the dedicated section and link all of
22+
the bitcode into a single whole-program bitcode file. This utility
23+
works for both executable and native libraries.
24+
25+
Currently, WLLVM works with either clang or the gcc dragonegg plugin.
26+
27+
This two-phase build process is necessary to be a drop-in replacement
28+
for gcc or g++ in any build system. Using the LTO framework in gcc
29+
and the gold linker plugin works in many cases, but fails in the
30+
presence of static libraries in builds. WLLVM's approach has the
31+
distinct advantage of generating working binaries, in case some part
32+
of a build process requires that.
33+
3434

3535
Usage
3636
=====
3737

38-
There are three environment variables that must be set to use this
39-
wrapper script:
38+
WLLVM includes two python executables: `wllvm` for compiling C code
39+
and `wllvm++` for C++, and an auxiliary tool `extract-bc`.
40+
41+
Three environment variables must be set to use these wrappers:
4042

41-
* `LLVM_COMPILER` should be set to 'dragonegg' or 'clang'.
43+
* `LLVM_COMPILER` should be set to either 'dragonegg' or 'clang'.
4244
* `LLVM_GCC_PREFIX` should be set to the prefix for the version of gcc that should
4345
be used with dragonegg. This can be empty if there is no prefix. This variable is
4446
not used if `$LLVM_COMPILER == clang`.
4547
* `LLVM_DRAGONEGG_PLUGIN` should be the full path to the dragonegg plugin. This
4648
variable is not used if `$LLVM_COMPILER == clang`.
4749

48-
Once the environment is set up, just use wllvm and wllvm++ as your C
50+
Once the environment is set up, just use `wllvm` and `wllvm++` as your C
4951
and C++ compilers, respectively.
5052

5153
In addition to the above environment variables the following can be optionally used:
@@ -58,13 +60,28 @@ In addition to the above environment variables the following can be optionally u
5860
variable.
5961
Example `LLVM_COMPILER_PATH=/home/user/llvm_and_clang/Debug+Asserts/bin`.
6062

61-
* `WLLVM_CONFIGURE_ONLY` can be set to anything, when set `wllvm` and `wllvm++`
62-
will not carry out the second phase that involves the production of bitcode.
63-
This may prevent configuration errors being cause by the unexpected production
64-
of the hidden bitcode files.
63+
* `WLLVM_CONFIGURE_ONLY` can be set to anything. If it is set, `wllvm`
64+
and `wllvm++` behave like a normal C or C++ compiler. They do not
65+
the produce bitcode. Setting `WLLVM_CONFIGURE_ONLY` may prevent
66+
configuration errors caused by the unexpected production of hidden
67+
bitcode files.
68+
69+
70+
Building a bitcode module with clang
71+
====================================
72+
73+
export LLVM_COMPILER=clang
6574

66-
Example building bitcode module
67-
===============================
75+
tar xf pkg-config-0.26.tar.gz
76+
cd pkg-config-0.26
77+
CC=wllvm ./configure
78+
make
79+
80+
# Produces pkg-config.bc
81+
extract-bc pkg-config
82+
83+
Building a bitcode module with dragonegg
84+
========================================
6885

6986
export LLVM_COMPILER=dragonegg
7087
export LLVM_GCC_PREFIX=llvm-
@@ -78,8 +95,9 @@ Example building bitcode module
7895
# Produces pkg-config.bc
7996
extract-bc pkg-config
8097

81-
Example building bitcode archive
82-
================================
98+
99+
Building bitcode archive
100+
========================
83101

84102
export LLVM_COMPILER=clang
85103
tar -xvf bullet-2.81-rev2613.tgz
@@ -91,20 +109,21 @@ Example building bitcode archive
91109
# Produces src/LinearMath/libLinearMath.bca
92110
extract-bc src/LinearMath/libLinearMath.a
93111

94-
Example building an Operating System
95-
================================
96112

97-
To see how to build freeBSD 10.0 from scratch check out the guide
98-
[here.](../master/README-freeBSD.md)
113+
Building an Operating System
114+
============================
99115

116+
To see how to build freeBSD 10.0 from scratch check out this
117+
[guide.](../master/README-freeBSD.md)
100118

101-
Example configuring without building bitcode
119+
120+
Configuring without building bitcode
102121
================================
103122

104123

105124
WLLVM_CONFIGURE_ONLY=1 CC=wllvm ./configure
106125
CC=wllvm make
107-
126+
108127

109128
Debugging
110129
=========

driver/as

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -53,18 +53,15 @@ except AttributeError as e:
5353
logging.error('Output file argument not found.\nException message: ' + str(e))
5454
sys.exit(1)
5555

56-
bcfilename = '.{0}.bc'.format(filename)
57-
bcPath = os.path.join(dirs, bcfilename)
58-
fakeAssembler = [llvmAssembler, infile, '-o', bcPath]
56+
fakeAssembler = [llvmAssembler, infile, '-o', argFilter.outFileName]
57+
5958
asmProc = Popen(fakeAssembler)
6059
realRet = asmProc.wait()
6160

6261
if realRet != 0:
6362
logging.error('llvm-as failed')
6463
sys.exit(realRet)
6564

66-
attachBitcodePathToObject(bcPath, argFilter.outFileName)
67-
6865
sys.exit(realRet)
6966

7067

driver/utils.py

Lines changed: 28 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -32,8 +32,8 @@
3232
# Internal logger
3333
_logger = logging.getLogger(__name__)
3434

35-
# Flag for debugging
36-
DEBUG = False
35+
# Flag for dumping
36+
DUMPING = False
3737

3838

3939
# This class applies filters to GCC argument lists. It has a few
@@ -179,11 +179,15 @@ def __init__(self, inputList, exactMatches={}, patternMatches={}):
179179
'-static' : (0, ArgumentListFilter.linkUnaryCallback),
180180
'-nostdlib' : (0, ArgumentListFilter.linkUnaryCallback),
181181
'-nodefaultlibs' : (0, ArgumentListFilter.linkUnaryCallback),
182+
'-rdynamic' : (0, ArgumentListFilter.linkUnaryCallback),
182183
# darwin flags
183184
'-dynamiclib' : (0, ArgumentListFilter.linkUnaryCallback),
184185
'-current_version' : (1, ArgumentListFilter.linkBinaryCallback),
185186
'-compatibility_version' : (1, ArgumentListFilter.linkBinaryCallback),
186187

188+
# dragonegg mystery argument
189+
'--64' : (0, ArgumentListFilter.compileUnaryCallback),
190+
187191
#
188192
# BD: need to warn the darwin user that these flags will rain on their parade
189193
# (the Darwin ld is a bit single minded)
@@ -278,7 +282,7 @@ def __init__(self, inputList, exactMatches={}, patternMatches={}):
278282
_logger.warning('Did not recognize the compiler flag "{0}"'.format(currentItem))
279283
self.compileUnaryCallback(currentItem)
280284

281-
if DEBUG:
285+
if DUMPING:
282286
self.dump()
283287

284288
def _shiftArgs(self, nargs):
@@ -296,7 +300,7 @@ def abortUnaryCallback(self, flag):
296300
def inputFileCallback(self, infile):
297301
_logger.debug('Input file: ' + infile)
298302
self.inputFiles.append(infile)
299-
if re.search('\\.(s|S)', infile):
303+
if re.search('\\.(s|S)$', infile):
300304
self.isAssembly = True
301305

302306
def outputFileCallback(self, flag, filename):
@@ -364,7 +368,7 @@ def getOutputFilename(self):
364368

365369
# iam: returns a pair [objectFilename, bitcodeFilename] i.e .o and .bc.
366370
# the hidden flag determines whether the objectFile is hidden like the
367-
# bitcodeFile is (starts with a '.'), use the DEBUG flag to get a sense
371+
# bitcodeFile is (starts with a '.'), use the logging level & DUMPING flag to get a sense
368372
# of what is being written out.
369373
def getArtifactNames(self, srcFile, hidden=False):
370374
(srcpath, srcbase) = os.path.split(srcFile)
@@ -381,15 +385,15 @@ def getArtifactNames(self, srcFile, hidden=False):
381385

382386
#iam: for printing our partitioning of the args
383387
def dump(self):
384-
print("compileArgs: ", self.compileArgs)
385-
print("inputFiles: ", self.inputFiles)
386-
print("linkArgs: ", self.linkArgs)
387-
print("objectFiles: ", self.objectFiles)
388-
print("outputFilename: ", self.outputFilename)
388+
_logger.debug('compileArgs: {0}'.format(self.compileArgs))
389+
_logger.debug('inputFiles: {0}'.format(self.inputFiles))
390+
_logger.debug('linkArgs: {0}'.format(self.linkArgs))
391+
_logger.debug('objectFiles: {0}'.format(self.objectFiles))
392+
_logger.debug('outputFilename: {0}'.format(self.outputFilename))
389393
for srcFile in self.inputFiles:
390-
print("srcFile: ", srcFile)
394+
_logger.debug('srcFile: {0}'.format(srcFile))
391395
(objFile, bcFile) = self.getArtifactNames(srcFile)
392-
print("{0} ===> ({1}, {2})".format(srcFile, objFile, bcFile))
396+
_logger.debug('{0} ===> ({1}, {2})'.format(srcFile, objFile, bcFile))
393397

394398

395399

@@ -455,6 +459,7 @@ def attachBitcodePathToObject(bcPath, outFileName):
455459
# Don't try to attach a bitcode path to a binary. Unfortunately
456460
# that won't work.
457461
(root, ext) = os.path.splitext(outFileName)
462+
_logger.debug('attachBitcodePathToObject: {0} ===> {1} [ext = {2}]\n'.format(bcPath, outFileName, ext))
458463
#iam: this also looks very dodgey; we need a more reliable way to do this:
459464
if ext not in ('.o', '.lo', '.os', '.So', '.po'):
460465
_logger.warning('Cannot attach bitcode path to "{0} of type {1}"'.format(outFileName, FileType.getFileType(outFileName)))
@@ -520,6 +525,12 @@ def __init__(self, cmd, isCxx, prefixPath=None):
520525
else:
521526
self.prefixPath = ''
522527

528+
#clang and drogonegg share the same taste in bitcode filenames.
529+
def getBitcodeFileName(self, argFilter):
530+
(dirs, baseFile) = os.path.split(argFilter.getOutputFilename())
531+
bcfilename = os.path.join(dirs, '.{0}.bc'.format(baseFile))
532+
return bcfilename
533+
523534
class ClangBuilder(BuilderBase):
524535
def __init__(self, cmd, isCxx, prefixPath=None):
525536
super(ClangBuilder, self).__init__(cmd, isCxx, prefixPath)
@@ -537,11 +548,6 @@ def getCompiler(self):
537548
def getBitcodeArglistFilter(self):
538549
return ClangBitcodeArgumentListFilter(self.cmd)
539550

540-
def getBitcodeFileName(self, argFilter):
541-
(dirs, baseFile) = os.path.split(argFilter.getOutputFilename())
542-
bcfilename = os.path.join(dirs, '.{0}.bc'.format(baseFile))
543-
return bcfilename
544-
545551
def extraBitcodeArgs(self, argFilter):
546552
bcPath = self.getBitcodeFileName(argFilter)
547553
return ['-o', bcPath]
@@ -562,8 +568,9 @@ def getBitcodeCompiler(self):
562568
# We use '-B' to tell gcc where to look for an assembler.
563569
# When we build LLVM bitcode we do not want to use the GNU assembler,
564570
# instead we want gcc to use our own assembler (see driver/as).
565-
return cc + ['-B', driverDir, '-fplugin={0}'.format(pth),
566-
'-fplugin-arg-dragonegg-emit-ir']
571+
cmd = cc + ['-B', driverDir, '-fplugin={0}'.format(pth), '-fplugin-arg-dragonegg-emit-ir']
572+
_logger.debug(cmd)
573+
return cmd
567574

568575
def getCompiler(self):
569576
pfx = ''
@@ -687,6 +694,7 @@ def buildBitcodeFile(builder, srcFile, bcFile):
687694
bcc.extend(af.compileArgs)
688695
bcc.extend(['-c', srcFile])
689696
bcc.extend(['-o', bcFile])
697+
_logger.debug('buildBitcodeFile: {0}\n'.format(bcc))
690698
proc = Popen(bcc)
691699
rc = proc.wait()
692700
if rc != 0:
@@ -699,6 +707,7 @@ def buildObjectFile(builder, srcFile, objFile):
699707
cc.extend(af.compileArgs)
700708
cc.append(srcFile)
701709
cc.extend(['-c', '-o', objFile])
710+
_logger.debug('buildObjectFile: {0}\n'.format(cc))
702711
proc = Popen(cc)
703712
rc = proc.wait()
704713
if rc != 0:

extract-bc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -344,7 +344,7 @@ def process_file_unix(inputFile, outputFile, llvmLinker, llvmArchiver):
344344
arCmd = ['ar', 'xv'] if verboseFlag else ['ar', 'x']
345345
ofileType = FileType.ELF_OBJECT
346346

347-
if ft == FileType.ELF_EXECUTABLE or ft == FileType.ELF_SHARED:
347+
if ft == FileType.ELF_EXECUTABLE or ft == FileType.ELF_SHARED or ft == FileType.ELF_OBJECT:
348348
logging.info('Generating LLVM Bitcode module')
349349
return handleExecutable(inputFile, outputFile, extractor, llvmLinker)
350350
elif ft == FileType.ARCHIVE:
@@ -364,7 +364,7 @@ def process_file_darwin(inputFile, outputFile, llvmLinker, llvmArchiver):
364364
arCmd = ['ar', '-x', '-v'] if verboseFlag else ['ar', '-x']
365365
ofileType = FileType.MACH_OBJECT
366366

367-
if ft == FileType.MACH_EXECUTABLE or ft == FileType.MACH_SHARED:
367+
if ft == FileType.MACH_EXECUTABLE or ft == FileType.MACH_SHARED or ft == FileType.MACH_OBJECT:
368368
logging.info('Generating LLVM Bitcode module')
369369
return handleExecutable(inputFile, outputFile, extractor, llvmLinker)
370370
elif ft == FileType.ARCHIVE:

0 commit comments

Comments
 (0)