Skip to content

Commit 0459efc

Browse files
committed
test/bench suite improvements
* 'rustc' tests now download and set (only in the directory) the right version of the compiler and tell you if that version is getting outdated. * Benchmarks look for a `.benchignore` file in the sample-sources directory. If they find one, those files are skipped by benchmarks. * Added a benchmarks README and updated the tests README * Changed some folder names, moved some stuff * Added a shell-script for populating 'sample-sources' (and tweaked .gitignore to be useful here) * Fixed some buggy comparision code in rustc-tests * Fixed mods to differentiate at the AST level between `mod foo;` and `mod foo { }` * Fixed some parser issues: - Float literals with underscores in them - Comments starting with `////` or `/***` are _not_ doc comments - Named arguments are now different than general arguments - Paths allow the right segments There is one outstanding issue: `union?` (so `union` as an identifier) fails to parse.
1 parent aac6846 commit 0459efc

File tree

27 files changed

+440
-169
lines changed

27 files changed

+440
-169
lines changed

.gitignore

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# Stack related files
2+
.stack-work/
3+
stack.yaml
4+
5+
# Benchmark output folders
6+
bench/allocations/
7+
bench/timings/
8+
9+
# Sample source files
10+
sample-sources/
11+
!sample-sources/attributes.rs
12+
!sample-sources/empty.rs
13+
!sample-sources/expressions.rs
14+
!sample-sources/items.rs
15+
!sample-sources/let.rs
16+
!sample-sources/literals.rs
17+
!sample-sources/macros.rs
18+
!sample-sources/patterns.rs
19+
!sample-sources/precedences.rs
20+
!sample-sources/statement-expressions.rs
21+
!sample-sources/statements.rs
22+
!sample-sources/types.rs

aa.rs

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
mod foo {}
2+

bench/README.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
We have two types of benchmarks. If you are using `stack` you can run them with
2+
3+
```
4+
$ stack bench # runs all benchmarks
5+
$ stack bench :allocation-benchmarks # runs allocation benchmarks only (faster)
6+
$ stack bench :timing-benchmarks # runs timing benchmarks only (slower)
7+
```
8+
9+
## `allocation-benchmarks`
10+
11+
Benchmarks how much memory is allocated by the runtime when parsing the files inside of the
12+
`sample-sources` directory at the project root. Resulting information is stored in a JSON file in
13+
the `allocations` folder (automatically created in this directory).
14+
15+
## `timimng-benchmarks`
16+
17+
Benchmark how long it takes to parse the files inside the `sample-sources` directory. Resulting
18+
information is stored in a JSON file in the `timings` folder (automatically created in this
19+
directory).
20+
21+
# Tools
22+
23+
Since some of these tests take a while, you can add a `.benchignore` file in `sample-sources` which
24+
lists files to skip for benchmarking (one file name per line).
25+
26+
There is also a `bench.py` utility in this directory which lets you compare benchmarks across
27+
different commits. It relies on the JSON files in `allocations` and `timings`, so you will have to
28+
checkout and run the benchmarks on commits you want to compare against (to generate the
29+
corresponding JSON file).
30+
31+
```
32+
$ ./bench.py --folder allocations # compare the last several commits for allocations
33+
$ ./bench.py --folder timings # compare the last several commits for timings
34+
```
35+

benchmarks/allocation-benchmarks/Main.hs renamed to bench/allocation-benchmarks/Main.hs

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -32,11 +32,16 @@ main = do
3232
-- Get the test cases
3333
workingDirectory <- getCurrentDirectory
3434
let sampleSources = workingDirectory </> "sample-sources"
35+
benchIgnore = sampleSources </> ".benchignore"
36+
benchIgnoreExists <- doesFileExist benchIgnore
37+
ignore <- if benchIgnoreExists
38+
then (\f -> map (sampleSources </>) (lines f)) <$> readFile benchIgnore
39+
else pure []
3540
entries <- map (sampleSources </>) <$> listDirectory sampleSources
36-
files <- filterM doesFileExist entries
41+
files <- filterM doesFileExist (filter (`notElem` ignore) entries)
3742

3843
-- Clear out previous WIP (if there is one)
39-
catch (removeFile (workingDirectory </> "allocations" </> "WIP" <.> "json"))
44+
catch (removeFile (workingDirectory </> "bench" </> "allocations" </> "WIP" <.> "json"))
4045
(\e -> if isDoesNotExistError e then pure () else throwIO e)
4146

4247
-- Run 'weigh' tests
@@ -57,8 +62,8 @@ main = do
5762
]
5863

5964
-- Save the output to JSON
60-
createDirectoryIfMissing False (workingDirectory </> "allocations")
61-
let logFile = workingDirectory </> "allocations" </> logFileName <.> "json"
65+
createDirectoryIfMissing False (workingDirectory </> "bench" </> "allocations")
66+
let logFile = workingDirectory </> "bench" </> "allocations" </> logFileName <.> "json"
6267
putStrLn $ "writing results to: " ++ logFile
6368
logFile `BL.writeFile` encode results
6469

bench.py renamed to bench/bench.py

Lines changed: 44 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -58,12 +58,55 @@ def flattenListDict(d, indent=0):
5858

5959
# Currently not used...
6060
def fmtSize(num):
61+
"""format a number of bytes on disk into a human readable form"""
6162
for unit in ['','KB','MB','GB','TB','PB','EB','ZB']:
6263
if abs(num) < 1024.0:
6364
return "%3.1f%s" % (num, unit)
6465
num /= 1024.0
6566
return "%.1f%s%s" % (num, 'YB', suffix)
6667

68+
def revParse(commit, useAbbreviated=False):
69+
"""get the hash for a commit"""
70+
abbreviated = subprocess.run(
71+
["git", "rev-parse", "--abbrev-ref", commit],
72+
stdout=subprocess.PIPE,
73+
check=True
74+
).stdout.decode("utf8").strip()
75+
76+
other = subprocess.run(
77+
["git", "rev-parse", commit],
78+
stdout=subprocess.PIPE,
79+
check=True
80+
).stdout.decode("utf8").strip()
81+
82+
return (useAbbreviated and abbreviated) or other
83+
84+
# Run benchmarks for a commit
85+
def runBenchmarks(commit):
86+
"""temporarily check out the given commit to run the benchmarks"""
87+
88+
print("Running benchmarks for '" + commit + "'")
89+
commit = revParse(commit)
90+
print('\033[31m' + "Do not make any changes to files!" + '\033[0m')
91+
init = revParse("HEAD")
92+
93+
localChanges = "No local changes to save\n" != subprocess.run(
94+
["git", "status"],
95+
stdout=subprocess.PIPE
96+
).stdout
97+
98+
if localChanges:
99+
subprocess.run(["git", "stash"], stdout=subprocess.PIPE)
100+
101+
subprocess.run(["git", "checkout", commit])
102+
subprocess.run(["stack", "bench"])
103+
subprocess.run(["git", "checkout", init])
104+
105+
if localChanges:
106+
subprocess.run(["git", "stash", "pop"], stdout=subprocess.PIPE)
107+
108+
print('\033[32m' + "Back to initial state" + '\033[0m')
109+
67110

68111
if __name__ == "__main__":
69112
# Argument parser
@@ -84,8 +127,7 @@ def fmtSize(num):
84127
sanitized = ["WIP"]
85128
for commit in commits[1:]:
86129
try:
87-
c = subprocess.check_output(["git", "rev-parse", commit]).decode("utf-8").strip()
88-
sanitized.append(c)
130+
sanitized.append(revParse(commit))
89131
except:
90132
print('Invalid commit "' + commit + '"')
91133

benchmarks/timing-benchmarks/Main.hs renamed to bench/timing-benchmarks/Main.hs

Lines changed: 15 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,11 +2,12 @@
22

33
import Criterion
44
import Criterion.Main (defaultConfig)
5-
import Criterion.Types (anMean, reportAnalysis, timeLimit, anOutlierVar, ovEffect, OutlierEffect(Severe))
5+
import Criterion.Types (anMean, reportAnalysis, timeLimit, anOutlierVar, ovEffect, OutlierEffect(Moderate))
66
import Statistics.Resampling.Bootstrap (Estimate(..))
77

88
import Control.Monad (filterM)
99
import Control.Exception (catch, throwIO)
10+
import Data.Foldable (for_)
1011
import Data.Traversable (for)
1112
import GHC.Exts (fromString)
1213

@@ -32,32 +33,39 @@ main = do
3233
-- Get the test cases
3334
workingDirectory <- getCurrentDirectory
3435
let sampleSources = workingDirectory </> "sample-sources"
36+
benchIgnore = sampleSources </> ".benchignore"
37+
benchIgnoreExists <- doesFileExist benchIgnore
38+
ignore <- if benchIgnoreExists
39+
then (\f -> map (sampleSources </>) (lines f)) <$> readFile benchIgnore
40+
else pure []
3541
entries <- map (sampleSources </>) <$> listDirectory sampleSources
36-
files <- filterM doesFileExist entries
42+
files <- filterM doesFileExist (filter (`notElem` ignore) entries)
3743

3844
-- Clear out previous WIP (if there is one)
39-
catch (removeFile (workingDirectory </> "timings" </> "WIP" <.> "json"))
45+
catch (removeFile (workingDirectory </> "bench" </> "timings" </> "WIP" <.> "json"))
4046
(\e -> if isDoesNotExistError e then pure () else throwIO e)
4147

4248
-- Run 'criterion' tests
4349
reports <- for files $ \f -> do
4450
let name = takeFileName f
4551
putStrLn name
4652
is <- readInputStream f
47-
bnch <- benchmarkWith' defaultConfig{ timeLimit = 15 } (nf (parse' @(SourceFile Span)) is)
53+
bnch <- benchmarkWith' defaultConfig{ timeLimit = 20 } (nf (parse' @(SourceFile Span)) is)
4854
pure (name, bnch)
4955
let results = object [ fromString name .= object [ "mean" .= m
5056
, "lower bound" .= l
5157
, "upper bound" .= u
5258
]
5359
| (name,report) <- reports
5460
, let Estimate m l u _ = anMean (reportAnalysis report)
55-
, ovEffect (anOutlierVar (reportAnalysis report)) /= Severe
61+
, ovEffect (anOutlierVar (reportAnalysis report)) < Moderate
5662
]
63+
for_ [ name | (name,report) <- reports, ovEffect (anOutlierVar (reportAnalysis report)) >= Moderate ] $ \n ->
64+
putStrLn $ "Benchmark for `" ++ n ++ "' will not be considered since it was inflated"
5765

5866
-- Save the output to JSON
59-
createDirectoryIfMissing False (workingDirectory </> "timings")
60-
let logFile = workingDirectory </> "timings" </> logFileName <.> "json"
67+
createDirectoryIfMissing False (workingDirectory </> "bench" </> "timings")
68+
let logFile = workingDirectory </> "bench" </> "timings" </> logFileName <.> "json"
6169
putStrLn $ "writing results to: " ++ logFile
6270
logFile `BL.writeFile` encode results
6371

get-rust-sources.sh

Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
#!/bin/sh
2+
3+
# Usage info
4+
if ! [ $# = 1 ]
5+
then
6+
echo "This script gets all of the (> 1000 LOC) source files in repositories"
7+
echo "under 'rust-lang' and 'rust-lang-nursery' organizations"
8+
echo ""
9+
echo "Expected usage:"
10+
echo " $0 <destination-folder>"
11+
echo ""
12+
echo "You probably want to run:"
13+
echo " $0 sample-sources"
14+
exit 1
15+
else
16+
DEST="$1"
17+
fi
18+
19+
# Work inside a temporary directory
20+
TEMP=temp
21+
mkdir $TEMP
22+
cd $TEMP
23+
24+
# Get the JSON files
25+
curl https://api.github.com/orgs/rust-lang/repos > rust-lang.json
26+
curl https://api.github.com/orgs/rust-lang-nursery/repos > rust-lang-nursery.json
27+
28+
# Make one big JSON array of repos and extract the name and clone url
29+
(jq -rs '.[0] + .[1] | .[] | (.name, .clone_url)' rust-lang.json rust-lang-nursery.json \
30+
) | while read -r REPO_NAME; read -r REPO_CLONE; do
31+
32+
# Skip 'multirust-rs-binaries' and 'rustc-timing-archive' in particular
33+
if [ $REPO_NAME = "multirust-rs-binaries" ] || [ $REPO_NAME = "rustc-timing-archive" ]
34+
then
35+
continue
36+
fi
37+
38+
# Do a shallow clone of the repo
39+
echo "Cloning $REPO_NAME at $REPO_CLONE"
40+
git clone --depth=1 $REPO_CLONE
41+
42+
# Find all rust files in the repo and copy each of these files to the DEST folder, provided they
43+
# are more than 2000 lines long. The 2000 line long stipulation serves several purposes: to
44+
# provide files that whose parsing time is non-trivial and also source files which are expected to
45+
# compile.
46+
echo "Finding rust files in $REPO_NAME"
47+
find $REPO_NAME -type f -name '*.rs' | while read -r FILE; do
48+
49+
# Escaped file name
50+
DEST_FILE="../$DEST/${FILE//\//|}"
51+
52+
# Check the file is longer than 2000 lines
53+
if (( 1000 < $(wc -l < "$FILE") ))
54+
then
55+
cp $FILE $DEST_FILE
56+
fi
57+
58+
done;
59+
60+
# Delete the cloned repo
61+
rm -rf $REPO_NAME
62+
63+
done;
64+
65+
# Clean up
66+
cd ..
67+
rm -rf $TEMP
68+

language-rust.cabal

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ library
7878

7979

8080
test-suite unit-tests
81-
hs-source-dirs: tests/unit-tests
81+
hs-source-dirs: test/unit-tests
8282
ghc-options: -Wall
8383
main-is: Main.hs
8484
other-modules: LexerTest
@@ -95,7 +95,7 @@ test-suite unit-tests
9595
, language-rust
9696

9797
test-suite rustc-tests
98-
hs-source-dirs: tests/rustc-tests
98+
hs-source-dirs: test/rustc-tests
9999
ghc-options: -Wall
100100
main-is: Main.hs
101101
other-modules: Diff
@@ -113,9 +113,10 @@ test-suite rustc-tests
113113
, text >=1.2.0
114114
, unordered-containers >= 0.2.7
115115
, language-rust
116+
, time >=1.2.0.0
116117

117118
benchmark timing-benchmarks
118-
hs-source-dirs: benchmarks/timing-benchmarks
119+
hs-source-dirs: bench/timing-benchmarks
119120
ghc-options: -Wall
120121
main-is: Main.hs
121122
type: exitcode-stdio-1.0
@@ -131,7 +132,7 @@ benchmark timing-benchmarks
131132
, aeson >= 1.0.0.0
132133

133134
benchmark allocation-benchmarks
134-
hs-source-dirs: benchmarks/allocation-benchmarks
135+
hs-source-dirs: bench/allocation-benchmarks
135136
ghc-options: -Wall
136137
main-is: Main.hs
137138
type: exitcode-stdio-1.0

src/Language/Rust/Parser.hs

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ sourceFile :: SourceFile Span
2323

2424
module Language.Rust.Parser (
2525
-- * Parsing
26-
parse, parse', parseSourceFile', Parse(..), P, execParser, initPos, Span,
26+
parse, parse', readSourceFile, Parse(..), P, execParser, initPos, Span,
2727
-- * Lexing
2828
lexToken, lexNonSpace, lexTokens, translateLit,
2929
-- * Input stream
@@ -56,8 +56,8 @@ parse' is = case execParser parser is initPos of
5656
Right x -> x
5757

5858
-- | Given a path pointing to a Rust source file, read that file and parse it into a 'SourceFile'
59-
parseSourceFile' :: FilePath -> IO (SourceFile Span)
60-
parseSourceFile' fileName = parse' <$> readInputStream fileName
59+
readSourceFile :: FilePath -> IO (SourceFile Span)
60+
readSourceFile fileName = parse' <$> readInputStream fileName
6161

6262
-- | Exceptions that occur during parsing
6363
data ParseFail = ParseFail Position String deriving (Eq, Typeable)
@@ -74,7 +74,6 @@ class Parse a where
7474

7575
instance Parse (Lit Span) where parser = parseLit
7676
instance Parse (Attribute Span) where parser = parseAttr
77-
instance Parse (Arg Span) where parser = parseArg
7877
instance Parse (Ty Span) where parser = parseTy
7978
instance Parse (Pat Span) where parser = parsePat
8079
instance Parse (Expr Span) where parser = parseExpr

0 commit comments

Comments
 (0)