Skip to content

PermissionDenied exception related to MoveFileEx on Windows #1904

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
SiriusStarr opened this issue Jul 8, 2020 · 8 comments
Open

PermissionDenied exception related to MoveFileEx on Windows #1904

SiriusStarr opened this issue Jul 8, 2020 · 8 comments

Comments

@SiriusStarr
Copy link
Collaborator

(Opening an issue here rather than keeping it on Slack.)

Windows builds of a project using Dhall (in Haskell) have been failing seemingly at random on Github's CI. Sometimes the tests succeed normally, sometimes they fail (always in this way). I've been unable to reproduce the problem on an actual Windows computer, where the whole test suite passes fine.

uncaught exception: IOException of type PermissionDenied
C:\Users\runneradmin\AppData\Local\dhall-haskell\ato5C23.write: renameFile:renamePath:MoveFileEx "\\\\?\\C:\\Users\\runneradmin\\AppData\\Local\\dhall-haskell\\ato5C23.write" Just "\\\\?\\C:\\Users\\runneradmin\\AppData\\Local\\dhall-haskell\\122026a29e0113646fb623fba2a6657b31b99127b689d510ef6761df7dd49da8a5bb": permission denied (Access is denied.)

The hex values change, e.g. this was another error thrown on a different build:

uncaught exception: IOException of type PermissionDenied
C:\Users\runneradmin\AppData\Local\dhall-haskell\atoCD11.write: renameFile:renamePath:MoveFileEx "\\\\?\\C:\\Users\\runneradmin\\AppData\\Local\\dhall-haskell\\atoCD11.write" Just "\\\\?\\C:\\Users\\runneradmin\\AppData\\Local\\dhall-haskell\\12205b43b1207f0c5f69e80a94bf78d52a2b2189b2658a70652cb805001ced08b5ae": permission denied (Access is denied.)

The test in question that is failing is simply importing a Dhall expression from a file (using Dhall.inputFile) and comparing it to an expected Haskell value (i.e. no file manipulation is occurring beyond whatever Dhall.inputFile is doing). It is always the very first import on a test run that fails, with all others seeming to succeed fine.

Turning off hspec's parallel spec evaluation just for this first spec seems to fix the error, even though parallel spec evaluation can be on for any number of other specs that also include Dhall.inputFile. So this may be a non-issue, but it still seems peculiar, since it seems to be Windows-specific.

@SiriusStarr SiriusStarr added bug CI Continuous Integration and removed CI Continuous Integration labels Jul 8, 2020
@sjakobi
Copy link
Collaborator

sjakobi commented Jul 8, 2020

Thanks for the report @SiriusStarr! :)

For reference, it's this code that writes files to the "semi-semantic" dhall-haskell cache:

writeToSemisemanticCache :: Dhall.Crypto.SHA256Digest -> Data.ByteString.ByteString -> IO ()
writeToSemisemanticCache semisemanticHash bytes = do
_ <- Maybe.runMaybeT $ do
cacheFile <- getCacheFile "dhall-haskell" semisemanticHash
liftIO (AtomicWrite.Binary.atomicWriteFile cacheFile bytes)
return ()

The use of atomic-write here was introduced in #1544.

This is the atomicWriteFile function that we use:

http://hackage.haskell.org/package/atomic-write-0.2.0.7/docs/System-AtomicWrite-Writer-ByteString-Binary.html#v:atomicWriteFile

And this is the renameFile function used there:

http://hackage.haskell.org/package/directory-1.3.6.1/docs/System-Directory.html#v:renameFile

I admittedly don't have a huge amount of trust in the atomic-write code – I think it's possible that dhall is one of very few projects using atomic-write on Windows. atomic-write doesn't appear to have CI for Windows, for instance.


@jneira As our Windows expert, would you have a recommendation how to tackle this? :)

@Gabriella439
Copy link
Collaborator

My suspicion is that the atomic-write logic might not be concurrency-safe on Windows. Based on the error message I'm guessing that two parallel dhall interpreters were trying to write out the same cache file at the same time. If that hypothesis is true then this should be possible to narrow down into a minimal reproducing example on Windows without using dhall

@Gabriella439
Copy link
Collaborator

@SiriusStarr: Also, I suspect the reason this only affects the first test is because that's the one that populates the cache

@sjakobi
Copy link
Collaborator

sjakobi commented Jul 11, 2020

I just noticed this bit from the atomic-write package description:

Atomically write to a file on POSIX-compliant systems while preserving permissions.

That sounds as if the package was never meant to guarantee atomicity on Windows. I must have missed that when I picked it to address #1540.

@Gabriella439
Copy link
Collaborator

Gabriella439 commented Jul 12, 2020

The underlying primitive that atomic-write needs to be atomic is System.Directory.renameFile.

On a POSIX system it's a wrapper around System.POSIX.rename and on a Windows system it's a wrapper around System.Win32.moveFileEx. So one possible explanation is that System.POSIX.rename is atomic while System.Win32.moveFileEx is not atomic.

Either way, it seems like an issue that would need to be fixed upstream, in either the directory package or the Win32 package

@sjakobi
Copy link
Collaborator

sjakobi commented Jul 12, 2020

Either way, it seems like an issue that would need to be fixed upstream, in either the directory package or the Win32 package

I guess we should try to make a bug report then. We could also try creating a workaround in dhall, possibly by using a file lock.

In any case, it would be good to have a proper reproducer for the issue. @SiriusStarr Could you possibly help us with that?

@SiriusStarr
Copy link
Collaborator Author

Okay, after fidgeting around, I've been able to reproduce it pretty simply on my own windows box, rather than in CI.

Requirements:

  • Delete %localappdata%\dhall & %localappdata%\dhall-haskell between runs. This only occurs on the first run when the cache is populated (which is why it was showing up in CI and not on my own computer).

Minimal Main.hs (using hspec for easy parallelism):

{-# LANGUAGE OverloadedStrings #-}

module Main (main) where

import qualified Dhall as D
import Test.Hspec

main :: IO ()
main = hspec $ parallel $ do
  it "import 1" $ D.input D.auto "let B = https://prelude.dhall-lang.org/Bool/package.dhall in B.not True" `shouldReturn` False
  it "import 2" $ D.input D.auto "let B = https://prelude.dhall-lang.org/Bool/package.dhall in B.not False" `shouldReturn` True

To reproduce: stack run --compiler=ghc-8.8.2 (8.8.3 is borked on Windows due to a bug that will be fixed in 8.8.4.)

import 1
import 2 FAILED [1]

Failures:

  src\Main.hs:12:3:
  1) import 2
       uncaught exception: IOException of type PermissionDenied
       C:\Users\Username\AppData\Local\dhall-haskell\ato3EE7.write: renameFile:renamePath:MoveFileEx "\\\\?\\C:\\Users\\Username\\AppData\\Local\\dhall-haskell\\ato3EE7.write" Just "\\\\?\\C:\\Users\\Username\\AppData\\Local\\dhall-haskell\\1220262d2dcb718ae7f37b6ce6142fb0aa73b714802582809d20ad49d8e4627f35ff": permission denied (Access is denied.)

  To rerun use: --match "/import 2/"

Randomized with seed 74949068

Finished in 0.6375 seconds
2 examples, 1 failure

stack.yaml:

resolver: lts-16.0

packages:
- .

package.yaml (Important: Note ghc-options; have to have threading/parallelism turned on, or the error doesn't occur.):

name: atomic-write-err
version: 0.1.0.0

dependencies:
  - base >= 4.7 && < 5
  - dhall >= 1.32
  - hspec >= 2.7.1

executables:
  atomic-write-err:
    main: Main.hs
    source-dirs: src
    ghc-options:
      - -threaded
      - -rtsopts
      - -with-rtsopts=-N

@Gabriella439
Copy link
Collaborator

I opened an issue here: haskell/directory#109

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants