Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate integration tests to some Makefile/shell based format #2797

Closed
ezyang opened this issue Aug 27, 2015 · 39 comments
Closed

Migrate integration tests to some Makefile/shell based format #2797

ezyang opened this issue Aug 27, 2015 · 39 comments

Comments

@ezyang
Copy link
Contributor

ezyang commented Aug 27, 2015

Currently, the process for package tests in Cabal is something like:

  1. Create a new Haskell file and write a lot of boilerplate to call a bunch of Cabal/ghc-pkg commands
  2. Wire it up to the main test caller
  3. Call 'cabal test' and wait for the test suite to build
  4. Actually run the test suite

This really sucks, for a few reasons:

  1. It takes a lot of boilerplate to write the actual package test, way more than an equivalent shell script plus expected output to diff against would
  2. Under the most normal workflow, we have to rebuild the test suite every time we make a change (e.g. tweak a command). This makes for a pretty annoying compile-run cycles
  3. It's opaque: it's much harder to reverse engineer what the source level commands are being run

The way GHC does these tests is that it just has a test runner which runs some shell scripts (really a Makefile) which runs a test, and the diffs it against some expected output. There's a bit of fuzz to make it all work out, but it is quite nice. Cabal should grow something similar; maybe that would encourage more tests! (Maybe there is a good out-of-the-box thing we can use here.)

@BardurArantsson
Copy link
Collaborator

+1, if you (or someone else) could make this happen it would be great.

I guess a feasible way to go here would to put this new system in place and not bother converting the existing test cases? (I may be sufficiently motivated to do so because I may need to add some PackageTest cases soonish.)

EDIT: Removed a bit of nonsense; I wasn't really familiar enough with the way cabal-install's PackageTests currently work.

@BardurArantsson
Copy link
Collaborator

Given that one of my "contributions" had to be reverted a few days ago (basically due to an error caused by a lack of automated tests), I think I'll take a stab at an initial implementation this weekend. @ezyang : Just to make sure we're not duplicating effort: are you actively working on this, or...?

@ezyang
Copy link
Contributor Author

ezyang commented Sep 18, 2015

Nope, have at it!

@BardurArantsson
Copy link
Collaborator

So, I think I have a basic test runner set up working now:

Using cabal: /home/bardur/wd/cabal/cabal-install/dist/build/cabal/cabal
Using ghc-pkg: /usr/bin/ghc-pkg
Current directory is: /home/bardur/wd/cabal/cabal-install
Running tests in category 'cat1'...
  Running should_run tests...
    Found 3 test case(s)
    Running "see_me_fail.sh"...
    Running "see_me_succeed.sh"...
    Running "should_fail_because_of_output.sh"...
  Running should_fail tests...
    Found 1 test case(s)
    Running "does_not_fail.sh"...
Running tests in category 'cat2'...
  Running should_run tests...
    Found 0 test case(s)
  Running should_fail tests...
    Found 1 test case(s)
    Running "fails_as_it_should.sh"...

The file layout is like this:

tests/IntegrationTests
tests/IntegrationTests/cat1
tests/IntegrationTests/cat1/should_run
tests/IntegrationTests/cat1/should_run/see_me_fail.sh
tests/IntegrationTests/cat1/should_run/see_me_succeed.sh
tests/IntegrationTests/cat1/should_run/see_me_succeed.out
tests/IntegrationTests/cat1/should_run/should_fail_because_of_output.sh
tests/IntegrationTests/cat1/should_run/should_fail_because_of_output.out
tests/IntegrationTests/cat1/should_fail
tests/IntegrationTests/cat1/should_fail/does_not_fail.sh
tests/IntegrationTests/cat2
tests/IntegrationTests/cat2/should_run
tests/IntegrationTests/cat2/should_fail
tests/IntegrationTests/cat2/should_fail/fails_as_it_should.sh

(I'll probably rename to something other than IntegrationTests, but you get the idea.)

Each of the shell scripts can have an accompanying ".out" and ".err" file alongside it which will be compared against the stdout/stderr of running the script.

(At the moment I'm not actually running the shell scripts nor comparing stdout/stderr, but that should be pretty easy to do -- I'm mostly just working on the overall setup wrt. locations of test cases/data/etc. at the moment.)

A few of things to think about:

  • Should scripts be run using "sh -e" as a wrapper? This would mean that any commands in the scripts which fail (and whose exit status isn't explicitly ignored using "|| true") will automatically cause the test case to fail... but then what about the "should_fail" tests?
  • There needs to be some way of isolating tests and specifically having separate .cabal files for each test (if necessary), but it should probably also be possible to share .cabal files across multiple individual tests (shell scripts). I'm not sure how that should be done. Each test should perhaps also have its own separate working directory.
  • What kind of state do the tests need access to? Should we provide the location of the built "cabal" executable via $PATH (and risk them using the system-wide one, or perhaps stripping PATH of everything else) or via a $CABAL environment variable? What other paths will test cases want to know? Perhaps this is one that can be left up to future expansion -- save the location of the "cabal-install" executable.

Comments/thoughts?

@phadej
Copy link
Collaborator

phadej commented Sep 19, 2015

For should fail tests:

sh-3.2$ ! true
sh-3.2$ echo $?
1
sh-3.2$ ! false
sh-3.2$ echo $?
0

@ezyang
Copy link
Contributor Author

ezyang commented Sep 19, 2015

You can always crib the relevant design decisions from GHC's test suite:

  1. Make always stops on the first error
  2. This is something GHC does quite poorly, which has caused us some pain. I think it would adequate if each Cabal calls has a dist directory suffixed with the testname; no need for a fresh working directory!
  3. GHC does this with TEST_HC (which has the path to the GHC executable) and TEST_HC_OPTS (which is the options it should be called with.) I think this is pretty godo.

@BardurArantsson
Copy link
Collaborator

Here's the work so far:

https://github.com/BardurArantsson/cabal/tree/gh-2797

or as a single commit:

BardurArantsson@5e2a05b

The code uses the usual "tasty" infrastructure, so we get color, timing information and failed/success counts for free.
For now, I've marked outstanding issues in the code with "XXX". Here are the main remaining issues:

  1. The current way of reading stdout/stderr is almost guaranteed to deadlock for non-trivial examples. This should be easily solvable by sparking off a couple of separate threads to perform the reading. (I wonder if it would be permissible to add async as a test dependency? I don't see any mention of it as a dependency in the .cabal files, but stack.yml mentions it, so presumably it's been there at some point.)
  2. Right now I'm calling /bin/sh, but how would that work on Windows? (Presumably the current PackageTests test suite is cross-platform by virtue of invoking only "cabal".) Is it reasonable to require MSYS or similar to be able to run the test suite? I see that the current PackageTests/Exec/Check.hs runs "bash", so something must be going on here...
  3. I'm providing a few environment variables so that "cabal" can be invoked properly (CABAL, CABAL_CONFIG_FILE, GHC_PKG, etc.), but I really want to make this simpler such that "cabal blah blah" is possible in the common case. I was thinking of maybe auto-generating a little wrapper shell script which defines a "cabal" shell function.
  4. The should_fail tests could be a bit error-prone as-is. I'm thinking that perhaps we should require that either .out or an .err file be provided (or both) for all test cases, just to make sure that stupid things like misspelled commands don't end up causing a successful test (because of the "-e" flag to "sh").
  5. Handling of the "dist" directory per @ezyang's comment: Yup, this is probably a good idea. Right now it's just set to "dist".

I still haven't rewritten any of the actual existing PackageTest tests, so it remains to be seen if I've forgotten anything crucial.

@BardurArantsson
Copy link
Collaborator

Oh, yes, here's an example of the output (sans color):

Current directory is: /home/bardur/wd/cabal/cabal-install
Integration Tests
  cat1
    should_run
      see_me_fail.sh (ignoring stdout+stderr):                         FAIL
        Unexpected exit status: 1
      see_me_succeed.sh (ignoring stderr):                             OK
      should_fail_because_of_output.sh (ignoring stderr):              FAIL
        <stdout> did not match file '/home/bardur/wd/cabal/cabal-install/tests/IntegrationTests/cat1/should_run/should_fail_because_of_output.out'. Was: "unexpected\n"
    should_fail
      does_not_fail.sh (ignoring stdout+stderr):                       FAIL
        Unexpected exit status: 0
  cat2
    should_fail
      fails_as_it_should.sh (ignoring stdout+stderr):                  OK
      succeeds_which_should_fail_the_test.sh (ignoring stdout+stderr): FAIL
        Unexpected exit status: 0

4 out of 6 tests failed (0.01s)

(I auto-add the "ignoring ..." bit just to make sure that it's possible to see at a glance if the discovery code is finding the .out and .err files as expected. I'm not sure if we can make it even more in-your-face or if we need to.)

@ezyang
Copy link
Contributor Author

ezyang commented Sep 21, 2015

  1. I don't see any problem with adding an async for the test suite
  2. GHC's test suite works on Windows by assuming that Make is available, since you can't dev GHC without MSYS. Honestly, if someone is dev'ing on Windows I think it's OK to assume msys. Make sure you test I assume.
  3. I don't like a wrapper script. It's an extra point of indirection, and if I want to run the commands manually I have to know to do the wrapper script. GHC's test suite, has fairly long command lines, but this makes it very explicit and easy to understand.

@BardurArantsson
Copy link
Collaborator

I've pushed an update which now reads the sub-process stdout/stderr asynchronously.

Yeah, I'll just go with "bash" for the shell, I think. Since it's only a requirement for development, I guess the MSYS requirement isn't too onerous -- and as I mentioned it seems to be a de-facto requirement for the existing test suite due to its invocation of "bash".

About the wrapper script: I can see your point, but I should think it would be a good thing to avoid a) excess bolierplate, but perhaps more importantly avoid b) accidentally calling the system-wide "cabal" executable by accident -- it's easy to forget that you have to use "special syntax" in test suites when reviewing code, etc. Hence I'd want to shadow the name "cabal" in test scripts anyway -- perhaps just to provide an appropriate error message about using $CABAL $CABAL_ARGS instead. What do you think about the latter approach?

@BardurArantsson
Copy link
Collaborator

Pushed a further little update:

  • Uses withAsync now for automatic cleanup in case of exceptions. (Not really a big deal, but I guess it can't hurt.)
  • Creates a working directory for each test case now and cleans it up after the test runs. The working directory is called ".work" relative to the test case directory. This is necessary because a) cabal is very stateful (fs-wise) and so we need a clean slate for each test case, and b) cabal tends to show paths in output, so we want relative paths for e.g. --config-file and CABAL_BUILDDIR.

Next up:

  • Use "bash" instead of "sh". (Not sure if I can use the one in $PATH simply, but if not, I think I might just use /bin/bash.)
  • Port a few tests over from PackageTests
  • (Maybe:) Implement a wrapper script which "bans" the "cabal" command by defining a "cabal" function which displays an error if called. (See previous comment for reasoning.)

I think that should about do it.

@BardurArantsson
Copy link
Collaborator

Pushed a little update:

  • Migrated the "cabal freeze" tests to integration-test. Net gain in LoC is ~90 lines. (Not that that should be the driving force, but it's a nice little bonus.)

Please have a look and see what you think!

This has uncovered a new potential source of problems, namely "unpredictable" file system state created by cabal-install and the need to do proper cleanup in a simple and reliable way.

In the case of "cabal freeze" it's cabal.config, and when porting these test cases I have resorted to deleting "cabal.config" in a "common.sh" script which is sourced by each of the test cases at the start. It would perhaps be a lot nicer if we just copied all the test data to a (temporary) work directory and removed the directory afterwards. (Assuming we didn't want to keep the work directory around for debugging the test case.)

@ezyang
Copy link
Contributor Author

ezyang commented Sep 23, 2015

You mean "loss" :)

Can't you just set the --config-file to specify a separate config file? Would prefer not to be slinging files to random temporary directories, prefer it to be deterministic so we can look at the state if we're debugging.

@ttuegel
Copy link
Member

ttuegel commented Sep 23, 2015

Use "bash" instead of "sh". (Not sure if I can use the one in $PATH simply, but if not, I think I might just use /bin/bash.)

Please use either /bin/sh (required to exist by POSIX, but might not be bash) or /usr/bin/env bash (required to exist by LSB, but also present on all other Unices) as /bin/bash does not exist on all systems, even if bash is installed.

@ezyang
Copy link
Contributor Author

ezyang commented Sep 23, 2015

I think it might just be better to use Make, in this case?

@ttuegel
Copy link
Member

ttuegel commented Sep 23, 2015

I think it might just be better to use Make, in this case?

That would probably be ideal, we won't have to worry about shell interpreter at all. I just wanted to make sure that if we do use a shell, we call it in a standards-compliant way. :)

@BardurArantsson
Copy link
Collaborator

@ttuegel: Yes, the plan was to use "usr/bin/env" with "bash". AFAIUI the "sh" emulation in "bash" isn't nearly strict enough to ensure that people would only write valid "sh" code (ie. bash accepts many bash-isms even in "sh mode"), so it's unrealistic to expect people to write proper "sh" code anyway.

@ezyang: I don't understand why you'd want to involve "make" here. Make just calls out to the system shell to execute commands, and I don't see what "make" buys us. (Plus, it's a dependency which I'm pretty sure you don't need to have to hack on Cabal/cabal-install, but I could be mistaken about that. "bash" may also not be strictly necessary, but as of today it's already effectively required and also generally massively more likely to be included in a "base" install of e.g. MSYS, though I don't know for sure if it's actually required to be installed with MSYS.)

(@ezyang: I'll respond to your other comment separately. Don't want a wall of text -- rather keep the "threads" separate. :))

@ezyang
Copy link
Contributor Author

ezyang commented Sep 24, 2015

@BardurArantsson Based on a recent discussion on this very thing here: https://phabricator.haskell.org/D1267 GHC doesn't have a bash dependency because they don't want to force BSD users to install bash to run the test suite.

...Honestly, I think the biggest reason to use make is so you can put multiple tests in one file. Helps reduce overhead of making tests.

@BardurArantsson
Copy link
Collaborator

@ezyang: "cabal.config" is just an example of a larger issue (see below). Btw, I'm already giving cabal the "--config-file" argument, but "cabal freeze" is actually creating a file named "cabal.config" regardless. Could this actually be a bug in "cabal freeze"?

I understand and share your concerns about debuggability, and I certainly want to have the option of preserving whatever files are generated by the tests, etc. -- perhaps doing so by default if a test fails. (And pointing the developer towards them in test output, etc.) Ideally, it should also be possible to re-run only that specific test -- I'm not sure if "tasty" has support for that out of the box, but I'd definitely want that as a feature.

However, there's a real danger here with accidentally introducing dependencies between tests in the same "group" simply by forgetting to do proper cleanup. (This is exacerbated by the need to have each test preemptively clean up everything
that any of the other tests might have left around. In short this is very fragile and error-prone.) Forgetting to do cleanup has already happened once during my simple port of the "freeze" tests and it took my a good while to realize what was going on. I've also seen it happen repeatedly in other largish systems with similar characteristics. When you have this much ambient state floating around it's effectively impossible, in practice, to always remember to do proper cleanup.

Trust me, you really don't want to end up in a situation where you have these accidental dependencies between tests because it makes tests incredibly fragile and frustrating to work on[2]... a situation we're trying to avoid as much as possible with this work.

The only realistic way of mitigating this issue that I've found works somewhat in practice is to forcibly randomize the order in which tests run[1]. However, this also carries a real risk of becoming enormously frustrating since it's not guaranteed to root out dependencies, so you may end up having to fix an issue that somebody else introduced. (Which, by sheer luck, happened to survive their own testing and the pre-merge tests.). It happens a lot less frequently, but it does happen from time to time.

I still haven't come up with a great solution for this issue, so this is all just tossing ideas around. I think my really ideal solution would be something like a layered virtual file system where the tests are all running in their directory with read-only access to everything, but where all writes transparently go into a separate isolated file system layered on top (at the OS layer, so no "escaping" the sandbox). That, however, isn't practical for a multitude of reasons, so we'll have to come up with something else.

Btw, calling the moral equivalent of cp -a "magic" is a bit hyperbolic, isn't it? ;)

As usual, any ideas around this welcome. I think it's a problem that needs a solution, but I also don't want to let 'perfect' be the enemy of 'good', so...

(Sorry about the wall of text. Let me know if anything's unclear in the above.)

[1] Given that getDirectoryContents doesn't guarantee any ordering, even though it's usually stable on a given system means that either you'd want to explicitly either sort or shuffle anyway.

[2] Yeah, I was often "the cleanup guy" in these situations.

@BardurArantsson
Copy link
Collaborator

I don't disagree with using "sh" in principle, I just see this as a practical matter since those of us on Linux (the vast majority, probably?) or OS X are probably guaranteed to accidentally introduce bashisms? (Is the /bin/sh on OS X actually a real sh? Anybody know?)

@ezyang
Copy link
Contributor Author

ezyang commented Sep 24, 2015

So, with certain driver tests in GHC, this can be a problem, but most of us run the test suite in parallel (e.g. with 12 tests running simultaneously at the same time) so we tend to notice when things clobber each other. Yes, occasionally I have to fix someone clobbering something but it's not that frequent.

I also object to the claim that there is "too much ambient state". If this is true, then our application is designed poorly and we should fix it. (This is confounded slightly by the fact that you've been working on cabal-install tests and not Cabal tests; I think things should be a lot better for Cabal.) Like, if it is THAT hard to convince cabal-install to not rely on some unexpected state, cabal-install is broken and should be fixed.

But it seems like the easy answer to this problem is to just force each test case to live in its own directory?

@BardurArantsson
Copy link
Collaborator

Cabal isn't an executable, so I don't understand how you'd do this kind of testing using shell scripts (or make or whatever) for that :).

Oh, and cabal-install is designed poorly... or rather it's extremely ad-hoc since it has basically evolved to where it is now in starts and fits by a lot of different people with no coherent vision. That doesn't tend to produce well-factored designs. :/

EDIT: Almost everything of any consequence happens in IO, for example. What I wouldn't give for a Free/Operational monad version of cabal-install :).

Like, if it is THAT hard to convince cabal-install to not rely on some unexpected state, cabal-install is broken and should be fixed.

Changing the test framework to make tests more reliable and easier to write is -- in my conception of the world, at least -- a prerequisite for being able to fix the problem without introducing new ones.

Getting rid of non-relocatable state in cabal-install is a highly non-trivial undertaking.

But it seems like the easy answer to this problem is to just force each test case to live in its own directory?

Yup, I considered that too, but that leads to duplication of the .cabal files that a lot of the test cases share. I suppose that could be worked around by dropping symlinks in there, but then I'd be worried about Windows support[1].

[1] But perhaps it's a non-issue on Windows with MSYS? I don't know.

@BardurArantsson
Copy link
Collaborator

Perhaps separate directories with symlinks for .cabal files is the 80% solution that can get us to where we want to be. I'll try a little experiment later to see what the other downsides are, if any.

@ezyang
Copy link
Contributor Author

ezyang commented Sep 24, 2015

Yeah, I suspected your real problem is that it's really annoying to setup Cabal projects.

Perhaps we should define some format for easily specifying Cabal projects in a compact way. Could be something as simple as:

foo.cabal
  name: blah
Foo.hs
  module Foo where

etc. Should make it less unpleasant to do this?

@ezyang
Copy link
Contributor Author

ezyang commented Sep 24, 2015

BTW, it is awesome that you are working on this, and would really like to see even an imperfect version of this go in. Will make dev experience a lot better.

@BardurArantsson
Copy link
Collaborator

So, I've been converting a few more test cases... and I'm not sure having ".out" and ".err" files is actually worth it. It sounds like a good idea to standardize it, but I find myself having to do "blah > /dev/null" quite a lot. Checking for the presence of some particular text in the output is mostly also pretty easy: In "should_run" tests you just do "blah && grep foo". (This changes a bit if you want to check for multiple things on different lines, etc. but...). In particular, the output from auto-configure also necessitates always doing an explicit "cabal configure > /dev/null" before actually running the command(s) of interest.

I probably won't get around to trying the "directory-per-test-case" thing today, but I'll try to get to it tomorrow. However, I'm currently quite confident that it'll be... cumbersome. (Turns out some tests share a lot more than just a .cabal file... so more symlinks would be necessary. I think it'll be pretty annoying to have to maintain manually.)

EDIT: As an example the "exec" tests share "my.cabal", "My.hs", "Foo.hs" and a "subdir" directory (which is empty at the start, not sure if it's actually necessary since I haven't converted all the test cases yet).

@BardurArantsson
Copy link
Collaborator

Oh, and a further observation: I was actually surprised by this, but it seems it takes roughly 0.60s to execute each test case at a bare minimum. This surprised me -- I'm not sure if it's purely "cabal" start-up overhead or what it is, but I think some work may be needed to bring it down significantly. A slow-running test-suite is always a pain.

@ezyang
Copy link
Contributor Author

ezyang commented Sep 24, 2015

Re out/err, it's interesting that in the GHC test suite (which has Cabal tests), Setup (which is how we test the Cabal library) is usually called with -v0, and then just the output of a few commands we query to see what the resulting system looks like is actually used. This might be a good fit? I think we want to avoid lots of grepping... sometimes it's necessary but it's pretty annoying, and then you have to manually update tests if output changes.

If directory-per-test-case is going to be cumbersome, let's not do it. I'm OK with the original plan of copying things to a separate directory and running them, as long as it's obvious from the test runner how this setup is done.

Re length: parallelization should help, right? I think that is a must have feature.

@BardurArantsson
Copy link
Collaborator

-v0: Might be workable -- I'll have to try it out on a few test cases, but I could imagine the auto-configure thing being a bit of an issue here in that you probably want selectively switch off the configure output when auto-configure kicks in. Either way, we can just leave the .out and .err support code in -- it doesn't actually do anything unless you explicitly create .out and .err files, so it's not really harmful as such. (EDIT: And you can always combine usage of "> /dev/null" or "-v0" on carefully selected commands with usage of .out/.err.)

Yeah, parallelization should help. I'm not seeing particularly high CPU usage (~10% of one CPU on an i7-4470K), so the time taken per test doesn't seem to be CPU bound, at least. I'm not sure if we can get parallel test runs from tasty, but I'd certainly hope so. (Of course this is predicated on being able to isolate test cases sufficiently from another, but I think the "copy + run somewhere else" thing should be able to achieve sufficient isolation for that.)

@ezyang
Copy link
Contributor Author

ezyang commented Sep 24, 2015

I dunno, I don't feel like I ever really care about the output of configure. 🤷

@BardurArantsson
Copy link
Collaborator

Indeed not :). Other than perhaps when testing "configure" :).

TODO list to self:

  1. Investigate parallelization. (To speed up further development.)
  2. Investigate how hard "copy+run" would be and if it has any non-trivial performance impact.
  3. Convert remaining test cases. Only MultipleSource left to go! :)

@BardurArantsson
Copy link
Collaborator

Pushed the converted "exec" tests just now.

@BardurArantsson
Copy link
Collaborator

Alright, parallelism is built into tasty so it just needed a +RTS -N -RTS to be supplied as an option to the test program on the command line. Of course that fails miserably at the moment since different instances of cabal will stomp all over each other's data, but that will hopefully be fixed by "copy-before-running". :)

I'm not sure if we can do that by default, though, since the ARM build disables the "-threading" flag (for good reasons, I presume). Either way it's not that big a deal since running in parallel is a much bigger win if you keep running the test suite, and for that you can easily supply the RTS flags directly on the command line.

Btw, time to run went from 14s to <4s, but that's probably unfair since I'm not sure if the tests actually ran to completion (likely that at least some of them didn't.)

Will implement "copy+run" next to see if I can get parallel runs to be stable and repeatable. (Will probably also add randomization on test order to try to prevent accidental dependencies between test cases.)

@BardurArantsson
Copy link
Collaborator

Just a little note to self: As was mentioned on #2664, line endings may be a problem for "err" and "out" files.

@BardurArantsson
Copy link
Collaborator

Alright, I've pushed what I think could be considered Release Candidate 1 of this. See https://github.com/BardurArantsson/cabal/tree/gh-2797.

There are still a couple of minor issues marked with XXX, but nothing major. One is the use of the 'temporary' package -- this is just a bit of laziness on my part since Distribution.Compat.TempFile.createTempDirectory is all that's needed, but it's currently not an "exposed" module. However, the comments in the TempFile.hs source lead me to think that the best approach here might actually be to replace the module by adding a dependency on "temporary" instead. Any thoughts on this?

Running in parallel, running all the tests takes 5.5 seconds on my machine. All previous PackageTests have been converted to the new test framework -- and I think this validates this approach. There's a lot less boilerplate and it's much easier to see exactly what commands get run.

The overall strategy for running the tests is now this:

  • Create a temporary directory somewhere under where getTemporaryDirectory says the system's temporary directory is (e.g. /tmp on Linux.).
  • If a test fails, the temporary directory is left in place, just to help debugging.
  • If a test succeeds, the temporary directory is removed automatically.

Review and comments would be very welcome. Hopefully, I'll be able to submit a final pull request this weekend.

@grayjay
Copy link
Collaborator

grayjay commented Oct 2, 2015

I tested it in parallel on Windows. Everything passed after I fixed the four issues that I encountered with the previous version (#2664 (comment)). I really like how quickly I can modify and run the tests.

@ezyang
Copy link
Contributor Author

ezyang commented Oct 2, 2015

Ship it!!!

@BardurArantsson
Copy link
Collaborator

Pushed RC2 (I guess) with a few updates. Still not quite sure about "sh" vs. "/bin/sh" and need to handle the Unix/Windows line endings somehow.

@BardurArantsson
Copy link
Collaborator

Pull request submitted: #2864

Let's see what Travis says...

BardurArantsson added a commit to BardurArantsson/cabal that referenced this issue Oct 11, 2015
BardurArantsson added a commit to BardurArantsson/cabal that referenced this issue Oct 11, 2015
23Skidoo added a commit that referenced this issue Oct 15, 2015
Migrate integration tests to shell based format, fixed travis
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants