Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Github Action for Buildbot Builders #5

Closed
wants to merge 52 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
61637dd
ARROW-6652: [Python] Fix ChunkedArray.to_pandas to retain timezone
jorisvandenbossche Sep 24, 2019
5a918ce
ARROW-3777: [C++] Add Slow input streams and slow filesystem
pitrou Sep 24, 2019
c6faaed
ARROW-6669: [Rust] [DataFusion] Implement binary expression for physi…
andygrove Sep 24, 2019
b780c56
ARROW-6187: [C++] Fallback to storage type when writing ExtensionType…
jorisvandenbossche Sep 24, 2019
2c7fb24
ARROW-6629: [Doc] [C++] Add filesystem docs
pitrou Sep 24, 2019
a89c803
ARROW-6649: [R] print methods for Array, ChunkedArray, Table, RecordB…
nealrichardson Sep 25, 2019
232cde0
ARROW-6674: [Python] Fix or ignore the test warnings
jorisvandenbossche Sep 25, 2019
199d3cf
ARROW-6158: [C++/Python] Validate child array types with type fields …
jorisvandenbossche Sep 25, 2019
4fe330a
ARROW-6678: [C++][Parquet] Binary data stored in Parquet metadata mus…
wesm Sep 25, 2019
7f2d637
ARROW-6089: [Rust] [DataFusion] Implement physical plan for "selecti…
andygrove Sep 25, 2019
502865d
ARROW-6667: [Python] remove cyclical object references in pyarrow.par…
AaronOpfer Sep 25, 2019
511c089
ARROW-6677: [FlightRPC][C++] Document Flight in C++
lidavidm Sep 25, 2019
d4dcfa9
ARROW-6675: [JS] Add scanReverse function to dataFrame and filteredDa…
mmaclach Sep 25, 2019
0d0e4cc
ARROW-6622: [R] Normalize paths for filesystem API on Windows
nealrichardson Sep 25, 2019
37b6c20
ARROW-6086: [Rust] [DataFusion] Add support for partitioned Parquet d…
andygrove Sep 25, 2019
883d9eb
ARROW-6679: [RELEASE] Add license info for the autobrew scripts
nealrichardson Sep 25, 2019
196face
ARROW-6630: [Doc] Document C++ file formats
pitrou Sep 25, 2019
6dec194
ARROW-6472: [Java] ValueVector#accept may has potential cast exception
tianchen92 Sep 25, 2019
07ab508
ARROW-4218: [Rust][Parquet] Initial support for array reader.
liurenjie1024 Sep 26, 2019
f39d2c2
ARROW-6687: [Rust] [DataFusion] Bug fix in DataFusion Parquet reader
andygrove Sep 26, 2019
5b4a08f
ARROW-6705: [Rust] [DataFusion] README has invalid github URL
alippai Sep 26, 2019
a75a602
ARROW-6703: [Packaging][Linux] Restore ARROW_VERSION environment vari…
kou Sep 26, 2019
2dc020c
ARROW-6709: [JAVA] Jdbc adapter currentIndex should increment when va…
tianchen92 Sep 26, 2019
dec0cfb
ARROW-6606: [C++] Add PathTree tree structure
fsaintjacques Sep 26, 2019
df2791c
ARROW-6683: [Python] Test for fastparquet <-> pyarrow cross-compatibi…
jorisvandenbossche Sep 26, 2019
46a14db
ARROW-6429: [Integration] Adding patch to fix Spark compilation for I…
BryanCutler Sep 26, 2019
fa92fae
ARROW-6716: [Rust] Bump nightly to nightly-2019-09-25 to fix CI
andygrove Sep 27, 2019
cf9df14
ARROW-6532 [R] write_parquet() uses writer properties (general and ar…
romainfrancois Sep 27, 2019
cf3990e
ARROW-6701: [C++][R] Lint failing on R cpp code
nealrichardson Sep 27, 2019
7fb6b75
ARROW-6714: [R] Fix untested RecordBatchWriter case
nealrichardson Sep 27, 2019
a476bee
Update main.yml
kszucs Sep 28, 2019
b1cdbe7
use checkout master branch [skip ci]
kszucs Sep 28, 2019
85bbb05
repository [skip ci]
kszucs Sep 28, 2019
a0f8aa4
ref [skip ci]
kszucs Sep 28, 2019
36e97b1
plural heads [skip ci]
kszucs Sep 28, 2019
019eff9
check python3 [skip ci]
kszucs Sep 28, 2019
a8ba09f
install ursabot [skip ci]
kszucs Sep 28, 2019
e4aea33
disable other workflows [skip ci]
kszucs Sep 28, 2019
739d177
list directories; use relative path [skip ci]
kszucs Sep 28, 2019
537155e
pwd [skip ci]
kszucs Sep 28, 2019
f27e583
ls [skip ci]
kszucs Sep 28, 2019
12f047f
env [skip ci]
kszucs Sep 28, 2019
e66ef0e
parent dir [skip ci]
kszucs Sep 28, 2019
ea49a40
from pip [skip ci]
kszucs Sep 28, 2019
98e71b6
ensure python [skip ci]
kszucs Sep 28, 2019
c1a41e4
run rust builder [skip ci]
kszucs Sep 28, 2019
246ddb0
repo [skip ci]
kszucs Sep 28, 2019
6ac3f77
matrix [skip ci]
kszucs Sep 28, 2019
9de8a8c
add the rest of the builders [skip ci]
kszucs Sep 28, 2019
a709192
rename [skip ci]
kszucs Sep 28, 2019
9d68c44
enable msvc [skip ci]
kszucs Sep 28, 2019
0d48f98
build titles [skip ci]
kszucs Sep 28, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 54 additions & 0 deletions .github/workflows/buildbot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
name: Buildbot

on: [push]

jobs:
build:
name: Build
runs-on: ubuntu-latest

strategy:
matrix:
builder:
- "AMD64 Debian 9 Rust 1.35"
- "AMD64 Debian 9 Go 1.11.11"
- "AMD64 Debian 9 Go 1.12.6"
- "AMD64 Conda C++"
- "AMD64 Conda Python 2.7"
- "AMD64 Conda Python 3.6"
- "AMD64 Conda Python 3.7"
- "AMD64 Conda R"
- "AMD64 Debian 9 NodeJS 11"
- "AMD64 Java OpenJDK 11"
- "AMD64 Java OpenJDK 8"
- "AMD64 Ubuntu 18.04 C GLib"
- "AMD64 Ubuntu 18.04 C++"
- "AMD64 Ubuntu 18.04 Python 3"
- "AMD64 Ubuntu 18.04 R"

steps:

- name: Checkout Ursabot
uses: actions/checkout@v1
with:
repository: ursa-labs/ursabot
ref: refs/heads/master
path: ursabot
- name: Install Python
uses: actions/setup-python@v1
with:
python-version: '3.7'
- name: Install Ursabot
run: pip install -e ../ursabot
- name: Check Ursabot Command
run: ursabot --help

- name: Run ${{ matrix.builder }} Builder
run: |
cd ../ursabot/projects/arrow
ursabot project build \
--repo https://github.com/$GITHUB_REPOSITORY \
--branch $GITHUB_REF \
--commit $GITHUB_SHA \
"${{ matrix.builder }}"

32 changes: 30 additions & 2 deletions LICENSE.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1841,8 +1841,8 @@ This project includes code from the autobrew project.
* r/tools/autobrew and dev/tasks/homebrew-formulae/autobrew/apache-arrow.rb
are based on code from the autobrew project.

Copyright: Copyright (c) 2017 - 2019, Jeroen Ooms.
All rights reserved.
Copyright (c) 2019, Jeroen Ooms
License: MIT
Homepage: https://github.com/jeroen/autobrew

--------------------------------------------------------------------------------
Expand Down Expand Up @@ -1874,3 +1874,31 @@ SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

----------------------------------------------------------------------

cpp/src/arrow/vendored/base64.cpp has the following license

ZLIB License

Copyright (C) 2004-2017 René Nyffenegger

This source code is provided 'as-is', without any express or implied
warranty. In no event will the author be held liable for any damages arising
from the use of this software.

Permission is granted to anyone to use this software for any purpose, including
commercial applications, and to alter it and redistribute it freely, subject to
the following restrictions:

1. The origin of this source code must not be misrepresented; you must not
claim that you wrote the original source code. If you use this source code
in a product, an acknowledgment in the product documentation would be
appreciated but is not required.

2. Altered source versions must be plainly marked as such, and must not be
misrepresented as being the original source code.

3. This notice may not be removed or altered from any source distribution.

René Nyffenegger [email protected]
2 changes: 1 addition & 1 deletion ci/travis_script_python.sh
Original file line number Diff line number Diff line change
Expand Up @@ -226,7 +226,7 @@ if [ "$ARROW_TRAVIS_PYTHON_DOCS" == "1" ]; then
doxygen
popd
cd ../docs
sphinx-build -q -b html -d _build/doctrees -W source _build/html
sphinx-build -q -b html -d _build/doctrees -W --keep-going source _build/html
fi

popd # $ARROW_PYTHON_DIR
Expand Down
2 changes: 2 additions & 0 deletions cpp/apidoc/Doxyfile
Original file line number Diff line number Diff line change
Expand Up @@ -2074,7 +2074,9 @@ INCLUDE_FILE_PATTERNS =

PREDEFINED = __attribute__(x)= \
__declspec(x)= \
PARQUET_EXPORT= \
ARROW_EXPORT= \
ARROW_FLIGHT_EXPORT= \
ARROW_EXTERN_TEMPLATE= \
ARROW_DEPRECATED(x)=

Expand Down
17 changes: 8 additions & 9 deletions cpp/build-support/run_clang_format.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,12 +28,10 @@

# examine the output of clang-format and if changes are
# present assemble a (unified)patch of the difference
def _check_one_file(completed_processes, filename):
def _check_one_file(filename, formatted):
with open(filename, "rb") as reader:
original = reader.read()

returncode, stdout, stderr = completed_processes[filename]
formatted = stdout
if formatted != original:
# Run the equivalent of diff -u
diff = list(difflib.unified_diff(
Expand Down Expand Up @@ -106,20 +104,21 @@ def _check_one_file(completed_processes, filename):
[arguments.clang_format_binary, filename]
for filename in formatted_filenames
], stdout=PIPE, stderr=PIPE)
for returncode, stdout, stderr in results:

checker_args = []
for filename, res in zip(formatted_filenames, results):
# if any clang-format reported a parse error, bubble it
returncode, stdout, stderr = res
if returncode != 0:
print(stderr)
sys.exit(returncode)
checker_args.append((filename, stdout))

error = False
checker = partial(_check_one_file, {
filename: result
for filename, result in zip(formatted_filenames, results)
})
pool = mp.Pool()
try:
# check the output from each invocation of clang-format in parallel
for filename, diff in pool.imap(checker, formatted_filenames):
for filename, diff in pool.starmap(_check_one_file, checker_args):
if not arguments.quiet:
print("Checking {}".format(filename))
if diff:
Expand Down
3 changes: 3 additions & 0 deletions cpp/src/arrow/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -117,6 +117,7 @@ set(ARROW_SRCS
filesystem/filesystem.cc
filesystem/localfs.cc
filesystem/mockfs.cc
filesystem/path_tree.cc
filesystem/path_util.cc
filesystem/util_internal.cc
io/buffered.cc
Expand All @@ -127,6 +128,7 @@ set(ARROW_SRCS
io/interfaces.cc
io/memory.cc
io/readahead.cc
io/slow.cc
testing/util.cc
util/basic_decimal.cc
util/bit_util.cc
Expand All @@ -144,6 +146,7 @@ set(ARROW_SRCS
util/thread_pool.cc
util/trie.cc
util/utf8.cc
vendored/base64.cpp
vendored/datetime/tz.cpp)

# Add dependencies for third-party allocators.
Expand Down
10 changes: 9 additions & 1 deletion cpp/src/arrow/array.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1234,6 +1234,7 @@ struct ValidateVisitor {
}

Status Visit(const StructArray& array) {
const auto& struct_type = checked_cast<const StructType&>(*array.type());
if (array.num_fields() > 0) {
// Validate fields
int64_t array_length = array.field(0)->length();
Expand All @@ -1245,10 +1246,17 @@ struct ValidateVisitor {
it->type()->ToString(), " at position [", idx, "]");
}

auto it_type = struct_type.child(i)->type();
if (!it->type()->Equals(it_type)) {
return Status::Invalid("Child array at position [", idx,
"] does not match type field: ", it->type()->ToString(),
" vs ", it_type->ToString());
}

const Status child_valid = it->Validate();
if (!child_valid.ok()) {
return Status::Invalid("Child array invalid: ", child_valid.ToString(),
" at position [", idx, "}");
" at position [", idx, "]");
}
++idx;
}
Expand Down
77 changes: 43 additions & 34 deletions cpp/src/arrow/csv/options.h
Original file line number Diff line number Diff line change
Expand Up @@ -32,82 +32,91 @@ class DataType;

namespace csv {

// Silly workaround for https://github.com/michaeljones/breathe/issues/453
constexpr char kDefaultEscapeChar = '\\';

struct ARROW_EXPORT ParseOptions {
// Parsing options

// Field delimiter
/// Field delimiter
char delimiter = ',';
// Whether quoting is used
/// Whether quoting is used
bool quoting = true;
// Quoting character (if `quoting` is true)
/// Quoting character (if `quoting` is true)
char quote_char = '"';
// Whether a quote inside a value is double-quoted
/// Whether a quote inside a value is double-quoted
bool double_quote = true;
// Whether escaping is used
/// Whether escaping is used
bool escaping = false;
// Escaping character (if `escaping` is true)
char escape_char = '\\';
// Whether values are allowed to contain CR (0x0d) and LF (0x0a) characters
/// Escaping character (if `escaping` is true)
char escape_char = kDefaultEscapeChar;
/// Whether values are allowed to contain CR (0x0d) and LF (0x0a) characters
bool newlines_in_values = false;
// Whether empty lines are ignored. If false, an empty line represents
// a single empty value (assuming a one-column CSV file).
/// Whether empty lines are ignored. If false, an empty line represents
/// a single empty value (assuming a one-column CSV file).
bool ignore_empty_lines = true;

/// Create parsing options with default values
static ParseOptions Defaults();
};

struct ARROW_EXPORT ConvertOptions {
// Conversion options

// Whether to check UTF8 validity of string columns
/// Whether to check UTF8 validity of string columns
bool check_utf8 = true;
// Optional per-column types (disabling type inference on those columns)
/// Optional per-column types (disabling type inference on those columns)
std::unordered_map<std::string, std::shared_ptr<DataType>> column_types;
// Recognized spellings for null values
/// Recognized spellings for null values
std::vector<std::string> null_values;
// Recognized spellings for boolean values
/// Recognized spellings for boolean true values
std::vector<std::string> true_values;
/// Recognized spellings for boolean false values
std::vector<std::string> false_values;
// Whether string / binary columns can have null values.
// If true, then strings in "null_values" are considered null for string columns.
// If false, then all strings are valid string values.
/// Whether string / binary columns can have null values.
///
/// If true, then strings in "null_values" are considered null for string columns.
/// If false, then all strings are valid string values.
bool strings_can_be_null = false;

// XXX Should we have a separate FilterOptions?

// If non-empty, indicates the names of columns from the CSV file that should
// be actually read and converted (in the vector's order).
// Columns not in this vector will be ignored.
/// If non-empty, indicates the names of columns from the CSV file that should
/// be actually read and converted (in the vector's order).
/// Columns not in this vector will be ignored.
std::vector<std::string> include_columns;
// If false, columns in `include_columns` but not in the CSV file will error out.
// If true, columns in `include_columns` but not in the CSV file will produce
// a column of nulls (whose type is selected using `column_types`,
// or null by default)
// This option is ignored if `include_columns` is empty.
/// If false, columns in `include_columns` but not in the CSV file will error out.
/// If true, columns in `include_columns` but not in the CSV file will produce
/// a column of nulls (whose type is selected using `column_types`,
/// or null by default)
/// This option is ignored if `include_columns` is empty.
bool include_missing_columns = false;

/// Create conversion options with default values, including conventional
/// values for `null_values`, `true_values` and `false_values`
static ConvertOptions Defaults();
};

struct ARROW_EXPORT ReadOptions {
// Reader options

// Whether to use the global CPU thread pool
/// Whether to use the global CPU thread pool
bool use_threads = true;
// Block size we request from the IO layer; also determines the size of
// chunks when use_threads is true
/// Block size we request from the IO layer; also determines the size of
/// chunks when use_threads is true
int32_t block_size = 1 << 20; // 1 MB

// Number of header rows to skip (not including the row of column names, if any)
/// Number of header rows to skip (not including the row of column names, if any)
int32_t skip_rows = 0;
// Column names for the target table.
// If empty, fall back on autogenerate_column_names.
/// Column names for the target table.
/// If empty, fall back on autogenerate_column_names.
std::vector<std::string> column_names;
// Whether to autogenerate column names if `column_names` is empty.
// If true, column names will be of the form "f0", "f1"...
// If false, column names will be read from the first CSV row after `skip_rows`.
/// Whether to autogenerate column names if `column_names` is empty.
/// If true, column names will be of the form "f0", "f1"...
/// If false, column names will be read from the first CSV row after `skip_rows`.
bool autogenerate_column_names = false;

/// Create read options with default values
static ReadOptions Defaults();
};

Expand Down
3 changes: 3 additions & 0 deletions cpp/src/arrow/csv/reader.h
Original file line number Diff line number Diff line change
Expand Up @@ -35,12 +35,15 @@ class InputStream;

namespace csv {

/// A class that reads an entire CSV file into a Arrow Table
class ARROW_EXPORT TableReader {
public:
virtual ~TableReader() = default;

/// Read the entire CSV file and convert it to a Arrow Table
virtual Status Read(std::shared_ptr<Table>* out) = 0;

/// Create a TableReader instance
static Status Make(MemoryPool* pool, std::shared_ptr<io::InputStream> input,
const ReadOptions&, const ParseOptions&, const ConvertOptions&,
std::shared_ptr<TableReader>* out);
Expand Down
3 changes: 3 additions & 0 deletions cpp/src/arrow/extension_type.cc
Original file line number Diff line number Diff line change
Expand Up @@ -138,4 +138,7 @@ std::shared_ptr<ExtensionType> GetExtensionType(const std::string& type_name) {
return registry->GetType(type_name);
}

extern const char kExtensionTypeKeyName[] = "ARROW:extension:name";
extern const char kExtensionMetadataKeyName[] = "ARROW:extension:metadata";

} // namespace arrow
3 changes: 3 additions & 0 deletions cpp/src/arrow/extension_type.h
Original file line number Diff line number Diff line change
Expand Up @@ -142,4 +142,7 @@ Status UnregisterExtensionType(const std::string& type_name);
ARROW_EXPORT
std::shared_ptr<ExtensionType> GetExtensionType(const std::string& type_name);

ARROW_EXPORT extern const char kExtensionTypeKeyName[];
ARROW_EXPORT extern const char kExtensionMetadataKeyName[];

} // namespace arrow
1 change: 1 addition & 0 deletions cpp/src/arrow/filesystem/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ arrow_install_all_headers("arrow/filesystem")

add_arrow_test(filesystem_test)
add_arrow_test(localfs_test)
add_arrow_test(path_tree_test)

if(ARROW_S3)
add_arrow_test(s3fs_test)
Expand Down
Loading