-
Notifications
You must be signed in to change notification settings - Fork 239
/
CHANGES.txt
611 lines (446 loc) · 26.9 KB
/
CHANGES.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
1.6.10 - 2018-09-20
1. Add an option to specify max bytes rewritten per rewrite request when
`fs.gs.copy.with.rewrite.enable` is set to `true`:
fs.gs.rewrite.max.bytes.per.call (default: 536870912)
Even though GCS does not require this parameter for rewrite requests,
rewrite requests are flaky without it.
1.6.9 - 2018-09-04
1. Change default values for GCS batch/directory operations properties to
improve performance:
fs.gs.copy.max.requests.per.batch (default: 1 -> 15)
fs.gs.copy.batch.threads (default: 50 -> 15)
fs.gs.max.requests.per.batch (default: 25 -> 15)
fs.gs.batch.threads (default: 25 -> 15)
2. Update all dependencies to latest versions.
1.6.8 - 2018-08-10
1. Support parallel execution of GCS batch requests.
Number of threads to execute batch requests configurable via property:
fs.gs.batch.threads (default: 0)
If `fs.gs.batch.threads` value is set to 0 then batch requests will be
executed sequentially by caller thread.
2. Do not send batch request when performing operations (rename, delete, copy)
on 1 object.
3. Add configuration properties to control batching of copy operations
separately from other operations:
fs.gs.copy.max.requests.per.batch (default: 30)
fs.gs.copy.batch.threads (default: 0)
4. Fix `RejectedExecutionException` during parallel execution of GCS batch
requests.
5. Change default values for GCS batch/directory operations properties:
fs.gs.copy.with.rewrite.enable (default: false -> true)
fs.gs.copy.max.requests.per.batch (default: 30 -> 1)
fs.gs.copy.batch.threads (default: 0 -> 50)
fs.gs.max.requests.per.batch (default: 30 -> 25)
fs.gs.batch.threads (default: 0 -> 25)
1.6.7 - 2018-06-12
1. Remove Hadoop3 support.
2. Add interface through which user can directly provide the access token.
3. Update Hadoop dependencies to 2.8.4 version.
4. Add more retries and error handling in GoogleCloudStorageReadChannel,
to make it more resilient to network errors; also add a property to allow users
to specify number of retries on low level GCS HTTP requests in case of server
errors and I/O errors.
5. Add properties to allow users to specify connect timeout and read timeout on
low level GCS HTTP requests.
6. Include prefix/directory objects metadata into `storage.objects.list`
requests response to improve performance (i.e. set
`includeTrailingDelimiter` parameter for `storage.objects.list` GCS
requests to `true`).
1.6.6 - 2018-05-08
1. Support Hadoop 3.
2. Change Maven project structure to be better compatible with IDEs.
3. Fix thread leaks that were occurring when YARN log aggregation uploaded
logs to GCS.
1.6.5 - 2018-04-12
1. Add support for using Cloud Storage Rewrite requests for copy operation:
fs.gs.copy.with.rewrite.enable (default: false)
This allows to copy files between different locations and storage classes.
2. Update all dependencies to latest versions.
3. Decrease default value for max requests per batch from 1,000 to 30.
4. Make max requests per batch value configurable with property:
fs.gs.max.requests.per.batch (default: 30)
1.6.4 - 2018-03-19
1. Fixed an issue where JSON auth files containing user auth e.g.
application_default_credentials.json does not work with
google.cloud.auth.service.account.json.keyfile
2. Honor GOOGLE_APPLICATION_DEFAULT_CREDENTIALS environment variable. For
Google Application Default Credentials (but not other defaults).
3. Make fs.gs.project.id optional. It is still required for listing buckets,
creating buckets, and entire BigQuery connector.
4. Disable GCS Metadata Cache by default (e.g. set default value of
`fs.gs.metadata.cache.enable` property to `false`).
5. Support GCS Requester Pays feature that could be configured with new
properties:
fs.gs.requester.pays.mode (default=DISABLED)
fs.gs.requester.pays.project.id (no default value)
fs.gs.requester.pays.buckets (no default value)
6. Add support for specifying marker files pattern that should be copied
last during folder rename operation. Pattern is configured with property:
fs.gs.marker.file.pattern
1.6.3 - 2018-01-25
1. Use new GCS batch requests endpoint.
6. Disable GCS Metadata Cache by default (e.g. set default value of
`fs.gs.metadata.cache.enable` property to `false`).
1.6.2 - 2017-11-21
1. Wire HTTP transport settings into Credential logic.
1.6.1 - 2017-04-14
1. Added a polling loop when determining if a createEmptyObjects error can
safely be ignored and expanded the cases in which we will attempt to
determine if an empty object already exists.
Previously, if a rate limiting exception was encountered while creating
empty objects the connector would issue a single get request for that
object. If the object exists and is zero length we would consider the
createEmptyObjects call successful and suppress the rate limit exception.
The new implementation will poll for the existence of the object, up to
a user-configurable maximum, and will poll when either a rate limiting
error occurs or when a 500-level error occurs. The maximum can be
configured by the following setting:
fs.gs.max.wait.for.empty.object.creation.ms
Any positive value for this setting will be interpreted to mean "poll
for up to this many milliseconds before making a final determination".
The default value will cause a maximum wait of 3 seconds. Polling can
be disabled by setting this key to 0.
1.6.0 - 2016-12-16
1. Added new PerformanceCachingGoogleCloudStorage; unlike the existing
CacheSupplementedGoogleCloudStorage which only serves as an advisory
cache for enforcement of list consistency, the new optional caching
layer is able to serving certain metadata and listing requests purely
out of a short-lived in-memory cache to enhance performance of
some workloads. By default this feature is disabled, and can be
controlled with the config settings:
fs.gs.performance.cache.enable=true (default=false)
fs.gs.performance.cache.list.caching.enable=true (default=false)
The first option enables the cache to serve getFileStatus requests,
while the second option additionally enables serving listStatus. The
duration of cache entries can be controlled with:
fs.gs.performance.cache.max.entry.age.ms (default=3000)
It is not recommended to always run with this feature enabled; it should
be used specifically to address cases where frameworks perform redundant
sequential list/stat operations in a non-distributed manner, and on
datasets which are not frequently changing. It is additionally advised
to validate data integrity separately whenever using this feature. There
is no cooperative cache invalidation between different processes when
using this feature, so concurrent mutations to a location from multiple
clients *will* produce inconsistent/stale results if this feature is
enabled.
1.5.5 - 2016-11-04
1. Minor refactoring of logic in CacheSupplementedGoogleCloudStorage to
extract a reusable ForwardingGoogleCloudStorage that can be used
for other GCS-delegating implementations.
1.5.4 - 2016-10-05
1. Fixed a bug in GoogleCloudStorageReadChannel where multiple in-place
seeks without any "read" in between could throw a range exception.
2. Fixed plumbing of GoogleCloudStorageReadOptions into the in-memory
test helpers to improve unittest alignment with integration tests.
3. Fixed handling of parent timestamp updating when a full URI is provided
as a path for which timestamps should be updated. This allows specifying
gs://bucket/p1/p2/object instead of simply p1/p2/object.
4. Updated Hadoop 2 dependency to 2.7.2.
5. Imported full set of Hadoop FileSystem contract tests, except for the
currently-unsupported Concat and Append tests. Minor changes to pass
all the tests:
-available() now throws ClosedChannelException if called after close()
instead of returning 0.
-read() and seek() throw ClosedChannelException if called after close()
instead of NullPointerException.
-Out-of-bounds seeks now throw EOFException instead of
IllegalArgumentException.
-Blocked overwrites throw org.apache.hadoop.fs.FileAlreadyExistsException
instead of generic IOException.
-Deleting root '/' recursively or non-recursively when empty will no
longer delete the underlying GCS bucket by default. To re-enable
deletion of GCS buckets when recursively deleting root, set
'fs.gs.bucket.delete.enable' = true.
1.5.3 - 2016-09-21
1. Misc updates in credential acquisition on GCE.
1.5.2 - 2016-08-23
1. Updated AbstractGoogleAsyncWriteChannel to always set the
X-Goog-Upload-Desired-Chunk-Granularity header independently from the
deprecated X-Goog-Upload-Max-Raw-Size; in general this improves
performance of large uploads.
1.5.1 - 2016-07-29
1. Optimized InMemoryDirectoryListCache to use a TreeMap internally instead
of a HashMap, since most of its usage is "listing by prefix"; aside
from some benefits for normal list-supplementation, glob listings
with a large number of subdirectories should be orders of magnitude
faster.
1.5.0 - 2016-07-15
1. Changed the single-argument constructor
GoogleCloudStorageFileSystem(GoogleCloudStorage) to inherit the inner
GoogleCloudStorageOptions from the passed-in GoogleCloudStorage rather
than simply falling back to default GoogleCloudStorageFileSystemOptions.
2. Changed RetryHttpInitializer to treat HTTP code 429 as a retriable
error with exponential backoff instead of erroring out.
3. Added a new output stream type which can be used by setting:
fs.gs.outputstream.type=SYNCABLE_COMPOSITE
With the SYNCABLE_COMPOSITE type, output streams obtained with a call
to FileSystem.create(Path) will have (limited) support for Hadoop's
Syncable interface, where calling hsync() will commit the data written
so far into persistent storage without closing the output stream; once
a writer calls hsync(), readers will be able to discover and read the
contents of the file written up to the last successful hsync() call
immediately. This feature is implemented using "composite objects" in
Google Cloud Storage, and thus has a current hard limit of 1023 calls
to hsync() over the lifetime of the stream after which subsequent hsync()
calls will throw a CompositeLimitExceededException, but will still
allow writing additional data and closing the stream without losing data
despite a failed hsync().
4. Added several optimizations and options controlling the behavior of
the GoogleHadoopFSInputStream:
-Removed internal prepopulated ByteBuffer inside GoogleHadoopFSInputStream
in favor of non-blocking buffering in the lower layer channel; this
improves performance for independent small reads without noticeable
impact on large reads. Old behavior can be specified by setting
fs.gs.inputstream.internalbuffer.enable=true (default=false)
-Added option to save an extra metadata GET on first read of a channel
if objects are known not to use the 'Content-Encoding' header. To
use this optimization set:
fs.gs.inputstream.support.content.encoding.enable=false (default=true)
-Added option to save an extra metadata GET on FileSystem.open(Path) if
objects are known to exist already, or the caller is resilient to delayed
errors for nonexistent objects which occur at read time, rather than
immediately on open(). To use this optimization set:
fs.gs.inputstream.fast.fail.on.not.found.enable=false (default=true)
-Added support for "in-place" small seeks by defining a byte limit where
forward seeks smaller than the limit will be done by reading/discarding
bytes rather than opening a brand-new channel. This is set to 8MB by
default, which is approximately the point where the cost of opening a
new channel is equal to the cost of reading/discarding the bytes in-place.
Disable by setting the value to 0, or adjust to other thresholds:
fs.gs.inputstream.inplace.seek.limit=0 (default=8388608)
1.4.5 - 2016-03-24
1. Add support for paths that cannot be parsed by Java's URI.create method.
This support is off by default, but can be enabled by setting Hadoop
configuration key fs.gs.path.encoding to the string uri-path. The
current behavior, and default value of fs.gs.path.encoding, is 'legacy'.
The new path encoding scheme will become the default in a future release.
2. VerificationAttributes are now exposed in GoogleCloudStorageItemInfo. The
current support is limited to reading these attributes from what was
computed by GCS server side. A future release will add support for
specifying VerificationAttributes in CreateOptions when creating new
objects.
1.4.4 - 2016-02-02
1. Add support for JSON keyfiles via a new configuration key:
google.cloud.auth.service.account.json.keyfile. This key should
point to a file that is available locally to jobs to clusters.
1.4.3 - 2015-11-12
1. Minor bug fixes and enhancements.
1.4.2 - 2015-09-15
1. Added checking in GoogleCloudStorageImpl.createEmptyObject(s) to handle
rateLimitExceeded (429) errors by fetching the fresh underlying info
and ignoring the error if the object already exists with the intended
metadata and size. This fixes an issue which mostly affects Spark:
https://github.com/GoogleCloudPlatform/bigdata-interop/issues/10
2. Added logging in GoogleCloudStorageReadChannel for high-level retries.
3. Added support for configuring the permissions reported to the Hadoop
FileSystem layer; the permissions are still fixed per FileSystem instance
and aren't actually enforced, but can now be set with:
fs.gs.reported.permissions [default = "700"]
This allows working around some clients like Hive-related daemons and tools
which pre-emptively check for certain assumptions about permissions.
1.4.1 - 2015-07-09
1. Switched from the custom SeekableReadableByteChannel to
Java 7's java.nio.channels.SeekableByteChannel.
2. Removed the configurable but default-constrained 250GB upload limit;
uploads can now exceed 250GB without needing to modify config settings.
3. Added helper classes related to GCS retries.
4. Added workaround support for read retries on objects with content-encoding
set to gzip; such content encoding isn't generally correct to use since
it means filesystem reported bytes will not match actual read bytes, but
for cases which accept byte mismatches, the read channel can now manually
seek to where it left off on retry rather than having a GZIPInputStream
throw an exception for a malformed partial stream.
5. Added an option for enabling "direct uploads" in
GoogleCloudStorageWriteChannel which is not directly used by the Hadoop
layer, but can be used by clients which directly access the lower
GoogleCloudStorage layer.
6. Added CreateBucketOptions to the GoogleCloudStorage interface so that
clients using the low-level GoogleCloudStorage directly can create buckets
with different locations and storage classes.
7. Fixed https://github.com/GoogleCloudPlatform/bigdata-interop/issues/5 where
stale cache entries caused stuck phantom directories if the directories
were deleted using non-Hadoop-based GCS clients.
8. Fixed a bug which prevented the Apache HTTP transport from working with
Hadoop 2 when no proxy was set.
9. Misc updates in library dependencies; google.api.version
(com.google.http-client, com.google.api-client) updated from 1.19.0 to
1.20.0, google-api-services-storage from v1-rev16-1.19.0 to
v1-rev35-1.20.0, and google-api-services-bigquery from v2-rev171-1.19.0
to v2-rev217-1.20.0, and Guava from 17.0 to 18.0.
1.4.0 - 2015-05-27
1. The new inferImplicitDirectories option to GoogleCloudStorage tells
it to infer the existence of a directory (such as foo) when that
directory node does not exist in GCS but there are GCS files
that start with that path (such as as foo/bar). This allows
the GCS connector to be used on read-only filesystems where
those intermediate directory nodes can not be created by the
connector. The value of this option can be controlled by the
Hadoop boolean config option "fs.gs.implicit.dir.infer.enable".
The default value is true.
2. Increased Hadoop dependency version to 2.6.0.
3. Fixed a bug introduced in 1.3.2 where, during marker file creation,
file info was not properly updated between attempts. This lead
to backoff-retry-exhaustion with 412-preconditon-not-met errors.
4. Added support for changing the HttpTransport implementation to use,
via fs.gs.http.transport.type = [APACHE | JAVA_NET]
5. Added support for setting a proxy of the form "host:port" via
fs.gs.proxy.address, which works for both APACHE and JAVA_NET
HttpTransport options.
6. All logging converted to use slf4j instead of the previous
org.apache.commons.logging.Log; removed the LogUtil wrapper which
previously wrapped org.apache.commons.logging.Log.
7. Automatic retries for premature end-of-stream errors; the previous
behavior was to throw an unrecoverable exception in such cases.
8. Made close() idempotent for GoogleCloudStorageReadChannel
9. Added a low-level method for setting Content-Type metadata in the
GoogleCloudStorage interface.
10.Increased default DirectoryListCache TTL to 4 hours, wired out TTL
settings as top-level config params:
fs.gs.metadata.cache.max.age.entry.ms
fs.gs.metadata.cache.max.age.info.ms
1.3.3 - 2015-02-26
1. When performing a retry in GoogleCloudStorageReadChannel, attempts to
close() the underlying channel are now performed explicitly instead of
waiting for performLazySeek() to do it, so that SSLException can be
caught and ignored; broken SSL sockets cannot be closed normally,
and are responsible for already cleaning up on error.
2. Added explicit check of currentPosition == size when -1 is read from
underlying stream in GoogleCloudStorageReadChannel, in case the
stream fails to identify an error case and prematurely reaches
end-of-stream.
1.3.2 - 2015-01-22
1. In the create file path, marker file creation is now configurable. By
default, marker files will not be created. The default is most suitable
for MapReduce applications. Setting fs.gs.create.marker.files.enable to
true in core-site.xml will re-enable marker files. The use of marker files
should be considered for applications that depend on early failing when
two concurrent writes attempt to write to the same file. Note that file
overwrite semantics are preserved with or without marker files, but
failures will occur sooner with marker files present.
1.3.1 - 2014-12-16
1. Fixed a rare NullPointerException in FileSystemBackedDirectoryListCache
which can occur if a directory being listed is purged from the cache
between a call to "exists()" and "listFiles()".
2. Fixed a bug in GoogleHadoopFileSystemCacheCleaner where cache-cleaner
fails to clean any contents when a bucket is non-empty but expired.
3. Fixed a bug in FileSystemBackedDirectoryListCache which caused garbage
collection to require several passes for large directory hierarchies;
now we can successfully garbage-collect an entire expired tree in a
single pass, and cache files are also processed in-place without having
to create a complete in-memory list.
4. Updated handling of new file creation, file copying, and file deletion
so that all object modification requests sent to GCS contain preconditions
that should prevent race-conditions in the face of retried operations.
1.3.0 - 2014-10-17
1. Directory timestamp updating can now be controlled via user-settable
properties "fs.gs.parent.timestamp.update.enable",
"fs.gs.parent.timestamp.update.substrings.excludes". and
"fs.gs.parent.timestamp.update.substrings.includes" in core-site.xml. By
default, timestamp updating is enabled for the YARN done and intermediate
done directories and excluded for everything else. Strings listed in
includes take precedence over excludes.
2. Directory timestamp updating will now occur on a background thread inside
GoogleCloudStorageFileSystem.
3. Attempting to acquire an OAuth access token will be now be retried when
using .p12 or installed application (JWT) credentials if there is a
recoverable error such as an HTTP 5XX response code or an IOException.
4. Added FileSystemBackedDirectoryListCache, extracting a common interface
for it to share with the (InMemory)DirectoryListCache; instead of using
an in-memory HashMap to enforce only same-process list consistency, the
FileSystemBacked version mirrors GCS objects as empty files on a local
FileSystem, which may itself be an NFS mount for cluster-wide or even
potentially cross-cluster consistency groups. This allows a cluster to
be configured with a "consistent view", making it safe to use GCS as the
DEFAULT_FS for arbitrary multi-stage or even multi-platform workloads.
This is now enabled by default for machine-wide consistency, but it is
strongly recommended to configure clusters with an NFS directory for
cluster-wide strong consistency. Relevant configuration settings:
fs.gs.metadata.cache.enable [default: true]
fs.gs.metadata.cache.type [IN_MEMORY (default) | FILESYSTEM_BACKED]
fs.gs.metadata.cache.directory [default: /tmp/gcs_connector_metadata_cache]
5. Optimized seeks in GoogleHadoopFSDataInputStream which fit within
the pre-fetched memory buffer by simply repositioning the buffer in-place
instead of delegating to the underlying channel at all.
6. Fixed a performance-hindering bug in globStatus where "foo/bar/*" would
flat-list "foo/bar" instead of "foo/bar/"; causing the "candidate matches"
to include things like "foo/bar1" and "foo/bar1/baz", even though the
results themselves would be correct due to filtering out the proper glob
client-side in the end.
7. The versions of java API clients were updated to 1.19 derived versions.
1.2.9 - 2014-09-18
1. When directory contents are updated e.g., files or directories are added,
removed, or renamed the GCS connector will now attempt to update a
metadata property on the parent directory with a modification time. The
modification time recorded will be used as the modification time in
subsequent FileSystem#getStatus(...), FileSystem#listStatus and
FileSystem#globStatus(...) calls and is the time as reported by
the system clock of the system that made the modification.
1.2.8 - 2014-08-07
1. Changed the manner in which the GCS connector jar is built to A) reduce
included dependencies to only those parts which are used and B) repackaged
dependencies whose versions conflict with those bundled with Hadoop.
2. Deprecated fs.gs.system.bucket config.
1.2.7 - 2014-06-23
1. Fixed a bug where certain globs incorrectly reported the parent directory
being not found (and thus erroring out) in Hadoop 2.2.0 due to an
interaction with the fs.gs.glob.flatlist.enable feature; doesn't affect
Hadoop 1.2.1 or 2.4.0.
1.2.6 - 2014-06-05
1. When running in hadoop 0.23+ (hadoop 2+), listStatus will now throw a
FileNotFoundException when a non-existent path is passed in.
2. The GCS connector now uses the v1 JSON API when accessing Google
Cloud Storage.
3. The GoogleHadoopFileSystem now treats the parent of the root path as if
it is the root path. This behavior mimics the POSIX behavior of "/.."
being the same as "/".
4. When creating a new file, a zero-length marker file will be created
before the FSDataOutputStream is returned in create(). This allows for
early detection of overwrite conflicts that may occur and prevents
certain race conditions that could be encountered when relying on
a single exists() check.
5. The dependencies on cglib and asm were removed from the GCS connector
and the classes for these are no longer included in the JAR.
1.2.5 - 2014-05-08
1. Fixed a bug where fs.gs.auth.client.file was unconditionally being
overwritten by a default value.
2. Enabled direct upload for directory creation to save one round-trip call.
3. Added wiring for GoogleHadoopFileSystem.close() to call through to close()
its underlying helper classes as well.
4. Added a new batch mode for creating directories in parallel which requires
manually parallelizing in the client. Speeds up nested directory creation
and repairing large numbers of implicit directories in listStatus.
5. Eliminated redundant API calls in listStatus, speeding it up by ~half.
6. Fixed a bug where globStatus didn't correctly handle globs containing '?'.
7. Implemented new version of globStatus which initially performs a flat
listing before performing the recursive glob logic in-memory to
dramatically speed up globs with lots of directories; the new behavior is
default, but can disabled by setting fs.gs.glob.flatlist.enable = false.
1.2.4 - 2014-04-09
1. The value of fs.gs.io.buffersize.write is now rounded up to 8MB if set to
a lower value, otherwise the backend will error out on unaligned chunks.
2. Misc refactoring to enable reuse of the resumable upload classes in other
libraries.
1.2.3 - 2014-03-21
1. Fixed a bug where renaming a directory could cause the file contents to get
shuffled between files when the fully-qualified file paths have different
lengths. Does not apply to renames on files directly, such as when using
a glob expression inside a flat directory.
2. Changed the behavior of batch request API calls such that they are retried
on failure in the same manner as non-batch requests.
3. Eliminated an unnecessary dependency on com/google/protobuf which could
cause version-incompatibility issues with Apache Shark.
1.2.2 - 2014-02-12
1. Fixed a bug where filenames with '+' were unreadable due to premature
URL-decoding.
2. Modified a check to allow fs.gs.io.buffersize.write to be a non-multiple
of 8MB, just printing out a warning instead of check-failing.
3. Added some debug-level logging of exceptions before throwing in cases
where Hadoop tends to swallows the exception along with its useful info.
1.2.1 - 2014-01-23
1. Added CHANGES.txt for release notes.
2. Fixed a bug where accidental URI decoding make it impossible to use
pre-escaped filenames, e.g. foo%3Abar. This is necessary for Pig to work.
3. Fixed a bug where an IOException was thrown when trying to read a
zero-byte file. Necessary for Pig to work.
1.2.0 - 2014-01-14
1. Preview release of GCS connector.