Skip to content

Commit

Permalink
Use singleton per-region for S3Client (#384)
Browse files Browse the repository at this point in the history
Summary:
Pull Request resolved: #384

# What
* Use a singleton for each region when constructing our S3Client instead of a _single_ singleton
* This is kind of a follow-up to D36727729 (47182c6)
# Why
* Follow-up to S281873 - S3Client singleton breaks multi-region support in PCF IO

NOTE: PCF should be owning this code in the long-term since it's out of PSI's scope, but since we owned the initial SEV, I took ownership of this follow-up.

Differential Revision: D39174441

fbshipit-source-id: a44d96b31a140ed2a220d516dfbda74504bed7f7
  • Loading branch information
Logan Gore authored and facebook-github-bot committed Sep 13, 2022
1 parent 6541152 commit 6a3db69
Show file tree
Hide file tree
Showing 3 changed files with 40 additions and 5 deletions.
2 changes: 1 addition & 1 deletion fbpcf/io/cloud_util/CloudFileUtil.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ std::unique_ptr<IFileUploader> getCloudFileUploader(
return std::make_unique<S3FileUploader>(
fbpcf::cloudio::S3Client::getInstance(
fbpcf::aws::S3ClientOption{.region = ref.region})
.getS3Client(),
->getS3Client(),
filePath);
} else {
throw fbpcf::PcfException("Not supported yet.");
Expand Down
41 changes: 38 additions & 3 deletions fbpcf/io/cloud_util/S3Client.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,47 @@

#include "fbpcf/io/cloud_util/S3Client.h"

#include <string>

#include <aws/core/Aws.h>
#include <aws/s3/S3Client.h>

#include <folly/container/F14Map.h>
#include <folly/Synchronized.h>

namespace fbpcf::cloudio {
S3Client& S3Client::getInstance(const fbpcf::aws::S3ClientOption& option) {
static S3Client s3Client(option);
return s3Client;
std::shared_ptr<S3Client> S3Client::getInstance(const fbpcf::aws::S3ClientOption& option) {
/* Due to previous problems, we create a Singleton instance of the S3Client,
* but there's a catch: we need a distinct S3Client for each region, or we
* run into other issues. For that reason, we store this map from string to
* S3Client with the assumption that the keys are region names. Since region
* is optional, we also allow for a default empty string region.
* ***************************** NOT THREAD SAFE ****************************
* NOTE: Significant refactoring is required to make this thread safe
* Downstream usage wants a mutable reference, but a folly::Synchronized
* RWLock will return a const ref to a reader, meaning it's hard to refactor.
* Simply trying to use folly::Synchronized around the map isn't sufficient,
* because we'll leak a reference to an object in the map which is unsafe.
* ***************************** NOT THREAD SAFE ****************************
*/
static folly::Synchronized<folly::F14FastMap<std::string, std::shared_ptr<S3Client>>> m;

std::string defaultStr{};
auto region = option.region.value_or(defaultStr);

m.withWLock([&](auto& clientMap) {
if (clientMap.find(region) == clientMap.end()) {
std::shared_ptr<S3Client> ptr{new S3Client{option}};
clientMap.at(region) = ptr;
}
});

/* You may see this and think, "Hey, the NOT THREAD SAFE warning above is
* outdated, it looks like we fixed it!", but you're wrong. This still does
* not fully solve the problem. Because the downstream consumer takes a
* mutable reference, there's no guarantee that this is thread safe. It's
* better than nothing, but you still shouldn't fully trust this code.
*/
return m.wlock()->at(region);
}
} // namespace fbpcf::cloudio
2 changes: 1 addition & 1 deletion fbpcf/io/cloud_util/S3Client.h
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ class S3Client {
}

public:
static S3Client& getInstance(const fbpcf::aws::S3ClientOption& option);
static std::shared_ptr<S3Client> getInstance(const fbpcf::aws::S3ClientOption& option);

std::shared_ptr<Aws::S3::S3Client> getS3Client() {
return awsS3Client_;
Expand Down

0 comments on commit 6a3db69

Please sign in to comment.