Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Object storage #8383

Merged
merged 34 commits into from
Jun 27, 2014
Merged

Object storage #8383

merged 34 commits into from
Jun 27, 2014

Conversation

DeepDiver1975
Copy link
Member

don't merge before owncloud-archive/3rdparty#99 is in 3rdparty

@@ -57,7 +90,9 @@ public static function setupFS( $user = '' ) {
// set up quota for home storages, even for other users
// which can happen when using sharing

if ($storage instanceof \OC\Files\Storage\Home) {
if ($storage instanceof \OC\Files\Storage\Home
|| $storage instanceof \OCA\ObjectStorage\S3 // FIXME introduce interface \OC\Files\Storage\HomeStorage? or add method?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we might also want to support OpenStack ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We started to implement this approach against a VM with ceph installed. Ceph supports S3 API as well as Swift.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, nice 😄
I just saw in the files_external project that S3 and Swift have a slightly different implementation. It might make sense to consolidate them at some point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not to sure if we can merge the implementations - these primary storage backends will work on the basic assumption that ownCloud is the only system accessing the 'files'. We never scan and we never store folder structure information in the object store - in contrary to our existing files_external implementations.

Let's see ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, makes sense.

@karlitschek karlitschek added this to the ownCloud 7 milestone May 13, 2014
@butonic
Copy link
Member

butonic commented Jun 6, 2014

@DeepDiver1975 @karlitschek @craigpg I managed to set up an openstack server in a virtual machine using http://devstack.org/ Objectstore on swift, s3 and rackspace are working. This PR adds the necessary configuration parameters to core. https://github.com/owncloud/objectstore contains the implementations for each backend as an app.

@karlitschek
Copy link
Contributor

great. Let´s finish this next week. We need Swift and S3 as separate apps and the documentation should be updated about how to configure Swift.
Does this work for you?

@butonic
Copy link
Member

butonic commented Jun 7, 2014

@karlitschek sure, I got a http://devstack.org/ vm running and it provides a swift as well as an s3 api. will split the code next week and update the docs accordingly.

@karlitschek
Copy link
Contributor

Sweeet

@PVince81
Copy link
Contributor

PVince81 commented Jun 8, 2014

@butonic will those docs include instructions how to build that VM ?
Did you just run their cloud-init script or did the VM needs additional tweaks to provide a swift and s3 interface ? (would be nice to have such a VM for testing ext storage)

@butonic
Copy link
Member

butonic commented Jun 10, 2014

@PVince81 I downloaded the ubuntu server 14.04 x64 iso and installed that in a VM with virtualbox. i used stack as the username during the installation. In the vm I had to install memcached prior to get the devstack script running:

sudo apt-get install memcached git
git clone https://github.com/openstack-dev/devstack.git

my devstack/local.conf literally looks like this

[[local|localrc]]
ADMIN_PASSWORD=password
MYSQL_PASSWORD=password
RABBIT_PASSWORD=password
SERVICE_PASSWORD=password
SERVICE_TOKEN=tokentoken

SWIFT_HASH=66a3d6b56c1f479c8b4e70ab5c2000f5

enable_service s-proxy s-object s-container s-account

after booting the vm you have to start the services manually by running

cd devstack
./stack.sh

The first time you execute stack.sh will take quite a while, downloading and installing all packages.

You will end up in a screen session with a lot of openstack services started in individual tab so you can follow any log messages. Quite nice.

I'm trying to slim down the stack configuration and add s3 because that should speed up deployment and reduce ram usage.

@butonic
Copy link
Member

butonic commented Jun 12, 2014

@PVince81 @icewind1991 @schiesbn I need a review regarding the approach I took to replace the root storage as well as the users home storage with an object storage implementation. Currently, the objectstore is magically enabled by putting something like the following in the config.php:

  'home_storage' => array(
    'class' => 'OC\\Files\\ObjectStore\\Swift',
    'arguments' => array(
      'username' => 'demo',
      'password' => 'password',
      'container' => 'devobjectstore',
      'region' => 'RegionOne', //required, devstack defaults to 'RegionOne'
      'url' => 'http://192.168.42.114:5000/v2.0', // The Identity / Keystone endpoint
      'tenantName' => 'demo', // required on devstack
      'serviceName' => 'swift', //devstack uses 'swift' by default, the lib defaults to 'cloudFiles' if omitted
    ),
  ),
  'root_storage' => array(
    'class' => 'OC\\Files\\ObjectStore\\Swift',
    'arguments' => array (
      'username' => 'demo',
      'password' => 'password',
      'container' => 'devobjectstore',
      'region' => 'RegionOne', //required, devstack defaults to 'RegionOne'
      'url' => 'http://192.168.42.114:5000/v2.0', // The Identity / Keystone endpoint
      'tenantName' => 'demo', // required on devstack
      'serviceName' => 'swift', //devstack uses 'swift' by default, the lib defaults to 'cloudFiles' if omitted
    ),
  ),

Yes, there are some annoyances with this configuration, such as having to specify the configuration twice: once for the root storage and once for the users home storage. What I am more interested in is a review of the storage assembly in setupFS(). I deisbled the creation of the skeleton for objectstorage based home folders because it does not use the oc filesystem. Should we move skeleton creation to post filesystem setup, maybe an app? Should I unify the configuration for root and home storage or do you see a valid use case where they might be set up differently?

The AbstractObjectStorage uses a NoopScanner because the files don't reside on the local filesystem and we can pull the metadata from the filecache. The objectstorage is assumed to be used by owncloud exclusively.

@butonic butonic changed the title [WIP] Object storage Object storage Jun 12, 2014
@butonic
Copy link
Member

butonic commented Jun 13, 2014

Sharing and encryption work fine.
@schiesbn The problem with encryption is that it totally kills performance because we have to fetch the key from the object store as well as the file itself.

  • After this has been merged investigate improving performance by caching the encryption keys with memcached / apc?

@icewind1991 Did I understand correctly the I have to wrap the users home storage so we can apply quota? If so, that is the reason whyI had to mount the object store as '/' as well as ''. There is no real need to use a quota wrapper on the root storage, is there?

@butonic
Copy link
Member

butonic commented Jun 16, 2014

tests fail on mysql because it has problems storing characters like 💩 (Pile of poo ... or 💩) likely tracked in #7030

//default home storage configuration:
'class' => '\OC\Files\Storage\Home',
'arguments' => array()
));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the homestorage config option objectstore ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only wanted to have on config option to make it easier to set up an objectstorage based owncloud. It does not really make sense to have two separate configuration options like "root_storage" and "home_storage" which I removed with 547791b. But I need the configuration when setting up these storages, hence the IMHO more meaningful name of "objectstore". Does that make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the config option can be used to setup other home storages besides object store

@icewind1991
Copy link
Contributor

I think it would be better to split of the object store methods getObject(), createObject() (writeObject()?) and deleteObject() and probably getUrn() into a seperate interface/class which is then passed to the ObjectStoreStorage class to seperate the objectstore logic from the logic of maping an objectstore to a storage backend.

This would also allow to have a seperate HomeObjectStoreStorage to be used as home storage while sill having a more general ObjectStoreStorage which can be mounted everywhere by the user.

@butonic
Copy link
Member

butonic commented Jun 16, 2014

Hm, that will only add another indirection. Currently the "objectstore" config option configures which class should be used to provide the objectstore implementation. The AbstractObjectStorage class is meant to help implement the individual storage backends like swift or s3 ... depending on what capabilities the storage provides developers might want to provide completely different implementations of \OC\Files\Storage\Common is_dir() or any other method in that class. So, I think inheritance makes more sense than composition in this case.

I'll have to think on the matter of an implementation that users can mount themselves ...

@ghost
Copy link

ghost commented Jun 16, 2014

💣 Test Failed. 💣
Refer to this link for build results: https://ci.owncloud.org/job/pull-request-analyser/5633/

@icewind1991
Copy link
Contributor

The problem with the current inheritance approach is that you can't have multiple types of objectstorage storages, you can't have a seperate class for HomeObjectStoreStorage because you can't create a subclass of AbstractObjectStorage for that since all implementations extend AbstractObjectStorage and not AbstractHomeObjectStoreStorage.

Imo, composition makes sense here since the type of object storage used is not related to the type of storage backend.

The current implementation just doesn't provide the flexibility I would like to see from this

@butonic
Copy link
Member

butonic commented Jun 17, 2014

@icewind1991 ok, changed the architecture from inheritance to composition. For now I kept the single "objectstore" configuration and hardcoded using a plain ObjectStoreStorage for the root storage vs a HomeObjectStoreStorage for user homes. Currently the HomeObjectStoreStorage implementation is a subclass of ObjectStoreStorage. As a result the getId implementation is currently the same. For the HomeObjectStoreStorage I could require a user and then always use 'objstore::user:' as the id. But how should I calculate a unique storage id for the plain ObjectStoreStorage class which is intended to be mounted in other mount points? I will have a user name but I need another datum to make the id unique. Maybe the mount point with the slashes squased to underscores? Or the mountpoint as a md5? @icewind1991 got a recommendation? The files external implementation uses the bucket .... hm ... that would only work when the same bucket is used more than once because the fileid is unique.

@ghost
Copy link

ghost commented Jun 17, 2014

💣 Test Failed. 💣
Refer to this link for build results: https://ci.owncloud.org/job/pull-request-analyser/5655/

@butonic
Copy link
Member

butonic commented Jun 26, 2014

With regard to folders being permanent: yes, they only need to be stored as metadata in the local filecache. The storage for it is an objectstore though. Originally when the constructor failed to connect to the object store it would throw an exception directing users to theri administrator. The connection is now only checked when an actual file up/download is tried due to the lazy initialization introduced by @icewind1991 with eb97eee
Not sure if we need to fail earlier.

@butonic
Copy link
Member

butonic commented Jun 26, 2014

  • BUG skeleton (creating the welcome.txt) does not work because it copies the files without the owncloud stream wrapper

@PVince81
Copy link
Contributor

Manual test of the SWIFT external storage (against devstack)

  • test: web: mount
  • test: web: create file
  • test: web: rename file
  • test: web: delete file
  • test: web: move file to subdir
  • test: web: download file
  • test: web: save file with text editor
  • test: web: upload file
  • test: web: overwrite file
  • test: web: create dir
  • test: web: delete dir
  • test: web: move/drag file to mount
  • test: web: move/drag file out of mount
  • test: web: move/drag folder to mount
  • test: web: move/drag folder out of mount (recursive delete!)
  • test: web: image previews appear
  • test: webdav: upload small file
  • test: webdav: overwrite file (part file handling)
  • test: webdav: rename file
  • test: sync client: can sync up small files
  • test: sync client: can sync up big files
  • test: sync client: can sync down
  • test: sync client: can sync down big files
  • test: sync client: upload existing file on upload

@PVince81
Copy link
Contributor

  • BUG: SWIFT ext storage: WebDAV/sync client upload seem to have issue with mime type (ALSO on master)

To be raised/handled separately.

@PVince81
Copy link
Contributor

The "cannot detect remote changes" for now, this is an old issue that is discussed here: #8633
Since the library upgrade doesn't break nor improve that issue, I suggest to ignore it for now so we can merge this PR faster.

@PVince81
Copy link
Contributor

I quickly compared the SWIFT ext storage with master and it seems to be faster on this branch.

This PR doesn't break existing functionality and the issues related to object store can be fixed separately. Issues must be raised after this is merged.

👍 from me.

@icewind1991
Copy link
Contributor

@icewind1991 now the testStat() fails for both: objectstore and files_external. The objectstore uses the local database to store the mtime, so hasUpdated actually works correctly. Please revert 92f2a43 because the NoopScanner replaces the Scanner implementation and has the same effect. Or am I wrong?

Simply returning false from hasUpdated() prevents having to query the mtime from the database, since the mtime comparison will always return false it's a completely wasted query

@craigpg
Copy link

craigpg commented Jun 26, 2014

Keeping in oC 7.0 CE

@butonic
Copy link
Member

butonic commented Jun 26, 2014

On 26. Juni 2014 16:32:03 MESZ, icewind1991 [email protected] wrote:

Simply returning false from hasUpdated() prevents having to query
the mtime from the database, since the mtime comparison will always
return false it's a completely wasted query
That depends on the mtime parameter, but I admit that I don't know who is using this method to determine if a file has updated since a given time. And since we can provide that Implementation based on the filecache, I fail to see the reason for just always returning false when we are breaking the tests with it.

@icewind1991
Copy link
Contributor

The method is only used for updating the cache which is not needed in this case.

For the unit test, I think it's fine best to just overwrite the unit test since it's for a use case not supported

@butonic
Copy link
Member

butonic commented Jun 27, 2014

For the sake of performance I'll change the Test method. But shouldn't hasUpdated then be something like needsCacheUpdate()? Otherwise someone else might start relying on the functionality. Anyway, I'll update the PHPDoc comment to clarify a bit.

@PVince81
Copy link
Contributor

I suggest to ignore / disable the testStat() issue as it isn't directly related to this feature. The problem already existed before. Let's merge this as soon as possible.

Maybe you can just override and NOOP that test for now and follow up in #8633 for a proper solution later.

@scrutinizer-notifier
Copy link

A new inspection was created.

@butonic
Copy link
Member

butonic commented Jun 27, 2014

@PVince81 @icewind1991 @DeepDiver1975 overwrote the testStat and updated the hasUpdated descriptions. lets see what jenkins says and then merge it

@scrutinizer-notifier
Copy link

The inspection completed: 32 new issues, 54 updated code elements

@ghost
Copy link

ghost commented Jun 27, 2014

🚀 Test Passed. 🚀
Refer to this link for build results: https://ci.owncloud.org/job/pull-request-analyser/5799/

$path = str_replace('//', '/', $path);

if (!$path) {
$path = '.';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reasoning behind not just returning an empty string here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dirname('/folder') returns .. So identify the root folder as . I assume you don't want to see '.' in the filecache table? I'll look into that.

@scrutinizer-notifier
Copy link

A new inspection was created.

@butonic
Copy link
Member

butonic commented Jun 27, 2014

@icewind1991 ok, now the root is stored as '' in the cache. Since I started with a copy of files_externals swift implementation I assume it also stores root as '.' in the cache, but thats a different issue.

good to merge?

@icewind1991
Copy link
Contributor

👍

@butonic
Copy link
Member

butonic commented Jun 27, 2014

@karlitschek only waiting for jenkins build before merging

@ghost
Copy link

ghost commented Jun 27, 2014

🚀 Test Passed. 🚀
Refer to this link for build results: https://ci.owncloud.org/job/pull-request-analyser/5802/

PVince81 pushed a commit that referenced this pull request Jun 27, 2014
@PVince81 PVince81 merged commit fd8b568 into master Jun 27, 2014
@PVince81 PVince81 deleted the object_storage branch June 27, 2014 14:53
@PVince81
Copy link
Contributor

Yay! Clicked the big fat green button 😄
Congrats, guys!

@butonic
Copy link
Member

butonic commented Jun 27, 2014

ok, next up fixing versions and text editor previews ...

@lock lock bot locked as resolved and limited conversation to collaborators Aug 21, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants