Skip to content
brianmay edited this page Dec 14, 2014 · 19 revisions

Future plans (Brian May)

Home / Future plans

Karaage 4

Karaage4 bugs: https://github.com/Karaage-Cluster/karaage/issues?q=is%3Aopen+is%3Aissue+milestone%3AKaraage4, https://github.com/vlsci/karaage/issues

Changes to new schema

  • PublicNotes: not absolutely sure about this table; I think the same thing can already be done with the karaage.LogEntry table and setting action_flag == COMMENT.

  • Suggest Resource have some sort of field to indicate units of measurement, e.g. kb or seconds.

  • there are two types of resources: accumulative and non-accumulative.

    Accumulative resources, e.g. cpu hours, if you want to get the entire usage for the entire year, you add the values up. Values like max/avg have no meaning (that I can think of anyway).

    For non-accumulative resources (e.g. disk space), if you add the values you get an estimate of "MB days" (note change in units) which still could be useful(?), however values like max/avg are going to useful too.

    It might be worth trying to classify the type somehow, so the user interface knows how to deal with the data in a generic way.

Implications of new schema

  • Resources could be used for tracking software usage: Add each software application per machine to the resources table. Create software as the ResourcePool. Track accumulative hours and times used for each package used on a daily basis. Although this will increase the size of the Usage table... Hopefully good indexing will mean this doesn't result in a reducing performance.

  • Project applications should have some screen for confirming resource limits (or using parent limits)? Who sets this? Applicant? Project Leader? Administrator? Or you could give each role in this list the ability to review/change the limits set by the previous role (this is similar to what already happens for applications).

  • Does UI block changes to group if owned by a project? That information now tracked by ProjectMembership table too.

Future directions for schema

  • Remove usedmodules table. Instead clusters use xmlrpc calls, which (should be altered to) update the cpujob_software table directly.

  • InstituteQuota, not required any more. It provides quotas, which are superseded with the new allocations stuff. It provides a many2many between Institutes and Machine Categories, however I think it is only used by datastore methods that aren't actually used, and these can go too.

  • New project applications are filled against an institute. Change this to be filled against a parent project instead. Need to have some permission controls so people only create project applications against authorized top level projects (maybe a per project flag)?

  • Only use for institutes now is for the primary group, maybe this be replaced by a primary project or something instead?

  • Remove institutes table.

  • ProjectQuota is still required, although not for the quota aspect. i.e. some of the fields will become obsolete, and the name will become misleading. It provides a many2many between Projects and Machine Categories, and this is required in various places (e.g. when user joins a project he will automatically accounts on these machine categories). Using ProjectQuota for the usage stuff will be superseded with the new Allocations stuff. It is possible we could replace ProjectQuota with something more relevant to what is required.

  • Machine Categories: At the present time they are only ever used for legacy usage data and data for Massive at VPAC. Designed to manage a collection of machines as a separate entity with unique data stores. A lot of complexity for something that is rarely used. Without a good use case, it is hard to know exactly what is required here.

  • Karaage core now tracks usage, more generalized then the karaage-usage plugin, hence the name of the karaage-usage plugin could be confusing :-(. Exactly what to do with the usage views perhaps needs more thought, maybe we could generalize parts and move back into core at some stage.

Future directions for karaage4

  • Karaage core now tracks usage, more generalized then the karaage-usage plugin, hence the name of the karaage-usage plugin could be confusing :-(. Exactly what to do with the usage views perhaps needs more thought, maybe we could generalize parts and move back into core at some stage.

  • Add the following to every file:

      from __future__ import division
      from __future__ import absolute_import
      from __future__ import unicode_literals
      from __future__ import print_function
    

Implications for V3

This is just a quick summary of my understanding of what Karaage 4 will mean for the installation at V3. Current as of the end of the meeting yesterday, with the database schema changes. Projects to become a tree. So we could have:

  • RMIT
    • RMIT Project A
    • RMIT Project B
    • ...
  • LaTrobe
    • LaTrobe Project A
    • LaTrobe Project B
    • ...
  • Other
    • Deakin
      • Deakin Project A
      • Deakin Project B
    • Swinburne
      • Swinburne Project A
      • Swinburne Project B
    • V3
      • V3 Project A
      • Make friend with Eliza

Usage no longer gets attributed to Institutes. Rather it gets attributed to a project based on its position in the tree. So "V3 Project A" would get usage attributed to the "V3" project as well as the "V3 Project A" project.

As a result, assigning quotas to Institutes as we know them will become unsupported, rather allocation's get assigned to the top level Institute Projects. e.g.

  • RMIT
    • Grant: RMIT; 2015-01-01 to 2015-12-31
      • Allocation: RMIT 200M cpu hours
        • AllocationPool; Period=2015; Project=RMIT;
  • LaTrobe
    • Grant: LaTrobe; 2015-01-01 to 2015-12-31
      • Allocation: LaTrobe 200M cpu hours
        • AllocationPool; Period=2015; Project=LaTrobe;
  • Project: Other
    • Project: Deakin
      • Grant: Fair share; 2015-01-01 to 2015-12-31
        • Allocation: Fair share 33M cpu hours
          • AllocationPool; Period=2015; Project=Other;
    • Project: Swinburne
      • Grant: Fair share; 2015-01-01 to 2015-12-31
        • Allocation: Fair share 33M cpu hours
          • AllocationPool; Period=2015; Project=Other;
    • Project: V3
      • Grant: Fairshare; 2015-01-01 to 2015-12-31
        • Allocation: Fair share 33M cpu hours
          • AllocationPool; Period=2015; Project=Other;

Where:

  • The grant is saying where the funding for this usage has come from, and for what period. I believe V3 funding is on a per year basis.

  • Allocation is where the limit on resources is provided; Allocations can be created for other resources here too, e.g. disk space limits. Karaage will not enforce these limits, however will flag it as an alert if the limits are exceeded.

  • Usage is assigned to the AllocationPool. An AllocationPool can be compromised of multiple Allocations, and hence multiple Grants. However, I think we only need one of each at the present time, as a result what names we given them isn't so important.

The idea is that cpu usage limits are stored in the database as "cpu time" format. However a GUI interface could convert these into percentage of total time available. e.g. as described in https://github.com/Karaage-Cluster/karaage/issues/222

I hope I have now understood this properly :-)

I have a ER diagram on my desk, with some modifications made by hand.