Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Architectural resources to "tame" the codebase #694

Open
Splines opened this issue Sep 17, 2024 · 6 comments
Open

Architectural resources to "tame" the codebase #694

Splines opened this issue Sep 17, 2024 · 6 comments

Comments

@Splines
Copy link
Member

Splines commented Sep 17, 2024

This is a "structured dumping ground" for useful resources that could help refactoring our code base. I think some issues are pretty-well summarized by a teaser for Jason Swett's book "Growing Large Rails Applications".

"How did our Rails app get to be such a confusing mess?"

Your models are bloated
When you open up a model file, you see a grab bag of random stuff. There are five methods that relate to this concept, two methods that relate to that concept, and about 20 one-off methods that don't have anything to do with anything else. The problem is especially bad in one or two "god objects" that you can't seem to figure out any way to break up.

Your controllers are confusing
Your controllers are bloated, but in a different way. By necessity, you've added custom methods to some of your controllers beyond the defaults that Rails gives you. In some cases you've added quite a lot of custom methods. You want to practice "fat models, skinny controllers" but somehow the controllers are still left carrying a lot of weight.

Your views are messy
You know that view code should be kept as simple as possible. Yet there's still some code that doesn't seem to fit comfortably in the view layer, nor the controller model layers, nor anywhere else. You've tried using helpers, and that helps some, but it also creates a mess of its own.

Your app generally lacks structure
Your Rails app feels like an amorphous blob, a heap of parts. You easily can't find what you want when you want to find it. When you add new code, it's unclear where to put it, so you just add it to one of the growing piles that are lying everywhere. Your team has tried to fix the problem by adding an app/services directory, and that helps a little, but mostly it just moves the mess around.

While it might not be that bad, we currently tend towards that direction with ever-growing models that act more like a dumping ground. Let's tackle this issue. It's a big one of course and will probably need a lot of time to solve, if it's ever to be solved. It's rather a continuous improvement I guess.

I don't think there's this one recipe that you follow to achieve a super clean and understandable codebase. See also Rails can only take you so far. Apps are too different and operate in various domains that they need different "solutions". Instead, let's collect many resources here to learn about new design patterns and experience from other people that deal with big code bases. I don't think buying Jason's book is necessary (it is not that cheap in the end) and am certain that we can gather a good pool of websites, blog posts etc. to move forward.

Feel free to edit my comments down below to add links you find useful.

@Splines
Copy link
Member Author

Splines commented Sep 17, 2024

🎯 Clean Architecture

Ever since I've learned about "clean architecture", I fell in love with this concept and employed it in my own projects. The project structure and the philosophy, as well as the design patterns, really helped separate all those concerns of my code. While it might be too much effort to rejig MaMpf to use this kind of architecture, it might be still worth to check it out and borrow some ideas.

  • Using DDD and Clean Architecture With Rails / November 2021
    One brave soul described how they went all in and used Clean Architecture in Ruby on Rails, even working around the Rails conventions and trying to use the best of both worlds. Seems like a nice "trying things out" endeavor, but way too much effort to adapt this in our code as this is changing things at the roots. It also looks like a lot of pain to work around Rails conventions.
  • Ways of approaching Clean Architecture & Why take a Clean Architecture approach to Rails? / December 2019
    Two very short posts about how there could be a middle ground: use Rails niceties but also sprinkle in some clean architecture here and there: A more friendly approach for Rails developers is to use more Rails idioms. There are plenty of idioms that match up with the entity, use case and gateway classes from Clean Architecture. Also a note on service objects, which I think is written very well.
  • Clean Architecture / November 2011.
    A short manifesto to use Clean Architecture by Uncle Bob himself.

This isn’t rocket science. The basic idea is very simple. You separate the UI from the business rules by passing simple data structures between the two. You don’t let your controllers know anything about the business rules. Instead, the controllers unpack the HttpRequest object into a simple vanilla data structure, and then pass that data structure to an interactor object that implements the use case by invoking business objects. The interactor then gathers the response data into another vanilla data structure and passes it back to the UI. The views do not know about the business objects. They just look in that data structure and present the response. There are, of course, more details than that; and they are well described in the references above. But at the bottom, that’s all there is to it.

Other dissenting (or perhaps a better word is “skeptical”) views have been less formal. One person simply asked me: “Have you ever actually done this – in a Rails project”, as if Rails was somehow special and changed the game so much that the normal rules of good design don’t apply.

  • Usually, the database is split from the domain in clean architecture. I'm certain we don't want to do this as this would be overkill. ActiveRecord works just fine in my opinion.

  • 37Signals: Code I like

Actual examples

  • See e.g. Paul's and my ResultWizard project.
  • At uni, we did a course on Clean architecture and were supposed to make a project. In my documentation, I also explain many concepts of clean architecture and domain-driven design. (Note that the images are currently broken, I've opened an issue on the plantuml GitHub repo for that).

@Splines
Copy link
Member Author

Splines commented Sep 17, 2024

📂 Folder structure

In my opinion, the topic of file/folder structure is one of the most important topics to achieve a sustainable codebase. I mean, it's like the house we're living in whenever we work with the MaMpf code, so it should feature a clear separation of what the rooms are for. And it shouldn't take too long to find a room.

To be honest, I feel a bit uneasy with the Rails default folder structure. For example, I find it weird that it separates files by their type and not by the feature they belong to.

  • Organizing Rails files by meaning / October 2022
    This one explains at a high-level what I mean.
  • Folder structure / September 2021
    This one goes more in-depth with actual examples. I like that they find a good balance: still the rigorous MVC separation (that I also like), but then subfolders to cluster according to meaning/feature. I don't like every idea there (e.g. the naming for the web folder is a bit weird IMHO), but I think this is a great resource to take inspiration from.
  • An extensive example of project organization / May 2020
    I didn't get any value reading this and didn't like the ideas presented there.

@Splines
Copy link
Member Author

Splines commented Sep 17, 2024

💫 Design patterns / Refactoring

We don't have to reinvent the wheel: there are many useful design patterns already out there that could be useful. But I feel like there is sometimes an over-emphasis on these patterns; more often than not some plain old Ruby classes should work just fine to create cohesive modules with low coupling between them.

  • Refactoring Guru
    This website is language-agnostic. I've been looking at it for many years and it's always amazing. Among others, it features a catalogue of design patterns, with concise examples and really beautiful visualizations. If you want to know what patterns there are, this is the go-to reference. It also features Refactoring techniques.
    But note that design patterns in general are quite a bit "hyped" on the web IMHO. Clear naming for methods, variables, files etc., docstrings, tests, file structure etc. are also important. Generally, I like to think about "communicate this or that as clear as possible" and everything else follows from that, e.g. having files that are very limited in scope (single-responsibility principle) etc.
  • 7 Patterns to Refactor Fat ActiveRecord Models October 2012, but timeless
    This one hooked me right from the start with precise descriptions. And it feels easy to implement, most of these things are really just POROs.

Concerns

Later, however, you find that working with the model class isn’t any easier. In fact it’s worse. All the complicated dependencies and interactions still exist, but now they are spread across multiple files, making them even harder to understand.

But, this blog post isn't about convincing you not to use them; it is about explaining the reasons why I don't use them. I prefer using Ruby's built-in, more flexible include/extend syntax.

Service Objects

  • Why Service Objects are an Anti-Pattern. This one is really great as it explains every major statement with an example and then goes on to present valuable alternatives to service objects and implements them. This article is "pro concerns" and here you see a very good example of how concerns could be used in a reasonable way. You might want to skip the first paragraphs and start with the one that begins with "We have a pretty simple data model".

@Splines
Copy link
Member Author

Splines commented Sep 18, 2024

▶ Concrete ideas on how to proceed in our codebase

These are just some ideas to make models smaller, more cohesive and reduce their inter-dependency (if there is any?).
These might be very different approaches, I just want to write them down such that I don't forget. Nothing set in stone yet.

  • Start inside-out, e.g. take a big model like User. Identify groups of methods/scopes that semantically belong together, e.g. by identifying common nouns. Try to put those in a separate file, use patterns and Rails niceties. Start here and make changes in other layers if necessary. But also note this contra opinion on filesize. Maybe also useful: concerned_with
    PRs should be very small: literally just handle this one semantic group and change nothing else on the user object. This way, we can also progressively write unit and maybe even more importantly full integration/system tests that cover whole user workflows / use cases. And it's a lot easier to review than if the whole user model was refactored at once.
  • One very specific case in the User model: we have methods like tutor?, teacher? etc. Those are all role-related, so it might be worth to have a separate class/... for User roles. This could be one way to make the User class smaller.
  • 🙄 To be honest: the more I read about design patterns, architectural patterns, blog posts etc. the more I feel overwhelmed. There's so many approaches out there and I haven't had a breakthrough yet on how to magically have a more structured codebase. With the VSCode search, I'm already quite fast and find everything I need in the codebase. And I feel like I know my way around the code and what to find in which folders, so it's probably not that bad after all.
  • Maybe use a command-line tool to find the biggest files in subfolders.
  • ✅ Whenever we refactor anything in the sense of this issue, write a test for it. This will help us to understand the changes and make sure it really does what we want. It will also help onboard new developers later on as they can look at the tests to see how things are used.
  • A UML diagram of models and their relationships might help. See Init support for entity-relationship diagram creation (Ruby ERD) #697.

Pain points to address

  • I think the biggest pain point is the non-DRYness of our controllers and models. Especially the models have so many different behaviors and controllers also include too much business logic. Maybe one approach could be to create subfolders in models and split the big models into multiple classes in these subfolders. Every file should only deal with one very specific thing (single-responsibility principle) and be as cohesive as possible. It might be tempting to just move functionality into separate files, but this good post argues against it.
  • JavaScript code is currently present in app/assets but also in the views/ subfolders. For me, this is very annoying as I have to constantly jump between folders that may live far apart from each other. Why not have something like a frontend folder (or web or whatever) with subfolders that reflect use cases. There, we can put the .html.erb "views" (templates) as well as the javascript files and CSS files. Whenever I work in the frontend, I will mostly likely touch the trinity of HTML, JS and CSS, so I feel like they could also live together.
  • In the MaMpf user interface, we have a very clear distinction between the normal user view and the view for users with elevated rights. We even separate this by a change of color in the top title bar, so it's a big thing. However, I don't see this reflected anywhere in the code. Taking inspiration from here, why not introduce respective subfolders?

@fosterfarrell9
Copy link
Collaborator

fosterfarrell9 commented Sep 23, 2024

Thank you for all your work here!
Since it has come up in the review of #671, I think we should see if we can agree on some concept on how to proceed here, in order that new PRs can already be written in a way that follows our ideas. I totally agree with you on this point (which is in my mind is also the most pressing one):

I think the biggest pain point is the non-DRYness of our controllers and models. Especially the models have so many different behaviors and controllers also include too much business logic. Maybe one approach could be to create subfolders in models and split the big models into multiple classes in these subfolders. Every file should only deal with one very specific thing (single-responsibility principle) and be as cohesive as possible. It might be tempting to just move functionality into separate files, but this good post argues against it.

As I did in our basecamp pings, I want to contribute this link, henceforth known as A. I personally think that using the things that Rails already offers (instead of completely overworking everything like the person in the first DDD link) would be a good way to go, also considering our limited resources. In the links above there are two links which are on different ends of the spectrum regarding this (in particular towards concerns, namely this one, henceforth known as B and the one I linked above. I personally have a preference for the archicture described in A, but I would love to hear your opinion on this, and I still can be convinced of B (also, B offers some concepts that we can use as well even if we follow A's way concerning concerns).

This would be relevant already in #671 as it currently introduces a service model VoucherProcessor. This could be discussed in #671 as well but I think it provides a nice example of what we want to achieve in #694 so I will discuss it here.

If I understand the point of view of A correctly (please correct me if I am wrong here), it would be better to do it this way (or in a similar way):

# app/models/voucher.rb
class Voucher < ApplicationRecord
  include Redeemable
  ...
end
# app/models/voucher/redeemable.rb
module Voucher::Redeemable
  extend ActiveSupport::Concern

  included do
     has_many :redemptions, dependent: :destroy
  end

  def redeem(...)
    # what is currently in the VoucherProcessor
  end
end

such that in the redeem action in the VouchersController we would directly interact with the Voucher model by something like voucher.redeem(...) instead of VoucherProcessor.call(voucher, current_user, check_voucher_params).

Refactoring other classes could then follow a similiar pattern just as you described (for A, it would mean concerns).

@Splines
Copy link
Member Author

Splines commented Oct 3, 2024

I personally think that using the things that Rails already offers (instead of completely overworking everything like the person in the first DDD link) would be a good way to go, also considering our limited resources.

I agree that OOP is a tool we shouldn't underestimate. Creating POROs can greatly help orchestrating everything while keeping it DRY. Citing resource A: "Delegating functionality to additional systems of objects (AKA using plain object-oriented programming)".

And yes: going full into DDD with clean architecture is not feasible for us with the already existing code. It might still be a valuable source for inspiration and we might borrow very specific things from it.


A vs. B

💠 Concerns

The main point of B that rebels against A is this paragraph:

Prefer composition to inheritance. Using mixins [concerns] like this is akin to “cleaning” a messy room by dumping the clutter into six separate junk drawers and slamming them shut. Sure, it looks cleaner at the surface, but the junk drawers actually make it harder to identify and implement the decompositions and extractions necessary to clarify the domain model.

This is countered in A by:

[...] you must differentiate between SRP violations at the interface level or the implementation level:

The SRP violation we care more about is violation at the implementation level. Plainly put, we care whether the class really does all of that stuff or whether it just delegates to a couple of other classes. If it delegates, we don’t have a large monolithic class; we just have a class that is a facade, a front end for a bunch of little classes and that can be easier to manage.

So B argues that it's just a "facade", we clean the room by putting the dirt into smaller bins (e.g. using concerns aka "mixins"). On the contrary, A argues that this is exactly what makes looking at the room nicer. Bins can be labeled and then it's easier to work with those smaller pieces inside a cleaner room.

The distinction can be really subtle. I find that pattern 6 (policies) in B provides a good example:

class ActiveUserPolicy
  def initialize(user)
    @user = user
  end

  def active?
    @user.email_confirmed && @user.last_login_at > 14.days.ago
  end
end

We could easily model this exact same thing using concerns. The only real difference would be that using concerns, we wouldn't pass the user variable to another independent object (here of type ActiveUserPolicy). Instead, using concerns, the active? method would actually be callable directly on the user object itself, which could clutter its public API. The question is if that really bothers us. At least, with Rails concerns, we still have a separation into another file. So, on the implementation level, User and a possible concern similar to ActiveUserPolicy are independent, but not so much during the runtime/architectural point of view where one file is kind of included in the other. But if the method name like active? is really clear, why not have it in the scope of the user object.

For a simple example as this one here, I'd prefer the point of view of B where we have a really independent PORO: both on the implementation level and the "runtime" level as it gets passed the user as parameter. Also for number 1 in B, this would be my go-to approach, e.g. extracting simple value objects such as the presented class Rating.

But what do you do if you need more than a simple value object and instead access data from the database? You might need some has_many statements, but then you can't just easily outsource from a big model to a new class. I feel like article B is falling short for these cases, whereas A is providing more real-life examples, e.g. the module Recording::Copyable. I'm not sure how this scenario could be tackled with mere B's available tools. The question is how can you outsource methods that semantically belong together (in new class), but that new class needs access to the database and also shares some data with the original class where it came from. Here, concerns could help allowing us to at least put code in separate files.

Tip

In my opinion, the better approach would be to first look at the infrastructure level and think about whether we can't outsource some parts of a class into a completely separate "thing" (PORO or ActiveRecord) that might stand with its own right (which might include adding another table in the DB). If that doesn't work, one can still resort to concerns in order to extend an existing thing by means of another file.

Just one point that I don't like in the example in A:

class Recording < ApplicationRecord
  include Incineratable, Copyable
end

module Recording::Incineratable
  def incinerate
    Incineration.new(self).run
  end
end

For module Recording::Copyable, I understand that they use concerns as we don't only outsource the method, but also the has_many :copies, ... statement. But for Recording I see no benefit. The incinerate method is just delegating to yet another object (here of type Recording::Incineration). So why not just put def incinerate ... as-is into the original Recording class? I don't see any benefit of employing concerns here (but for the Recording::Copyable I still do).

💠 Services

I feel that A & B somehow agree on the point that we shouldn't resort to services too hastily. From A:

Don’t lean too heavily toward modeling a domain concept as a Service. Do so only if the circumstances fit. If we aren’t careful, we might start to treat Services as our modeling “silver bullet.” Using Services overzealously will usually result in the negative consequences of creating an Anemic Domain Model, where all the domain logic resides in Services rather than mostly spread across Entities and Value Objects.

And B (point 2):

I reach for Service Objects when an action meets one or more of these criteria:

  • The action is complex (e.g. closing the books at the end of an accounting period)
  • The action reaches across multiple models (e.g. an e-commerce purchase using Order, CreditCard and Customer objects)
  • [...] The action is not a core concern of the underlying model (e.g. sweeping up outdated data after a certain time period). [...]

For me that means, think twice when using a concept that resembles a service. Maybe a simple value object (or something else?) might fit better. But if an action is complex enough or even reaches across multiple models, we might consider using a service.

Now A strains away form calling their PORO objects (that are kind of services) by that name: "service". For me it's still kind of like a service, but that's just naming. Whatever you might call it, I like A's idea of not treating services in a special way (for me that also means not necessarily putting them in a dedicated services folder):

We don’t use services as first-class architectural artifacts in the DDD sense (stateless, named after a verb), but we have many classes that exist to encapsulate operations. We don’t call those services and they don’t receive special treatment. We usually prefer to present them as domain models that expose the needed functionality instead of using a mere procedural syntax to invoke the operation.

See A's signup example
class Projects::InvitationTokens::SignupsController < Projects::InvitationTokens::BaseController
  def create
    @signup = Project::InvitationToken::Signup.new(signup_params)

    if @signup.valid?
      claim_invitation @signup.create_identity!
    else
      redirect_to invitation_token_join_url(@invitation_token), alert: @signup.errors.first.message
    end
  end
end

class Project::InvitationToken::Signup
  include ActiveModel::Model
  include ActiveModel::Validations::Callbacks

  attr_accessor :name, :email_address, :password, :time_zone_name, :account

  validate :validate_email_address, :validate_identity, :validate_account_within_user_limits

  def create_identity!
    # ...
  end
end

So instead of having a SigningUpService in charge of the “signing up” domain operation, we have a Signup class that lets you validate and create an identity in the app.

The signup is modeled as PORO object and then instantiated. It uses include statements to import relevant ActiveModel "things", e.g. to be able to call validate. Now compare to B's approach (point 2):

See B's authentication example
class UserAuthenticator
  def initialize(user)
    @user = user
  end

  def authenticate(password)
    return false unless @user

    if BCrypt::Password.new(@user.password_digest) == password
      @user
    else
      false
    end
  end
end

class SessionsController < ApplicationController
  def create
    user = User.where(email: params[:email]).first

    if UserAuthenticator.new(user).authenticate(params[:password])
      self.current_user = user
      redirect_to dashboard_path
    else
      flash[:alert] = "Login failed."
      render "new"
    end
  end
end

I don't see a big difference here. In A, the controller stores the @signup variable and then calls methods on it. We do the same in B (but instead of storing it as local variable, we call authenticate directly on the object). But that's not an important difference here. Maybe B's approach is a bit closer to what you have implemented so far as VoucherProcessor. I strongly agree with A that we shouldn't treat these "services" in a special way, which means (for me) not calling them Processor and not introducing a special call method. Instead, just POROs would be better in my opinion with names that make sense for the specific scenario, e.g. here VoucherRedeemer or something similar.

But this is actually a great example where stepping back for a moment yields another possibility, namely the one you proposed. Without any "service", but instead using concerns.

Your proposed approach with concerns instead.
class Voucher < ApplicationRecord
  include Redeemable
  ...
end

module Voucher::Redeemable
  extend ActiveSupport::Concern

  included do
     has_many :redemptions, dependent: :destroy
  end

  def redeem(...)
    # what is currently in the VoucherProcessor
  end
end

I think this really shows the power of concerns. In this example, Voucher and Redemptions are closely related: redemption is an essential part of a voucher. So I feel it's ok to not model the redemptions far away from the vouchers and instead include them in the vouchers itself. But to avoid too much clutter in the vouchers, we use concerns to split it up to another file. Calling voucher.redeem() feels very natural to me.

Maybe this also shows that we should sometimes just write down how we'd like the statement to roughly look like in the end (regardless of any squiggly lines and warnings you get in the IDE). For example, you might have written voucher.redeem() right from the start here. With that in mind, it can be easier to identify the strategies to use. This is because one sits down to think about a nice-to-read statement, and while doing that, one inevitably has to think through object compositions and their relations to each other. If I had started to implement the voucher redemption, I would have put it directly in the voucher class first, just to realize later that the redemption part could stand on its own. This is just to underline how closely vouchers and redemptions are related.

💠 Summary

In summary, I think the patterns presented in B can be very useful, if you happen to be in the exact or similar situation the examples were made up for. This is always the thing with these guides: there's no one size fits all solution and there won't be, as requirements are just too context-sensitive for the very specific programming project in mind. There's also no definitive "How to write a book" guide as there are just too many genres and everyone has their one styles.

I like the pragmatic approaches taken in A even though they raise my hackles sometimes as it's so different to what I'm used to from (my limited experience with) DDD. But I acknowledge that their approach works as well and that Rails might even be designed in a way to tailor the specific needs of web applications where boundaries of layers can be blurry. And apparently, they navigate the landscape well with their approach. The more I work with Rails, the more I like their pragmatism.

We don’t separate application-level and domain-level artifacts. Instead, we have a set of domain models (both Active Records and POROs) exposing public interfaces to be invoked from the system boundaries, typically controllers or jobs. We don’t separate that API from the domain model, architecturally speaking.

We care a lot about how we design these models and the API they expose, we just find little value in an additional layer to orchestrate the access to them. [...] In other words, we don’t default to create services, actions, commands, or interactors to implement controller actions.

Most of our controllers use this approach of accessing models directly: a model exposes a method, and the controller invokes it.

As this is already our approach (I think?), maybe we can leave that. But if classes get too big, introduce composition within the domain model to outsource to objects on their own. Most of this can probably done within app/models.

Maybe this whole thing is even more about just taking the time to step back and think through component boundaries:

  1. What is this new feature really about? Where do I want to integrate it in the frontend (roughly)? Also see Interface First.
  2. How can I represent the data structure in terms of domain objects? What are the key data fields needed for it?
  3. With what other components of my software will this feature interact? And in which ways? Is it "close" to another component? Even 1:1 relationship? Is it very similar to an already existing component such that generalization would make sense (still prefer composition over inheritance)
  4. Based on all this: where could be good place to settle the domain-related code for this feature? Can we think of simple and succinct names for its public interface. As written in A: care a lot about the public interface of any class. Naming is oftentimes key! If you see this code in one year, do you know what to expect from this method based on just its title?

For existing classes that are too big

  • Identify where (e.g. inside a single class) multiple domain concepts are mixed together. Group them in natural language, e.g. these two methods discuss the same thing, these two attributes discuss another thing etc. Here it can be (again) helpful to look at the frontend and see how boundaries are defined there.
  • Think about how the identified groups interact with each other. Do they stand completely on their own? Perfect, then it's even easier to separate them into distinct files. Start with: is it possible to design their own PORO class for them decoupled from everything else? If that's not possible, what's the minimal set of attributes that needs to be shared with other objects?

As I'm also learning a lot of new things here, I might put forward contradictory arguments. Please point it out to me should I disagree with my own at some places ;)

@Splines Splines changed the title Contain the ever-growing code base Architectural resources to "tame" the codebase Oct 8, 2024
@Splines Splines pinned this issue Oct 8, 2024
Splines added a commit that referenced this issue Oct 8, 2024
According to concepts discussed in #694.
Also auto-load subdirectories in models folder
and set Current.user for usage in models.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants