Skip to content

Subscribe Update Publish

robnagler edited this page Sep 10, 2022 · 1 revision

Definitional GUI

A Definitional GUI is one that is driven off a purely declarative definition of a Sirepo "app". Currently, we use a "schema" to do this, but it is backed up by too many lines of javascript, html, and svg. This needs to change, because our development needs to be scalable.

We have made some great strides towards this with the next generation GUI written React, but we need to go further. Logic already is creeping in which requires denormalization of the schema. This is implicit coupling without assertions, e.g. consider what would happen if friendly is misspelled or changes.

In a definitional GUI, but the server and the client share the exact same description of what the GUI sees. This was the intent of SGML/HTML, but obviously HTML is no longer definitional.

Reactive Dogs

When a dog is friendly in the Sirepo toy app called myapp, the user gets an opportunity to enter a Favorite Treat, that's what the logic noted above does. This is quite typical for Sirepo apps.

We really want our GUI to be "reactive", that is, when we click, it should do something. If you change the Disposition of the dog model to friendly, the Favorite Treat box should pop up. Or, more importantly, when you change a value that invalidates a report, it should clear the report or at least show it no longer matches the data. This is why people prefer Overleaf to running TeX manually, or worse, they use Word instead of a declarative language like TeX to do word processing.

Illusive Reactivity

The Favorite Treat example is trivial a bit of an illusion. Consider the problem of changing the US state in a GUI that also requires you to enter the county, or adding to that, changing the country in an GUI that requires you to enter a valid state/province/canton. Ideally you would allow the user to type the state or select it from a drop down. Most applications don't do that, and instead, let the user enter something invalid and validate on the server. It's not possible to load the entire database of countries and states, which is why most address entry applications used to be so lousy.

Enter Google Maps and similar services that allow you to start typing an address which gets autocompleted, usually just based on the street numbers. This requires the GUI (browser) to hit a server on each character. It's very cool and very fast, and it does not require any coupling in the GUI except that there will be autocomplete choices coming back dynamically as characters are entered.

ReportingDatabase

What makes Google Maps fast is the concept of a ReportingDatabase. When you see discussions about why Google Maps is so fast, people fail to mention this important concept:

The operational needs and the reporting needs are, however, often quite different - with different requirements from a schema and different data access patterns. When this happens it's often a wise idea to separate the reporting needs into a reporting database, which takes a copy of the essential operational data but represents it in a different schema.

The (as always excellent) Martin Fowler article goes on to say:

A reporting database fits well when you have a lot of domain logic in a domain model or other in-memory code. The domain logic can be used to process updates to the operational data, but also to calculate derived data which to enrich the reporting database.

We have a lot of domain logic in the GUI that is reactive, some of which is in the GUI only, e.g. changing the color map of a graph. We also have a lot of domain logic that defines what a report is and how it is generated. Most of that resides on the server, and there's too much in the GUI, which makes it hard to test and understand.

We do not have an official reporting database. Rather, the GUI maintains this in a variety of ways, mostly in appState, which is a cache of all the models. The cache is complex and not centrally synchronized so it is easy for application updates to get lost. Cache updates are manually managed.

MemoryImage

The other thing that makes Google Maps fast is the MemoryImage concept. Google can only respond so quickly by maintaining in-memory images all over the planet.

Fowler writes about MemoryImage:

The key element to a memory image is using event sourcing, which essentially means that every change to the application's state is captured in an event which is logged into a persistent store. Furthermore it means that you can rebuild the full application state by replaying these events. The events are then the primary persistence mechanism.

A familiar example of a system that uses event sourcing is a version control system. Every change is captured as a commit, and you can rebuild the current state of the code base by replaying the commits into an empty directory. In practice, of course, it's too slow to replay all the events, so the system persists periodic snapshots of the application state. Then rebuilding involves loading the latest snapshot and replaying any events since that snapshot.

His choice of example is somewhat interesting in that it is not necessary in Git to have a version history in order to recreate the present: HEAD is always the top of the branch. However, if you want to know about the history, you have to have all commits, and there, you can replay the commits to recreate the history on a new branch, for example.

Our problem is slightly different: we do not need to reproduce history although would be a side-effect of this approach. Our problem is that the data are distributed in many different forms (run dirs, for example), and we do not have a consistent way of describing it. We use background_percent_complete, among many other examples, as a grab bag of data. This anything-goes approach is problematic, naturally.

Eventual Consistency

Modern systems are built around eventual consistency. Sirepo is no different. We submit a simulation request for an animation, and the data approximates what the simulation is actually doing. Somebody else looking at the same running simulation, might have a different view.

In Eventual Consistency is a UX Nightmare, Derek Comartin suggests solutions to the eventual consistency problems for the UI. One of them is this:

Another option is to push to the client once the replication (or projection) has occurred. This could be accomplished with something like WebSockets where you establish a connection from the client to server and then have the server push to the client to notify them the replication (or projection) has occurred. At this point, the client can then perform a query to get the latest data.

Many people suggest this solution: push notification on the web socket followed by a query to get the data. However, why not simply get the data back after registering interest?

Command Query Responsibility Segregation

The Command Query Responsibility Segregation (CQRS) pattern is pretty common on the web. This was popularized by Representational state transfer (REST). There are atomicity issues, e.g. a stack pop can't be done any other way. However, that is not something we support today. We do just the opposite: write the data and then send the cached data to begin a simulation. (The pop stack analogy would be: write and simulate in one operation.)

Fowler writes:

CQRS naturally fits with some other architectural patterns.

  • As we move away from a single representation that we interact with via CRUD, we can easily move to a task-based UI.
  • CQRS fits well with event-based programming models. It's
  • common to see CQRS system split into separate services communicating with Event Collaboration. This allows these services to easily take advantage of Event Sourcing.
  • Having separate models raises questions about how hard to keep those models consistent, which raises the likelihood of using eventual consistency.
  • For many domains, much of the logic is needed when you're updating, so it may make sense to use EagerReadDerivation to simplify your query-side models.
  • If the write model generates events for all updates, you can structure read models as EventPosters, allowing them to be MemoryImages and thus avoiding a lot of database interactions.
  • CQRS is suited to complex domains, the kind that also benefit from Domain-Driven Design.

[Note: Domain-Driven Design link was changed from the original.]

This section was near the beginning of the research that brings us to the point of the present discussion: use CQRS for all these reasons, but instead of Query, use subscriptions.

Schema Tells All

If we stretch the definition of Domain-Driven Design, we can say that our domain experts defined some code which we use as the language for how we talk about Sirepo apps. Specifically, the schema is the Rosetta Stone between the expert and Sirepo. This includes completely identifying all updates and queries, almost. This proposals says: do it all, and use CQRS with subscriptions.

CQRS normally implies that the reporting database is distinct from the application. Fowler writes:

A few years ago I wrote about a couple of systems using an EventPoster architecture. This style provides read access to the in-memory model to lots of UIs for analytic purposes. Multiple UIs mean multiple threads, but there's only one writer (the event processor) which greatly simplifies concurrency issues.

The Sirepo GUI is a reader, but not for "analytic purposes". Rather, the GUI is a display engine for very complex domains. Because those domains are complex, there are lots of tweaks to in the GUI. If we get rid of the tweaks, we can move to a fully definitional model with commands in the CQRS model simply being updates to the models and analysis (runs) persisted on the server.

Subscribe-Update-Publish (SUP)

In this model, reads are subscription-based: the GUI registers interest in a page, which is well defined in the schema. The subscription results in a publication that includes all the data associated with that page: reports, form values, etc. This data is wholly consistent with what's on the server, because it comes from the server. When updates happen to the subscribed-data, they are published to the GUI.

When the page changes, the subscription changes, and the process starts over.

Eventual consistency is resolved as follows:

  • Read-your-writes consistency. This is an important model where process A, after it has updated a data item, always accesses the updated value and will never see an older value. This is a special case of the causal consistency model.
  • Session consistency. This is a practical version of the previous model, where a process accesses the storage system in the context of a session. As long as the session exists, the system guarantees read-your-writes consistency. If the session terminates because of a certain failure scenario, a new session needs to be created and the guarantees do not overlap the sessions.

The Subscribe-Update-Publish (SUP, or 'sup) model is supported by a session on the server that is bound to a GUI instance (browser window running the SPA). Publications occur over a websocket asynchronously. The server maintains a session which holds the websocket and enough data to define the reporting database for this particular instance. If the websocket dies, the GUI needs to reattach to its session, and that may mean something was missed. This can be resolved through Event Sourcing or simply a refresh. There can be optimizations like an update serial that ensures monotonicity.

Updates are completely decoupled: they are sent directly to the server with no "shortcuts" to the GUI, that is, the Favorite Treat field only shows up when the session says it does.

Simulate Start and End

We have had many issues with simulate start and end buttons and the logic around them. This should be simplified. When a user presses simulate start, the event is stored is persisted to the database. However, there's lots of complex business logical around that event.

With SUP, the update from the GUI is simply: start button, not start this simulation with this particular set of data. When the button is pressed (just like Save Changes), the button is disabled. Only when the server publishes the "End Simulation" button does the button change to End Simulation on the GUI.

This means that when a refresh happens, the GUI does not have to infer that a simulation is running by collecting a set of data. It simply gets the "End Simulation" button on a refresh. This might also include the background percent complete, and so. All the logic sits in the server, and easily testable and therefore maintainable.

Conclusion

This model is not new. The [curses library](https://en.wikipedia.org/wiki/Curses_(programming_library) worked this way: updates (character inputs) were completely independent of publications (screen outputs). Curses maintained a buffer (MemoryImage) of what was displayed on the the remote terminal.

This SUP model is much more sophisticated. It must guarantee consistency and liveliness. The consistency is straightforward, because no work is done on the client, it only needs to render publications. Liveliness should be possible over a websocket running at modern internet speeds.

Clone this wiki locally