Skip to content
This repository has been archived by the owner on Feb 15, 2024. It is now read-only.

How can we programatically terminate a session? #13

Closed
atc0005 opened this issue Apr 24, 2020 · 16 comments · Fixed by #30
Closed

How can we programatically terminate a session? #13

atc0005 opened this issue Apr 24, 2020 · 16 comments · Fixed by #30
Assignees
Labels
question Further information is requested session user
Milestone

Comments

@atc0005
Copy link
Owner

atc0005 commented Apr 24, 2020

Blocking a user account can be done by adding the account to a specific flat-file referenced by EZproxy, but historically we've had to login to the admin panel to terminate a session. Is there an API we can use? If we can't kill existing sessions, perhaps we can match against an IP Address and block that instead?

References:

@atc0005 atc0005 added question Further information is requested user session labels Apr 24, 2020
@atc0005
Copy link
Owner Author

atc0005 commented May 3, 2020

The following comments were pulled from an internal issue I opened elsewhere to log some thoughts. Transferring here as they're relevant and (AFAICT) not particularly sensitive in nature.


EDIT: As of this writing, I have yet to hear back from OCLC Support. I have to assume that the outcome will be that we're on our own to come up with a solution.

@atc0005
Copy link
Owner Author

atc0005 commented May 3, 2020

Does blocking an IP require restarting EZproxy?

Perhaps we can write to a config file to block further sessions, but match existing sessions to their IPs and block those IPs?

@atc0005
Copy link
Owner Author

atc0005 commented May 3, 2020

https://help.oclc.org/Library_Management/EZproxy/EZproxy_configuration/Set_limits_for_your_institution

Determines how long in minutes an EZproxy session should remain valid after the last time it is accessed. The default of 120 determines that a session remains valid until 2 hours after the last time the user accesses a database through EZproxy. MaxLifetime is the only setting that is position dependent in config.txt. In normal use, it should appear before the first TITLE line.

If nothing else, the MaxLifetime value could be tweaked further to increase the likelihood that the session times out and is subject to the same block as other abusers.

@atc0005
Copy link
Owner Author

atc0005 commented May 3, 2020

https://help.oclc.org/Library_Management/EZproxy/Configure_resources/RejectIP

Perhaps have the main EZproxy config file pull in an include file of rejected IP Address entries?

The logic could perhaps check to see when a user was blocked and if it was X minutes past, add a RejectIP entry in order to force the user session to terminate. After X minutes, the IP could be removed and the blocked/disabled user account entry would serve to prevent repeat abuse for the existing account. The temporary IP block would limit potential Denial of Service to legitimate users of the system.

@atc0005
Copy link
Owner Author

atc0005 commented May 3, 2020

Workflow (scratch notes, some logic debugging needed):

  1. EZproxy logs activity
  2. Splunk ingests updates to general log messages file
  3. Splunk ingests updates to disabled user file
  4. Splunk thresholds for general usage tripped, alert submitted via JSON payload
  5. brick web app processes request, writes out user account to disabled users file
  6. brick web app records the event metadata, likely to a local database
  7. EZproxy sees update to file, blocks new logins for specified user account
    • existing sessions are unaffected at this point
  8. Splunk ingests updates to general log messages file
  9. Splunk ingests updates to disabled user file
  10. Splunk thresholds for disabled user activity tripped, alert submitted via JSON payload
    • this would be for activity associated with user accounts in the disabled file X minutes past the ingest time
    • potentially submitted to a different endpoint on brick web app since the goal/logic would be different than for the initial block alert payloads
  11. brick web app writes associated IP Address of offending user to temporary.blocked.ips.txt (or whatever name)
    • this event could occur multiple times for each offending IP Address up to Limit (EZproxy setting for maximum concurrent sessions per account) times, resulting in Limit number of temporary IP blocks
  12. brick records the block somewhere (likely a local database)
  13. brick web app restarts EZproxy (if necessary)
  14. EZproxy sees new IP rejection config settings, blocks all connections from that IP
    • TODO: Does this also trigger intruder logic? If so, how would we clear intruder blocks once we remove an explicit rejection?
  15. Splunk ingests blocked IP Address include file update
    • this could be useful for Network Security team, or us if we need to check against our other systems
  16. EZproxy: All existing user sessions for blocked IP timeout are closed/expired
  17. brick (on an internal timer) re-evaluates blocked IP entries and disables them after a set time
    • brick records this event also

At this point legitimate users (other than the blocked user account) are mostly unaffected as we no longer have temporary IP blocks for the specific user account in place.

@atc0005 atc0005 pinned this issue May 3, 2020
@atc0005
Copy link
Owner Author

atc0005 commented May 3, 2020

@auadamw How does the workflow in #13 (comment) sound?

We would potentially need to ingest +1 more files in order to handle this, and would probably want to add an ingest for one more file past that (so +2 overall).

We might also drop the MaxLifetime value further than we already have it (not mentioning that specific detail here) to help with this as well, though if we get the logic working as specified in #13 (comment) it wouldn't matter as much if we use a temporary IP block to force session expiration.

The assumption there is that the bulk of the unauthorized behavior would occur during early morning or off-hours and any IP block triggered would minimally impact legitimate users.

@atc0005
Copy link
Owner Author

atc0005 commented May 3, 2020

I gave this some more thought and I think we can use an existing tool to lighten the development time/costs for the initial implementation: fail2ban.

Modified workflow below.


  1. EZproxy is running and logging usage activity
  2. Splunk agent is running and monitoring for updates
    • looking for and ingesting any new updates to general log messages file
    • looking for and ingesting any new updates to disabled user file
  3. brick web app is running with several log files open
    • log file for user account block requests
    • log file for IP Address block requests
  4. fail2ban is running and monitoring brick web app log file for IP Address block requests
  5. Splunk server thresholds for general usage tripped (based on general log messages file), alert submitted via JSON payload
  6. brick web app processes request, writes out user account to disabled users file
  7. brick web app records the event to a local log file
  8. EZproxy sees update to file, blocks new logins for specified user account
    • existing sessions are unaffected at this point
  9. Splunk agent ingests updates to general log messages file
  10. Splunk agent ingests updates to disabled user file
  11. Splunk server thresholds for disabled user activity tripped, alert submitted via JSON payload
    • this would be for activity associated with user accounts in the disabled file X minutes past the ingest time
    • potentially submitted to a different endpoint on brick web app since the goal/logic would be different than for the initial block alert payloads
  12. brick web app writes log message to IP Address block requests log file
    • this event could occur multiple times for each offending IP Address up to Limit (EZproxy setting for maximum concurrent sessions per account) times, resulting in Limit number of IP Address block requests from Splunk server
  13. fail2ban sees each new entry in IP Address block requests log file
  14. fail2ban blocks each offending IP Address for a configured amount of time slightly greater than the MaxSession limit in EZproxy
  15. Splunk agent ingests brick web app IP Address block requests log file (optional)
    • this could be useful for Network Security team; the fact that we opted to block the IP is an event they're probably interested in tracking
    • this could also prove useful if we need to check against our other systems for related activity
  16. EZproxy: All existing user sessions for blocked IP timeout are closed/expired
  17. fail2ban expires the temporary IP block
  18. At this point legitimate users (other than the blocked user account) are mostly unaffected as we no longer have temporary IP blocks for the specific user account in place.

This allows the brick web app to focus exclusively on:

  • parsing Splunk "server" requests
  • writing disabled user file entries
  • writing IP Address block request log entries

Direct/measurable upsides:

  • brick web app
    • reduces the complexity of the brick web application (quite a bit)
    • should help with debugging problems
  • EZproxy
    • does not require modifying "main" config settings in EZproxy
    • does not require restarting EZproxy to block IPs
      • an automated restart based on and combined with config-level changes is risky
  • fail2ban
    • not reinventing existing/available functionality
    • well known (e.g., support from community)
    • time-tested
    • local familiarity (I've worked with it off/on for several years now)

@atc0005
Copy link
Owner Author

atc0005 commented May 3, 2020

Further refinement:

  • Splunk ingests only the traffic logs like it is doing now via local agent
  • Splunk sends alerts via JSON payloads to a single endpoint like we've discussed previously
    • no need for a second endpoint, second alert or second ingest (though this might still be useful for a later point in time)
  • brick logs both the block request and the IP Address (two separate files) each time a payload is delivered by Splunk
  • fail2ban blocks the logged IP Address for MaxLifetime + some small additional amount of time

This workflow should allow for the live sessions to timeout, the user account to be blocked and new sessions to be blocked.

@atc0005
Copy link
Owner Author

atc0005 commented May 4, 2020

I missed a response from OCLC Support on April 28th (just found it). Snippet of that response (leaving out the support tech's information):

Thank you for contacting OCLC Product Support.

Unfortunately, EZproxy does not currently have the mechanism to terminate sessions with given attributes.
There is only limitations on session duration and user permissions based on your authentication method.

To secure your EZproxy server from suspicious login activity, you may set IntruderIPAttempts. This will set any restrictions based on the directive definitions when logging in.
For more details, see the following documentation: https://help.oclc.org/Library_Management/EZproxy/Configure_resources/IntruderIPAttempts

Let me know if I can be of any further assistance on this request.

My response back:

Hi,

Sorry for my late response (I just found this email).

We're already using the IntruderIPAttempts directive to block user accounts based on thresholds that seem to work fairly well for the most egregious abuse (bots, "rage" logins when users forget passwords, etc).

Our use case in this discussion is tied to an external indicator that an account has been abused, so we're looking to shutdown a specific user account based on external tooling. After some research, it is beginning to look our solution will be to (automatically) block the associated IP at the host firewall level long enough for the EZproxy MaxLifetime value to be reached and force the session to timeout. That host firewall entry could then be (automatically) unblocked a short time later. Combined with adding the associated user account to a flat-file (or AD group or ...), this should prevent new sessions for the associated account.

Any major flaws in that plan?

@atc0005 atc0005 self-assigned this May 7, 2020
@atc0005 atc0005 added this to the v0.1.0 milestone May 7, 2020
@atc0005
Copy link
Owner Author

atc0005 commented May 7, 2020

Leaving out some emails in the thread, but received one back today confirming that the recommended RejectIP directive does require restarting EZproxy in order to take effect. That makes that option less desirable due to that requirement (e.g., if we accidentally introduce a config error we could bring down the service).

@atc0005
Copy link
Owner Author

atc0005 commented May 11, 2020

brick web app is running with several log files open

  • log file for user account block requests
  • log file for IP Address block requests

Note to self: Unfortunately I don't recall why I suggested two log files; fail2ban is perfectly capable of (and is intended for) monitoring log files based on patterns. We log to one log file and fail2ban parses the entries looking for a pattern that it has been configured to take action on. When it finds the pattern, it extracts the IP Address and blocks it (temporarily, or "permanently") as indicated previously.

At this point I'm trying to figure out exactly how the log messages will be recorded. I've got the templates setup, I figured out when logging occurs, but now I'm trying to figure out when fail2ban will be triggered.

Some misc notes below as I think out loud.

Every payload received ...

  • is intended as a "block this user" request from Splunk
  • is logged with an indicator that the user was reported
  • is logged a second time indicating whether an action was taken to disable the user account or ignore it or the associated IP Address (based on the presence in an ignore file)

It is tempting to have fail2ban look at the reported log entries, but that removes the ability to ignore specific usernames or IP Addresses and still log that the payload was received.

@atc0005
Copy link
Owner Author

atc0005 commented May 11, 2020

Reminder to self:

fail2ban requires a timestamp in the source file to determine when the event occurred. This is needed (if I recall correctly) so that it can tell where it last processed. I don't recall if this is also so it can tell when an IP was last blocked (I believe that this state is tracked elsewhere in case the origin log file rotates, etc).

Regardless, a list of bare IPs for fail2ban to process is probably not advisable for numerous reasons, though it could be useful to sysadmins who wish to quickly remove a blocked IP. I think for that we'll need to use Ansible or become comfortable using "recipes" to unblock fail2ban-blocked IPs as needed (which is doable to begin with).

@atc0005 atc0005 unpinned this issue May 19, 2020
@atc0005
Copy link
Owner Author

atc0005 commented May 22, 2020

Finished giving a demo earlier to our team where we were given the "go ahead" to install this application on our test EZproxy server for real world testing. After I happened to search GitHub for "ezproxy" and found this project:

https://github.com/calvinm/ezproxy-abuse-checker

which has a Perl script named block_user.pl with this block close to the end of the file:

if ($block_session) {
	system("/opt/ezproxy/ezproxy kill $block_session");
}

Strange, but that looks like they're able to terminate a user session using built-in EZproxy functionality. This is the support that the OCLC Support rep told me wasn't available. I suspect the tech I spoke with honestly didn't know about the feature, and it's possible that it's not even documented well.

Will dig further.

@atc0005
Copy link
Owner Author

atc0005 commented May 22, 2020

Strange, but that looks like they're able to terminate a user session using built-in EZproxy functionality. This is the support that the OCLC Support rep told me wasn't available. I suspect the tech I spoke with honestly didn't know about the feature, and it's possible that it's not even documented well.

Will dig further.

OCLC Support misunderstood what I wrote back, so I took our test EZproxy instance and did some testing. The net result is that I was able to retrieve my own login session ID from two different locations:

  • EZPROXY_INSTALL_PATH/ezproxy.hst
    • S SESSION_ID OTHER_STUFF
  • EZPROXY_INSTALL_PATH/audit/YYYYMMDD.txt
    • Login.Success
    • Login.Success.Relogin

Further testing would be needed to determine if both files can provide the session ID reliably.

I then went through the steps to confirm that I could terminate my session using the retrieved ID:

$ sudo ./ezproxy kill
Session must be specified

$ sudo  grep -E '^S ' ezproxy.hst
S SESSION_ID_HERE REDACTED

$ sudo ./ezproxy kill SESSION_ID_HERE
Session SESSION_ID_HERE terminated

SESSION_ID_HERE is a placeholder for the real session ID, which I've omitted from the example output.

@atc0005
Copy link
Owner Author

atc0005 commented May 23, 2020

Wrapping up initial release for v0.1.0 today/tomorrow. Going to leave this issue open for further research/testing with the goal of deciding on a "final" (as much as anything can be final) direction for the next release.

atc0005 added a commit that referenced this issue May 23, 2020
Features of the initial prototype release:

- Highly configurable (with more configuration choices to be exposed
  in the future)

- Supports configuration settings from multiple sources
  - command-line flags
  - environment variables
  - configuration file
  - reasonable default settings

- Ignore individual usernames (i.e., prevent disabling listed accounts)
- Ignore individual IP Addresses (i.e., prevent disabling associated
  account)

- User configurable logging settings
  - levels, format and output

- Microsoft Teams notifications
  - generated for multiple events
    - alert received
    - disabled user
    - ignored user
    - ignored IP Address
    - error occurred
  - configurable retries
  - configurable notifications delay in order to respect remote API
    limits

- Logging
  - Payload receipt from monitoring system
  - Action taken due to payload
    - username ignored
      - due to username inclusion in ignore file for usernames
      - due to IP Address inclusion in ignore file for IP Addresses
    - username disabled

- contrib files/content provided to allow for spinning up a demo
   environment in order to provide a hands-on sense of what this
   project can do
  - fail2ban
  - postfix
  - docker
    - Maildev container
  - brick
  - rsyslog
  - systemd
  - sample JSON payloads for use with curl or other http/API clients
  - demo environment doc
  - slides from group presentation/demo

Worth noting:

- Go modules (vs classic GOPATH setup)
- GitHub Actions Workflows which apply linting and build checks
- Makefile for general use cases (including local linting)
  - Note: See README first if building on Windows

refs:

- GH-26
- GH-21
- GH-16
- GH-15
- GH-13
- GH-12
- GH-11
- GH-7
- GH-6
- GH-4
- GH-1
atc0005 added a commit that referenced this issue May 23, 2020
Features of the initial prototype release:

- Highly configurable (with more configuration choices to be exposed
  in the future)

- Supports configuration settings from multiple sources
  - command-line flags
  - environment variables
  - configuration file
  - reasonable default settings

- Ignore individual usernames (i.e., prevent disabling listed accounts)
- Ignore individual IP Addresses (i.e., prevent disabling associated
  account)

- User configurable logging settings
  - levels, format and output

- Microsoft Teams notifications
  - generated for multiple events
    - alert received
    - disabled user
    - ignored user
    - ignored IP Address
    - error occurred
  - configurable retries
  - configurable notifications delay in order to respect remote API
    limits

- Logging
  - Payload receipt from monitoring system
  - Action taken due to payload
    - username ignored
      - due to username inclusion in ignore file for usernames
      - due to IP Address inclusion in ignore file for IP Addresses
    - username disabled

- contrib files/content provided to allow for spinning up a demo
   environment in order to provide a hands-on sense of what this
   project can do
  - fail2ban
  - postfix
  - docker
    - Maildev container
  - brick
  - rsyslog
  - systemd
  - sample JSON payloads for use with curl or other http/API clients
  - demo environment doc
  - slides from group presentation/demo

Worth noting:

- Go modules (vs classic GOPATH setup)
- GitHub Actions Workflows which apply linting and build checks
- Makefile for general use cases (including local linting)
  - Note: See README first if building on Windows

refs:

- GH-26
- GH-21
- GH-16
- GH-15
- GH-13
- GH-12
- GH-11
- GH-7
- GH-6
- GH-4
- GH-1
atc0005 added a commit that referenced this issue May 23, 2020
Features of the initial prototype release:

- Highly configurable (with more configuration choices to be exposed
  in the future)

- Supports configuration settings from multiple sources
  - command-line flags
  - environment variables
  - configuration file
  - reasonable default settings

- Ignore individual usernames (i.e., prevent disabling listed accounts)
- Ignore individual IP Addresses (i.e., prevent disabling associated
  account)

- User configurable logging settings
  - levels, format and output

- Microsoft Teams notifications
  - generated for multiple events
    - alert received
    - disabled user
    - ignored user
    - ignored IP Address
    - error occurred
  - configurable retries
  - configurable notifications delay in order to respect remote API
    limits

- Logging
  - Payload receipt from monitoring system
  - Action taken due to payload
    - username ignored
      - due to username inclusion in ignore file for usernames
      - due to IP Address inclusion in ignore file for IP Addresses
    - username disabled

- contrib files/content provided to allow for spinning up a demo
   environment in order to provide a hands-on sense of what this
   project can do
  - fail2ban
  - postfix
  - docker
    - Maildev container
  - brick
  - rsyslog
  - systemd
  - sample JSON payloads for use with curl or other http/API clients
  - demo environment doc
  - slides from group presentation/demo

Worth noting:

- Go modules (vs classic GOPATH setup)
- GitHub Actions Workflows which apply linting and build checks
- Makefile for general use cases (including local linting)
  - Note: See README first if building on Windows

refs:

- GH-26
- GH-21
- GH-16
- GH-15
- GH-13
- GH-12
- GH-11
- GH-7
- GH-6
- GH-4
- GH-1
@atc0005 atc0005 modified the milestones: v0.1.0, Future May 24, 2020
@atc0005
Copy link
Owner Author

atc0005 commented May 24, 2020

Wrapping up initial release for v0.1.0 today/tomorrow. Going to leave this issue open for further research/testing with the goal of deciding on a "final" (as much as anything can be final) direction for the next release.

Spun off GH-31 for that research instead of dragging this existing issue across milestones. Leaving this at the v0.1.0 milestone since the bulk of the work/notes reflects the fail2ban implementation direction.

@atc0005 atc0005 closed this as completed May 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested session user
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant