A repository and namespace for STIX custom objects and extensions, as well as proposals for how they could be used. Because we see opportunities to expand the scope of STIX's influence in infosec, naturally.
Built for discussion and evaluation, to try to find any devils in the details. Please open an issue or submit a pull request if you're interested in contributing.
Caveat: This is alpha content, not (yet) intended for production use. However, all customizations should be compliant with the STIX 2.1 spec, so it shouldn't break anything to try it out.
-
Support wider adoption of STIX data abstractions in detection and response.
A small expansion of observable types will go a long way towards addressing the applicability of STIX to detection and response use-cases.
-
Support improved data source documentation and relationships in MITRE ATT&CK.
Describing object types (vs. just instances) as first-class STIX domain objects lets us attach useful metadata (like data sources!) and empowers more flexible relationships.
-
Stay within the STIX spec and keep customizations minimal to reduce standards proliferation.
The Structured Threat Information Expression (STIX) is an open specification for sharing cyber threat intelligence (CTI) maintained by the OASIS CTI technical committee (TC). STIX provides consistent abstractions and a prescriptive exchange format for cyber threat activity, and it's the canonical format for the popular MITRE ATT&CK framework.
We love STIX, but we've run into two limitations:
-
STIX v2.1 chose to limit its catalog of observable object types, omitting certain types useful for detection and response.
A STIX working document captured this comment by Rich Struse:
"while I understand that organizations want and need to track specific assets that were impacted by an event, I find it hard to imagine the general-purpose use-case where organizations are sharing such info widely. As such, it seems out of scope for a CTI exchange format."
We appreciate the desire to manage the scope of the STIX project, but we respectfully disagree:
- Security operations, digital forensics, and incident response frequently require sharing information across organizational boundaries. We commonly share between departments, sister organizations, regulators, attorneys, law enforcement, vendors, customers, and more. It just tends to be ad hoc, governed by whatever tools or procedures happen to be in place. Lots of word documents and pdfs.
- STIX has features for controlling how information should be shared (e.g., TLP): it's not exclusively designed for sharing information "widely."
- Attacks are often launched from victims' compromised systems, so the distinction between "friendly assets" and "attacker infrastructure" is already pretty blurry.
- It leads to arbitrary distinctions (e.g., having a
user-account
observable in scope, but the system auser-account
uses out of scope). - Parallel efforts for describing "friendly" or "internal" observables would have to solve problems STIX already solved ... along with compatibility with STIX itself!
Bottom line: we think STIX has a lot to offer beyond just CTI, particularly in the realm of detection and response, and a few custom objects would vastly increase its usefulness without unduly compromising concision.
Fortunately STIX v2.1 supports custom objects and extensions, and this project respects those constraints, making it compatible with existing spec.
-
There's no obvious way to capture data about object types themselves (as opposed to instances of types) within STIX. That is, types are not objects to which you can add "static" or "prototypical" data, nor can you refer to object types directly in relationships.
MITRE ATT&CK's recent efforts around data sources show how it would be nice, for example, to have a STIX representation of the
process
type to which we could attach useful data (e.g., where to find concrete evidence aboutprocess
es on a system).Also, having types as first-class STIX objects makes relationships much more powerful. You could say things like "
tool
t createsprocess
objects with properties of a certain kind."With custom objects we can make this happen too!
- System (
x-scope-system
)- Physical System Extension (
x-scope-physical-ext
), e.g., laptop, desktop, smartphone - Virtual System Extension (
x-scope-virtual-ext
), e.g., vm, container - Datastore System Extension (
x-scope-datastore-ext
), e.g., database, wiki, s3 - Appliance System Extension (
x-scope-appliance-ext
), e.g., load balancer, wifi gateway, WAF - Platform System Extension (
x-scope-platform-ext
), e.g., cloud email tenant, CRM instance
- Physical System Extension (
The concept of a "system" (a.k.a., host, endpoint, asset) is ubiquitous, and has an intuitive definition, something like "a logically distinct combination of hardware and software." A system is where most observables are in fact observed: file
, process
, user-account
, etc., are observed on systems.
Except STIX doesn't have a type for them. CybOX used to (called "System"), ECS does (called "Host"), and it's floated as a possibility for STIX v2.2+ in the working documents, but it's not in the current spec.
Another way to think of a system is as anything supporting sessions (see below), but regardless of the formal definition, it's a prerequisite to making STIX more relevant to detection and response. We think it'll help the CTI use-case too.
We propose bringing it back as a new, inclusive system
type with extensions to capture the fact that systems aren't just physical desktop boxes anymore. See the details and comments in system/x-scope-system.yml
- Session (
x-scope-session
)
A session is any period of interaction between an account and a system (see above). ECS has an open RFC that captures their variety well: sessions can be local, remote, network, or more, and on any type of system (virtual, physical, appliance, etc.).
They're characterized by an account, system, start time, and end time, though in the simplest case these times may be the same (say, for a single REST call). Also, one or the other times may be unknown.
Current STIX fields like user-account
's first and last login times don't give sufficient granularity to describe activity in the context of detection and response, and they don't capture the reality that user accounts can access many systems.
Bringing a session
type into the mix provides a great tool for sharing information about timelines, which are critical for detection, response, and CTI alike.
- API (
x-scope-api
)
Application programming interfaces (APIs) are central to many adversary techniques, but there's currently no way to describe them in STIX. Like with system
, there was such a type in CybOX, but it was culled.
We recommend resurrecting it, if only for its applicability to ATT&CK. It might be a bit tricky to get the details right, but the juice will be worth the squeeze.
For example, when discussing ATT&CK attack-pattern
s like process injection and its sub-techniques, it's helpful to refer to specific Windows APIs and API calls.
Like system
, api
would benefit from an updated definition to include the breadth of modern API types, from OS system calls to RPC to REST.
So why did we choose these custom types? STIX already many observable types, and we want to be judicious about adding more, consistent with our goals above. Along with the detection and response use-cases (from our experience) we used the following analysis to understand coverage of existing data sources:
ATT&CK v8, across all platforms and non-revoked techniques, has ~1,627 data source descriptions:
# i'm not a jq expert ...
$ jq '[.objects[] | select(.type == "attack-pattern" and (.revoked | not) and (.x_mitre_deprecated | not) ) | .x_mitre_data_sources ] | flatten | .[] | select(. != null)' < enterprise-attack-v8.json | wc -l
1627
These represent ~65 unique values following a long-tail, power-law descent:
Data Source Prevalence |
The top 10 data source types give you 67%+ the total:
source | count | % | % cumulative |
---|---|---|---|
Process monitoring | 290 | 17.8% | 17.8% |
Process command-line parameters | 186 | 11.4% | 29.3% |
File monitoring | 176 | 10.8% | 40.1% |
Packet capture | 75 | 4.6% | 44.7% |
API monitoring | 74 | 4.5% | 49.2% |
Netflow/Enclave netflow | 64 | 3.9% | 53.2% |
Process use of network | 61 | 3.7% | 56.9% |
Authentication logs | 59 | 3.6% | 60.5% |
Windows Registry | 55 | 3.4% | 63.9% |
Network protocol analysis | 52 | 3.2% | 67.1% |
Since STIX already has process
, file
, network-traffic
, and windows-registry-x
, api
was a nice choice to round it out. Notice too that authentication logs (session
) is the only other source on this list that doesn't have an existing STIX type (outside implicit, embedded login times).
Of course, "covering the most techniques" isn't the only measure of usefulness of a data source. To get a sense of which data sources covered the most commonly observed techniques, we pulled the top 20 techniques from the (excellent) 2020 Red Canary Threat Report as a representative sample. Without going into detail on their numbers (the report's free, check it out!), we can actually look at all 16 data sources for these techniques:
source | count | % |
---|---|---|
Process monitoring | 36 | 31.6% |
Process command-line parameters | 21 | 18.4% |
File monitoring | 17 | 14.9% |
API monitoring | 11 | 9.6% |
PowerShell logs | 7 | 6.1% |
Windows event logs | 6 | 5.3% |
Binary file metadata | 4 | 3.5% |
DLL monitoring | 2 | 1.8% |
Netflow/Enclave netflow | 2 | 1.8% |
System calls | 2 | 1.8% |
Windows Registry | 1 | 0.9% |
Process use of network | 1 | 0.9% |
Packet capture | 1 | 0.9% |
Named Pipes | 1 | 0.9% |
Authentication logs | 1 | 0.9% |
Network protocol analysis | 1 | 0.9% |
As with the coverage analysis, STIX has most of these (PowerShell logs are logs of process
es, named pipes are just file
s, etc.). api
joins the club again, including "System calls," as do "Authentication logs" for session
s.
- STIX Type (
x-scope-stix-type
)
This gets a little meta, but bear with us 😃: the idea is to capture details about STIX object types as concrete STIX objects, because it's only STIX objects that can contain real data and be the target of references.
Put another way: currently the details of the process
type live in a word document and a non-normative json schema. They're not STIX data. This would allow us to capture that in a STIX object itself, a la:
{
"type": "x-scope-stix-type",
"id": "x-scope-stix-type--GUID-FOR-PROCESS-TYPE",
"name": "process",
"schema": "json schema from https://github.com/oasis-open/cti-stix2-json-schemas/blob/master/schemas/observables/process.json",
"external_references": [],
"other_fields": "with other content"
}
This provides some benefits:
-
You can store data that applies to all objects of that type. For you programmers, think static members in C++/Java or prototype properties in javascript.
Improving MITRE ATT&CK's data sources led to the idea of storing them in SCO types themselves. It'd be useful to describe what Windows event logs would help us populate a
process
observable, for example. With a first-classstix-type
object, we could add this data under theexternal_references
field of thestix-type
object for theprocess
SCO (or to a customevidence_locations
field, or whatever). -
Relationships (SROs) could then refer to types rather than just instances.
It can be useful to describe the relationships of one SCO type to another. Some of these, like a
user-account
creating aprocess
, are embedded relationships captured by a_ref
field in the object. These are well-suited toexternal_references
on the SCO type as noted above. In pseudo-code:{ "type": "x-scope-stix-type", "id": "x-scope-stix-type--GUID-FOR-PROCESS-TYPE", "name": "process", "external_references": [ "info about windows EID 4688 and how it supports filling the creator_user_ref field", "(the user-account -> process embedded relationship)" ] }
Others, like a
process
setting awindows-registry-key
, don't have embedded_ref
s. These could be described with a new relationship object (SRO). With a first-classstix-type
, we can create an SRO that captures this, and attachexternal_references
or other fields to that SRO:{ "type": "relationship", "id": "relationship--GUID", "relationship_type": "set", "source_ref": "x-scope-stix-type--GUID-FOR-PROCESS-TYPE", "target_ref": "x-scope-stix-type--GUID-FOR-REGISTRY-KEY-TYPE", "external_references": [ "info about sysmon EID 13 and how it links process info to registry key activity", "(the process -> windows-registry-key non-embedded relationship)" ] }
Take another example unrelated to logs: let's say we wanted to express "the
attack-pattern
Create or Modify System Process: Windows Service (T1543.003) createsprocess
observables. The current model doesn't allow relationships betweenattack-pattern
s and observable types, but with astix-type
object you could express it easily. -
You can transmit the STIX specification as STIX data. It's like having a compiler written in the language itself.
The proposed objects are stored in the objects directory as yaml files. These can be transformed into a reasonable facsimile of the STIX spec format using the node script in the render folder.
Forgive the spaghetti-code ejs template - this is just to prove you can build a similar looking spec from yaml, which would be easier to serialize into the stix-type
objects described above. And would be easier to collaborate on via github ...
# needs nodejs
$ cd ./render
$ npm install
# user-account.yml is a STIX v2.1 built-in, for testing the formatting:
./render/render.js render/template.ejs objects/user-account/user-account.yml > user-account.html
# custom types are stubs for now:
./render/render.js render/template.ejs objects/system/x-scope-system.yml > system.html
All of this is in respectful collaboration with the folks fighting the good fight for interoperability, consistency, and openness! Thanks for everything you do.
- MITRE ATT&CK data source initiative (discussion, blog part 1, blog part 2)
- MITRE Cyber Analytics Repository (CAR)
- CyBOX, an archived project that was worked into STIX SCOs.
- Elastic Common Schema (ECS), used primarily to add normalized fields to logging data.
Copyright 2020 Counteractive Security
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License in the LICENSE file or at:
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.