A small script that enriches CVEs to other sources with all data stored as STIX 2.1 objects.
Here at DOGESEC we work with a lot of CVE data across our products. cve2stix generates core STIX 2.1 Vulnerability objects from CVE data.
However, we have lots of other sources (EPSS, KEV, ATT&CK...) that we want to enrich this data with.
We built Arango CVE Processor to handle the generation and maintenance of these enrichments.
In short, Arango CVE Processor is a script that;
- reads the ingested CVE STIX data in ArangoDB
- creates STIX objects to represent the relationships between CVE and other datasets
# clone the latest code
git clone https://github.com/muchdogesec/arango_cve_processor
# create a venv
cd arango_cve_processor
python3 -m venv arango_cve_processor-venv
source arango_cve_processor-venv/bin/activate
# install requirements
pip3 install -r requirements.txt
Arango CVE Processor has various settings that are defined in an .env
file.
To create a template for the file:
cp .env.example .env
To see more information about how to set the variables, and what they do, read the .env.markdown
file.
python3 arango_cve_processor.py \
--database DATABASE \
--relationship RELATIONSHIP \
--ignore_embedded_relationships BOOLEAN \
--modified_min DATETIME \
--cve_id CVE-NNNN-NNNN CVE-NNNN-NNNN
Where;
--database
(required): the arangoDB database name where the objects you want to link are found. It must contain the collectionsnvd_cve_vertex_collection
andnvd_cve_edge_collection
--relationship
(optional, dictionary): you can apply updates to certain relationships at run time. Default is all. Note, you should ensure yourdatabase
contains all the required seeded data. User can select from;cve-cwe
cve-capec
cve-attack
cve-epss
cve-kev
--ignore_embedded_relationships
(optional, boolean). Default is false. iftrue
passed, this will stop any embedded relationships from being generated. This is a stix2arango feature where STIX SROs will also be created for_ref
and_refs
properties inside each object (e.g. if_ref
property =identity--1234
and SRO between the object with the_ref
property andidentity--1234
will be created). See stix2arango docs for more detail if required, essentially this a wrapper for the same--ignore_embedded_relationships
setting implemented by stix2arango--modified_min
(optional, date). By default arango_cve_processor will consider all CVEs in the database specified with the property_is_latest==true
(that is; the latest version of the object). Using this flag with a modified time value will further filter the results processed by arango_cve_processor to STIX objects with amodified
time >= to the value specified. This is useful when you don't want to process data for very old CVEs in the database.--created_min
(optional, date). Same asmodified_min
but considerscreated
date.--cve_id
(optional, lists of CVE IDs): will only process the relationships for the CVEs passed, otherwise all CVEs will be considered. Separate each CVE with a white space character (e.g.CVE-NNNN-NNNN CVE-NNNN-NNNN
)
Process CVE -> CWE relationships for all CVEs modified after 2023-01-01
python3 arango_cve_processor.py \
--database arango_cve_processor_standard_tests_database \
--relationship cve-cwe \
--modified_min 2023-01-01 \
--ignore_embedded_relationships true
If you would like to know how the logic of this script works in detail, please consult the /docs
directory.
- To generate STIX 2.1 extensions: stix2 Python Lib
- STIX 2.1 specifications for objects: STIX 2.1 docs
- ArangoDB docs