Skip to content

Commit 98eb848

Browse files
committed
updates to
1 parent b1c585a commit 98eb848

File tree

398 files changed

+58558
-1
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

398 files changed

+58558
-1
lines changed

Dockerfile

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
FROM python:3.11-bullseye
2+
3+
WORKDIR /code
4+
5+
# Add necessary groups and users
6+
RUN groupadd -g 3000 scot4api
7+
RUN useradd -M -r -u 3000 -g 3000 -s /bin/bash scot4api
8+
9+
RUN apt-get update && \
10+
apt-get upgrade -y && \
11+
apt-get install -y \
12+
curl \
13+
mariadb-client \
14+
python3-dev \
15+
default-libmysqlclient-dev \
16+
build-essential \
17+
libxml2-dev \
18+
libxslt-dev
19+
20+
# Create the default file storage directories
21+
RUN mkdir -p /var/scot_files/_deleted_items
22+
RUN chown -R scot4api /var/scot_files
23+
24+
# Copy over the required files
25+
COPY requirements.txt /code/requirements.txt
26+
COPY ./src/app /code/app
27+
28+
# Install requirements and upgrade pip
29+
RUN pip install --upgrade pip && pip install -r requirements.txt
30+
31+
# Set deployment user and give correct permissions
32+
RUN chown -R scot4api /code
33+
USER scot4api
34+
35+
# Start option
36+
CMD ["uvicorn", "app.main:app", "--host", "127.0.0.1", "--port", "8000"]

Dockerfile-util

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
FROM python:3.11-bullseye
2+
3+
WORKDIR /code
4+
5+
# Add necessary groups and users
6+
RUN groupadd -g 3000 scot4api
7+
RUN useradd -M -r -u 3000 -g 3000 -s /bin/bash scot4api
8+
9+
RUN apt-get update && \
10+
apt-get upgrade -y && \
11+
apt-get install -y \
12+
curl \
13+
mariadb-client \
14+
python3-dev \
15+
default-libmysqlclient-dev \
16+
build-essential \
17+
libxml2-dev \
18+
libxslt-dev
19+
20+
# Create the default file storage directories
21+
RUN mkdir -p /var/scot_files/_deleted_items
22+
RUN chown -R scot4api /var/scot_files
23+
24+
# Copy over the required files
25+
COPY requirements.txt /code/requirements.txt
26+
COPY requirements-test.txt /code/requirements-test.txt
27+
28+
COPY ./src/app /code/app
29+
COPY ./tests /code/tests
30+
COPY ./conversion /code/conversion
31+
32+
# Install requirements and upgrade pip
33+
RUN pip install --upgrade pip && pip install -r requirements-test.txt
34+
35+
# Set deployment user and give correct permissions
36+
RUN chown -R scot4api /code
37+
USER scot4api

LICENSE

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
Copyright (2024) Sandia Corporation. Under the terms of Contract DE-AC04-94AL85000, there is a non-exclusive license for use of this work by or on behalf of the U.S. Government. Export of this program may require a license from the United States Government.
2+
3+
NOTICE:
4+
5+
For five (5) years from 09/01/2024, the United States Government is granted for itself and others acting on its behalf a paid-up, nonexclusive, irrevocable worldwide license in this data to reproduce, prepare derivative works, and perform publicly and display publicly, by or on behalf of the Government. There is provision for the possible extension of the term of this license. Subsequent to that period or any extension granted, the United States Government is granted for itself and others acting on its behalf a paid-up, nonexclusive, irrevocable worldwide license in this data to reproduce, prepare derivative works, distribute copies to the public, perform publicly and display publicly, and to permit others to do so. The specific term of the license can be identified by inquiry made to Sandia Corporation or DOE.
6+
7+
NEITHER THE UNITED STATES GOVERNMENT, NOR THE UNITED STATES DEPARTMENT OF ENERGY, NOR SANDIA CORPORATION, NOR ANY OF THEIR EMPLOYEES, MAKES ANY WARRANTY, EXPRESS OR IMPLIED, OR ASSUMES ANY LEGAL RESPONSIBILITY FOR THE ACCURACY, COMPLETENESS, OR USEFULNESS OF ANY INFORMATION, APPARATUS, PRODUCT, OR PROCESS DISCLOSED, OR REPRESENTS THAT ITS USE WOULD NOT INFRINGE PRIVATELY OWNED RIGHTS.
8+
9+
Any licensee of this software has the obligation and responsibility to abide by the applicable export control laws, regulations, and general prohibitions relating to the export of technical data. Failure to obtain an export control license or other authority from the Government may result in criminal liability under U.S. laws.
10+
11+
Copyright [2024] Sandia Corporation.
12+
13+
Licensed under the Apache License, Version 2.0 (the "License");
14+
you may not use this file except in compliance with the License.
15+
You may obtain a copy of the License at
16+
17+
http://www.apache.org/licenses/LICENSE-2.0
18+
19+
Unless required by applicable law or agreed to in writing, software
20+
distributed under the License is distributed on an "AS IS" BASIS,
21+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
22+
See the License for the specific language governing permissions and
23+
limitations under the License.

README.md

Lines changed: 85 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,85 @@
1-
Placeholder
1+
# SCOT4 API
2+
3+
## Build and Deploy
4+
5+
**To build images for testing**: push code to the default branch of this repository. The CI pipeline will create a new image and push it to the [unsorted container registry](https://baltig.sandia.gov/scot/scot4/SCOT-API/container_registry/313) with a tag matching the short SHA of the commit.
6+
7+
**To build an image for quality**: [identify the pipeline created by your latest push](https://baltig.sandia.gov/scot/scot4/SCOT-API/-/pipelines). Click the play button the "Tag Qual Image" job. This will take the image and push it to the [quality container registry](https://baltig.sandia.gov/scot/scot4/SCOT-API/container_registry/326) as tag `latest`.
8+
9+
**To build an image for production**: [create a new release for this project](https://baltig.sandia.gov/scot/scot4/SCOT-API/-/releases). When selecting a tag, choose a new tag name that follows a valid semantic versioning scheme (MAJOR.MINOR.PATCH) for instance 1.0.17. Make sure that this version is greater than any previous release. **Note**: only a maintainer of this repository may create a patch for the default branch. Once the image is created, it will be placed in [the production container registry](https://baltig.sandia.gov/scot/scot4/SCOT-API/container_registry/340) with a tag name matching the git tag created as well as overwriting `latest`.
10+
11+
On tag validity: a job is run in the pipeline that verifies the tag is a valid semantic version string and is greater than any version before this one. This script lives in [SCOT4 Pipeline Support Repo](https://baltig.sandia.gov/scot/scot4/pipeline-support/-/blob/main/scripts/tag_validate.py?ref_type=heads) and is bundled into a container image for use in pipelines. It utilizes the gitlab release api to check all of the repo's releases and the git tags associated with them
12+
13+
#### Initial Setup
14+
15+
Create a .env
16+
17+
```shell
18+
touch src/.env
19+
```
20+
21+
Needs to contain these keys
22+
```
23+
# PROD or DEV
24+
ENV=
25+
SECRET_KEY=
26+
SQLALCHEMY_DATABASE_URI=sqlite:///../scot4-test.db
27+
```
28+
29+
Note `main.py` ts called from the TLD.
30+
```shell
31+
export PYTHONPATH=$PWD/src
32+
python src/app/main.py
33+
```
34+
35+
#### Running
36+
37+
Using main
38+
```shell
39+
python src/app/main.py
40+
```
41+
42+
**OR**
43+
Using uvicorn
44+
```shell
45+
export PYTHONPATH=$PWD/src
46+
cd src/app
47+
uvicorn main:app --host=127.0.0.1 --port=8080 --reload
48+
```
49+
50+
51+
#### Running Tests
52+
Now in parallel!
53+
- With `-n auto`, pytest-xdist will use as many processes as your computer has physical CPU cores.
54+
- `--dist loadfile`: Tests are grouped by their containing file. Groups are distributed to available workers as whole units. This guarantees that all tests in a file run in the same worker.
55+
- Make sure that the SQLite database is in memory otherwise it can crash
56+
57+
```shell
58+
export PYTHONPATH=$PWD/src:$PWD/tests
59+
export SQLALCHEMY_DATABASE_URI="sqlite://"
60+
export ENV=TEST
61+
pytest -n auto --dist loadfile tests/
62+
```
63+
64+
To run pytest normally
65+
```shell
66+
export PYTHONPATH=$PWD/src:$PWD/tests
67+
pytest tests/
68+
```
69+
70+
What needs to be done/thought about
71+
72+
Roles & Permissions
73+
* Administrator - Full Access
74+
* Incident Commander - View Edit Events, Alerts
75+
* Observer - View Events
76+
77+
schemas need:
78+
* PositiveInt
79+
* EmailStr
80+
* AnyUrl
81+
* None
82+
* Fix for
83+
84+
DB models
85+
* need to be pluralized

conversion/conversion.md

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
# SCOT3 to SCOT4 Conversion Utilities
2+
3+
This directory contains scripts for migrating data from version 3 of SCOT to version 4. They are grouped into three categories: database migrations, file migrations, and extra (optional) migrations. Three bash shell scripts have been provided for you to run the applicable migrations in each category.
4+
5+
## Database Conversion
6+
This set of scripts migrates the core database data to the SCOT4 database by pulling data directly from the SCOT3 mongodb database. Almost all SCOT3 installations migrating to SCOT4 will want to do this. The `database_conversion.sh` script will run all of the necessary scripts for you.
7+
8+
The following environment variables should be set when running `database_conversion.sh`:
9+
- MONGO_DB_URI - the URI used to connect to the SCOT3 mongodb database
10+
- MONGO_DB_NAME - the name of the SCOT3 mongodb database
11+
- SQL_URI - the URI used to connect to the SCOT4 SQL database
12+
- SQL_PW - the password used to connect to the SCOT4 SQL database
13+
- SCOT_MIGRATION_STAGING_DIRECTORY (optional) - the directory used to stage intermediate files for the conversion (created if does not exist, default /data/scot4_migration_sync/)
14+
15+
## Extra Migrations
16+
This set of scripts contains useful entries that are not necessarily required, but which will ease the transition from SCOT3 to SCOT4.
17+
18+
### Signature Migrations
19+
One of the primary ways that SCOT4 differs from SCOT3 is that guides must be linked to alerts by way of a signature. In SCOT4, signatures more explicitly represent the rules that generate alerts, so guides are linked to specific signatures, and those signatures are then linked to alerts when they fire.
20+
21+
Because of this, in order for guides for new and past alerts to be linked properly, each must be linked to a signature. The script `guide_sigs_link.py` will attempt to link guides to signatures by name or create a new signature for a guide if there isn't already a signature with the same name. Likewise, `link_alerts_signatures.py` will attempt to link all existing alerts with a signature as if those alerts had just been generated. `signature_permissions.py` will also fix permissions on existing signatures (since signatures didn't have permissions in SCOT3).
22+
23+
If you would like to perform all of these extra signature migrations steps, run the `signature_conversion.sh` script.
24+
25+
The following environment variables should be set when running `signature_conversion.sh`:
26+
- SQLALCHEMY_DATABASE_URI - set to the SCOT4 database URI, as if running the SCOT4 API
27+
- PYTHONPATH - set to include the src/ directory of the SCOT4 API (the scripts borrow code from the API to run)
28+
29+
### Admin Migration
30+
By default, the SCOT4 migration creates a user named `scot-admin` to be the initial superuser for SCOT. You can give this user a password and an API key by setting the `SCOT_ADMIN_PASSWORD` and/or `SCOT_ADMIN_APIKEY` environment variables respectively, then running the `extra_migration/update_admin_password_and_api_key.py` script.
31+
32+
The following environment variables should be set when running `update_admin_password_and_api_key.py`:
33+
- SQLALCHEMY_DATABASE_URI - set to the SCOT4 database URI, as if running the SCOT4 API
34+
- PYTHONPATH - set to include the src/ directory of the SCOT4 API (the script borrows code from the API to run)
35+
36+
## File Conversion
37+
Finally, if you uploaded files to SCOT3 and wish to migrate them to SCOT4, they must be migrated separately. This also applies to cached images in entries that were downloaded and subsequently hosted through SCOT. These scripts upload the files and cached images to the SCOT4 file store and also rewrite existing entries to point to the new files.
38+
39+
Before files and images can be migrated, **you must configure a file storage mechanism on the SCOT4 instance**. This usually means that you must set up the API and frontend, and configure a file storage option through the admin panel on the frontend. Once you have done this, you can run the `file_conversion.sh` to migrate both files and cached images from SCOT3.
40+
41+
The following environment variables should be set when running `file_conversion.sh`:
42+
- MONGO_DB_URI - the URI of the SCOT3 mongodb database
43+
- MONGO_DB_NAME - the name of the SCOT3 mongodb database
44+
- SCOT4_URI - the base URI of the SCOT4 installation (e.g. https://scot4.example.com)
45+
- SCOT_ADMIN_APIKEY - a SCOT4 API key with admin priveleges (see above for one way to create one)
46+
- SCOT3_FILE_PREFIX (needed for file migration) - the directory under which the files were stored in the SCOT3 database, this defaults to the default in SCOT3, which was `/opt/scotfiles/`
47+
- SCOT_FILES_DIR (needed for file migration) - the directory on the current machine in which the old SCOT3 files are stored (with the same file structure that the SCOT3 installation had)
48+
- SCOT_CACHED_IMAGES_DIR (needed for cached images migration) - the directory on the current machine that contains the SCOT3 cached images in their original file structure (this is usually the /cached_images/ directory in the SCOT3 files)

conversion/database_conversion.sh

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
#! /bin/bash
2+
set -e
3+
4+
CONVERSION_DIR=$(dirname "$0")
5+
SCOT_MIGRATION_STAGING_DIRECTORY="${SCOT_MIGRATION_STAGING_DIRECTORY:-/data/scot4_migration_sync/}"
6+
7+
# Set up dirs
8+
mkdir -p $SCOT_MIGRATION_STAGING_DIRECTORY/conversion_staging
9+
10+
# Create all TSVs from mongo data
11+
cd $CONVERSION_DIR/database_migration
12+
python3 ./scot3_scot4_mongo_tsv_export.py
13+
14+
# Tear down DB and import TSVs
15+
mysqlsh $SQL_URI --password=$SQL_PW --file ./initial_scot4_database.sql
16+
mysqlsh $SQL_URI --password=$SQL_PW --file ./scot3_scot4_tsv_import.py
17+
mysqlsh $SQL_URI --password=$SQL_PW --file ./fix_parent_entry_ids.sql
Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
import os
2+
import csv
3+
import tqdm
4+
import json
5+
6+
def main(mongo_db=None):
7+
schema_id_map = {}
8+
staging_directory = os.getenv('SCOT_MIGRATION_STAGING_DIRECTORY')
9+
scot3_alertgroup_count = mongo_db.alertgroup.count_documents({})
10+
scot3_alertgroups = mongo_db.alertgroup.find()
11+
_id = 1
12+
with open(f'{staging_directory}/alertgroup_schema_keys.csv', 'w+') as alertgroup_schema_keys_csv:
13+
writer = csv.writer(alertgroup_schema_keys_csv, dialect='unix', delimiter="\t", quotechar="'")
14+
writer.writerow(['schema_key_name', 'alertgroup_id', 'schema_key_order', 'schema_key_id'])
15+
with tqdm.tqdm(total=scot3_alertgroup_count) as pbar:
16+
bulk_array = []
17+
for alertgroup in scot3_alertgroups:
18+
alerts = mongo_db.alert.find({'alertgroup':alertgroup['id']})
19+
schema_keys = set([])
20+
for alert in alerts:
21+
schema_keys.update([k.lower() for k in alert['data'].keys()])
22+
new_schema_keys = [[x.lower(), alertgroup['id'], c] for c,x in enumerate(schema_keys) if (x.lower() != '_raw' and x.lower() !='columns' and x.lower() != 'search')]
23+
for schema_key_iter in new_schema_keys:
24+
schema_key_iter.append(_id)
25+
schema_key_name = schema_key_iter[0]
26+
alertgroup_id = schema_key_iter[1]
27+
_key = f"{schema_key_name}-{alertgroup_id}"
28+
schema_id_map[_key] = _id
29+
writer.writerow(schema_key_iter)
30+
_id += 1
31+
pbar.update(1)
32+
scot3_alerts = mongo_db.alert.find()
33+
scot3_alert_count = mongo_db.alert.count_documents({})
34+
# initialize csv file
35+
with tqdm.tqdm(total=scot3_alert_count) as pbar:
36+
with open(f'{staging_directory}/alert_data.csv', 'w+') as alert_data_csv:
37+
writer = csv.writer(alert_data_csv, dialect='unix', delimiter="\t", quotechar="'")
38+
for alert in scot3_alerts:
39+
alert_datas = transform_alert(alert=alert, schema_id_map=schema_id_map)
40+
writer.writerows(alert_datas)
41+
pbar.update(1)
42+
43+
def transform_alert(alert=None, schema_id_map=None):
44+
alert_datas = []
45+
# First transform alert['data'] and alert['data_with_flair'] dictionaries to only have lowercase keys. This will eliminate duplicate keys
46+
alert['data'] = {k.lower(): v for k,v in alert['data'].items()}
47+
alert['data_with_flair'] = {k.lower(): v for k,v in alert['data_with_flair'].items()}
48+
alertgroup_id = alert['alertgroup']
49+
unique_keys = set(list(alert['data'].keys()) + list(alert['data_with_flair'].keys()))
50+
for c,k in enumerate(unique_keys):
51+
if k =='columns' or k =='search' or k=='_raw':
52+
# We don't care about these columns because they should not show up in an alertgroup table
53+
continue
54+
else:
55+
# Get the schem key id from the map we created beforehand
56+
schema_id = schema_id_map.get(f"{k}-{alertgroup_id}")
57+
if schema_id is None:
58+
continue
59+
else:
60+
data_value = alert['data'].get(k)
61+
data_value_flaired = alert['data_with_flair'].get(k)
62+
data_value = json.dumps(data_value)
63+
data_value_flaired = json.dumps(data_value_flaired)
64+
alert_data = [data_value, data_value_flaired, schema_id , alert['id']]
65+
alert_datas.append(alert_data)
66+
return alert_datas
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
import csv
2+
import os
3+
from datetime import datetime
4+
from datetime import timezone
5+
import tqdm
6+
from conversion_utilities import write_permission, write_tag_source_links
7+
8+
def main(mongo_db=None, role_lookup=None, tag_lookup=None, source_lookup=None):
9+
staging_directory = os.getenv('SCOT_MIGRATION_STAGING_DIRECTORY')
10+
permission_csv = open(f'{staging_directory}/alertgroup_permissions.csv','w+')
11+
permission_csv_writer = csv.writer(permission_csv, dialect='unix', delimiter='\t', quotechar="'")
12+
permission_csv_writer.writerow(['role_id', 'target_type', 'target_id', 'permission'])
13+
14+
links_csv = open(f'{staging_directory}/alertgroup_links.csv','w+')
15+
link_csv_writer = csv.writer(links_csv, dialect='unix', delimiter='\t', quotechar="'")
16+
link_csv_writer.writerow(['v0_type', 'v0_id', 'v1_type', 'v1_id'])
17+
scot3_alertgroup_count = mongo_db.alertgroup.count_documents({})
18+
scot3_alertgroups = mongo_db.alertgroup.find()
19+
with open(f'{staging_directory}/alertgroups.csv', 'w+') as alertgroup_schema_keys_csv:
20+
writer = csv.writer(alertgroup_schema_keys_csv, dialect='unix', delimiter="\t", quotechar="'")
21+
writer.writerow(['alertgroup_id', 'tlp', 'subject', 'created_date', 'modified_date', 'view_count'])
22+
with tqdm.tqdm(total=scot3_alertgroup_count) as pbar:
23+
for alertgroup in scot3_alertgroups:
24+
view_count = alertgroup.get('views')
25+
if view_count is None:
26+
view_count = 0
27+
new_alertgroup = [alertgroup['id'], 'unset', alertgroup['subject'], datetime.fromtimestamp(alertgroup['created']).astimezone(timezone.utc).replace(tzinfo=None), datetime.fromtimestamp(alertgroup['updated']).astimezone(timezone.utc).replace(tzinfo=None), view_count]
28+
writer.writerow(new_alertgroup)
29+
write_permission(thing=alertgroup, thing_type='alertgroup', role_lookup=role_lookup, permission_csv_writer=permission_csv_writer)
30+
write_tag_source_links(thing=alertgroup, thing_type='alertgroup', tag_lookup=tag_lookup, source_lookup=source_lookup, link_csv_writer=link_csv_writer)
31+
pbar.update(1)
32+
33+
permission_csv.close()
34+
links_csv.close()
35+
if __name__ == "__main__":
36+
main()

0 commit comments

Comments
 (0)