Go to file
2024-07-16 15:37:13 +02:00
artifacts_json Switched from YAML to JSON, closing #8. 2024-07-16 11:01:08 +02:00
artifacts_nickel Add POC for using nickel as a configuration lang 2024-07-11 16:05:08 +02:00
artifacts_yaml Written a doc. Changing the name of the variables related to the hash log. 2024-07-16 15:34:15 +02:00
blacklists basic workflow for ecg 2024-07-11 15:17:16 +02:00
workflow integrate nickel in workflow 2024-07-16 13:59:44 +02:00
.gitignore Initial support for Docker errors parsing. 2024-07-12 17:50:19 +02:00
blacklist.csv basic workflow for ecg 2024-07-11 15:17:16 +02:00
check.ncl Add POC for using nickel as a configuration lang 2024-07-11 16:05:08 +02:00
clean.sh Added support for cache. Added run and clean scripts. 2024-07-12 12:10:03 +02:00
ecg.py Written a doc. Changing the name of the variables related to the hash log. 2024-07-16 15:34:15 +02:00
flake.lock add snakemake and awk to nix flake 2024-07-11 13:35:54 +02:00
flake.nix Switched from YAML to JSON, closing #8. 2024-07-16 11:01:08 +02:00
README.md Written a doc. Changing the name of the variables related to the hash log. 2024-07-16 15:34:15 +02:00
run.sh Added artifact hash logging, closing #3. 2024-07-16 12:04:29 +02:00

Study of the Reproducibility and Longevity of Dockerfiles

ECG is a program that automates software environment checking for scientific artifacts.

It is meant to be executed periodically to analyze variations in the software environment of the artifact through time.

How it works

ECG takes as input a JSON configuration telling where to download the artifact, where to find the Dockerfile to build in the artifact, and which package managers are used by the Docker container.

It will then download the artifact, build the Dockerfile, and then create a list of the installed packages in the Docker container. It also stores the potential errors encountered when building the Dockerfile, and logs the hash of the artifact for future comparison.

Setup

A Linux operating system and the following packages are required:

  • snakemake
  • gawk
  • nickel

The following Python package is also required:

  • requests

Otherwise, you can use the Nix package manager and run nix develop in this directory to setup the full software environment.

Usage

Run ecg.py as follow:

python3 ecg.py <config_file> -p <pkglist_path> -l <log_file> -b <build_status_file> -a <artifact_hash_log> -c <cache_directory>

Where:

  • <config_file> is the configuration file of the artifact in JSON format. An example is given in artifacts_json/test.json.
  • <pkglist_path> is the path to the file where the package list generated by the program should be written.
  • <log_file> is the path to the file where to log the output of the program.
  • <build_status_file> is the path to the file where to write the build summary of the Docker image given in the configuration file.
  • <artifact_hash_log> is the path to the file where to log the hash of the downloaded artifact.
  • <cache_directory> is the path to the cache directory, where downloaded artifacts will be stored for future usage.

License

TBD