\item\href{https://orcid.org/0009-0003-7645-5044}{Quentin \textsc{Guilloteau}}: Conceptualization, Methodology, Software, Data Curation, Supervision, Project administration
\item ...
\end{itemize}
\subsection{Description of the project}
This project aims to show the limitations of using Docker containers as a reliable reproducibility tool.
In particular, as Docker relies on non-reproducible tools, it is difficult to construct a \dfile\ that will rebuild the \emph{exact} same software environment in the future.
In this project, we will collect research artifacts coming from various scientific conferences containing \dfile s, rebuild them periodically, and observe the variation in the resulting software environments.
This Python script\ \cite{ecg_code} takes as input a (verified) JSON representation of the Nickel artifact description, and then tries to build the \dfile\ contained in the artifact.
\item If the build is successful, gather information about the produced software environment (Sections \ref{sec:package_managers}, \ref{sec:git}, \ref{sec:misc}, and \ref{sec:pyenv})
\dfile\ authors can also install packages from source.
One way to do this is via Git.
In this case, once the container built sucessfully, \ecg\ logs into the container and extract the commit hash of the repository (via \texttt{git log}).
\paragraph{Example of Data}
Below is an example of data collected for a Git repository called \texttt{ctf}:
In the case where the \dfile\ download content from the internet (\eg\ archives, binaries), \ecg\ will download the same content on the host machine (\ie\ not in the container) and then compute the cryptographic hash of the downloaded content.
\paragraph{Example of Data}
Below is an example of data collected for the downloading of the \texttt{Miniconda3} binary:
Even if \texttt{pip} is managed in the ``Package Managers'' section (Section \ref{sec:package_managers}), when authors use a virtual environment, \ecg\ needs to query this exact Python environment, and not the global one.
The gathering part of the \dfile s will be done right after the publication of the proceeding of a conference.
Contributors of the ``Data Curation'' phase will go through all the papers and their artifact to extract artifact containing \dfile s.
These \dfile s will then be captured with the Nickel description (see Section \ref{sec:nickel}).
To avoid mistake, at least two contributors will be assigned by paper.
If there is any difference in the Nickel description of an artifact, a discussion between the contributors will be initiated to conclude on the correct artifact description.
The first part of the analysis can be done statically from the description of the artifacts.
\begin{itemize}
\item Number/Proportion of \dfile s using particular package managers
\item Number/Proportion of \dfile s downloading from Git repositories
\item Number/Proportion of \dfile s downloading from internet
\end{itemize}
\subsection{Dynamic Analysis}
The second part of the analysis will be done after the first year of data collection, and will focus on the temporal evolution of properties of the artifacts.
\paragraph{Artifact Sources}
\begin{itemize}
\item Number/Proportion of artifacts that can be downloaded
\item Number/Proportion of artifacts which content has changed
\end{itemize}
\paragraph{Build Status}
\begin{itemize}
\item Number/Proportion of \dfile s that build succesfully
\item Number/Proportion of \dfile s errors (\texttt{baseimage\_unavailable}, \texttt{time\_execeed}, \texttt{unknown\_error}) for the failed builds
\end{itemize}
\paragraph{Software Environment}
\begin{itemize}
\item Number of installed packages per container
\item Number/Proportion of packages that changed version since last build
\item Package sources (package manager, Git, Misc) from where packages are changing the most
\end{itemize}
\section{Other}
\subsection{Computational Environment}
The builds of the \dfile s will be executed on the french testbed Grid'5000 \cite{grid5000}.
\noteqg{TODO: Which cluster?}
The software environment will be managed by Nix \cite{dolstra2004nix}.