The Frontier Development Lab (FDL) is a partnership between NASA, ESA, the SETI Institute, and commercial AI partners, formed with the purpose of applying ML technologies to space science and pushing the frontiers of research across a broad range of space-related topics. The program consists of an annual intensive eight-week research sprint during which small teams, with members drawn from the fields of AI, data science, and the space sector, work on a range of challenge areas.
The PyATMOS dataset is the result of the work of one of the teams from the FDL 2018 Astrobiology challenge. It comprises a set of model earth-like exoplanet atmospheres, spanning a range of plausible parameter space.
ATMOS is a software package produced and maintained by the Virtual Planetary Laboratory which is publicly available online. The ATMOS package is a coupled photochemistry-climate model, designed to simulate "stable" atmospheres given thermodynamic and chemical inputs. The inputs to ATMOS can be either a set of gas concentrations or a set of gas fluxes at the surface of the planet, as well as planetary parameters and stellar parameters (e.g. the gravitational field strength of the planet, and the stellar spectrum). If the inputs are gas concentrations, then the software will find what gas fluxes are required to maintain the given concentrations in a stable atmosphere. Conversely, if the inputs are fluxes, then the software will calculate the gas concentrations of the resulting stable atmosphere. A stable atmosphere is defined as one where the gas concentrations do not change over a sufficiently long timestep—the length of which varies depending on context but can be as long as the age of the universe. Sometimes a "stable" atmosphere cannot be maintained, for example if certain chemical species are broken down or produced by photochemical reactions too quickly. The output of ATMOS is a 1D column of the resultant atmosphere's temperature, pressure, gas concentrations and gas fluxes as a function of altitude.
PyATMOS (available on GitHub) is the software used to actually run ATMOS hundreds of thousands of times and create a database of exoplanet atmospheres. It is a dockerized wrapper for ATMOS controllable via python, and therefore should be usable on any machine with docker and python, regardless of the operating system.
The NASA Exoplanet Archive hosts around 125,000 simulated exoplanet atmospheres. The astrobiology team of the 2018 NASA FDL programme used the PyATMOS software to create a database of possible stable exoplanet atmospheres. All of these are based around an Earth-like planet that orbits a star similar to the Sun, but with different gas mixtures in their atmospheres. A parameter space of possible atmospheres was scanned by varying the concentrations of the following gasses:
Clicking on the 'PyATMOS Dataset' tab opens an interface for interactively browsing the dataset. The interface is composed of two panels, each comprising a table and an interactive plot.
The upper panel displays a summary table of all the available atmospheric models, with one model per row, and with columns representing the basic input and output parameters for each model. By default, the plot to the right of the table shows a two-dimensional histogram of the distribution of the models across parameter space—initially against surface temperature and pressure.
The bottom panel shows a preview of the model indicated by the currently highlighted row in the summary table (top panel). The table in the bottom panel shows the content of one of the main output files as returned by PyATMOS, parsed_clima_final.csv (see 'Data Products Description' for more information). This table records the variation of several quantities (pressure, temperature, etc.) through each layer of the atmosphere, with one row per layer in the atmosphere. The plot to its right shows a visualisation of the data in the table, by default plotting the curve of temperature against altitude.
Clicking on any row in the summary table (top panel) highlights that row, and updates the model preview panel (bottom panel) accordingly. The highlighted row (atmospheric layer) in the bottom table is indicated by a similarly-colored point on the curve in the model preview plot to the right. Clicking any row in the lower table udpates the marked point in that plot.
Both of the tables can be sorted by clicking on the column headings, and filtered by entering filter criteria into the text boxes underneath the column headings (e.g., you might enter '<300' in the 'Temperature' column of the summary table). The respective plots will be updated accordingly.
The table and plot interfaces are inherited from a toolkit developed at IPAC to support the IRSA Viewer and Catalog Search tools, and provide a lot more functionality via the icons in the gray bar at the top of each. For more information, refer to the following parts of the IRSA Catalog Search Tool documentation:
Downloading a single model
To download the full data for a single model, simply highlight the row
corresponding to the model you wish to download in the summary table
in the top panel, and then click 'Download This Model' in the bottom
panel. This will immediately download a tarred set of all the files
associated with the chosen model.
Downloading a selection of models
To download data for a selection of models, indicate each of
the models you wish to download by checking their respective checboxes
at the left side of the summary table in the top panel. If you have a
large number of models to download, judicious application of
filters on the columns, followed by use of the 'check-all' check box at
the very top-left of the table should save considerable effort. Once
you have checked all the required models, click 'Download All Checked
Models'. The system will generate a single (possibly large!) wget
script which you can run to retrieve all the required models. The
script will retrieve a .csv file containing selected rows of the
summary table (pyatmos_models.csv), and a single gzipped tar bundle for every model that you have
requested. Note that this can result in a large number of files, and a
long download time - you may wish to split the wget script in order to
manage the download on your filesystem. If you are
trying to download a significant fraction of the dataset, you may find
it more efficient to consider instead downloading the entire dataset in
compressed form (see below).
Downloading the entire dataset
To download the entire dataset comprising all the atmospheric
models and the complete summary table, download and run the following wget
script:
Download Entire FDL PyATMOS Dataset
(compressed: ~40 GB; uncompressed: ~110GB)
Note: that the dataset is large, and download may take considerable time!
For a description of the files delivered with each model, see 'Data Products Description' below.
There is a set of 20 data files associated with each model atmosphere, organized into a single directory labelled with a unique hash for each atmosphere (identified in the 'Directory Name' column of the summary table).
In each "atmosphere" directory we see the following files:
(* Files marked with an asterisk may or may not be present for a given model - see below.)
In addition, if you downloaded multiple models, a single summary table file is included corresponding to the table in the top panel of the interactive tool:
Below is a description of each of these files. The products for each model can broadly be separated into two classes, those produced and used directly by the ATMOS software, and those produced by PyATMOS.
pyatmos_models.csv
Single summary table for all the downloaded data, with one model
per row, and with columns representing the basic input and output
parameters for each model, as well as the hashes used to label the
directories for each model. See 'Column
Definitions' for a complete description of the columns. Provided
only if more than one model is downloaded.
species.dat
This file controls the chemical species that were present
during the atmosphere simulation. One of the important fields is
"FIXEDMR", which corresponds to the mixing ratio of this particular
gas at the surface of the planet. The other important field is
"SGFLUX" which corresponds to the gas flux at the surface of the
planet. One can only have either a constant mixing ratio or a constant
flux, and this is toggled by the LBOUND field. An LBOUND of 1
indicates a constant mixing ratio is being set, a value of 2 indicates
a constant flux is being set. Other fields are described in more
detail inside each species.dat file.
Clima_log.txt and Photo_log.tex
These two files are simply a recording of the text output from running
the "clima" and "photochem" parts of the ATMOS software, respectively.
TempIn.dat and TempOut.dat
These two files are used internally by the climate model. They contain
the temperature profile (as a function of altitude, or atmospheric
layer number), and also the "FH20" values in the atmosphere at that
altitude. This information is repeated in parsed_clima_initial.csv and
parsed_clima_final.csv
out.out
Contains lists and tables of many of the results of ATMOS, in
particular the photochemical model. These results include the
calculated fluxes and mixing ratios of gasses of the stable planetary
atmosphere. This file has been parsed for the user by the PyATMOS
software to extract the relevant information. The two resulting files
are "parsed_photochem_fluxes.csv" and
"parsed_photochem_mixing_ratios.csv".
clima_allout.tab
This file stores tabulated results from the climate model within
ATMOS, as well as some of the inputs to the climate model. The inputs
include the initial state of the atmosphere (temperature, pressure as
a function of altitude) as well as some stellar parameters and
planetary albedo. The atmosphere's final state (again temperature and
pressure as a function of altitude) is also stored. This file has been
parsed for the user by the PyATMOS software to extract the relevant
information. The three resulting files are "parsed_clima_initial.csv",
"parsed_clima_final.csv" and "parsed_clima_iterations.csv".
parsed_clima_initial.csv and parsed_clima_final.csv
These contain the pressure and temperature of the atmosphere as a
function of altitude for the initial state of the atmosphere as well
as the final state, respectively. The relevant columns are "ALT"
(altitude of the atmosphere layer in km), "P" (pressure of the
atmosphere layer in bar), "T" (temperature of the atmosphere layer in
Kelvin). The other columns are not relevant.
parsed_clima_iterations.csv
A record of the temperature and temperature change between each
iteration of the photochemical model computation.
parsed_photochem_fluxes.csv and
parsed_photochem_mixing_ratios.csv
This file contains
information parsed from the "out.out" file into a more
computer-friendly format. The information is the fluxes and mixing
ratios of each gas species (labelled by the column headings) at each
altitude ("Z") in cm, respectively.
in.dist
The set of parameters used to configure the photochemical mode.
input_clima.dat
This file contains the inputs to the climate model of ATMOS, including
the stellar parameters and some of the planetary parameters—for
example, the gravitational field strength of the planet (at the
planet's surface), or the fraction of solar radiance received
(relative to what the earth receives).
Mostly the default settings in this file are used, except in cases
where the input methane mixing ratio was > 1e-4, in which case, a
special flag was set to allow the climate model to consider hazes,
which are relevant for atmospheres with large methane concentrations.
If this file is not present in the download, then one can assume the
default settings were used.
mixing_ratios.dat
Mixing ratios of the relevant gases passed from the photochemical part
of the model to the climate model. This file is handled internally by
ATMOS and so is not always included in the downloadable files.
out.dist
The set of the parameters produced by the photochemical
model. This can be used as a replacement for "in.dist" in further
iterations of the photochemical model if a slightly different set of
parameters were used.
run_metadata.json
This file stores metadata about the PyATMOS run that was used to
generate this atmosphere, including the initial gas concentrations,
duration of the calculations, number of iterations required by the
photochemical model to find a stable atmosphere.
*npy.npz files
These are duplicates of the csv files that were created by PyATMOS in
a compressed numpy format.
Column Name | Table Label | Units | Description |
---|---|---|---|
concentration_CH4 | CH4 Concentration | fractional | CH4 concentration at planet surface calculated by model |
concentration_CO2 | CO2 Concentration | fractional | CO2 concentration at planet surface calculated by model |
concentration_H2 | H2 Concentration | fractional | H2 concentration at planet surface calculated by model |
concentration_H2O | H2O Concentration | fractional | H2O concentration at planet surface calculated by model |
concentration_O2 | O2 Concentration | fractional | O2 concentration at planet surface calculated by model |
flux_CH4 | CH4 Flux | molecules/s/cm2 | CH4 flux required to maintain gas concentrations at planet surface, calculated by model |
flux_CO2 | CO2 Flux | molecules/s/cm2 | CO2 flux required to maintain gas concentrations at planet surface, calculated by model |
flux_H2 | H2 Flux | molecules/s/cm2 | H2 flux required to maintain gas concentrations at planet surface, calculated by model |
flux_H2O | H2O Flux | molecules/s/cm2 | H2O flux required to maintain gas concentrations at planet surface, calculated by model |
flux_O2 | O2 Flux | molecules/s/cm2 | O2 flux required to maintain gas concentrations at planet surface, calculated by model |
hash | Directory Name | none | Identifier for the directory containing full model data |
input_CH4 | Input CH4 concentration | fractional | CH4 concentration input to model (at planet surface) |
input_CO2 | Input CO2 concentration | fractional | CO2 concentration input to model (at planet surface) |
input_H2 | Input H2 concentration | fractional | H2 concentration input to model (at planet surface) |
input_H2O | Input H2O concentration | fractional | H2O concentration input to model (at planet surface) |
input_O2 | Input O2 concentration | fractional | O2 concentration input to model (at planet surface) |
pressure_bar | Pressure | bar | Pressure at planet surface |
temperature_kelvin | Temperature | K | Temperature at planet surface |
Column Name | Table Label | Units | Description |
---|---|---|---|
J | Layer Number | none | Atmospheric layer number |
P | Pressure | bar | Atmospheric pressure at this layer |
Alt | Altitude | km | Altitude at this layer |
T | Temperature | K | Temperature at this layer |
CONVEC | Is Layer Convective? | none | Is the atmosphere convective in this layer? (0 = no; 1 = yes) |
FH2O | H2O Fraction | none | Fraction of the atmosphere composed of H2O in this layer |
FO3 | O3 Fraction | none | Fraction of the atmosphere composed of O3 in this layer |
FDL team members
Aaron Bell (University of Tokyo), Aditya Chopra (Australian National University), William Fawcett (University of Cambridge), Rodd Tabeli (Georgia Institute of Technology).
FDL mentors
Daniel Angerhausen (Universität Bern), Anamaria Bera (University of Central Florida), Natalie Cabrol (SETI Institute), Chris Kempes (Santa Fe Institute), Massimo Mascaro (Applied AI, Google).
Last updated: 02 August 2019