EXOFAST Recipe: Run an Advanced Fit

Documentation

Using the Service:

Recipe: Run a Simple Fit
Recipe: Run an Advanced Fit
Acknowledgments

This example serves as a tutorial that will walk you through some of the aspects of applying the EXOFAST tool in a more "real-world" situation, fitting public data from HAT-P-3 b.

For the purposes of this exercise, we will only run χ² fits, rather than the full MCMC. The χ² fits run a lot faster, and are in any case run automatically as a precursor to during a full MCMC run. It therefore generally makes sense to explore your data using χ² fitting before proceeding to an MCMC run.

This particular fit is generally more difficult than most fits because the RV data are sparse, and so not very constraining. That makes this a useful demonstration of the types of problems you may encounter (and what you can do about it), but try not to be intimidated—it's usually much easier. (For a simpler “quick-look” example that focuses on the main aspects of the EXOFAST interface, try the Run a Simple Fit walk-through.)

Step 1: Gathering Input Data

Presumably, you have your own data files. If you're missing RVs of a known planet, you can often get them from the NASA Exoplanet Archive (for HAT-P-3b, both the light curve and RVs can be accessed from the HAT-P-3 b Confirmed Planet Overview Page. While you can do the fit with just transit data or just RV data, you can learn much more about the system with both, which is the real advantage of using EXOFAST.

The light curves and RV files in the archive are always in IPAC table format, which can be parsed by the EXOFAST web interface, along with a variety of other formats (see the file format requirements for transit and RV). However, the RV file must contain, at a minimum, three columns representing: BJD_TDB, RV (m s^-1), and error (m s^-1). The light curve must also have at least three columns: BJD_TDB, normalized flux (mean out-of-transit flux approximately equals 1), and error.

Optionally, any additional columns in the transit file can be used to represent systematic trends that will be used to detrend the light curve (e.g., airmass at each time).

We have converted two files from the archive for the purposes of this walk-through:

hat3.rv (1 KB)
hat3.flux (12 KB)

These files are in simple, plain-text columnar format. The time stamps are all converted to BJD-TDB for accuracy, and the magnitude units in the original transit light curve are converted to normalized flux.

Tip: Radial velocity units are all already m s^-1 in the archive.

Step 2: Upload Your Input Files

Load the EXOFAST web interface and make sure you are on the EXOFAST Inputs tab. If you are already there, click Start New Session.
In the Transit File Options pane, select SDSS i' as the band for the input transit file.
Click Choose File and then browse to and select your local copy of the hat3.flux file.
Click Upload.
In the Radial Velocity File Options pane, click Choose File and repeat the same steps to upload the hat3.rv file.

Parameters are already available in the archive for HAT-P-3 b, and we could pull them in directly in the same way as in the simple walkthrough for HD 17156 b. However, for the purposes of this tutorial, we will pretend we have independent estimates for a subset of parameters, and no handle yet on the period.

To save you the effort of manually entering prior parameters, download this priors parameter file.
In the Prior and Prior Widths Inputs pane, click Choose File button under Upload Prior Parameter File.
Click Upload. Notice the requisite prior values and widths are now loaded into the appropriate text boxes.
Where present, the prior values will be used as starting points for the fitting (otherwise EXOFAST will use Hot-Jupiter-like guesses, which are usually good enough). Where widths are available, those widths will be used as initial stepping scales for the χ² minimization algorithm. These values and widths will also serve as the Bayesian priors if we later run a full MCMC calculation as well.
Use the default setting for the rest of the options in this pane.
Click Run EXOFAST.

Step 3: Analyze First Set of Results

First, the program will find the period with a Lomb-Scargle periodogram of the RV data. Assuming a circular orbit, it finds the N min best periods. Normally, there will be one obvious period, and it'll just work.

This time, however, you'll notice there are four periods with low χ²:

Best peaks in the RV fit:
     T_C            Period        ecosw        esinw        K          gamma          slope      chi^2      chi^2/dof
2454194.513285     2.509754    -0.040753     0.013526   115.808745   -22.060623     0.000000    73.595057    24.531686
2454196.563903     3.190485    -0.214598    -0.016286   111.215690   -10.738981     0.000000     5.837212     1.945737
2454195.676539     2.900505    -0.129177     0.013844   104.960806   -16.348920     0.000000    10.519117     3.506372
2454195.934438     2.988335    -0.157248    -0.047552   106.561277   -14.253449     0.000000     4.929913     1.643304
2454195.150633     2.728094    -0.062298     0.056450   101.638085   -19.409819     0.000000     6.324167     2.108056

It will proceed with the best fit (lowest of the RV data, in this case a period of 2.99 days) but that doesn't mean it is correct. In fact, a good indication that it's NOT correct are the χ² of the combined fits:

Combined fit:
Chi^2 of Transit data = 467.15740 (368 data points)
Chi^2 of RV data = 7087.4714 (9 data points)
Chi^2 of Priors = 77.895843 (3 priors)
Chi^2/dof = 22.722426

Then, if you look at the best-fit parameters and models, you'll see it's not a good fit at all. The eccentricity is really high, the RV data don't fit at all, and the transit model is also a poor fit to the data. This is really an indication that the best fit transit and best fit RV were not consistent, and thus the 2.99-day period is likely the problem.

Step 4: Refine Minimum and Maximum Periods

We will force EXOFAST to pick the next most likely (3.19 days) instead:

Return to Period and Setting Inputs on the Inputs tab and insert 3.15 as the minimum period and 3.25 as the maximum period, thus allowing only the 3.19-day peak.
Click Run EXOFAST.

This works, but you should be leery for two reasons. First, you should investigate the other two periods, because they looked equally likely. Second, you'll see from the RV plot the orbit really only has data at three or four distinct phases, and so it doesn't really have the power to constrain the eccentricity (i.e., it's probably just fitting the noise).

We should investigate options for a circular orbit, noting the χ² values for the combined fit. Note that these are after the errors have been scaled to force Ρ(χ²) = 0.5 for the individual fits.

When we run through above procedure for each of the four best periods, we find results like:

Chi^2/dof = 23.464731 (P = 2.99 days)
Chi^2/dof = 1.0283665 (P = 3.19 days)
Chi^2/dof = 1.0770624 (P = 2.66 days)
Chi^2/dof = 1.0096547 (P = 2.90 days)

Step 5: Refine the Period

So it looks like the 2.90 day period is slightly favored, but there's really no telling between 2.66, 2.90, and 3.19 days (note that the RV-only preferred period of 2.73 days was changed to 2.65 days in the combined fit). Since we really don't have the power to constrain the eccentricity, we'll fix it to zero.

Return to the EXOFAST Inputs tab.
Check the Force Circular Orbit box.
Click Run EXOFAST.

Best peaks in the RV fit:
     T_C            Period        ecosw        esinw        K          gamma          slope      chi^2      chi^2/dof
      1.107424     2.509629     0.000000     0.000000   111.702373   -19.695690     0.000000    79.680061    15.936012
      0.032483     3.191384     0.000000     0.000000    88.122421    -8.571517     0.000000    60.684224    12.136845
      1.241369     2.901193     0.000000     0.000000    90.201515   -14.637107     0.000000    31.508634     6.301727
      0.604251     2.985977     0.000000     0.000000    89.044937   -12.943334     0.000000    34.917544     6.983509
      1.105912     2.727992     0.000000     0.000000    95.094301   -17.178435     0.000000    27.056734     5.411347
Chi^2/dof = 5.4113468
Scaling errors by 2.4935620

We see the same periods, though the 3.19-day period is much less appealing. Let's look at all of them anyway by constraining the period ranges as before:

Chi^2/dof = 1.0498914 (P = 2.65 days)
Chi^2/dof = 0.9978962 (P = 2.90 days the combined fit finds approximately this minimum for both 2.99 and 2.90 days)
Chi^2/dof = 1.0074201 (P = 3.19 days)

The 2.5-day period one failed altogether, with a very poor fit to the RV data. An error is displayed in the message bar of the Inputs tab, pointing us to the Results panel. Looking there at the Run Log, we see this error reported:

EXOFAST_GETMCMCSCALE: Parameter 6 is unconstrained. Check your starting conditions

This essentially means the transit time predicted from the RVs were so far off that all the transit models were flat lines. We can safely ignore this period.

So still, there's a slight preference for the 2.90-day period, but two others (2.65 and 3.19 days) are reasonable. We really need more information to be sure. Fortunately (as will almost always be the case), we know the period precisely from the transit survey data (2.89970 ± 0.000054 days), so we know which to use. We can also use that value as an explicit prior by entering it in the prior and width boxes (note, however, that the period prior and the limits on the Lomb-Scargle periodogram are independent. You need to make sure you have the right period for both.)

Step 6: Run Without Previously Published Parameters

So what happens if your planet has no previously published parameters in the literature to use as a starting point (like a new planet)? In this case, we'll pretend we have no prior information.

Return to Period and Setting Inputs on the Inputs tab and reset the period ranges: 1 for minimum and 91.850 for maximum. (Or, simply reload the radial velocity data file.)
Uncheck the Force Circular Orbit box.

In order to use the information from both RV and transit data sets, we'll need our own priors on Teff and [Fe/H] (logg helps, but is adequately constrained by the transit + Torres relations alone). These can all be estimated from a precise spectrum. If we don't have that, Teff can be estimated from multi-color photometry and we can assume the metallicity is solar.

In this example, we'll manually enter the values we used before and pretend we measured them from somewhere.

Insert the following values into the appropriate prior and prior width boxes:
- Teff = 5190 +/- 80
- [Fe/H] = 0.27 +/- 0.08
If you don't have these priors, enter roughly solar values with very large error bars (Teff = 6000 ± 2000, [Fe/H] = 0 ± 1), and then do not believe the stellar parameters or anything derived from them. However, be warned that since a transit alone can determine the stellar density, the stellar parameters can make a difference to the actual light curve model. If you get a high contribution to the χ² from the priors, you may want to revisit them.
Click Run EXOFAST.

We see a similar error to before:

Best peaks in the RV fit:
      T_C             Period         ecosw         esinw         K           gamma           slope       chi^2       chi^2/dof
2454194.513285      2.509754     -0.040753      0.013526    115.808745    -22.060623      0.000000     73.595057     24.531686
2454196.563903      3.190485     -0.214598     -0.016286    111.215690    -10.738981      0.000000      5.837212      1.945737
2454195.676539      2.900505     -0.129177      0.013844    104.960806    -16.348920      0.000000     10.519117      3.506372
2454195.934438      2.988335     -0.157248     -0.047552    106.561277    -14.253449      0.000000      4.929913      1.643304
2454195.150633      2.728094     -0.062298      0.056450    101.638085    -19.409819      0.000000      6.324167      2.108056
Chi^2/dof = 1.6433043
Scaling errors by 1.4434929

Transit fit:
Chi^2/dof = 63.833157
Scaling errors by 7.9970128
RMS of residuals = 0.0066387347
% EXOFAST_GETMCMCSCALE: EXOFAST_GETMCMCSCALE: Parameter 8 is unconstrained.
                         Check your starting conditions

Or you may also see another similar error:

ERROR: Transit fit not constrained. Check your starting conditions.

These errors basically mean the same thing as the previous one. Because we didn't use our previously-known starting values to start the transit fit, it only relied on the RV to predict the transit time. Since the period it picked was wrong, the predicted transit time was way off, and all transit models it tried were indistinguishable, flat lines.

Now we can play the same game, narrowing the window of periods allowed by the Lomb-Scargle periodogram, trying circular and eccentric fits, and we'll end up with roughly the same results as when we used the previously published starting values.

Step 7: Run with Radial Velocity Data Only

Since not every planet transits, often times, all you'll have is RV data. In that case, skip loading the transit file in the Transit File Options pane of the EXOFAST Inputs tab. While it's not necessary to fit, you are required to plug in logg, Teff, and [Fe/H]. It'll use the Torres relation to derive a stellar mass and radius, which is used, along with the RV parameters, to derive a bunch of interesting things about the system, including transit probabilities, durations, and times.

For the transit durations, it assumes a point planet (R_p=0) and a central crossing (b=0). Just like above, you'll have to explore different periods that seem nearly equally likely (and with/without eccentricity), but you won't have any additional data (e.g., the transit time) as a sanity check. If you were really trying to fit a data set this messy with the RV alone, you'd just need more data.

Step 8: Run with Transit Data Only

This is the least robust because the transit model depends on the eccentricity, argument of periastron, and period, which a single transit alone can't constrain very well. Fortunately, this shouldn't be necessary for the vast majority of planets. If at all possible, try to find a source of RVs from the literature (check the archive via the RV File Host Lookup option!). If they don't exist (such as for faint Kepler data), proceed.

In principle, the period can be constrained if multiple transits are given, but deriving it from first principles requires a Box Least Squares algorithm, which we don't do. If your data span less than one period, you may be able to pull a period estimate from the archive as a starting point. Otherwise, you'll need to give it some other starting value for the period, even if it's determined from the same photometry you're fitting. If you have multiple transits, a prior width on the period is not necessary, otherwise, the results may not be believable. Failing to include a prior width may cause the period to wander dramatically, and many things scale with the period, so none of your fits will be believable. A warning is issued if this is attempted.

Forcing the orbit to be circular (by checking the Force Circular Orbit box in Period and Setting Inputs) is highly recommended for most transit-only fits. If the orbit is eccentric, you probably know that from RVs, so you should include them instead of a prior. Nevertheless, if you'd prefer, you can include a prior on eccentricity and the argument of periastron if the orbit is eccentric.

Like the combined fit, we require you to provide an estimate of Teff and [Fe/H] so we can derive the stellar properties and therefore the planetary properties.

«Previous Recipe: Run a Simple Fit Acknowledgments Next »

Last update: 20 July 2017