Documentation
Using the Service:Recipe: Run a Simple Fit
Recipe: Run an Advanced Fit
Acknowledgments
This example serves as a tutorial that will walk you through some of the aspects of applying the EXOFAST tool in a more "real-world" situation, fitting public data from HAT-P-3 b.
For the purposes of this exercise, we will only run χ2 fits, rather than the full MCMC. The χ2 fits run a lot faster, and are in any case run automatically as a precursor to during a full MCMC run. It therefore generally makes sense to explore your data using χ2 fitting before proceeding to an MCMC run.
This particular fit is generally more difficult than most fits because the RV data are sparse, and so not very constraining. That makes this a useful demonstration of the types of problems you may encounter (and what you can do about it), but try not to be intimidated—it's usually much easier. (For a simpler “quick-look” example that focuses on the main aspects of the EXOFAST interface, try the Run a Simple Fit walk-through.)
Presumably, you have your own data files. If you're missing RVs of a known planet, you can often get them from the NASA Exoplanet Archive (for HAT-P-3b, both the light curve and RVs can be accessed from the HAT-P-3 b Confirmed Planet Overview Page. While you can do the fit with just transit data or just RV data, you can learn much more about the system with both, which is the real advantage of using EXOFAST.
The light curves and RV files in the archive are always in IPAC table format, which can be parsed by the EXOFAST web interface, along with a variety of other formats (see the file format requirements for transit and RV). However, the RV file must contain, at a minimum, three columns representing: BJD_TDB, RV (m s-1), and error (m s-1). The light curve must also have at least three columns: BJD_TDB, normalized flux (mean out-of-transit flux approximately equals 1), and error.
Optionally, any additional columns in the transit file can be used to represent systematic trends that will be used to detrend the light curve (e.g., airmass at each time).
We have converted two files from the archive for the purposes of this walk-through:
These files are in simple, plain-text columnar format. The time stamps are all converted to BJD-TDB for accuracy, and the magnitude units in the original transit light curve are converted to normalized flux.
Parameters are already available in the archive for HAT-P-3 b, and we could pull them in directly in the same way as in the simple walkthrough for HD 17156 b. However, for the purposes of this tutorial, we will pretend we have independent estimates for a subset of parameters, and no handle yet on the period.
Where present, the prior values will be used as starting points for the fitting (otherwise EXOFAST will use Hot-Jupiter-like guesses, which are usually good enough). Where widths are available, those widths will be used as initial stepping scales for the χ2 minimization algorithm. These values and widths will also serve as the Bayesian priors if we later run a full MCMC calculation as well.
First, the program will find the period with a Lomb-Scargle periodogram of the RV data. Assuming a circular orbit, it finds the N min best periods. Normally, there will be one obvious period, and it'll just work.
This time, however, you'll notice there are four periods with low χ2:
Best peaks in the RV fit: T_C Period ecosw esinw K gamma slope chi^2 chi^2/dof 2454194.513285 2.509754 -0.040753 0.013526 115.808745 -22.060623 0.000000 73.595057 24.531686 2454196.563903 3.190485 -0.214598 -0.016286 111.215690 -10.738981 0.000000 5.837212 1.945737 2454195.676539 2.900505 -0.129177 0.013844 104.960806 -16.348920 0.000000 10.519117 3.506372 2454195.934438 2.988335 -0.157248 -0.047552 106.561277 -14.253449 0.000000 4.929913 1.643304 2454195.150633 2.728094 -0.062298 0.056450 101.638085 -19.409819 0.000000 6.324167 2.108056
It will proceed with the best fit (lowest of the RV data, in this case a period of 2.99 days) but that doesn't mean it is correct. In fact, a good indication that it's NOT correct are the χ2 of the combined fits:
Combined fit: Chi^2 of Transit data = 467.15740 (368 data points) Chi^2 of RV data = 7087.4714 (9 data points) Chi^2 of Priors = 77.895843 (3 priors) Chi^2/dof = 22.722426
Then, if you look at the best-fit parameters and models, you'll see it's not a good fit at all. The eccentricity is really high, the RV data don't fit at all, and the transit model is also a poor fit to the data. This is really an indication that the best fit transit and best fit RV were not consistent, and thus the 2.99-day period is likely the problem.
We will force EXOFAST to pick the next most likely (3.19 days) instead:
This works, but you should be leery for two reasons. First, you should investigate the other two periods, because they looked equally likely. Second, you'll see from the RV plot the orbit really only has data at three or four distinct phases, and so it doesn't really have the power to constrain the eccentricity (i.e., it's probably just fitting the noise).
We should investigate options for a circular orbit, noting the χ2 values for the combined fit. Note that these are after the errors have been scaled to force Ρ(χ2) = 0.5 for the individual fits.
When we run through above procedure for each of the four best periods, we find results like:
Chi^2/dof = 23.464731 (P = 2.99 days) Chi^2/dof = 1.0283665 (P = 3.19 days) Chi^2/dof = 1.0770624 (P = 2.66 days) Chi^2/dof = 1.0096547 (P = 2.90 days)
So it looks like the 2.90 day period is slightly favored, but there's really no telling between 2.66, 2.90, and 3.19 days (note that the RV-only preferred period of 2.73 days was changed to 2.65 days in the combined fit). Since we really don't have the power to constrain the eccentricity, we'll fix it to zero.
Best peaks in the RV fit: T_C Period ecosw esinw K gamma slope chi^2 chi^2/dof 1.107424 2.509629 0.000000 0.000000 111.702373 -19.695690 0.000000 79.680061 15.936012 0.032483 3.191384 0.000000 0.000000 88.122421 -8.571517 0.000000 60.684224 12.136845 1.241369 2.901193 0.000000 0.000000 90.201515 -14.637107 0.000000 31.508634 6.301727 0.604251 2.985977 0.000000 0.000000 89.044937 -12.943334 0.000000 34.917544 6.983509 1.105912 2.727992 0.000000 0.000000 95.094301 -17.178435 0.000000 27.056734 5.411347 Chi^2/dof = 5.4113468 Scaling errors by 2.4935620
We see the same periods, though the 3.19-day period is much less appealing. Let's look at all of them anyway by constraining the period ranges as before:
Chi^2/dof = 1.0498914 (P = 2.65 days) Chi^2/dof = 0.9978962 (P = 2.90 days the combined fit finds approximately this minimum for both 2.99 and 2.90 days) Chi^2/dof = 1.0074201 (P = 3.19 days)
The 2.5-day period one failed altogether, with a very poor fit to the RV data. An error is displayed in the message bar of the Inputs tab, pointing us to the Results panel. Looking there at the Run Log, we see this error reported:
EXOFAST_GETMCMCSCALE: Parameter 6 is unconstrained. Check your starting conditions
This essentially means the transit time predicted from the RVs were so far off that all the transit models were flat lines. We can safely ignore this period.
So still, there's a slight preference for the 2.90-day period, but two others (2.65 and 3.19 days) are reasonable. We really need more information to be sure. Fortunately (as will almost always be the case), we know the period precisely from the transit survey data (2.89970 ± 0.000054 days), so we know which to use. We can also use that value as an explicit prior by entering it in the prior and width boxes (note, however, that the period prior and the limits on the Lomb-Scargle periodogram are independent. You need to make sure you have the right period for both.)
So what happens if your planet has no previously published parameters in the literature to use as a starting point (like a new planet)? In this case, we'll pretend we have no prior information.
In order to use the information from both RV and transit data sets, we'll need our own priors on Teff and [Fe/H] (logg helps, but is adequately constrained by the transit + Torres relations alone). These can all be estimated from a precise spectrum. If we don't have that, Teff can be estimated from multi-color photometry and we can assume the metallicity is solar.
In this example, we'll manually enter the values we used before and pretend we measured them from somewhere.
If you don't have these priors, enter roughly solar values with very large error bars (Teff = 6000 ± 2000, [Fe/H] = 0 ± 1), and then do not believe the stellar parameters or anything derived from them. However, be warned that since a transit alone can determine the stellar density, the stellar parameters can make a difference to the actual light curve model. If you get a high contribution to the χ2 from the priors, you may want to revisit them.
We see a similar error to before:
Best peaks in the RV fit: T_C Period ecosw esinw K gamma slope chi^2 chi^2/dof 2454194.513285 2.509754 -0.040753 0.013526 115.808745 -22.060623 0.000000 73.595057 24.531686 2454196.563903 3.190485 -0.214598 -0.016286 111.215690 -10.738981 0.000000 5.837212 1.945737 2454195.676539 2.900505 -0.129177 0.013844 104.960806 -16.348920 0.000000 10.519117 3.506372 2454195.934438 2.988335 -0.157248 -0.047552 106.561277 -14.253449 0.000000 4.929913 1.643304 2454195.150633 2.728094 -0.062298 0.056450 101.638085 -19.409819 0.000000 6.324167 2.108056 Chi^2/dof = 1.6433043 Scaling errors by 1.4434929 Transit fit: Chi^2/dof = 63.833157 Scaling errors by 7.9970128 RMS of residuals = 0.0066387347 % EXOFAST_GETMCMCSCALE: EXOFAST_GETMCMCSCALE: Parameter 8 is unconstrained. Check your starting conditions
Or you may also see another similar error:
ERROR: Transit fit not constrained. Check your starting conditions.
These errors basically mean the same thing as the previous one. Because we didn't use our previously-known starting values to start the transit fit, it only relied on the RV to predict the transit time. Since the period it picked was wrong, the predicted transit time was way off, and all transit models it tried were indistinguishable, flat lines.
Now we can play the same game, narrowing the window of periods allowed by the Lomb-Scargle periodogram, trying circular and eccentric fits, and we'll end up with roughly the same results as when we used the previously published starting values.
Since not every planet transits, often times, all you'll have is RV data. In that case, skip loading the transit file in the Transit File Options pane of the EXOFAST Inputs tab. While it's not necessary to fit, you are required to plug in logg, Teff, and [Fe/H]. It'll use the Torres relation to derive a stellar mass and radius, which is used, along with the RV parameters, to derive a bunch of interesting things about the system, including transit probabilities, durations, and times.
For the transit durations, it assumes a point planet (Rp=0) and a central crossing (b=0). Just like above, you'll have to explore different periods that seem nearly equally likely (and with/without eccentricity), but you won't have any additional data (e.g., the transit time) as a sanity check. If you were really trying to fit a data set this messy with the RV alone, you'd just need more data.
This is the least robust because the transit model depends on the eccentricity, argument of periastron, and period, which a single transit alone can't constrain very well. Fortunately, this shouldn't be necessary for the vast majority of planets. If at all possible, try to find a source of RVs from the literature (check the archive via the RV File Host Lookup option!). If they don't exist (such as for faint Kepler data), proceed.
In principle, the period can be constrained if multiple transits are given, but deriving it from first principles requires a Box Least Squares algorithm, which we don't do. If your data span less than one period, you may be able to pull a period estimate from the archive as a starting point. Otherwise, you'll need to give it some other starting value for the period, even if it's determined from the same photometry you're fitting. If you have multiple transits, a prior width on the period is not necessary, otherwise, the results may not be believable. Failing to include a prior width may cause the period to wander dramatically, and many things scale with the period, so none of your fits will be believable. A warning is issued if this is attempted.
Forcing the orbit to be circular (by checking the Force Circular Orbit box in Period and Setting Inputs) is highly recommended for most transit-only fits. If the orbit is eccentric, you probably know that from RVs, so you should include them instead of a prior. Nevertheless, if you'd prefer, you can include a prior on eccentricity and the argument of periastron if the orbit is eccentric.
Like the combined fit, we require you to provide an estimate of Teff and [Fe/H] so we can derive the stellar properties and therefore the planetary properties.
«Previous Recipe: Run a Simple Fit Acknowledgments Next »
Last update: 20 July 2017