DESCRIPTION OF THE KEPLER DATA VALIDATION ONE-PAGE SUMMARY REPORTS

Quarters 1-12

Version: 1.0
Delivered by the Kepler Project on Dec 7, 2012


CONTENT

  1. INTRODUCTION
    1. FULL TIME-SERIES FLUX PLOT
    2. PHASED FULL-ORBIT FLUX PLOT
    3. PHASED TRANSIT-ONLY FLUX PLOT
    4. WHITENED PHASED TRANSIT-ONLY PLOT
    5. ODD-EVEN TRANSIT PLOT
    6. CENTROID OFFSET PLOT
    7. DV ANALYSIS TABLE

Figure 1: Example of a one-page summary that corresponds to Q1-12 TCE 1 of 6 in KIC 006541920. The event is also known as Kepler-11e and KOI 157.03. The large, red letters identify each part of the one-page summary described below.

1. INTRODUCTION

Within the Kepler pipeline, stars that have been identified with at least one Threshold Crossing Event (TCE) — which is a series of at least three transit-like signals with a consistent period — are put through a process called Data Validation (DV). In DV, diagnostic parameters are computed and plotted for each TCE to help determine if it is an instrumental artifact from the spacecraft, a blended binary or other astrophysical false positive, or a true planetary candidate. Although very comprehensive multi-page reports are generated and archived for each Kepler star with at least one TCE, these simple one-page summaries provide much of the critical information for a quick assessment of candidacy. This document describes these one-page summaries, with an example shown in Figure 1. Large, red letters have been added to the figure for guidance throughout the rest of this document.

At the very top of each report is a single line of text that contains the Kepler Input Catalog (KIC) number, the planet candidate number, and its orbital period. Immediately below this is another line of text that contains the Kepler magnitude (Kp) of the star and its radius (R*) in stellar radii, effective temperature (Teff), surface gravity (log g), and metallicity ([Fe/H]). The remainder of the one-page summary is divided into sections designated by letters A-G. Each is explained in the following sections of this document, along with an explanation of how each plot and parameter can be used to help disposition a TCE. The software revision URL that appears on the bottom of the page identifies the version of the pipeline code used for the DV run. The date of summary generation is also provided.

A. FULL TIME-SERIES FLUX PLOT

Plot A shows the full flux time-series for the TCE with relative flux on the y-axis and time in Barycentric Kepler Julian Date (BKJD) on the x-axis (BJD = BKJD + 2,454,833.0). The light curve has been detrended beyond that accomplished by Presearch Data Conditioning (PDC) using a running median filter to remove any long-period systematics. The start of each new quarter is marked with a vertical dashed red line and labeled with the quarter number (e.g., Q2 for Quarter 2). Along the bottom of the plot are triangles that mark the expected position of the transits for this particular TCE, corresponding to the best period and epoch identified.

This plot helps identify any potential inter-quarter systematics that may have triggered the TCE. Gaps in the data along the quarter boundaries, and at monthly intervals within each quarter, are expected because the spacecraft is re-oriented to download data. If the TCE is a planet candidate, then a transit should be visible at every triangle where data exists. TCEs whose transits occur primarily near quarter boundaries are more suspect because the strongest systematics are at the start of quarters. Additional transits may be visible, especially if DV has identified more than one TCE for the system. (The total number of TCEs found for the KIC target is shown at the very top of the DV report).

B. PHASED FULL-ORBIT FLUX PLOT

Plot B shows the phase-folded light curve for the TCE, folded according to its best-fit period so that the phase in days is plotted on the x-axis. The epoch of the primary transit, as well as the expected epoch of the secondary eclipse (assuming zero eccentricity), are indicated by the two triangles along the bottom of the plot, at 0.0 and half the orbital period, respectively. Blue dots are phase-binned averages of the data. A transit model fit is performed on the whitened data, (see Section D), and the resulting (de-whitened) model is shown on this plot via the solid red line.

This plot helps assess whether the phased data can be adequately explained by a physical transit model. If the TCE is a viable planet candidate, the transit model should accurately fit the phased transit, although discrepancies may occur since the transit model is actually fit to the whitened data in Plot D. The out-of-transit baseline should generally be flat. A secondary eclipse typically should not be visible, except in cases of hot-Jupiter planets with very short orbital periods. If an eclipse is visible, this suggests the TCE may be an eclipsing binary false positive. It is not unusual to observe additional transits scattered about in this light curve, especially if DV has identified more than one TCE for the system. Generally, these transits should not be in-phase at the period of the current TCE under examination.

C. PHASED TRANSIT-ONLY FLUX PLOT

Plot C shows the phase-folded light curve for the TCE, with the range on the x-axis reduced so that only the primary transit is visible. The x-axis unit is hours, and the blue dots are phase-binned averages of the original data. As explained in Section B, a transit model fit is performed on the whitened data (see Section D), and the resulting (de-whitened) model is shown on this plot via the solid red line.

This plot allows a detailed assessment of the primary transit and theoretical model fit. If the TCE is a viable planet candidate, the transit model should accurately fit the phased transit, although discrepancies may occur since the transit model is actually fit to the whitened data in Plot D. The primary transit should also be fairly symmetric around Phase 0.0. Asymmetry in the light curve is an indication that the TCE could be a result of spacecraft systematics or other astrophysical phenomena.

D. WHITENED PHASED TRANSIT-ONLY PLOT

Plot D shows the phase-folded, binned light curve for the TCE with a whitening filter applied to remove any further long-period systematics. The y-axis shows the Whitened Flux Values (Tenenbaum et al. 2010). A best-fit transit model is shown via a solid red line, which has also been passed through the whitening filter. Residuals of the best-fit to the binned data are shown by green dots (offset in flux for clarity), while the magenta dots are data centered around phase 0.5 (also offset in flux for clarity). The secondary eclipse may occur elsewhere for non-circular orbits. Above the plot are values for the Multiple Event Statistic (MES), the total number of individual transits that have been fit, the Signal-to-Noise Ratio (SNR) of the iterative whitened transit model fit, the reduced Chi-Squared value (χ2/DoF), where DoF is the number of Degrees of Freedom in the fit, and the transit depth in parts per million (ppm), with the error on the transit depth shown in brackets.

This plot compares the primary transit and the model fit, to determine how any systematics in the data are affected by the whitening filter. It is not unusual to see an increase in flux in both the binned data and the transit model, immediately before and after the transit, due to the whitening filter. The transit model for a good planet candidate should fit the binned data, with no obvious trends observed in the residuals that would indicate an asymmetric transit. A good fit should have a reduced Chi-Squared near 1.0. Although the signal-to-noise should be somewhat similar to the MES, it will generally be higher than the MES due to fitting a fully detailed transit model. High MES and SNR values indicate a more significant detection of a transit-like signature.

E. ODD-EVEN TRANSIT PLOT

Plot E shows the phase-folded light curve (black dots) separately for the odd and even transits. Binned data are indicated by blue dots. On the left side, only the odd (i.e., the first, third, fifth, etc.) transit signatures are phase-folded and shown, while on the right side only the even (i.e., the second, fourth, sixth, etc.) transit signatures are shown. A transit model has been independently fit to the odd and even sets (in the whitened domain) to determine the transit depth of each set. The red line indicates the transit depth of all the data fitted together, with the red boxes indicating the uncertainty in that measurement. At the top of the plot the significance of the difference in depth for the odd and even numbered transits is shown, both in terms of a percentile and sigma.

This plot exposes any alternating difference in transit depth. If the TCE under investigation is a valid planetary candidate, there should be no statistical difference between the depths of the odd and even numbered transits. If the binary eclipse depths are not exactly equal, a difference in the odd and even transit depths should be seen. A statistically significant difference indicates that the object is likely to be an astrophysical false positive. Note, however, that a background eclipsing binary could have nearly equal eclipse depths, or a very small secondary eclipse depth. Furthermore, if the candidate is a background eclipsing binary or similar astrophysical false positive, then it is possible that the TCE is folding the data at exactly half the period of that binary. Therefore a lack of significant transit depth variation does not, by itself, confirm the planetary nature of a TCE.

F. CENTROID OFFSET PLOT

Plot F shows the PRF centroid offset with the RA Offset in arcseconds on the x-axis, and the Dec Offset in arcseconds on the y-axis. For each quarter, two separate pixel-level images of the source are computed, one using the average of only the in-transit data, and the other using the average of data just outside of transit. The difference of the in and out-of-transit images is used to produce a difference image. The difference image produces a star image at the location of the transit signal.

The Kepler Pixel Response Function (PRF) is the Kepler point spread function combined with expected spacecraft pointing jitter and other systematic effects (Bryson et al. 2010). The PRF is fit separately to the difference and out-of-transit images to compute centroid positions. The fit to the difference image gives the location of the transit source, and the fit to the out-of-transit image gives the location of the target star (assuming there are no other bright stars in the aperture). Subtracting the target star location from the transit source location gives the offset of the transit source from the target star. This is performed on a per-quarter basis, and the quarterly offsets are shown as green cross-hairs and labeled with the quarter number, where the length of the arms of each cross-hair represents the 1σ error in RA and Dec. Asterisks in the image show the location of known stars in the aperture, with the red asterisk being the target star. The coordinates of these stars are chosen so that the target star is at (0,0). A robust fit (i.e., an error-weighted fit that iteratively removes extreme outliers) is performed using all the quarterly centroid offsets to compute an average in-transit offset position, and is shown with 1σ error bars as a magenta cross. A dark blue circle is shown, always centered on the magenta cross, that represents the 3σ limit on the magnitude of the robustly-fit, quarter-averaged offset of the transit source from the target star. The numerical value of the quarterly-averaged offset source from the target star is given by OotOffset-rm in the DV analysis table (G).

This plot graphically indicates whether there is a significant centroid offset between the transit source and target star location during transits, and if an associated KIC star is likely to be the true source of the TCE. In general, a significant (i.e., >3σ) centroid offset is seen if the red asterisk lies outside the dark blue circle. In this case it is likely that the observed transit is not due to a transit on the target star. However, here are several ways in which this diagnostic can be misleading: 1) if the offset (distance of the center of the magenta cross-hair from the target star) is less than 0.1 arcsec, then the offset is likely due to systematic measurement error and the transit is likely to be on the target star regardless of the offset value in sigma, 2) If there are other stars in the aperture with brightness equal to or greater than the target star, then the offset computation can be very inaccurate. This situation can be detected by comparing OotOffset-rm with KicOffset-rm in the DV analysis table (G). When they differ by more than 2 arcsec and there are bright stars in the aperture, then the plot is likely invalid. In this case OotOffset-rm may be invalid and KicOffset-rm may be used to estimate the offset of the transit source from the target. Finally, these diagnostics are valid only if the TCE is due to a transit or eclipse on a star in the aperture. If the TCE results from a systematic error, such as a spacecraft pointing tweak, pixel sensitivity dropout, or other similar effect, then this method of measuring centroids is invalid.

G. DV ANALYSIS TABLE

Section G shows a table of fit parameters, derived parameters, and vetting statistics generated by the DV analysis. The left column contains best-fit parameters from a Mandel-Agol (2002) transit model to the whitened data, assuming the TCE is a transiting planet. The orbital period of the planetary candidate is shown in days, and the epoch (i.e., the central time of the first transit) is shown in BKJD. The parameter Rp/R* is the ratio of the planetary radius to the stellar radius, a/R* is the ratio of the planet-star separation at time of transit to the stellar radius, and b is the impact parameter, where b = 0 represents a central transit and b = 1 represents the most extreme grazing transit. These last three parameters are unitless. The errors for these parameters are shown in brackets. At the bottom of the left column, Teq is the calculated equilibrium temperature of the planet's surface, Rp is the calculated planetary radius in units of Earth radii, and a is the calculated semi-major axis of the system in au.

On the right-hand side, various diagnostic parameters are shown. Epoch-sig is a metric for how well the epochs computed separately for the odd-only and even-only transits agree with each other. 100% (0.0σ) indicates a perfect match, while lower percentages (higher σs) indicate more significant odd-even epoch differences. A significant value of Epoch-sig suggests that the TCE is an eclipsing binary with a slightly eccentric orbit, (so that the secondary eclipse is slightly offset from phase 0.5) with the TCE period half of the binary's true orbital period.

The values of ShortPeriod-sig and LongPeriod-sig are metrics that indicate whether the current TCE has a similar period to any other TCE found in the same system. Specifically, ShortPeriod-sig compares the period of the current TCE to the next shortest period TCE in the system, and LongPeriod-sig compares the period of the current TCE to the next longest period TCE in the system. A value of 100% (0.0σ) indicates no match at all between the two periods, with lower percentages (higher sigmas) indicating increasingly more significant agreements between the TCE periods. A significant value of ShortPeriod-sig or LongPeriod-sig may indicate that the system contains an eclipsing binary whose primary and secondary eclipse events have been detected as two different TCEs, thus having very similar periods but different epochs. If ShortPeriod-sig or LongPeriod-sig have a value of "NA" it means that there are no additional TCEs detected in the system with either a shorter or longer period, respectively, than the current TCE under examination.

Centroid-sig is a measure of whether there is a statistically significant centroid shift correlated with the transit as measured by flux-weighted centroids. OotOffset-rm is the measured angular distance between the quarterly-averaged out-of-transit source location and the quarterly averaged location of the transiting source, both determined via PRF fitting. KicOffset-rm is the difference between the quarterly-averaged transit location and the target star position listed in the KIC. The significance of the measured offsets are provided in square parentheses. Pipeline algorithms for populating Bootstrap-pfa, OotOffset-bf, and KicOffset-bf are under development, so these parameters should currently be ignored.