207.1. Timeseries values#

207_1_Timeseries_values

207.1. Timeseries values¶

For the Rubin Science Platform at data.lsst.cloud.
Data Release: Data Preview 1
Container Size: Large
LSST Science Pipelines version: r29.2.0
Last verified to run: 2025-12-30
Repository: github.com/lsst/tutorial-notebooks
DOI: 10.11578/rubin/dc.20250909.20

Learning objective: Understand how the values of the timeseries features in the DiaObject table are derived.

LSST data products: DiaObject, DiaSource

Packages: scipy

Credit: Originally developed by the Rubin Community Science team. Please consider acknowledging them if this notebook is used for the preparation of journal articles, software releases, or other notebooks.

Get Support: Everyone is encouraged to ask questions or raise issues in the Support Category of the Rubin Community Forum. Rubin staff will respond to all questions posted there.

1. Introduction¶

A set of timeseries features, also called variability characterization parameters, are computed for all rows of the DiaObject table using flux measurements from the DiaSource table.

This tutorial serves as a reference for interpreting the timeseries features of a given DiaObject. For each feature, this tutorial provides descriptions and equations, code to recalculate the parameter using the DiaSource flux measurements, and an illustration of what the feature indicates about the lightcurve.

A known pulsating variable star is used as the example DiaObject. It is important to note that although all timeseries features are calculated for all DiaObjects, they are not all appropriate for the interpretation of all DiaObjects. For example, a linear slope is calculated for all DiaObjects, but many types of transients and variables are not expected to exhibit a linear slope.

For future data releases, additional timeseries features are likely to be included, as discussed in Data Management Tech Note 118 Review of Timeseries Features (DMTN-118).

Related tutorials: The 200-level tutorials on the DiaObject and DiaSource tables, and the other tutorial in the 207 series on timeseries features.

1.1. Features summary¶

The timeseries features are calculated based on the fluxes in the DiaSource table, and this table only includes visits (observations) for which there was a SNR $>5$ detection in the difference image. For example, for periodic variables, visits for which the star had a brightness similar to its brightness in the template image - and thus very little flux in the difference image - do not contribute to the timeseries features' values.

The following parameters are calculated per filter (<f>), using the difference image photometry (psfFlux) column from the diaSources that are associated with a given DiaObject.

<f>_psfFluxNdata : The number of associated diaSources.
<f>_psfFluxErrMean : Mean of the diaSource PSF flux errors.
<f>_psfFluxMin, Max : Minimum and maximum diaSource PSF flux.
<f>_psfFluxMean, MeanErr : Weighted mean of diaSource PSF flux, and its standard error.
<f>_psfFluxSigma : Standard deviation of the distribution of <f>_psfFlux.
<f>_psfFluxMAD : Median absolute deviation (MAD) of diaSource PSF flux. Does not include scale factor for comparison to sigma.
<f>_psfFluxChi2 : $\chi^2$ statistic for the scatter of <f>_psfFlux around <f>_psfFluxMean.
<f>_psfFluxSkew : Skew of diaSource PSF flux.
<f>_psfFluxPercentile05, 25, 50, 75, 95 : 5th, 25th, 50th, 75th, and 95th percentile diaSource PSF flux.
<f>_psfFluxStetsonJ: StetsonJ statistic of diaSource PSF flux.
<f>_psfFluxLinearIntercept : The y-intercept of a linear model fit to diaSource PSF flux vs time.
<f>_psfFluxLinearSlope : Slope of a linear model fit to diaSource PSF flux vs time.
<f>_psfFluxMaxSlope : Maximum ratio of time ordered $\Delta f / \Delta t$.

The following parameters are calculated per filter with forced photometry on the science image (also called the direct or visit image) at the diaSource positions (the scienceFlux column).

<f>_scienceFluxMean : Weighted mean of the PSF flux (forced photometered on the visit image).
<f>_scienceFluxMeanErr : Standard error on <f>_scienceFluxMean.
<f>_scienceFluxSigma : Standard deviation of the PSF flux (forced photometered on the visit image).

Although the scienceFlux features are forced photometry, because they are from the DiaSource table they only include images in which the astrophysical object was detected with SNR$>5$ in the difference image. These timeseries features do not use the forced photometry that is performed at the location of all diaObjects in all visit and difference images, which is stored in the ForcedSourceOnDiaObject table.

1.2. Import packages¶

Import numpy, a fundamental package for scientific computing with arrays in Python (numpy.org), and matplotlib, a comprehensive library for data visualization (matplotlib.org; matplotlib gallery). Also import scipy.stats, a package containing a large number of statistical functions (scipy.stats).

From the lsst package, import modules for accessing the Table Access Protocol (TAP) service and making colorblind-friedly plots.

In [1]:

import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
from scipy.optimize import lsq_linear

from lsst.rsp import get_tap_service
from lsst.utils.plotting import (get_multiband_plot_colors,
                                 get_multiband_plot_symbols,
                                 get_multiband_plot_linestyles)

1.3. Define parameters¶

Define colors, symbols, and linestyles to represent the six LSST filters, $ugrizy$.

In [2]:

f_name = ['u', 'g', 'r', 'i', 'z', 'y']
f_col = get_multiband_plot_colors()
f_sym = get_multiband_plot_symbols()
f_lin = get_multiband_plot_linestyles()

Get an instance of the TAP service, and assert that it exists.

In [3]:

service = get_tap_service("tap")
assert service is not None

For this tutorial, use only the $g$- and $r$-band to illustrate the timeseries features.

In [4]:

use_filters = ['g', 'r']

2. Retrieve data for a known variable star¶

This tutorial uses a known SX Phoenicis-type pulsating variable star captured in DP1: Gaia DR3 2912281258855051520 (see Carlin et al. 2025). This star has coordinates RA, Dec = $94.9226329830, -25.2318482104$ deg and a diaObjectId in the DP1 dataset as defined in the following cell.

In [5]:

star_diaObjectId = 614435753623027782

2.1. The diaObject record¶

Retrieve all of the $g$- and $r$-band timeseries features from the DiaObject table. Create a comma-separated list of columns to retrieve from the DiaObject table. Include the $g$- and $r$-band timeseries features, plus the nDiaSources column.

In [6]:

query = "SELECT column_name " \
        "FROM tap_schema.columns " \
        "WHERE table_name = 'dp1.DiaObject'"
results = service.search(query).to_table()

columns_list = ''
for filt in use_filters:
    for name in results['column_name']:
        if name.find(filt + '_') == 0:
            columns_list += name + ', '
columns_list += 'nDiaSources'

del query, results

Define a query to retrieve all columns in the list from the DiaObject table for the row corresponding to the variable star, and submit the query to the TAP service.

In [7]:

query = """SELECT {} FROM dp1.DiaObject WHERE diaObjectId = {}
        """.format(columns_list, star_diaObjectId)

job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
    job.raise_if_error()

del query

Job phase is COMPLETED

Retrieve the results from the TAP service as an astropy table named star_diaObject.

In [8]:

assert job.phase == 'COMPLETED'
star_diaObject = job.fetch_result().to_table()
assert len(star_diaObject) == 1

Option to view star_diaObject.

In [9]:

# star_diaObject

In [10]:

job.delete()

2.2. The diaSource records¶

In general, for scientific analyses of lightcurves it is recommended to use the forced flux measurements in the ForcedSourceOnDiaObject table. However, because the timeseries features in the DiaObject table are calculated using the detection (unforced) flux measurements from the DiaSource table, this tutorial also uses them. The key difference to be aware of is that the DiaSource table only contains flux measurements for visits in which the diaObject was detected with SNR$\geq5$ in the difference image, whereas the ForcedSourceOnDiaObject table contains forced flux measurements for all visits.

Define a query to retrieve the $g$ and $r$-band difference-image (psfFlux) and direct-image (scienceFlux) measurements, the flux errors, and the exposure time midpoint Modified Julian Date (midpointMjdTai) from the DiaSource table.

In [11]:

query = """SELECT psfFlux, psfFluxErr, scienceFlux, scienceFluxErr, band, midpointMjdTai
        FROM dp1.DiaSource
        WHERE (band = '{}' OR band = '{}')
        AND diaObjectId = {}
        """.format(str(use_filters[0]), str(use_filters[1]), str(star_diaObjectId))

job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
    job.raise_if_error()

del query

Job phase is COMPLETED

Retrieve the results from the TAP service as an astropy table named star_diaSources.

In [12]:

assert job.phase == 'COMPLETED'
star_diaSources = job.fetch_result().to_table()
assert len(star_diaSources) == 129

In [13]:

job.delete()

Visualize the measured fluxes from the DiaSource table as a lightcurve.

In [14]:

fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(6, 4), sharex=True)
for filt in use_filters:
    tx = np.where(star_diaSources['band'] == filt)[0]
    ax1.errorbar(star_diaSources[tx]['midpointMjdTai'], star_diaSources[tx]['psfFlux'],
                 yerr=star_diaSources[tx]['psfFluxErr'], fmt=f_sym[filt], ms=7,
                 alpha=0.5, mew=0, color=f_col[filt], label=filt)
    ax2.errorbar(star_diaSources[tx]['midpointMjdTai'], star_diaSources[tx]['scienceFlux'],
                 yerr=star_diaSources[tx]['scienceFluxErr'], fmt=f_sym[filt], ms=7,
                 alpha=0.5, mew=0, color=f_col[filt], label=filt)
    del tx
ax1.set_ylabel('psfFlux [nJy]')
ax2.set_ylabel('scienceFlux [nJy]')
ax2.set_xlabel('MJD [d]')
ax1.legend(loc='upper left')
plt.tight_layout()
plt.show()

No description has been provided for this image

Figure 1: The psfFlux (difference image flux; top) and scienceFlux (direct image flux; bottom) for all visits in which the star was detected with SNR>5 in the difference image, versus the MJD of the visit, for the $g$- and $r$-bands only. The errors bars on the flux are, for most data points, relatively too small to be seen.

3. Timeseries features¶

The following subsections demonstrate how each of the timeseries features in the DiaObject table are derived by recalculating them from the observations in the DiaSource table. For scientific analyses it is not necessary to recalculate these features; this tutorial does it only as a demonstration.

All features are calculated from the PSF flux measured on the difference images, psfFlux, except the three features in the last sub-section which are calculated from the PSF flux measured on the science images, scienceFlux.

3.1. Ndata¶

The <f>_psfFluxNdata column is the number of rows in the DiaSource table that are associated with a given diaObject, for a given filter. In other words, this is the number of difference-image detections of a given diaObject in a given band.

Recalculate the <f>_psfFluxNdata column using the DiaSource table, and confirm it matches the DiaObject table.

In [15]:

for f, filt in enumerate(use_filters):
    diao_ndata = int(star_diaObject[filt+'_psfFluxNdata'])
    tx = np.where(star_diaSources['band'] == filt)[0]
    dias_ndata = len(tx)
    print(filt + '-band   diaObject    diaSource')
    print('Ndata   %10i   %10i' % (diao_ndata, dias_ndata))
    if f == 0:
        print(' ')
    del diao_ndata, tx, dias_ndata

g-band   diaObject    diaSource
Ndata           61           61

r-band   diaObject    diaSource
Ndata           68           68

3.2. ErrMean¶

The <f>_psfFluxErrMean column is the unweighted average of the psfFluxErr values: $\frac{1}{N}\sum{e_{f}}$, where $e_f$ is the flux error (psfFluxErr) and $N$ is the number of flux measurements.

Recalculate the <f>_psfFluxErrMean column using the DiaSource table, and confirm it matches the DiaObject table.

In [16]:

for f, filt in enumerate(use_filters):
    diao_emean = float(star_diaObject[filt + '_psfFluxErrMean'])
    tx = np.where(star_diaSources['band'] == filt)[0]
    dias_emean = np.sum(star_diaSources['psfFluxErr'][tx])/len(tx)
    print(filt + '-band    diaObject    diaSource')
    print('ErrMean  %10.0f   %10.0f ' % (diao_emean, dias_emean))
    if f == 0:
        print(' ')
    del diao_emean, tx, dias_emean

g-band    diaObject    diaSource
ErrMean         344          344

r-band    diaObject    diaSource
ErrMean         368          368

3.3. Min, Max¶

The <f>_psfFluxMin and Max values are the lowest and highest values of the psfFlux column (difference-image fluxes) for the diaObject.

Recalculate the <f>_psfFluxMin and Max columns using the DiaSource table, and confirm they match the DiaObject table.

In [17]:

for f, filt in enumerate(use_filters):
    diao_min = float(star_diaObject[filt + '_psfFluxMin'])
    diao_max = float(star_diaObject[filt + '_psfFluxMax'])
    tx = np.where(star_diaSources['band'] == filt)[0]
    dias_min = np.min(star_diaSources['psfFlux'][tx])
    dias_max = np.max(star_diaSources['psfFlux'][tx])
    print(filt + '-band    diaObject    diaSource')
    print('Min      %10.0f   %10.0f ' % (diao_min, dias_min))
    print('Max      %10.0f   %10.0f ' % (diao_max, dias_max))
    if f == 0:
        print(' ')
    del diao_min, diao_max, tx, dias_min, dias_max

g-band    diaObject    diaSource
Min          -37333       -37333
Max           31071        31071

r-band    diaObject    diaSource
Min          -23334       -23334
Max           20747        20747

3.4. Mean, MeanErr¶

The <f>_psfFluxMean is the weighted mean of the measured difference-image fluxes: $\bar{f_w} = \frac{\sum{f\times w}}{\sum{w}}$.

The weights ($w$) are the inverse of the square of the psfFluxErr ($e_f$) values: $w = \frac{1}{e_f^2}$.

The <f>_psfFluxMeanErr is the error on the weighted mean flux, and it is the inverse of the root of the sum of the weights: $\epsilon = \frac{1}{\sqrt{\sum{w}}}$.

Recalculate the <f>_psfFluxMean and MeanErr columns using the DiaSource table, and confirm they match the DiaObject table.

In [18]:

for f, filt in enumerate(use_filters):
    diao_mean = float(star_diaObject[filt + '_psfFluxMean'])
    diao_meane = float(star_diaObject[filt + '_psfFluxMeanErr'])
    tx = np.where(star_diaSources['band'] == filt)[0]
    dias_weights = 1.0/(star_diaSources['psfFluxErr'][tx]**2)
    dias_wmean = np.sum(star_diaSources['psfFlux'][tx] * dias_weights)/np.sum(dias_weights)
    dias_meane = 1.0 / np.sqrt(np.sum(dias_weights))
    print(filt + '-band    diaObject    diaSource')
    print('Mean     %10.0f   %10.0f ' % (diao_mean, dias_wmean))
    print('MeanErr  %10.2f   %10.2f ' % (diao_meane, dias_meane))
    if f == 0:
        print(' ')
    del diao_mean, diao_meane, tx, dias_weights, dias_wmean, dias_meane

g-band    diaObject    diaSource
Mean         -21039       -21039
MeanErr       43.64        43.64

r-band    diaObject    diaSource
Mean         -11608       -11608
MeanErr       44.43        44.43

3.5. Sigma, MAD¶

The <f>_psfFluxSigma is the standard deviation in the measured difference-image fluxes ($\sigma_f$). Note that this feature uses the unweighted average flux ($\bar{f}$) in its calculation: $\sigma_f = \sqrt{\frac{\sum{(f - \bar{f})^2}}{N-1}}$.

The <f>_psfFluxMAD, the median absolute deviation (MAD), is the median value of an array composed of the absolute values of the differences between the psfFlux and the median value of the psfFlux, for a given filter.

Recalculate the <f>_psfFluxSigma and MAD columns using the DiaSource table, and confirm they match the DiaObject table.

In [19]:

for f, filt in enumerate(use_filters):
    diao_sigma = float(star_diaObject[filt + '_psfFluxSigma'])
    diao_mad = float(star_diaObject[filt + '_psfFluxMAD'])
    tx = np.where(star_diaSources['band'] == filt)[0]
    dias_mean = np.sum(star_diaSources['psfFlux'][tx])/len(tx)
    dias_sigma = np.sqrt(np.sum((star_diaSources['psfFlux'][tx] - dias_mean)**2)/(len(tx)-1))
    dias_mad = np.ma.median(np.abs(star_diaSources['psfFlux'][tx]
                                   - np.ma.median(star_diaSources['psfFlux'][tx])))
    print(filt + '-band  diaObject    diaSource')
    print('Sigma  %10.0f   %10.0f ' % (diao_sigma, dias_sigma))
    print('MAD    %10.0f   %10.0f' % (diao_mad, dias_mad))
    if f == 0:
        print(' ')
    del diao_sigma, diao_mad, tx, dias_mean, dias_sigma, dias_mad

g-band  diaObject    diaSource
Sigma       22382        22382
MAD          6509         6509

r-band  diaObject    diaSource
Sigma       11559        11559
MAD          4647         4647

Show the minimum, maximum, mean, sigma, and MAD values as lines on the lightcurve and the flux distribution.

Define the labels, linewidths, and linestyles of the lines to represent the statistical features, then create the plot.

In [20]:

s_lb = ['min/max', None, 'mean', 'sigma', None, 'MAD', None]
s_lw = [1, 1, 1, 2, 2, 1, 1]
s_ls = ['dashed', 'dashed', 'solid', 'dotted', 'dotted', '-.', '-.']
fig, ax = plt.subplots(2, 2, figsize=(8, 5), sharex='col')
for f, filt in enumerate(use_filters):
    tx = np.where(star_diaSources['band'] == filt)[0]
    ax[f, 0].errorbar(star_diaSources['midpointMjdTai'][tx], star_diaSources['psfFlux'][tx],
                      yerr=star_diaSources['psfFluxErr'][tx], fmt=f_sym[filt],
                      ms=7, alpha=0.5, mew=0, color=f_col[filt], label=filt + ', star')
    ax[f, 1].hist(star_diaSources['psfFlux'][tx], bins=30,
                  alpha=0.5, color=f_col[filt], label=filt)
    s_vl = [star_diaObject[filt + '_psfFluxMin'], star_diaObject[filt + '_psfFluxMax'],
            star_diaObject[filt + '_psfFluxMean'],
            star_diaObject[filt + '_psfFluxMean'] - star_diaObject[filt + '_psfFluxSigma'],
            star_diaObject[filt + '_psfFluxMean'] + star_diaObject[filt + '_psfFluxSigma'],
            star_diaObject[filt + '_psfFluxMean'] - star_diaObject[filt + '_psfFluxMAD'],
            star_diaObject[filt + '_psfFluxMean'] + star_diaObject[filt + '_psfFluxMAD']]
    for s in range(len(s_vl)):
        ax[f, 0].axhline(s_vl[s], lw=s_lw[s], ls=s_ls[s], color='grey', label=s_lb[s])
        ax[f, 1].axvline(s_vl[s], lw=s_lw[s], ls=s_ls[s], color='grey', label=s_lb[s])
    ax[f, 0].set_ylabel('psfFlux [nJy]')
    ax[f, 1].set_ylabel('N(diaSources)')
    ax[f, 1].legend(bbox_to_anchor=(1.7, 1), loc='upper right')
    del tx, s_vl
ax[1, 0].set_xlabel('MJD [d]')
ax[1, 1].set_xlabel('psfFlux [nJy]')
plt.suptitle('Statistical features on the lightcurve and flux distribution of a variable star')
plt.tight_layout()
plt.show()
del s_lb, s_lw, s_ls

Figure 2: The $g$- and $r$-band lightcurves (left) and flux distributions (right) for the variable star, with the minimum and maximum (dashed), mean (solid), and standard deviation (sigma; dotted) of the flux values overplotted as grey lines.

3.6. Chi2¶

The <f>_psfFluxChi2 for a given filter is $\chi^2 = \sum\left(\frac{f - \bar{f}}{e_f}\right)^2$, where $f$ is the psfFlux, $\bar{f}$ is the weighted mean of the psfFlux, and $e_f$ is the psfFluxErr.

Notice that the equation for $\chi^2$ is quite similar to the one for $\sigma_f$, except in $\chi^2$ the flux differences from the mean are weighted by the flux error.

In a sense, the $\chi^2$ is a measure of whether the flux variability is more than that expected from random variations due to measurement uncertainty.

Recalculate the <f>_psfFluxChi2 column using the DiaSource table, and confirm it matches the DiaObject table.

In [21]:

for f, filt in enumerate(use_filters):
    diao_chi2 = float(star_diaObject[filt+'_psfFluxChi2'])
    tx = np.where(star_diaSources['band'] == filt)[0]
    dias_chi2 = np.sum(((star_diaSources['psfFlux'][tx]
                         - star_diaObject[filt + '_psfFluxMean'])
                        / (star_diaSources['psfFluxErr'][tx]))**2)
    print(filt + 'band   diaObject   diaSource')
    print('Chi2   %10.0f  %10.0f' % (diao_chi2, dias_chi2))
    if f == 0:
        print(' ')
    del diao_chi2, tx, dias_chi2

gband   diaObject   diaSource
Chi2       220810      220810

rband   diaObject   diaSource
Chi2        66132       66132

3.7. Skew¶

The skewness of a distribution describes its asymetry about its peak, as in the diagram below.

Figure 3: An illustration of negatively and positively skewed distributions. By Rodolfo Hermans (Godot) at en.wikipedia (CC BY-SA 3.0).

Demonstrate how to derive the <f>_psfFluxSkew using the scipy.stats package.

In [22]:

for f, filt in enumerate(use_filters):
    diao_skew = float(star_diaObject[filt + '_psfFluxSkew'])
    tx = np.where(star_diaSources['band'] == filt)[0]
    dias_skew = stats.skew(np.asarray(star_diaSources['psfFlux'][tx], dtype='float64'),
                           bias=False, nan_policy='omit')
    print(filt + 'band    diaObject    diaSource')
    print('Skew    %10.4f   %10.4f ' % (diao_skew, dias_skew))
    if f == 0:
        print(' ')
    del diao_skew, tx, dias_skew

gband    diaObject    diaSource
Skew        1.1426       1.1426

rband    diaObject    diaSource
Skew        1.1073       1.1073

The right-hand panels of Figure 2 in Section 3.5 show that the flux distribution for the known variable star has positive skew.

3.8. Percentiles¶

The <f>_psfFluxPercentiles are 5, 25, 50, 75, and 95th percentiles of the cumulative distribution of difference-image fluxes.

Recalculate the <f>_psfFluxPercentiles columns using the DiaSource table and the numpy.nanpercentiles function. Compress the array to remove masked values before using np.percentile. Do not use np.nanpercentile because it ignores the mask and produces a warning message.

In [23]:

percentiles = {'05': 0.05, '25': 0.25, '50': 0.5, '75': 0.75, '95': 0.95}

for f, filt in enumerate(use_filters):
    tx = np.where(star_diaSources['band'] == filt)[0]
    print(filt + '-band   diaObject    diaSource')
    for p, pc in enumerate(percentiles):
        diao_pc = star_diaObject[filt + '_psfFluxPercentile' + pc]
        values = star_diaSources['psfFlux'][tx]
        dias_pc = float(np.percentile(values.compressed(),
                                         100.0*percentiles[pc]))
        print('PC %2s   %10.0f   %10.0f ' % (pc, diao_pc, dias_pc))
        del diao_pc, dias_pc, values
    if f == 0:
        print(' ')
    del tx

g-band   diaObject    diaSource
PC 05       -36710       -36710
PC 25       -35295       -35295
PC 50       -29042       -29042
PC 75        -8579        -8579
PC 95        27488        27488

r-band   diaObject    diaSource
PC 05       -22417       -22417
PC 25       -19595       -19595
PC 50       -17730       -17730
PC 75        -6018        -6018
PC 95         9629         9629

Plot the cumulative flux distribution for the known variable star and mark the percentiles.

In [24]:

fig, ax = plt.subplots(2, 1, figsize=(6, 4), sharex=True)
for f, filt in enumerate(use_filters):
    tx = np.where(star_diaSources['band'] == filt)[0]
    sx = np.argsort(star_diaSources['psfFlux'][tx])
    xvals = star_diaSources['psfFlux'][tx[sx]]
    yvals = (np.arange(len(sx), dtype='float') + 1.0)/float(len(sx))
    ax[f].plot(xvals, yvals, 'None',
               ls=f_lin[filt], color=f_col[filt], label=filt)
    del tx, sx, xvals, yvals
    for p, pc in enumerate(percentiles):
        xval = star_diaObject[filt + '_psfFluxPercentile' + pc]
        yval = percentiles[pc]
        if p == 0:
            ax[f].plot(xval, yval, '*', ms=7, color='black', label='percentile')
        else:
            ax[f].plot(xval, yval, '*', ms=7, color='black')
        del xval, yval
    ax[f].legend(loc='lower right')
    ax[f].set_ylabel('Percentile')
ax[1].set_xlabel('Sorted psfFlux [nJy]')
ax[0].set_title('Cumulative flux distribution of the known variable star')
plt.tight_layout()
plt.show()

Figure 4: The cumulative flux distribution for the $g$- and the $r$-band, with the percentiles marked as black stars.

3.9. StetsonJ¶

The <f>_psfFluxStetsonJ is a variability index developed for Cepheids (defined in Stetson 1996).

Define a function to calculate the StetsonJ parameter, using the same input parameters that are used by default in the lsst.meas.base module diaCalculationPlugins.py.

In [25]:

def StetsonJ(fluxes, errors):
    """
    Calculate the StetsonJ parameter with the same parameters
    as used in the LSST Science Pipelines.

    Parameters
    ----------
    fluxes: float
            The difference-image fluxes in nJy.
    errors: float
            The errors on the fluxes in nJy.

    Returns
    -------
    stetsonj: float
              The StetsonJ parameter.
    """

    alpha = 2
    beta = 2
    n_iter = 20
    tol = 1e-6
    n_points = len(fluxes)
    n_factor = np.sqrt(n_points / (n_points - 1))
    inv_var = 1 / errors**2
    mean = np.average(fluxes, weights=inv_var)
    for iter_idx in range(n_iter):
        chi = np.fabs(n_factor * (fluxes - mean) / errors)
        tmp_mean = np.average(fluxes,
                              weights=inv_var / (1 + (chi / alpha) ** beta))
        diff = np.fabs(tmp_mean - mean)
        mean = tmp_mean
        if diff / mean < tol and diff < tol:
            break
    flux_mean = mean
    delta_val = (np.sqrt(n_points / (n_points - 1)) * (fluxes - flux_mean) / errors)
    p_k = delta_val ** 2 - 1
    stetsonj = np.mean(np.sign(p_k) * np.sqrt(np.fabs(p_k)))
    return stetsonj

Recalculate the <f>_psfFluxStetsonJ parameter.

In [26]:

for f, filt in enumerate(use_filters):
    diao_sj = float(star_diaObject[filt + '_psfFluxStetsonJ'])
    tx = np.where(star_diaSources['band'] == filt)[0]
    dias_sj = StetsonJ(star_diaSources['psfFlux'][tx],
                       star_diaSources['psfFluxErr'][tx])
    print(filt + 'band     diaObject    diaSource')
    print('StetsonJ     %6.2f       %6.2f ' % (diao_sj, dias_sj))
    if f == 0:
        print(' ')
    del diao_sj, tx, dias_sj

gband     diaObject    diaSource
StetsonJ      44.61        44.61

rband     diaObject    diaSource
StetsonJ      24.68        24.68

3.10. Linear fit¶

The <f>_psfFluxLinearSlope and <f>_psfFluxLinearIntercept are the result of fitting a linear regression using the scipy.optimize.lsq_linear function.

The psfFluxMaxSlope is the maximum "instantaneous" slope, or the maximum ration of the series of time-ordered values of $\Delta f / \Delta t$.

Demonstrate how to derive the linear fit parameters.

In [27]:

for f, filt in enumerate(use_filters):
    diao_b = float(star_diaObject[filt + '_psfFluxLinearIntercept'])
    diao_m = float(star_diaObject[filt + '_psfFluxLinearSlope'])
    diao_mm = float(star_diaObject[filt + '_psfFluxMaxSlope'])
    tx = np.where(star_diaSources['band'] == filt)[0]
    A = np.array([star_diaSources['midpointMjdTai'][tx]
                  / star_diaSources['psfFluxErr'][tx],
                  1.0 / star_diaSources['psfFluxErr'][tx]]).transpose()
    dias_m, dias_b = lsq_linear(A, star_diaSources['psfFlux'][tx]
                                / star_diaSources['psfFluxErr'][tx]).x
    sx = np.argsort(star_diaSources['midpointMjdTai'][tx])
    dias_mm = (np.diff(star_diaSources['psfFlux'][tx[sx]])
               / np.diff(star_diaSources['midpointMjdTai'][tx[sx]])).max()
    print(filt + 'band       diaObject   diaSource')
    print('Intercept  %10.0f  %10.0f ' % (diao_b, dias_b))
    print('Slope      %10.2f  %10.2f ' % (diao_m, dias_m))
    print('Max Slope  %10.0f  %10.0f ' % (diao_mm, dias_mm))
    if f == 0:
        print(' ')
    del diao_b, diao_m, diao_mm, tx, A, dias_b, dias_m, sx, dias_mm

gband       diaObject   diaSource
Intercept    39115056    39121655
Slope         -645.28     -645.39
Max Slope    10225682    10225817

rband       diaObject   diaSource
Intercept     4841781     4838914
Slope          -80.02      -79.98
Max Slope     7048395     7048321

> Notice: For the linear fits, the re-derived parameters from the DiaSource table are not an exact match to the parameters stored in the DiaObject table.

The known variable star is not expected to exhibit a linear slope. Plot the lightcurve and show the best fit line.

In [28]:

fig, ax = plt.subplots(2, 1, figsize=(6, 4), sharex=True)
for f, filt in enumerate(use_filters):
    tx = np.where(star_diaSources['band'] == filt)[0]
    ax[f].errorbar(star_diaSources['midpointMjdTai'][tx], star_diaSources['psfFlux'][tx],
                   yerr=star_diaSources['psfFluxErr'][tx], fmt=f_sym[filt],
                   ms=7, alpha=0.5, mew=0, color=f_col[filt], label=filt + ', star')
    m = star_diaObject[filt + '_psfFluxLinearSlope']
    b = star_diaObject[filt + '_psfFluxLinearIntercept']
    xvals = np.sort(star_diaSources['midpointMjdTai'][tx])
    yvals = m * xvals + b
    ax[f].plot(xvals, yvals, lw=1, ls=f_lin[filt],
               alpha=0.8, color=f_col[filt], label='linear fit')
    ax[f].set_ylabel('psfFlux [nJy]')
    ax[f].legend(loc='upper left')
    del tx, m, b, xvals, yvals
ax[1].set_xlabel('MJD [d]')
ax[0].set_title('Linear fit to the lightcurve of the known variable star')
plt.tight_layout()
plt.show()

Figure 5: As expected for a pulsating variable, the results of a linear fit are not a useful interpretation for the known variable star lightcurve.

3.11. Science flux features¶

The <f>_scienceFluxMean is the weighted mean of the forced PSF fluxes measured on the science (direct) images. The <f>_scienceFluxMeanErr is the error on the weighted mean flux. Both features use the same formulae as quoted in Section 3.4.

The <f>_scienceFluxSigma is the standard deviation in the forced PSF fluxes measured on the science (direct) images. It has the same definition as in Section 3.5, and also uses the unweighted mean flux in its formula.

Demonstrate how to derive these statistics for the scienceFlux.

In [29]:

for f, filt in enumerate(use_filters):
    diao_sci_mean = float(star_diaObject[filt + '_scienceFluxMean'])
    diao_sci_meane = float(star_diaObject[filt + '_scienceFluxMeanErr'])
    diao_sci_sigma = float(star_diaObject[filt + '_scienceFluxSigma'])
    tx = np.where(star_diaSources['band'] == filt)[0]
    dias_sci_w = 1.0/(star_diaSources['scienceFluxErr'][tx]**2)
    dias_sci_wmean = np.sum(star_diaSources['scienceFlux'][tx] * dias_sci_w) / np.sum(dias_sci_w)
    dias_sci_meane = 1.0 / np.sqrt(np.sum(dias_sci_w))
    dias_sci_mean = np.sum(star_diaSources['scienceFlux'][tx])/len(tx)
    dias_sci_sigma = np.sqrt(np.sum((star_diaSources['scienceFlux'][tx]
                                     - dias_sci_mean)**2)/(len(tx)-1))
    print(filt + '-band    diaObject    diaSource')
    print('Mean     %10.0f   %10.0f ' % (diao_sci_mean, dias_sci_wmean))
    print('MeanErr  %10.2f   %10.2f ' % (diao_sci_meane, dias_sci_meane))
    print('Sigma    %10.0f   %10.0f ' % (diao_sci_sigma, dias_sci_sigma))
    if f == 0:
        print(' ')
    del diao_sci_mean, diao_sci_meane, diao_sci_sigma
    del tx, dias_sci_w, dias_sci_wmean, dias_sci_meane, dias_sci_mean, dias_sci_sigma

g-band    diaObject    diaSource
Mean         109668       109668
MeanErr       43.36        43.36
Sigma         21279        21279

r-band    diaObject    diaSource
Mean         114239       114239
MeanErr       43.96        43.96
Sigma         11588        11588

In [ ]: