201.6. ForcedSourceOnDiaObject table¶
201.6. Forced Source on DiaObject table¶
Data Release: Data Preview 1
Container Size: large
LSST Science Pipelines version: r29.1.1
Last verified to run: 2025-06-20
Repository: github.com/lsst/tutorial-notebooks
Learning objective: To understand the contents of the ForcedSourceOnDiaObject
table and how to access it.
LSST data products: ForcedSourceOnDiaObject
Packages: lsst.rsp
, lsst.daf.butler
Credit: Originally developed by the Rubin Community Science team. Please consider acknowledging them if this notebook is used for the preparation of journal articles, software releases, or other notebooks.
Get Support: Everyone is encouraged to ask questions or raise issues in the Support Category of the Rubin Community Forum. Rubin staff will respond to all questions posted there.
1. Introduction¶
The ForcedSourceOnDiaObject
table contains forced PSF measurements in individual visit_images
and difference_images
at the sky coordinates of all DiaObjects
.
DiaObjects
are the union set of all diaSources
detected in difference_images
.
It is recommended to use the ForcedSourceOnDiaObject
table for light curves.
- TAP table name:
dp1.ForcedSourceOnDiaObject
- butler table name:
dia_object_forced_source
- columns: 28
- rows: 196,911,566
Related tutorials: The TAP and butler data access services are demonstrated in the 100-level "How to" tutorials. There are 200-level tutorials on difference_images
and the DiaObject
table.
1.1. Import packages¶
Import standard python packages re
, numpy
, matplotlib
and astropy
.
From the lsst
package, import modules for the TAP service, the butler, and plotting.
import re
import numpy as np
import matplotlib.pyplot as plt
from lsst.rsp import get_tap_service
from lsst.daf.butler import Butler
from lsst.utils.plotting import (get_multiband_plot_colors,
get_multiband_plot_symbols,
get_multiband_plot_linestyles)
1.2. Define parameters and functions¶
Create an instance of the TAP service, and assert that it exists.
service = get_tap_service("tap")
assert service is not None
Create an instance of the Rubin data butler, and assert that it exists.
butler = Butler('dp1', collections="LSSTComCam/DP1")
assert butler is not None
Define the colors and symbols to represent the LSST filters in plots.
filter_names = ['u', 'g', 'r', 'i', 'z', 'y']
filter_colors = get_multiband_plot_colors()
filter_symbols = get_multiband_plot_symbols()
filter_linestyles = get_multiband_plot_linestyles()
2. Schema (columns)¶
To browse the table schema visit the Rubin schema browser, or use the TAP service via the Portal Aspect or as demonstrated in Section 2.1.
2.1. Retrieve table schema¶
To retrieve the table schema, define a query for the schema columns of the ForcedSourceOnDiaObject
table and run the query job.
query = "SELECT column_name, datatype, description, unit " \
"FROM tap_schema.columns " \
"WHERE table_name = 'dp1.ForcedSourceOnDiaObject'"
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
job.raise_if_error()
assert job.phase == 'COMPLETED'
Job phase is COMPLETED
Retrieve the query results and display them as an astropy
table with the to_table()
attribute.
results = job.fetch_result().to_table()
results
column_name | datatype | description | unit |
---|---|---|---|
str64 | str64 | str512 | str64 |
band | char | Abstract filter that is not associated with a particular instrument | |
coord_dec | double | Fiducial ICRS Declination of centroid used for database indexing | deg |
coord_ra | double | Fiducial ICRS Right Ascension of centroid used for database indexing | deg |
detector | long | Id of the detector where this forced source was measured. Datatype short instead of byte because of DB concerns about unsigned bytes. | |
diaObjectId | long | Id of the DiaObject that this DiaForcedSource was associated with. | |
diff_PixelFlags_nodataCenter | boolean | Source center is outside usable region on image difference (masked NO_DATA) | |
invalidPsfFlag | boolean | Forced source has an invalid PSF. | |
parentObjectId | long | Unique ObjectId of the parent of the ObjectId in context of the deblender. | |
patch | long | Skymap patch ID | |
pixelFlags_bad | boolean | Bad pixel in the Source footprint | |
pixelFlags_cr | boolean | Cosmic ray in the Source footprint | |
pixelFlags_crCenter | boolean | Cosmic ray in the Source center | |
pixelFlags_edge | boolean | Source is on the edge of an exposure region (masked EDGE) | |
pixelFlags_interpolated | boolean | Interpolated pixel in the Source footprint | |
pixelFlags_interpolatedCenter | boolean | Interpolated pixel in the Source center | |
pixelFlags_nodata | boolean | Source is outside usable exposure region (masked NO_DATA) | |
pixelFlags_saturated | boolean | Saturated pixel in the Source footprint | |
pixelFlags_saturatedCenter | boolean | Saturated pixel in the Source center | |
pixelFlags_suspect | boolean | Sources footprint includes suspect pixels | |
pixelFlags_suspectCenter | boolean | Sources center is close to suspect pixels | |
psfDiffFlux | float | Flux derived from linear least-squares fit of psf model forced on the image difference | nJy |
psfDiffFlux_flag | boolean | Failure to derive linear least-squares fit of psf model forced on the image difference | |
psfDiffFluxErr | float | Uncertainty on the flux derived from linear least-squares fit of psf model forced on the image difference | nJy |
psfFlux | float | Flux derived from linear least-squares fit of psf model forced on the calexp | nJy |
psfFlux_flag | boolean | Failure to derive linear least-squares fit of psf model forced on the calexp | |
psfFluxErr | float | Uncertainty on the flux derived from linear least-squares fit of psf model forced on the calexp | nJy |
tract | long | Skymap tract ID | |
visit | long | Id of the visit where this forced source was measured. |
The table displayed above has been truncated.
Option to print every column name as a list.
# for col in results['column_name']:
# print(col)
Option to use the regular expressions package re
to search for column names that contain the string temp
.
# temp = 'Err'
# temp = 'Flux'
# temp = 'psf'
# for col in results['column_name']:
# if re.search(temp, col):
# print(col)
Delete the job, but not the results
.
del query
job.delete()
2.2.2. Coordinates¶
The sky coordinates in decimal degrees for the associated DiaObject
is held fixed during the forced measurements, and has no quoted uncertainty in the ForcedSourceOnDiaObject
table.
coord_ra
,coord_dec
2.2.3. Observation¶
The band, visit, and detector in which the ForcedSourceOnDiaObject
was measured.
band
,visit
,detector
2.2.4. Photometry¶
A forced fit of the Point Spread Function (PSF) at the object's coordinates in each individual visit_image
and difference_image
.
psfFlux
,psfFluxErr
psfDiffFlux
,psfDiffFluxErr
Forced fluxes on difference images can be negative.
It is not recommended to convert forced fluxes to magnitudes, but if they are positive, the fluxes ($f$) are in nanojanskies and the corresponding AB magnitude ($m$) is:
$m = -2.5\log(f) + 31.4$
2.2.5. Flags¶
A variety of flags indicating whether pixels that are saturated, or affected by cosmic rays, contributed to the source's measurements.
pixelFlags_*
2.3. Descriptions and units¶
For a subset of the key columns show the table of their descriptions and units.
col_list = set(['diaObjectId', 'coord_ra', 'coord_dec', 'band', 'visit', 'detector',
'psfFlux', 'psfFluxErr', 'psfDiffFlux', 'psfDiffFluxErr'])
tx = [i for i, item in enumerate(results['column_name']) if item in col_list]
results[tx]
column_name | datatype | description | unit |
---|---|---|---|
str64 | str64 | str512 | str64 |
band | char | Abstract filter that is not associated with a particular instrument | |
coord_dec | double | Fiducial ICRS Declination of centroid used for database indexing | deg |
coord_ra | double | Fiducial ICRS Right Ascension of centroid used for database indexing | deg |
detector | long | Id of the detector where this forced source was measured. Datatype short instead of byte because of DB concerns about unsigned bytes. | |
diaObjectId | long | Id of the DiaObject that this DiaForcedSource was associated with. | |
psfDiffFlux | float | Flux derived from linear least-squares fit of psf model forced on the image difference | nJy |
psfDiffFluxErr | float | Uncertainty on the flux derived from linear least-squares fit of psf model forced on the image difference | nJy |
psfFlux | float | Flux derived from linear least-squares fit of psf model forced on the calexp | nJy |
psfFluxErr | float | Uncertainty on the flux derived from linear least-squares fit of psf model forced on the calexp | nJy |
visit | long | Id of the visit where this forced source was measured. |
Clean up.
del col_list, tx, results
3. Data access¶
The ForcedSourceOnDiaObject
table is available via the TAP service and the butler.
Recommended access method: TAP.
3.1. Advisory: avoid full-table queries¶
Avoid full-table queries. Always include a constraint on the coordinates, diaObjectId
, or visit
.
The ForcedSourceOnDiaObject
table is a large, inclusive union set of forced measurements made in all the visit and difference images at the locations of all DiaObjects
.
Furthermore, skipping spatial constraints is not a good habit to form, because future data release ForcedSourceOnDiaObject
tables will likely contain trillions of rows.
3.2. TAP (Table Access Protocol)¶
The ForcedSourceOnDiaObject
table is stored in Qserv and accessible via the TAP services using ADQL queries.
Include spatial constraints:
Qserv stores catalog data sharded by coordinate (RA, Dec), so ADQL query statements that include constraints by coordinate do not requre a whole-catalog search and are typically faster (and can be much faster) than ADQL query statements which only include constraints for other columns.
Use either an ADQL cone or polygon search for faster queries (do not use WHERE ... BETWEEN
statements to set boundaries on RA and Dec).
3.2.1. Demo query¶
Define a query to return the ten "key columns" from Section 2.3.
Use the coordinates of a known extragalactic transient:
- RA = 53.125
- Dec = -27.740
Impose spatial constraints: search within a 1 arcsecond radius of a known transient near the center of the Extended Chandra Deep Field South (ECDFS) field, RA, Dec = $53.125, -27.740$.
query = "SELECT diaObjectId, coord_ra, coord_dec, band, visit, detector, " \
"psfFlux, psfFluxErr, psfDiffFlux, psfDiffFluxErr " \
"FROM dp1.ForcedSourceOnDiaObject " \
"WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec), " \
"CIRCLE('ICRS', 53.125, -27.740, 0.00028)) = 1 " \
"ORDER BY visit ASC "
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
job.raise_if_error()
assert job.phase == 'COMPLETED'
Job phase is COMPLETED
Fetch the results as an astropy
table.
results = job.fetch_result().to_table()
print(len(results))
446
Option to display the table.
# results
Show that all of the ForcedSourceOnDiaObject
photometry that was returned is associated with only 1 DiaObject
.
values, counts = np.unique(results['diaObjectId'], return_counts=True)
assert len(values) == 1
for value, count in zip(values, counts):
print(value, count)
del values, counts
611255759837069401 446
As an example, calculate and print the minimum, mean, maximum, and standard deviation in PSF flux in the difference_images
per filter. Negative fluxes demonstrate that the transient was likely present in the reference template used for the image subtraction.
for filt in filter_names:
fx = np.where(results['band'] == filt)[0]
print('%s %3i %8.1f %8.1f %8.1f %8.1f' %
(filt, len(fx),
np.min(results['psfDiffFlux'][fx]), np.mean(results['psfDiffFlux'][fx]),
np.max(results['psfDiffFlux'][fx]), np.std(results['psfDiffFlux'][fx])))
u 22 -17794.6 -1184.0 1651.0 5280.5 g 121 -38815.1 -275.9 3379.5 4096.1 r 116 -38037.2 -1497.7 5476.7 5217.8 i 88 -9918.1 1432.2 9298.7 5791.1 z 87 -64061.1 -4087.0 5891.5 9607.9 y 12 -2178.1 -306.9 1851.8 1119.6
Clean up.
job.delete()
del query, results
3.2.2. Joinable tables¶
The ForcedSourceOnDiaObject
table can be joined to:
- the
Visit
table on columnvisit
- the
DiaObject
table on columndiaObjectId
The Visit
table contains information about the observation, such as the date and time.
The DiaObject
table contains measurement statistics made on the difference_images
.
To plot a light curve, join the ForcedSourceOnDiaObject
and Visit
tables.
To the query used in Section 3.2.1, add a table join to the Visit
table and retrieve the column expMidptMJD
.
Change the WHERE
statement to select all rows of the ForcedSourceOnDiaObjectId
catalog for the diaObjectId
identified in the previous section.
This query would also work the same with the CONTAINS(POINT(), CIRCLE())
statement.
query = "SELECT fsodo.diaObjectId, fsodo.coord_ra, fsodo.coord_dec, fsodo.band, fsodo.visit, " \
"fsodo.detector, fsodo.psfFlux, fsodo.psfFluxErr, fsodo.psfDiffFlux, " \
"fsodo.psfDiffFluxErr, v.expMidptMJD, " \
"diao.u_psfFluxMean, diao.g_psfFluxMean, diao.r_psfFluxMean, " \
"diao.i_psfFluxMean, diao.z_psfFluxMean, diao.y_psfFluxMean " \
"FROM dp1.ForcedSourceOnDiaObject AS fsodo " \
"JOIN dp1.Visit AS v ON fsodo.visit = v.visit " \
"JOIN dp1.DiaObject AS diao ON diao.diaObjectId = fsodo.diaObjectId " \
"WHERE diao.diaObjectId = 611255759837069401 " \
"ORDER BY v.expMidptMJD ASC "
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
job.raise_if_error()
assert job.phase == 'COMPLETED'
Job phase is COMPLETED
Fetch the results as an astropy
table.
results = job.fetch_result().to_table()
print(len(results))
446
Option to display the results table.
# results
Plot the light curve for this object, as measured from the difference_images
(psfDiffFlux
) and visit_images
(psfFlux
).
Warning message: The following cell will produce a pink warning about masked elements being converted to nans, which is just saying that some of the flux measurements in the array were masked (i.e., were not successful) and that the corresponding values are converted to "not a number", "nan". They are not plotted.
fig, (ax1, ax2) = plt.subplots(ncols=2, figsize=(10, 4))
for filt in filter_names:
fx = np.where(results['band'] == filt)[0]
if len(fx) > 0:
ax1.axhline(results[filt+'_psfFluxMean'][fx[0]], color=filter_colors[filt],
ls=filter_linestyles[filt], alpha=0.2, label=filt)
ax1.plot(results['expMidptMJD'][fx],
results['psfDiffFlux'][fx],
filter_symbols[filt], ms=5, mew=0, alpha=0.7,
color=filter_colors[filt])
ax2.plot(results['expMidptMJD'][fx],
results['psfFlux'][fx],
filter_symbols[filt], ms=5, mew=0, alpha=0.7,
color=filter_colors[filt], label=filt)
ax1.set_xlabel('MJD')
ax1.set_ylabel('PSF Diff Flux')
ax1.legend(loc='lower left', ncol=2)
ax2.set_xlabel('MJD')
ax2.set_ylabel('PSF Flux')
ax2.legend(loc='lower left', ncol=2)
plt.tight_layout()
plt.show()
/opt/lsst/software/stack/conda/envs/lsst-scipipe-10.0.0/lib/python3.12/site-packages/matplotlib/cbook.py:1719: UserWarning: Warning: converting a masked element to nan. return math.isfinite(val) /opt/lsst/software/stack/conda/envs/lsst-scipipe-10.0.0/lib/python3.12/site-packages/matplotlib/cbook.py:1355: UserWarning: Warning: converting a masked element to nan. return np.asarray(x, float)
Figure 1: At left, the PSF forced photometry light curve on the
difference_images
for the diaobject. Horizontal lines mark the diaobject's mean flux from associateddiaSource
measurements in thedifference_images
. At right, the PSF forced photometry light curve on thevisit_images
for the diaobject.
Note that the photometry statistics from the DiaObject
table (e.g. r_psfFluxMean
) are calculated from associated DiaSources
, where the signal-to-noise ratio of the detections are $\geq5$, not from the ForcedSourceOnDiaObject
measurements.
Clean up.
job.delete()
del query, results
3.4. Butler¶
TAP is the recommended way to access the source table, but the butler can also be used.
Show that the dimensions for the object_forced_source
table are the skymap's tract and patch, and that both are required.
butler.get_dataset_type('dia_object_forced_source')
DatasetType('dia_object_forced_source', {skymap, tract, patch}, ArrowAstropy)
butler.get_dataset_type('dia_object_forced_source').dimensions.required
{skymap, tract, patch}
3.4.1. Demo query¶
Retrieve all dataset_refs
for dia_object_forced_source
tables for patches that overlap the coordinates that were also used in Section 3.2.
ra = 53.125
dec = -27.740
refs = butler.query_datasets("dia_object_forced_source",
where="patch.region OVERLAPS POINT(:ra, :dec)",
bind={"ra": ra, "dec": dec})
Print the number of dataset references returned.
print(len(refs))
1
refs
[DatasetRef(DatasetType('dia_object_forced_source', {skymap, tract, patch}, ArrowAstropy), {skymap: 'lsst_cells_v1', tract: 5063, patch: 34}, run='LSSTComCam/runs/DRP/DP1/v29_0_0/DM-50260/20250419T073356Z', id=53be93ed-96c6-4e4c-a45d-a7551a9f3a7f)]
Print the dataId
of the second element of the refs
.
for ref in refs:
print(ref.dataId)
{skymap: 'lsst_cells_v1', tract: 5063, patch: 34}
Define the columns to retrieve.
col_list = ['diaObjectId', 'coord_ra', 'coord_dec',
'band', 'psfDiffFlux', 'psfDiffFluxErr']
Get the data from the butler for the patch in the list of returned dataset refs.
results = butler.get(refs[0],
parameters={'columns': col_list})
print(len(results))
3428663
Option to display the results.
# results
Without a small constraint on the region as in Section 3.2.1, data for many more unique objects has been returned.
values = np.unique(results['diaObjectId'])
print(len(values))
del values
6344
Check that the transient is in the results table.
tx = np.where(results['diaObjectId'] == 611255759837069401)[0]
print(len(tx))
446
Print the same table of psfDiffFlux
statistics in Section 3.2.1.
for filt in filter_names:
fx = np.where(results['band'][tx] == filt)[0]
print('%s %3i %8.1f %8.1f %8.1f %8.1f' %
(filt, len(fx),
np.min(results['psfDiffFlux'][tx[fx]]),
np.mean(results['psfDiffFlux'][tx[fx]]),
np.max(results['psfDiffFlux'][tx[fx]]),
np.std(results['psfDiffFlux'][tx[fx]])))
u 22 -17794.6 -1184.0 1651.0 5280.5 g 121 -38815.1 -275.9 3379.5 4096.1 r 116 -38037.2 -1497.7 5476.7 5217.9 i 88 -9918.1 1432.2 9298.7 5791.1 z 87 -64061.1 -4087.0 5891.5 9607.9 y 12 -2178.1 -306.9 1851.9 1119.6
del ra, dec, refs, col_list
del results, tx, fx