201.3. ForcedSource table¶
201.3. Forced Source table¶
Data Release: Data Preview 1
Container Size: large
LSST Science Pipelines version: r29.1.1
Last verified to run: 2025-06-21
Repository: github.com/lsst/tutorial-notebooks
Learning objective: To understand the contents of the ForcedSource
table and how to access it.
LSST data products: ForcedSource
Packages: lsst.rsp
, lsst.daf.butler
Credit: Originally developed by the Rubin Community Science team. Please consider acknowledging them if this notebook is used for the preparation of journal articles, software releases, or other notebooks.
Get Support: Everyone is encouraged to ask questions or raise issues in the Support Category of the Rubin Community Forum. Rubin staff will respond to all questions posted there.
1. Introduction¶
The ForcedSource
table contains forced PSF measurements in individual visit_images
and difference_images
at the sky coordinates of all Objects
.
Objects
are the union set of all sources detected in the deep_coadd
images or in the individual visit images (even those detected only once in one filter).
It is recommended to use the Forced Source and Difference Image Analysis (DIA) tables for light curves.
- TAP table name:
dp1.ForcedSource
- Butler table name:
object_forced_source
- columns: 28
- rows: 268,796,943
Related tutorials: The TAP and Butler data access services are demonstrated in the 100-level "How to" tutorials. There are 200-level tutorials on visit_images
, difference_images
and the Object
table.
1.1. Import packages¶
Import standard python packages re
, numpy
, matplotlib
and astropy
.
From the lsst
package, import modules for the TAP service, the Butler, and plotting.
import re
import numpy as np
import matplotlib.pyplot as plt
from lsst.rsp import get_tap_service
from lsst.daf.butler import Butler
from lsst.utils.plotting import (get_multiband_plot_colors,
get_multiband_plot_symbols,
get_multiband_plot_linestyles)
1.2. Define parameters and functions¶
Create an instance of the TAP service, and assert that it exists.
service = get_tap_service("tap")
assert service is not None
Create an instance of the Rubin data Butler, and assert that it exists.
butler = Butler('dp1', collections="LSSTComCam/DP1")
assert butler is not None
Define the colors and symbols to represent the LSST filters in plots.
filter_names = ['u', 'g', 'r', 'i', 'z', 'y']
filter_colors = get_multiband_plot_colors()
filter_symbols = get_multiband_plot_symbols()
filter_linestyles = get_multiband_plot_linestyles()
2. Schema (columns)¶
To browse the table schema visit the Rubin schema browser, or use the TAP service via the Portal Aspect or as demonstrated in Section 2.1.
2.1. Retrieve table schema¶
To retrieve the table schema, define a query for the schema columns of the ForcedSource
table and run the query job.
query = "SELECT column_name, datatype, description, unit " \
"FROM tap_schema.columns " \
"WHERE table_name = 'dp1.ForcedSource'"
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
job.raise_if_error()
Job phase is COMPLETED
Retrieve the query results and display them as an astropy
table with the to_table()
attribute.
assert job.phase == 'COMPLETED'
results = job.fetch_result().to_table()
results
column_name | datatype | description | unit |
---|---|---|---|
str64 | str64 | str512 | str64 |
band | char | Abstract filter that is not associated with a particular instrument | |
coord_dec | double | Fiducial ICRS Declination of centroid used for database indexing | deg |
coord_ra | double | Fiducial ICRS Right Ascension of centroid used for database indexing | deg |
detector | long | Id of the detector where this source was measured. Datatype short instead of byte because of DB concerns about unsigned bytes. | |
diff_PixelFlags_nodataCenter | boolean | Source center is outside usable region on image difference (masked NO_DATA) | |
invalidPsfFlag | boolean | Source has an invalid PSF. | |
objectId | long | Unique Object ID. Primary Key of the Object Table | |
parentObjectId | long | Unique ObjectId of the parent of the ObjectId in context of the deblender. | |
patch | long | Skymap patch ID | |
pixelFlags_bad | boolean | Bad pixel in the Source footprint | |
pixelFlags_cr | boolean | Cosmic ray in the Source footprint | |
pixelFlags_crCenter | boolean | Cosmic ray in the Source center | |
pixelFlags_edge | boolean | Source is on the edge of an exposure region (masked EDGE) | |
pixelFlags_interpolated | boolean | Interpolated pixel in the Source footprint | |
pixelFlags_interpolatedCenter | boolean | Interpolated pixel in the Source center | |
pixelFlags_nodata | boolean | Source is outside usable exposure region (masked NO_DATA) | |
pixelFlags_saturated | boolean | Saturated pixel in the Source footprint | |
pixelFlags_saturatedCenter | boolean | Saturated pixel in the Source center | |
pixelFlags_suspect | boolean | Sources footprint includes suspect pixels | |
pixelFlags_suspectCenter | boolean | Sources center is close to suspect pixels | |
psfDiffFlux | float | Flux derived from linear least-squares fit of psf model forced on the image difference | nJy |
psfDiffFlux_flag | boolean | Failure to derive linear least-squares fit of psf model forced on the image difference | |
psfDiffFluxErr | float | Uncertainty on the flux derived from linear least-squares fit of psf model forced on the image difference | nJy |
psfFlux | float | Flux derived from linear least-squares fit of psf model forced on the calexp | nJy |
psfFlux_flag | boolean | Failure to derive linear least-squares fit of psf model forced on the calexp | |
psfFluxErr | float | Uncertainty on the flux derived from linear least-squares fit of psf model forced on the calexp | nJy |
tract | long | Skymap tract ID | |
visit | long | Id of the visit where this source was measured. |
The table displayed above has been truncated.
Option to print every column name as a list.
# for col in results['column_name']:
# print(col)
Option to use the regular expressions package re
to search for column names that contain the string temp
.
# temp = 'Err'
# temp = 'Flux'
# temp = 'psf'
# for col in results['column_name']:
# if re.search(temp, col):
# print(col)
Delete the job, but not the results
.
del query
job.delete()
2.2.2. Coordinates¶
The sky coordinates in decimal degrees for the associated Object
are held fixed during the forced measurements, and have no quoted uncertainty in the ForcedSource
table.
coord_ra
,coord_dec
2.2.3. Observation¶
The band, visit, and detector in which the ForcedSource
was measured.
band
,visit
,detector
2.2.4. Photometry¶
A forced fit of the Point Spread Function (PSF) at the object's coordinates in each individual visit_image
and difference_image
.
psfFlux
,psfFluxErr
psfDiffFlux
,psfDiffFluxErr
Forced fluxes can be negative.
Forced PSF photometry on an image in which the background is slightly oversubtracted at the coordinates can be negative. Photometry on differences can be negative as well.
It is not recommended to convert forced fluxes to magnitudes, but if they are positive, the fluxes ($f$) are in nanoJanskys and the corresponding AB magnitude ($m$) is:
$m = -2.5\log(f) + 31.4$
2.2.5. Flags¶
A variety of flags indicating whether pixels that are saturated, or affected by cosmic rays, contributed to the source's measurements.
pixelFlags_*
2.3. Descriptions and units¶
For a subset of the key columns show the table of their descriptions and units.
col_list = set(['objectId', 'coord_ra', 'coord_dec', 'band', 'visit', 'detector',
'psfFlux', 'psfFluxErr', 'psfDiffFlux', 'psfDiffFluxErr'])
tx = [i for i, item in enumerate(results['column_name']) if item in col_list]
results[tx]
column_name | datatype | description | unit |
---|---|---|---|
str64 | str64 | str512 | str64 |
band | char | Abstract filter that is not associated with a particular instrument | |
coord_dec | double | Fiducial ICRS Declination of centroid used for database indexing | deg |
coord_ra | double | Fiducial ICRS Right Ascension of centroid used for database indexing | deg |
detector | long | Id of the detector where this source was measured. Datatype short instead of byte because of DB concerns about unsigned bytes. | |
objectId | long | Unique Object ID. Primary Key of the Object Table | |
psfDiffFlux | float | Flux derived from linear least-squares fit of psf model forced on the image difference | nJy |
psfDiffFluxErr | float | Uncertainty on the flux derived from linear least-squares fit of psf model forced on the image difference | nJy |
psfFlux | float | Flux derived from linear least-squares fit of psf model forced on the calexp | nJy |
psfFluxErr | float | Uncertainty on the flux derived from linear least-squares fit of psf model forced on the calexp | nJy |
visit | long | Id of the visit where this source was measured. |
Clean up.
del col_list, tx, results
3. Data access¶
The ForcedSource
table is available via the TAP service and the Butler.
Recommended access method: TAP.
3.1. Advisory: avoid full-table queries¶
Avoid full-table queries. Always include a constraint on the coordinates, objectId
, or visit
.
The ForcedSource
table is a large, inclusive union set of forced measurements made in all the visit and difference images at the locations of all Objects
.
Furthermore, skipping spatial constraints is not a good habit to form, because future data release ForcedSource
tables will contain trillions of rows.
3.2. TAP (Table Access Protocol)¶
The ForcedSource
table is stored in Qserv and accessible via the TAP services using ADQL queries.
Include spatial constraints:
Qserv stores catalog data sharded by coordinate (RA, Dec), so ADQL query statements that include constraints by coordinate do not requre a whole-catalog search and are typically faster (and can be much faster) than ADQL query statements which only include constraints for other columns.
Use either an ADQL cone or polygon search for faster queries (do not use WHERE ... BETWEEN
statements to set boundaries on RA and Dec).
3.2.1. Demo query¶
Define a query to return the ten "key columns" from Section 2.3.
Use the coordinates of a variable star discovered in LSSTComCam data (Carlin et al. 2025, in prep.):
- RA = 94.9226329830
- Dec = -25.2318482104
Impose spatial constraints: search within a 2 arcsecond radius of the star's coordinates.
query = "SELECT objectId, coord_ra, coord_dec, band, visit, detector, " \
"psfFlux, psfFluxErr, psfDiffFlux, psfDiffFluxErr " \
"FROM dp1.ForcedSource " \
"WHERE CONTAINS(POINT('ICRS', coord_ra, coord_dec), " \
"CIRCLE('ICRS', 94.9226329830, -25.2318482104, 0.00056)) = 1 " \
"ORDER BY objectId ASC "
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
job.raise_if_error()
Job phase is COMPLETED
Fetch the results as an astropy
table.
assert job.phase == 'COMPLETED'
results = job.fetch_result().to_table()
print(len(results))
219
Option to display the table.
# results
Show that all of the ForcedSource
photometry that was returned is associated with only 1 Object
.
values, counts = np.unique(results['objectId'], return_counts=True)
assert len(values) == 1
for value, count in zip(values, counts):
print(value, count)
del values, counts
614435753623041404 219
As an example, calculate and print the minimum, mean, maximum, and standard deviation in PSF flux in the visit_images
per filter.
for filt in filter_names:
fx = np.where(results['band'] == filt)[0]
print('%s %3i %8.1f %8.1f %8.1f %8.1f' %
(filt, len(fx),
np.min(results['psfFlux'][fx]), np.mean(results['psfFlux'][fx]),
np.max(results['psfFlux'][fx]), np.std(results['psfFlux'][fx])))
u 20 43299.7 49598.4 74063.9 7876.5 g 64 363.7 109666.4 162934.0 29583.2 r 68 485.2 114955.3 147708.0 18085.7 i 17 111610.0 122977.4 133661.0 8460.0 z 42 -361.3 100899.1 133390.0 24700.9 y 8 111474.0 114524.8 116633.0 1416.7
Clean up.
job.delete()
del query, results
3.2.2. Joinable tables¶
The ForcedSource
table can be joined to:
- the
Visit
table on columnvisit
- the
Object
table on columnobjectId
The Visit
table contains information about the observation, such as the date and time.
The Object
table contains measurements made on the deep_coadd
image.
To plot a light curve, join the ForcedSource
and Visit
tables.
To the query used in Section 3.2.1, add a table join to the Visit
table and retrieve the column expMidptMJD
.
Change the WHERE
statement to select all rows of the ForcedSource
catalog for the objectId
identified in the previous section.
This query would also work the same with the CONTAINS(POINT(), CIRCLE())
statement.
query = "SELECT fs.objectId, fs.coord_ra, fs.coord_dec, fs.band, fs.visit, fs.detector, " \
"fs.psfFlux, fs.psfFluxErr, fs.psfDiffFlux, fs.psfDiffFluxErr, v.expMidptMJD, " \
"o.u_psfFlux, o.g_psfFlux, o.r_psfFlux, o.i_psfFlux, o.z_psfFlux, o.y_psfFlux " \
"FROM dp1.ForcedSource AS fs " \
"JOIN dp1.Visit AS v ON fs.visit = v.visit " \
"JOIN dp1.Object AS o ON o.objectId = fs.objectId " \
"WHERE o.objectId = 614435753623041404 " \
"ORDER BY v.expMidptMJD ASC "
job = service.submit_job(query)
job.run()
job.wait(phases=['COMPLETED', 'ERROR'])
print('Job phase is', job.phase)
if job.phase == 'ERROR':
job.raise_if_error()
Job phase is COMPLETED
Fetch the results as an astropy
table.
assert job.phase == 'COMPLETED'
results = job.fetch_result().to_table()
print(len(results))
219
Option to display the results table.
# results
Plot the light curve for this object, as measured in the visit_image
and difference_image
.
fig, (ax1, ax2) = plt.subplots(ncols=2, figsize=(10, 4))
for filt in filter_names:
fx = np.where(results['band'] == filt)[0]
if len(fx) > 0:
ax1.axhline(results[filt+'_psfFlux'][fx[0]], color=filter_colors[filt],
ls=filter_linestyles[filt], alpha=0.2, label=filt)
ax1.plot(results['expMidptMJD'][fx],
results['psfFlux'][fx],
filter_symbols[filt], ms=5, mew=0, alpha=0.7,
color=filter_colors[filt])
ax2.plot(results['expMidptMJD'][fx],
results['psfDiffFlux'][fx],
filter_symbols[filt], ms=5, mew=0, alpha=0.7,
color=filter_colors[filt], label=filt)
ax1.set_xlabel('MJD')
ax1.set_ylabel('PSF Flux')
ax1.legend(loc='lower left', ncol=2)
ax2.set_xlabel('MJD')
ax2.set_ylabel('PSF Diff Flux')
ax2.legend(loc='lower left', ncol=2)
plt.tight_layout()
plt.show()
Figure 1: At left, the PSF forced photometry light curve on the
visit_images
for the object. Horizontal lines mark the object's flux in thedeep_coadd
image. At right, the PSF forced photometry light curve on thedifference_images
for the object.
Clean up.
job.delete()
del query, results
3.4. Butler¶
TAP is the recommended way to access the source table, but the Butler can also be used.
Show that the dimensions for the object_forced_source
table are the skymap's tract and patch, and that both are required.
butler.get_dataset_type('object_forced_source')
DatasetType('object_forced_source', {skymap, tract, patch}, ArrowAstropy)
butler.get_dataset_type('object_forced_source').dimensions.required
{skymap, tract, patch}
3.4.1. Demo query¶
Retrieve all dataset_refs
for object_forced_source
tables for patches that overlap the coordinates that were also used in Section 3.2.
ra = 94.9226329830
dec = -25.2318482104
refs = butler.query_datasets("object_forced_source",
where="patch.region OVERLAPS POINT(:ra, :dec)",
bind={"ra": ra, "dec": dec})
Print the number of dataset references returned.
print(len(refs))
3
Print the dataId
s of the refs
.
for ref in refs:
print(ref.dataId)
{skymap: 'lsst_cells_v1', tract: 5306, patch: 97} {skymap: 'lsst_cells_v1', tract: 5525, patch: 0} {skymap: 'lsst_cells_v1', tract: 5526, patch: 9}
Define the columns to retrieve.
col_list = ['objectId', 'coord_ra', 'coord_dec',
'band', 'psfFlux', 'psfFluxErr']
Get the data from the Butler for the third patch in the list of returned dataset refs.
results = butler.get(refs[2],
parameters={'columns': col_list})
print(len(results))
357619
Option to display the results.
# results
Without a small constraint on the region as in Section 3.2.1, data for many more unique objects has been returned.
values = np.unique(results['objectId'])
print(len(values))
del values
1830
Check that the variable star object is in the results table.
tx = np.where(results['objectId'] == 614435753623041404)[0]
print(len(tx))
219
Print the same table of psfFlux
statistics as in Section 3.2.1.
for filt in filter_names:
fx = np.where(results['band'][tx] == filt)[0]
print('%s %3i %8.1f %8.1f %8.1f %8.1f' %
(filt, len(fx),
np.min(results['psfFlux'][tx[fx]]),
np.mean(results['psfFlux'][tx[fx]]),
np.max(results['psfFlux'][tx[fx]]),
np.std(results['psfFlux'][tx[fx]])))
u 20 43299.7 49598.4 74063.9 7876.5 g 64 363.7 109666.3 162934.2 29583.2 r 68 485.2 114955.3 147707.7 18085.7 i 17 111610.2 122977.3 133661.2 8460.0 z 42 -361.3 100899.1 133390.0 24700.9 y 8 111474.4 114524.8 116633.3 1416.7
del ra, dec, refs, col_list
del results, tx, fx