102.5. LSDB data access#
102.5. Rubin data access with LSDB¶
For the Rubin Science Platform at data.lsst.cloud.
Data Release: Data Preview 1
Container Size: Large
LSST Science Pipelines version: Release r29.2.0
Last verified to run: 2025-09-19
Repository: github.com/lsst/tutorial-notebooks
Learning objective: How to access Rubin data in LSDB format.
LSST data products: Object
, DiaObject
, DiaSource
, ForcedSource
, and ForcedSourceOnDiaObject
Packages: lsdb
Credit: Originally developed by Andrés A. Plazas Malagón, Melissa Graham, and the Rubin Community Science team with input from Neven Caplar and Tianqing Zhang. Please consider acknowledging them if this notebook is used for the preparation of journal articles, software releases, or other notebooks.
Get Support: Everyone is encouraged to ask questions or raise issues in the Support Category of the Rubin Community Forum. Rubin staff will respond to all questions posted there.
1. Introduction¶
LSDB (Large Scale Database) is an open-source Python framework that enables fast all-sky cross-matching, bulk application of user-defined functions, and simplified analysis of time-domain (light curve) data. It operates on data stored in the HATS data format that provides an efficient, spatially-indexed format for storing catalog data. Built on top of Dask, LSDB uses the HATS (Hierarchical Adaptive Tiling Scheme) data format (HEALPix-sharded Parquet) to efficiently perform spatial operations.
LSDB also hosts many other datasets stored in the HATS format, beyond the DP1 catalogs shown in this tutorial, but only the DP1 dataset is available in the Rubin Science Platform (RSP). Find the full list of LSDB-hosted datasets at data.lsdb.io.
Note: This notebook is intended only as a simple tutorial on LSDB DP1 catalogs. For more detailed examples and advanced use cases, see the full set of LSDB tutorials at LSDB tutorials.
References:
- Descriptions of LSDB-formatted Data Preview 1 (DP1) data: https://data.lsdb.io/
- LSDB documentation: docs.lsdb.io
- Working with Rubin Data using LSDB
- LSDB hackathon at the Rubin Community Workshop 2025
Related tutorials: The 200-level tutorials on the Object
, DiaObject
, DiaSource
, ForcedSource
, and ForcedSourceOnDiaObject
catalogs. The 300-level tutorial on how to access photometric redshifts in LSDB-formatted files.
1.1. Import packages¶
Import the LSDB package to work with LSDB-formatted files, upath
for handling local and remote file paths uniformly, and matplotlib.pyplot
for visualization.
import lsdb
import astropy.units as u
from astropy.coordinates import SkyCoord
from upath import UPath
import matplotlib.pyplot as plt
from lsst.utils.plotting import (get_multiband_plot_colors,
get_multiband_plot_symbols)
Define the filter names, colors, and symbols to use when plotting.
filter_names = ['u', 'g', 'r', 'i', 'z', 'y']
filter_colors = get_multiband_plot_colors()
filter_symbols = get_multiband_plot_symbols()
Set the base path to the LSDB-formatted DP1 data in the RSP.
base_path = UPath("/rubin/lsdb_data")
2. Access the LSDB DP1 catalogs¶
The four LSDB DP1 read-only catalogs available in the Rubin Science Platform at data.lsst.cloud
are located in the directory /rubin/lsdb_data
, and their names are:
object_collection
: theObject
tableobject_collection_lite
: a limited number of columns from theObject
tabledia_object_collection
: theDiaObject
tableobject_photoz
: photometric redshift (photo-z) estimates for galaxies
Note that the object_photoz
catalog was not part of the DP1 release (see Section 2.3, below).
2.1. object_collection¶
This LSDB-formatted file is the same as the DP1 Object
table but with additional columns, <f>_psfMag
and <f>_psfMagErr
, which are the corresponding <f>_psfFlux
columns converted to magnitudes (for each filter, <f>
, in $ugrizy$).
Schema browser for the DP1 Object table.
Nested columns also have additional columns such as psfMag
and psfMagErr
(see section 2.1.2 below).
2.1.1. Load and display the catalog¶
object_cat = lsdb.open_catalog(base_path / "object_collection")
object_cat
coord_dec | coord_decErr | coord_ra | coord_raErr | g_psfFlux | g_psfFluxErr | g_psfMag | g_psfMagErr | i_psfFlux | i_psfFluxErr | i_psfMag | i_psfMagErr | objectId | patch | r_psfFlux | r_psfFluxErr | r_psfMag | r_psfMagErr | refBand | refFwhm | shape_flag | shape_xx | shape_xy | shape_yy | tract | u_psfFlux | u_psfFluxErr | u_psfMag | u_psfMagErr | x | xErr | y | y_psfFlux | y_psfFluxErr | y_psfMag | y_psfMagErr | yErr | z_psfFlux | z_psfFluxErr | z_psfMag | z_psfMagErr | objectForcedSource | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
npartitions=389 | ||||||||||||||||||||||||||||||||||||||||||
Order: 6, Pixel: 130 | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | int64[pyarrow] | int64[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | string[pyarrow] | float[pyarrow] | bool[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | int64[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | nested<coord_ra: [double], coord_dec: [double]... |
Order: 8, Pixel: 2176 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Order: 9, Pixel: 2302101 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Order: 7, Pixel: 143884 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Lazily loaded catalogs: note the message under the displayed table above, that all of the columns have been loaded "lazily". This is always the default for LSDB catalogs, and it means that only the metadata is loaded at first. This way, LSDB can plan how tasks will be executed in the future without actually doing any computation.
Order
is the HEALPix resolution, and Pixel
is the HEALPix index of the specific sky patch.
Optional: uncomment the following cell and press "tab" to browse the available methods on an LSDB catalog like object_cat
.
# object_cat.
2.1.2. Show column names¶
Display the subset of 42 columns that are lazily loaded by default.
object_cat.columns
Index(['coord_dec', 'coord_decErr', 'coord_ra', 'coord_raErr', 'g_psfFlux', 'g_psfFluxErr', 'g_psfMag', 'g_psfMagErr', 'i_psfFlux', 'i_psfFluxErr', 'i_psfMag', 'i_psfMagErr', 'objectId', 'patch', 'r_psfFlux', 'r_psfFluxErr', 'r_psfMag', 'r_psfMagErr', 'refBand', 'refFwhm', 'shape_flag', 'shape_xx', 'shape_xy', 'shape_yy', 'tract', 'u_psfFlux', 'u_psfFluxErr', 'u_psfMag', 'u_psfMagErr', 'x', 'xErr', 'y', 'y_psfFlux', 'y_psfFluxErr', 'y_psfMag', 'y_psfMagErr', 'yErr', 'z_psfFlux', 'z_psfFluxErr', 'z_psfMag', 'z_psfMagErr', 'objectForcedSource'], dtype='object')
Optional: uncomment the cell below to display the names of a larger subset of the 1304 columns from the Object
catalog.
# object_cat.all_columns
Search for column names that contain a string, such as psfMag
(i.e., the columns that contain the PSF fluxes converted to magnitudes).
search_string = 'psfMag'
for col in object_cat.all_columns:
if col.find(search_string) > 0:
print(col)
u_psfMag u_psfMagErr g_psfMag g_psfMagErr r_psfMag r_psfMagErr i_psfMag i_psfMagErr z_psfMag z_psfMagErr y_psfMag y_psfMagErr
Load only selected columns
use_columns = ['coord_dec', 'coord_decErr', 'coord_ra', 'coord_raErr',
'g_psfFlux', 'g_psfFluxErr', 'g_psfMag', 'g_psfMagErr']
object_cat_selected_columns = lsdb.open_catalog(base_path / "object_collection",
columns=use_columns)
object_cat_selected_columns.columns
Index(['coord_dec', 'coord_decErr', 'coord_ra', 'coord_raErr', 'g_psfFlux', 'g_psfFluxErr', 'g_psfMag', 'g_psfMagErr'], dtype='object')
2.1.3. Execute a cone search¶
Cone searches are supported and defined by a center (ra
, dec
), in degrees, and a radius r
, in arcseconds.
Execute a cone search on the object catalog using the coordinates (in degrees) of the Extended Chandra Deep Field South DP1 target field, with a radius of 0.1 deg.
ra_ecdfs = 53.16
dec_ecdfs = -28.10
object_cat_ecdfs = object_cat.cone_search(ra=ra_ecdfs, dec=dec_ecdfs,
radius_arcsec=0.1 * 3600.0)
This table contains only 8 partitions, compared to the 389 described in Section 2.1.1, due to the 0.1-degree spatial restriction.
object_cat_ecdfs
coord_dec | coord_decErr | coord_ra | coord_raErr | g_psfFlux | g_psfFluxErr | g_psfMag | g_psfMagErr | i_psfFlux | i_psfFluxErr | i_psfMag | i_psfMagErr | objectId | patch | r_psfFlux | r_psfFluxErr | r_psfMag | r_psfMagErr | refBand | refFwhm | shape_flag | shape_xx | shape_xy | shape_yy | tract | u_psfFlux | u_psfFluxErr | u_psfMag | u_psfMagErr | x | xErr | y | y_psfFlux | y_psfFluxErr | y_psfMag | y_psfMagErr | yErr | z_psfFlux | z_psfFluxErr | z_psfMag | z_psfMagErr | objectForcedSource | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
npartitions=8 | ||||||||||||||||||||||||||||||||||||||||||
Order: 9, Pixel: 2299851 | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | int64[pyarrow] | int64[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | string[pyarrow] | float[pyarrow] | bool[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | int64[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | nested<coord_ra: [double], coord_dec: [double]... |
Order: 9, Pixel: 2299854 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Order: 9, Pixel: 2299876 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Order: 9, Pixel: 2299878 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
object_cat_ecdfs.head(3)
coord_dec | coord_decErr | coord_ra | coord_raErr | g_psfFlux | g_psfFluxErr | g_psfMag | g_psfMagErr | i_psfFlux | i_psfFluxErr | i_psfMag | i_psfMagErr | objectId | patch | r_psfFlux | r_psfFluxErr | r_psfMag | r_psfMagErr | refBand | refFwhm | shape_flag | shape_xx | shape_xy | shape_yy | tract | u_psfFlux | u_psfFluxErr | u_psfMag | u_psfMagErr | x | xErr | y | y_psfFlux | y_psfFluxErr | y_psfMag | y_psfMagErr | yErr | z_psfFlux | z_psfFluxErr | z_psfMag | z_psfMagErr | objectForcedSource | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2528713666757650643 | -28.196205 | 0.000000 | 53.187210 | 0.000000 | 1744349.500000 | 228.920868 | 15.795917 | 0.000142 | 1685627.000000 | 1071.237427 | 15.833097 | 0.000690 | 611253698252788430 | 4 | 2062061.750000 | 894.229187 | 15.614245 | 0.000471 | i | 0.807527 | True | 8.869386 | 0.395710 | 9.041536 | 5063 | 923507.875000 | 272.120880 | 16.486399 | 0.000320 | 13418.000000 | 0.005167 | 2838.000000 | 2806465.750000 | 591.650208 | 15.279601 | 0.000229 | 0.007965 | 2272363.500000 | 496.028137 | 15.508805 | 0.000237 |
|
|||||||||||||||
2528713667097101852 | -28.194650 | 0.001008 | 53.187209 | 0.000140 | 1898.243652 | 8.923483 | 23.204121 | 0.005104 | 2675.489746 | 19.397362 | 22.831491 | 0.007872 | 611253698252788459 | 4 | 2538.645508 | 11.373549 | 22.888494 | 0.004864 | i | 0.805709 | True | 5063 | 748.425598 | 61.777092 | 24.214628 | 0.089824 | 13418.000000 | 2.865658 | 2866.000000 | 15297.465820 | 276.856079 | 20.938452 | 0.019652 | 18.149639 | 3072.677002 | 40.690876 | 22.681208 | 0.014379 |
|
||||||||||||||||||
2528713668080668800 | -28.192582 | 0.000041 | 53.190902 | 0.000027 | 182.774277 | 8.643060 | 25.745213 | 0.051381 | 218.402191 | 18.898153 | 25.551857 | 0.094183 | 611253698252788655 | 4 | 214.599091 | 11.055686 | 25.570930 | 0.055984 | i | 0.807146 | True | 5063 | 394.112671 | 62.504913 | 24.910950 | 0.173660 | 13359.366271 | 0.553583 | 2903.177309 | 233.969193 | 264.759735 | 25.477104 | 0.732595 | 262.392517 | 40.650154 | 25.352621 | 0.169569 |
|
object_cat_ecdfs.columns
Index(['coord_dec', 'coord_decErr', 'coord_ra', 'coord_raErr', 'g_psfFlux', 'g_psfFluxErr', 'g_psfMag', 'g_psfMagErr', 'i_psfFlux', 'i_psfFluxErr', 'i_psfMag', 'i_psfMagErr', 'objectId', 'patch', 'r_psfFlux', 'r_psfFluxErr', 'r_psfMag', 'r_psfMagErr', 'refBand', 'refFwhm', 'shape_flag', 'shape_xx', 'shape_xy', 'shape_yy', 'tract', 'u_psfFlux', 'u_psfFluxErr', 'u_psfMag', 'u_psfMagErr', 'x', 'xErr', 'y', 'y_psfFlux', 'y_psfFluxErr', 'y_psfMag', 'y_psfMagErr', 'yErr', 'z_psfFlux', 'z_psfFluxErr', 'z_psfMag', 'z_psfMagErr', 'objectForcedSource'], dtype='object')
Visualize the object distribution in the region.
plt.figure(figsize=(7, 5))
plt.hist2d(object_cat_ecdfs['coord_ra'], object_cat_ecdfs['coord_dec'],
bins=200, cmap='viridis')
plt.colorbar(label="Number of Objects")
plt.xlabel("Right Ascension [deg]")
plt.ylabel("Declination [deg]")
plt.title("Sky Distribution of Objects")
plt.show()
Figure 1: A 2-dimensional distribution (heatmap) of the number of objects across the sky, as returned by the cone search centered on the ECDFS field.
2.1.4. Execute a query on column values¶
It is possible to filter LSDB catalogs using the .query()
method.
The query expression is written as a string and follows the same syntax as Pandas .query()
, which supports a subset of Python expressions for filtering DataFrames.
Select only objects with an $r$-band PSF magnitude between 16 and 24 mag.
object_cat_mag_range = object_cat.query("r_psfMag > 16 and r_psfMag < 24")
object_cat_mag_range
coord_dec | coord_decErr | coord_ra | coord_raErr | g_psfFlux | g_psfFluxErr | g_psfMag | g_psfMagErr | i_psfFlux | i_psfFluxErr | i_psfMag | i_psfMagErr | objectId | patch | r_psfFlux | r_psfFluxErr | r_psfMag | r_psfMagErr | refBand | refFwhm | shape_flag | shape_xx | shape_xy | shape_yy | tract | u_psfFlux | u_psfFluxErr | u_psfMag | u_psfMagErr | x | xErr | y | y_psfFlux | y_psfFluxErr | y_psfMag | y_psfMagErr | yErr | z_psfFlux | z_psfFluxErr | z_psfMag | z_psfMagErr | objectForcedSource | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
npartitions=389 | ||||||||||||||||||||||||||||||||||||||||||
Order: 6, Pixel: 130 | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | int64[pyarrow] | int64[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | string[pyarrow] | float[pyarrow] | bool[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | int64[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | nested<coord_ra: [double], coord_dec: [double]... |
Order: 8, Pixel: 2176 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Order: 9, Pixel: 2302101 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Order: 7, Pixel: 143884 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Use the .head()
method to quickly inspect a few rows to check that the query worked as expected.
object_cat_mag_range.head(3)
coord_dec | coord_decErr | coord_ra | coord_raErr | g_psfFlux | g_psfFluxErr | g_psfMag | g_psfMagErr | i_psfFlux | i_psfFluxErr | i_psfMag | i_psfMagErr | objectId | patch | r_psfFlux | r_psfFluxErr | r_psfMag | r_psfMagErr | refBand | refFwhm | shape_flag | shape_xx | shape_xy | shape_yy | tract | u_psfFlux | u_psfFluxErr | u_psfMag | u_psfMagErr | x | xErr | y | y_psfFlux | y_psfFluxErr | y_psfMag | y_psfMagErr | yErr | z_psfFlux | z_psfFluxErr | z_psfMag | z_psfMagErr | objectForcedSource | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
9195875808926578 | 6.055018 | 0.000014 | 38.112638 | 0.000027 | 1274.516846 | 97.626991 | 23.636637 | 0.083330 | 648370118430033584 | 19 | 2462.434326 | 179.295883 | 22.921589 | 0.079195 | r | 0.843185 | False | 6.325832 | 0.783537 | 2.350819 | 10464 | 27987.290525 | 0.492506 | 3501.649349 | 0.253183 |
|
|||||||||||||||||||||||||||||||
9195883532105479 | 6.060315 | 0.000014 | 38.113882 | 0.000016 | 89.467979 | 75.252007 | 26.520830 | 1.329962 | 648370118430033588 | 19 | 1707.087524 | 174.774734 | 23.319361 | 0.111551 | r | 0.842531 | False | 16.577915 | 8.706131 | 22.551805 | 10464 | 27964.873429 | 0.288372 | 3596.985200 | 0.253343 |
|
|||||||||||||||||||||||||||||||
9195884147047980 | 6.059977 | 0.000077 | 38.110133 | 0.000113 | 550.610413 | 76.316589 | 24.547890 | 0.151462 | 648370118430033589 | 19 | 1112.894287 | 175.488525 | 23.783865 | 0.172647 | r | 0.843951 | True | 7.729987 | 6.766598 | 17.503637 | 10464 | 28032.000000 | 2.040944 | 3591.000000 | 1.393253 |
|
2.1.5. Access nested light curves¶
Some LSDB catalogs have "nested" columns. These are columns which, instead of containing an array of data, contain a table.
The LSDB documentation contains more information on working with nested columns and time series data in LSDB format.
In the object_collection
catalog, forced photometry is available in a nested column named objectForcedSource
.
The fields in objectForcedSource
are a subset of the DP1 ForcedSource
table columns, plus two additional columns: psfMag
and psfMagErr
, the psfFlux
and psfFluxErr
columns converted to magnitudes.
Schema browser for the DP1 ForcedSource table.
Discover which columns are nested.
object_cat.nested_columns
['objectForcedSource']
Option to display the fields in the nested column.
# object_cat["objectForcedSource"].nest.fields
Extract and plot a light curve for a random object¶
Select a random object from the ECDFS field by its objectId
.
The ECDFS field had many visits over multiple weeks, so the forced photometry light curve for any random object in ECDFS will have many light curve points.
random_object = object_cat.query("objectId == 611253698252788430")
Option to show the row of the object_cat
for this random object.
# random_object.head(1)
Extract just the objectForcedSource
for this random object, use the compute
method to convert it to a Pandas DataFrame (df
), and extract the light curve (lc
).
random_object_fs = random_object['objectForcedSource']
random_object_df = random_object_fs.compute()
random_object_lc = random_object_df.iloc[0]
Option to display the light curve as a table.
# random_object_lc
Plot the forced photometry light curve for this random object.
fig = plt.figure(figsize=(6, 4))
for f, filt in enumerate(filter_names):
tx = (random_object_lc['band'] == filt)
plt.plot(random_object_lc['midpointMjdTai'][tx], random_object_lc['psfMag'][tx],
filter_symbols[filt], ms=5, mew=0, alpha=0.5, color=filter_colors[filt], label=filt)
plt.ylim([20, 14])
plt.legend(loc='lower left', ncol=3)
plt.xlabel('MJD')
plt.ylabel('PSF Magnitude')
plt.title('Nested Forced Photometry Light Curve')
plt.show()
Figure 2: The forced photometry light curve of a random object in the ECDFS field, extracted from a nested column
Clean up.
del object_cat_selected_columns, object_cat_ecdfs, object_cat_mag_range
del random_object, random_object_fs, random_object_df, random_object_lc
2.2. object_collection_lite¶
The object_collection_lite
LSDB catalog is a reduced version of the Object
catalog in object_collection
. It contains 74 commonly used columns that provide basic object properties, including object identifiers, sky coordinates with uncertainties, basic shape measurements, flags, and PSF- and Kron-based fluxes and magnitudes (with uncertainties) across the six Legacy Survey of Space and Time (LSST) bands ($ugrizy$).
Get the catalog.
object_cat_lite = lsdb.open_catalog(base_path / "object_collection_lite")
The same 42 default columns are loaded lazily for the object_collection_lite
as for the object_collecation
catalog.
object_cat_lite
coord_dec | coord_decErr | coord_ra | coord_raErr | g_psfFlux | g_psfFluxErr | g_psfMag | g_psfMagErr | i_psfFlux | i_psfFluxErr | i_psfMag | i_psfMagErr | objectId | patch | r_psfFlux | r_psfFluxErr | r_psfMag | r_psfMagErr | refBand | refFwhm | shape_flag | shape_xx | shape_xy | shape_yy | tract | u_psfFlux | u_psfFluxErr | u_psfMag | u_psfMagErr | x | xErr | y | y_psfFlux | y_psfFluxErr | y_psfMag | y_psfMagErr | yErr | z_psfFlux | z_psfFluxErr | z_psfMag | z_psfMagErr | objectForcedSource | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
npartitions=389 | ||||||||||||||||||||||||||||||||||||||||||
Order: 6, Pixel: 130 | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | int64[pyarrow] | int64[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | string[pyarrow] | float[pyarrow] | bool[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | int64[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | float[pyarrow] | nested<band: [string], coord_dec: [double], co... |
Order: 8, Pixel: 2176 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Order: 9, Pixel: 2302101 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Order: 7, Pixel: 143884 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Option to display all 74 of the columns.
# object_cat_lite.all_columns
Clean up.
del object_cat_lite
2.3. dia_object_collection¶
This LSDB-formatted file is the same as the DP1 DiaObject
table.
dia_object_cat = lsdb.open_catalog(base_path / "dia_object_collection")
2.3.1. Access nested light curves¶
Check which columns are nested.
dia_object_cat.nested_columns
['diaObjectForcedSource', 'diaSource']
The fields in these nested columns are the mostly same as the DP1 DiaSource
and ForcedSourceOnDiaObject
tables.
Schema browser for the DP1 DiaSource table.
Schema browser for the DP1 ForcedSourceOnDiaObject table.
There are four additional fields:
psfMag
andpsfMagErr
(in both nested columns)scienceMag
andscienceMagErr
(in thediaSource
nested column only)
Warning: For both nested columns, the
psfMag
column has been calculated from thepsfFlux
column, but they are not the same: in theDiaSource
tablepsfFlux
is the PSF fit flux in the difference image, but in theForcedSourceOnDiaObject
tablepsfFlux
is the PSF forced photometry flux in the direct (or science) image. Fluxes measured on a difference image will be negative when the object is brighter in the template image than in the direct (science) image, and so typically, difference-image fluxes are not converted to magnitudes.
Option to display the fields in the nested columns.
# dia_object_cat["diaSource"].nest.fields
# dia_object_cat["diaObjectForcedSource"].nest.fields
Extract and plot a light curve for a random object¶
A known supernova (SN) occurred in the ECDFS field.
Select this known SN by its diaObjectId
.
known_sn = dia_object_cat.query("diaObjectId == 611255759837069401")
Extract the light curve from the diaSource
and diaObjectForcedSource
columns.
known_sn_ds = known_sn['diaSource']
known_sn_ds_df = known_sn_ds.compute()
known_sn_ds_lc = known_sn_ds_df.iloc[0]
known_sn_fs = known_sn['diaObjectForcedSource']
known_sn_fs_df = known_sn_fs.compute()
known_sn_fs_lc = known_sn_fs_df.iloc[0]
Compare light curves that use the forced difference-image flux, the detected difference-image flux, and the detected difference-image flux converted to a magnitude.
This will illustrate why it is recommended to use forced photometry fluxes for difference-image light curves.
fig, (ax1, ax2, ax3) = plt.subplots(3, 1, figsize=(6, 8))
for f, filt in enumerate(filter_names):
tx1 = (known_sn_fs_lc['band'] == filt)
tx2 = (known_sn_ds_lc['band'] == filt)
ax1.plot(known_sn_fs_lc['midpointMjdTai'][tx1]-60000, known_sn_fs_lc['psfDiffFlux'][tx1],
filter_symbols[filt], ms=5, mew=0, alpha=0.5, color=filter_colors[filt], label=filt)
ax2.plot(known_sn_ds_lc['midpointMjdTai'][tx2]-60000, known_sn_ds_lc['psfFlux'][tx2],
filter_symbols[filt], ms=5, mew=0, alpha=0.5, color=filter_colors[filt], label=filt)
ax3.plot(known_sn_ds_lc['midpointMjdTai'][tx2]-60000, known_sn_ds_lc['psfMag'][tx2],
filter_symbols[filt], ms=5, mew=0, alpha=0.5, color=filter_colors[filt], label=filt)
del tx1, tx2
ax1.set_xlim([620, 660])
ax2.set_xlim([620, 660])
ax3.set_xlim([620, 660])
ax1.set_ylim([-10000, 10000])
ax2.set_ylim([-10000, 10000])
ax3.set_ylim([23.7, 21.3])
ax1.set_xlabel('MJD-60000')
ax2.set_xlabel('MJD-60000')
ax3.set_xlabel('MJD-60000')
ax1.set_ylabel('forced PSF Diff Flux')
ax2.set_ylabel('diaSource PSF Flux')
ax3.set_ylabel('diaSource PSF Mag')
ax2.legend(loc='upper right', ncol=2)
plt.tight_layout()
plt.show()
Figure 3: Top: the forced PSF photometry on the difference image has light curve points from every observation. Middle: the
diaSource
PSF-fit photometry only has light curve points when the SN was detected with a signal-to-noise ratio $>$ 5, positive or negative, in the difference image. Some epochs are missing, when the difference flux was near 0. Bottom: converting difference-image fluxes to magnitudes means the observations where the difference-image flux is negative are lost.
Clean up.
del dia_object_cat, known_sn
del known_sn_ds, known_sn_ds_df, known_sn_ds_lc
del known_sn_fs, known_sn_fs_df, known_sn_fs_lc
2.4. object_photoz¶
As documented in the SIT-Com tech note "Initial studies of photometric redshifts with LSSTComCam from DP1" (SITCOMTN-154), members of the Rubin Commissioning Science Unit for photometric redshifts have generated photo-z estimates for every galaxy in DP1.
The object_photoz
table follows a naming pattern of {pz_algorithm_name}_z_{point_estimate_type}
where:
pz_algorithm_name ∈ ['fzboost', 'knn', 'gpz', 'bpz', 'cmnn', 'dnf', 'tpz', 'lephare']
point_estimate_type ∈ ['mode', 'mean', 'median', 'err68high', 'err68low', 'err95high', 'err95low']
As mentioned in Section 1, use of the object_photoz
table is demonstrated in more detail in the 300-level tutorials.
Load the photo-z table.
pz_cat = lsdb.open_catalog(base_path / "object_photoz")
Display the results.
pz_cat
coord_dec | coord_ra | g_cModelMag | g_cModelMagErr | g_gaap1p0Mag | g_gaap1p0MagErr | g_gaap3p0Mag | g_gaap3p0MagErr | g_kronMag | g_kronMagErr | g_psfMag | g_psfMagErr | g_sersicMag | g_sersicMagErr | i_cModelMag | i_cModelMagErr | i_gaap1p0Mag | i_gaap1p0MagErr | i_gaap3p0Mag | i_gaap3p0MagErr | i_kronMag | i_kronMagErr | i_psfMag | i_psfMagErr | i_sersicMag | i_sersicMagErr | objectId | r_cModelMag | r_cModelMagErr | r_gaap1p0Mag | r_gaap1p0MagErr | r_gaap3p0Mag | r_gaap3p0MagErr | r_kronMag | r_kronMagErr | r_psfMag | r_psfMagErr | r_sersicMag | r_sersicMagErr | u_cModelMag | u_cModelMagErr | u_gaap1p0Mag | u_gaap1p0MagErr | u_gaap3p0Mag | u_gaap3p0MagErr | u_kronMag | u_kronMagErr | u_psfMag | u_psfMagErr | u_sersicMag | u_sersicMagErr | y_cModelMag | y_cModelMagErr | y_gaap1p0Mag | y_gaap1p0MagErr | y_gaap3p0Mag | y_gaap3p0MagErr | y_kronMag | y_kronMagErr | y_psfMag | y_psfMagErr | y_sersicMag | y_sersicMagErr | z_cModelMag | z_cModelMagErr | z_gaap1p0Mag | z_gaap1p0MagErr | z_gaap3p0Mag | z_gaap3p0MagErr | z_kronMag | z_kronMagErr | z_psfMag | z_psfMagErr | z_sersicMag | z_sersicMagErr | lephare_z_median | lephare_z_mean | lephare_z_mode | lephare_z_err95_low | lephare_z_err95_high | lephare_z_err68_low | lephare_z_err68_high | knn_z_median | knn_z_mode | knn_z_err95_low | knn_z_err95_high | knn_z_err68_low | knn_z_err68_high | tpz_z_median | tpz_z_mean | tpz_z_mode | tpz_z_err95_low | tpz_z_err95_high | tpz_z_err68_low | tpz_z_err68_high | cmnn_z_median | cmnn_z_mean | cmnn_z_mode | cmnn_z_err95_low | cmnn_z_err95_high | cmnn_z_err68_low | cmnn_z_err68_high | gpz_z_median | gpz_z_mean | gpz_z_mode | gpz_z_err95_low | gpz_z_err95_high | gpz_z_err68_low | gpz_z_err68_high | bpz_z_median | bpz_z_mean | bpz_z_mode | bpz_z_err95_low | bpz_z_err95_high | bpz_z_err68_low | bpz_z_err68_high | dnf_z_median | dnf_z_mean | dnf_z_mode | dnf_z_err95_low | dnf_z_err95_high | dnf_z_err68_low | dnf_z_err68_high | fzboost_z_median | fzboost_z_mean | fzboost_z_mode | fzboost_z_err95_low | fzboost_z_err95_high | fzboost_z_err68_low | fzboost_z_err68_high | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
npartitions=4 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Order: 3, Pixel: 2 | double[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | int64[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | float[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] | double[pyarrow] |
Order: 5, Pixel: 4471 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Order: 2, Pixel: 80 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Order: 0, Pixel: 8 | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Clean up.
del pz_cat
3. Visualize the LSDB sky partitions¶
LSDB catalogs are divided into partitions, which reflect how the LSDB-formatted files are stored. Each partition contains approximately the same number of objects, so partitions are not equal-area regions of the sky. The HATS partitioning scheme assigns smaller partitions to dense regions (for example, the Galactic bulge) and larger partitions to sparse regions, ensuring balanced row counts across files.
The plot_pixels
method of a catalog object visualizes these partitions.
The result is not a science-driven sky coverage map but a display of the polygonal partition boundaries.
Pixel colors represent pixel sizes, with smaller pixels corresponding to regions of higher source density.
Warning: the following code cell produces a
UserWarning
about small HEALPix pixels, which is OK to ignore. The seven DP1 fields are relatively small on the sky to start with, and their partitions even smaller -- but still the locations of all seven fields appear.
fig = object_cat.plot_pixels(plot_title="Object Sky Partition Map")
/opt/lsst/software/stack/conda/envs/lsst-scipipe-10.1.0/lib/python3.12/site-packages/hats/inspection/visualize_catalog.py:303: UserWarning: This plot contains HEALPix pixels smaller than a pixel of the plot. Some values may be lost warnings.warn(
Figure 4: An all-sky map showing the HEALPix partitions for the LSDB-formatted
object_collection
catalog.
Define a field of view (fov) and center in order to zoom-in on the partitions for the object_collection
catalog, and re-create the plot.
Warning: The same
UserWarning
will show about small HEALPix pixels.
fov = (100 * u.deg, 120 * u.deg)
center = SkyCoord(70 * u.deg, -30 * u.deg)
fig = object_cat.plot_pixels(fov=fov, center=center,
plot_title="Object Sky Partition Map")
/opt/lsst/software/stack/conda/envs/lsst-scipipe-10.1.0/lib/python3.12/site-packages/hats/inspection/visualize_catalog.py:303: UserWarning: This plot contains HEALPix pixels smaller than a pixel of the plot. Some values may be lost warnings.warn(
Figure 5: A zoomed-in version of Figure 2.
Clean up.
del object_cat
4. Learn more about LSDB¶
As mentioned in Section 1, this notebook is intended only as a simple tutorial on LSDB DP1 catalogs. For more detailed examples and advanced use cases, see the full set of LSDB tutorials.