FITS to Zarr (Low-Level)¶
The ovro_lwa_portal.fits_to_zarr_xradio module provides the low-level
functions for converting OVRO-LWA FITS image files to Zarr format using xradio.
Prefer the high-level API for most use cases
The FITSToZarrConverter
class in the ingest module wraps these functions with FileLock-based concurrency
protection, progress callbacks, and a simpler interface. Use the low-level
functions here only when you need fine-grained control over the conversion
process.
Quick Reference¶
from pathlib import Path
from ovro_lwa_portal.fits_to_zarr_xradio import (
convert_fits_dir_to_zarr,
fix_fits_headers,
)
# Fix headers first (optional — convert_fits_dir_to_zarr can do this on-demand)
fits_files = sorted(Path("/data/fits").glob("*.fits"))
fixed = fix_fits_headers(fits_files, Path("/data/fixed_fits"))
# Convert to Zarr
result = convert_fits_dir_to_zarr(
input_dir="/data/fits",
out_dir="/data/output",
fixed_dir="/data/fixed_fits",
fix_headers_on_demand=False, # already fixed above
)
API Reference¶
convert_fits_dir_to_zarr¶
convert_fits_dir_to_zarr(input_dir, out_dir, zarr_name='ovro_lwa_full_lm_only.zarr', fixed_dir='fixed_fits', chunk_lm=1024, rebuild=False, resume=False, fix_headers_on_demand=True, cleanup_fixed_fits=False, progress_callback=None, duplicate_resolver=None, discovery_freq_bin_hz=_DISCOVERY_FREQ_BIN_HZ, time_keys_only=None, lm_reference_ds=None, lm_reference_target_size=None, group_metadata_source='fits')
¶
Convert all matching FITS in a directory into a single LM-only Zarr store.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_dir
|
str | Path
|
Directory containing input FITS files. |
required |
out_dir
|
str | Path
|
Directory where the Zarr store will be written. |
required |
zarr_name
|
str
|
Name of the Zarr store directory (under |
'ovro_lwa_full_lm_only.zarr'
|
fixed_dir
|
str | Path
|
Directory to place generated |
'fixed_fits'
|
chunk_lm
|
int
|
Optional LM chunk size for the in-memory xarray datasets (0 disables). |
1024
|
rebuild
|
bool
|
If True, overwrite any existing Zarr; otherwise append to it. |
False
|
fix_headers_on_demand
|
bool
|
If True, fix FITS headers on-demand during conversion if they don't exist.
If False, assume headers are already fixed using :func: |
True
|
cleanup_fixed_fits
|
bool
|
If True (and |
False
|
progress_callback
|
Optional[Callable[[str, int, int, str], None]]
|
Optional callback function for progress reporting. Should accept (stage: str, current: int, total: int, message: str). |
None
|
duplicate_resolver
|
Optional[Callable[[str, float, List[Path]], Path]]
|
Optional callback to resolve duplicate files that map to the same
time/frequency group. Signature: |
None
|
discovery_freq_bin_hz
|
float
|
Bin width in Hz for treating header frequencies as the same subband during discovery (default 23~kHz). Must be positive. |
_DISCOVERY_FREQ_BIN_HZ
|
time_keys_only
|
Optional[Sequence[str]]
|
If set, only time keys in this collection are converted (after discovery). Used for incremental pipelines (e.g. dewarp one time step, then append Zarr). |
None
|
lm_reference_ds
|
Optional[Dataset]
|
If provided, skip the global LM reference scan and use this dataset instead. Must match the grid chosen for the full run (typically built once from the same input layout before dewarping). Callers should pass a deep-copied dataset if the same object might be mutated elsewhere. |
None
|
lm_reference_target_size
|
int | None
|
When building the global LM reference ( |
None
|
group_metadata_source
|
Literal['fits', 'filename']
|
|
'fits'
|
After
|
|
required | |
join
|
|
required | |
store
|
|
required | |
When
|
|
required | |
that
|
|
required | |
resumed
|
|
required | |
steps
|
|
required | |
present
|
|
required | |
the
|
|
required | |
the
|
|
required | |
explicit
|
|
required | |
growing
|
|
required | |
Within
|
|
required | |
frame
|
|
required | |
slices
|
|
required | |
is
|
|
required | |
Mixed
|
|
required | |
largest
|
|
required | |
and
|
|
required | |
same
|
|
required | |
sky
|
|
required |
Returns:
| Type | Description |
|---|---|
Path
|
Path to the resulting Zarr store directory. |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If no matching FITS files are found. |
RuntimeError
|
If LM grids differ across time steps. |
Source code in src/ovro_lwa_portal/fits_to_zarr_xradio.py
2879 2880 2881 2882 2883 2884 2885 2886 2887 2888 2889 2890 2891 2892 2893 2894 2895 2896 2897 2898 2899 2900 2901 2902 2903 2904 2905 2906 2907 2908 2909 2910 2911 2912 2913 2914 2915 2916 2917 2918 2919 2920 2921 2922 2923 2924 2925 2926 2927 2928 2929 2930 2931 2932 2933 2934 2935 2936 2937 2938 2939 2940 2941 2942 2943 2944 2945 2946 2947 2948 2949 2950 2951 2952 2953 2954 2955 2956 2957 2958 2959 2960 2961 2962 2963 2964 2965 2966 2967 2968 2969 2970 2971 2972 2973 2974 2975 2976 2977 2978 2979 2980 2981 2982 2983 2984 2985 2986 2987 2988 2989 2990 2991 2992 2993 2994 2995 2996 2997 2998 2999 3000 3001 3002 3003 3004 3005 3006 3007 3008 3009 3010 3011 3012 3013 3014 3015 3016 3017 3018 3019 3020 3021 3022 3023 3024 3025 3026 3027 3028 3029 3030 3031 3032 3033 3034 3035 3036 3037 3038 3039 3040 3041 3042 3043 3044 3045 3046 3047 3048 3049 3050 3051 3052 3053 3054 3055 3056 3057 3058 3059 3060 3061 3062 3063 3064 3065 3066 3067 3068 3069 3070 3071 3072 3073 3074 3075 3076 3077 3078 3079 3080 3081 3082 3083 3084 3085 3086 3087 3088 3089 3090 3091 3092 3093 3094 3095 3096 3097 3098 3099 3100 3101 3102 3103 3104 3105 3106 3107 3108 3109 3110 3111 3112 3113 3114 3115 3116 3117 3118 3119 3120 3121 3122 3123 3124 3125 3126 3127 3128 3129 3130 3131 3132 3133 3134 3135 3136 3137 3138 3139 3140 3141 | |
fix_fits_headers¶
fix_fits_headers(files, fixed_dir, *, skip_existing=True, group_metadata_source='fits')
¶
Fix FITS headers for a list of files, creating *_fixed.fits files.
This function processes FITS files to ensure they have the necessary
headers for xradio conversion. It can be run ahead of time before
calling :func:convert_fits_dir_to_zarr to separate the header
fixing step from the conversion process.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
files
|
List[Path]
|
List of FITS file paths to process. |
required |
fixed_dir
|
Path
|
Directory where |
required |
skip_existing
|
bool
|
If True, skip files that already have corresponding fixed versions. Default is True. |
True
|
group_metadata_source
|
Literal['fits', 'filename']
|
Frequency sort order for processing files (see :func: |
'fits'
|
Returns:
| Type | Description |
|---|---|
List[Path]
|
List of paths to the fixed FITS files. |
Notes
- Files already ending with
_fixed.fitsare considered already fixed and are returned as-is. - The :func:
_fix_headersfunction applies BSCALE/BZERO and adds minimal WCS/spectral keywords required by xradio. - Files whose primary header lacks a real
BMAJ/BMIN(missing or non-positive) raise :class:InvalidBeamErrorinside :func:_fix_headers; they are logged at WARNING level, omitted from the returned list, and any partially-written*_fixed.fitsis removed so downstream consumers see only files with a real synthesized beam.
Examples:
>>> from pathlib import Path
>>> from ovro_lwa_portal.fits_to_zarr_xradio import fix_fits_headers
>>> input_files = list(Path("input").glob("*.fits"))
>>> fixed_dir = Path("fixed_fits")
>>> fixed_dir.mkdir(exist_ok=True)
>>> fixed_paths = fix_fits_headers(input_files, fixed_dir)
>>> print(f"Fixed {len(fixed_paths)} files")
Source code in src/ovro_lwa_portal/fits_to_zarr_xradio.py
1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 | |