The Gaia Focal Plane106 CCDs , 938 million pixels, 2800 cm 2 104.26cm42.35cmSky MapperCCDsAstrometric Field CCDsPhotometric CCDsRadial VelocitySpectrometric CCDsImage motionReadout in TDI (Time Delayed Integration) mode,synchronized with rotation speed.

The Gaia Data Astrometric data: 1D and 2D samples (windows)depending on brightness of the object. Expected number of samples:1 billion stars x 80 observations x 10 readouts =~ 1 x 10 12 samples.(1 millisecond processing time per sample → more than 30 year in total) Furthermore photometric data and radial velocityspectrometric data. Expected downlink maximum 42 GByte per day.

Gaia Data ProcessingConsortium (DPAC) Many European institutes andobservatories involved. More than 300 people. Organized in 9 Coordination Units (CUs). 6 Data Processing Centres (DPCs):Madrid (ESAC), Barcelona, Toulouse,Cambridge, Geneva, Torino.

Core Data Processing ESAC, Madrid: Initial Data Treatment (centroiding(centroiding, , crossmatching). First Look (determine quality of the data). Astrometric Global Iterative Solution(determine attitude and calibrationparameters). Intermediate Data Update (UB, Barcelona). Astrometric Verification (OATA, Torino).

Non-coreData Processing Object Processing (CNES, Toulouse). Photometry (IoA(IoA, , Cambridge). Spectrometry (CNES, Toulouse). Variability (ISDC, Geneva). Astrophysical Parameters (CNES,Toulouse). Simulation (UB, Barcelona).

Large ScaleDistributed Processing Independent Data Processing Centres(DPCs) – reduce complexity. Database oriented. Main Database atESAC. Iterative processing – 6 month DataReduction Cycles. Data transferred to/from DPC using GaiaTransfer System (protocol yet unknown,might be GridFTP).

Data Flow

Small ScaleDistributed Processing A lot of data: efficient I/O necessary. Distributed processing on dedicatedcluster. Task scheduling separated from scientificalgorithms. Simple but efficient schedulingmechanism: Whiteboard with jobs – Datatrains on the worker nodes.

Optimized Processingat ESAC Database optimized for the requiredprocessing: use of index organized tables. data stored in groups as serialized objects. data removed that will not be used. Auxiliary data kept in memory if possible(e.g. calibration, attitude, ephemeris data). Many threads to use nodes, CPU's s andcores as efficient as possible.

Concluding remarks Gaia Data Processing is not using the Grid(yet). Not foreseen for the daily core processingat ESAC. CNES has a study running to see if Gridcomputing might be useful for other DPCs.

