With the rapid advancements in the field of fluorescent
dyes, accurate
prediction of optical properties and efficient retrieval of dye-related
data are essential for effective dye design. However, there is a lack
of tools for comprehensive data integration and convenient data retrieval.
Moreover, existing prediction models mainly focus on a single property
of fluorescent dyes and fail to account for the diverse fluorophores
and solutions in a systematic manner. To address this, we proposed
Fluor-predictor, a multitask prediction model for fluorophores. This
study integrates multiple dye databases and develops an interpretable
graph neural network-based multitask regression model to predict four
key optical properties of fluorescent dyes. We thoroughly examined
the impact of factors such as data quality and the number of solvents
on model performance. By leveraging atomic weight contributions, the
model not only predicts these properties but also provides insights
to guide structural modifications. In addition, we compiled and built
a comprehensive database containing 36,756 records of fluorescence
properties. To address the limitations of existing models in accurate
prediction of Xanthene and Cyanine dyes, we then compiled 1148 Xanthene
dye records and 1496 Cyanine dye records from the literature, comparing
direct training with transfer learning approaches. The model achieved
mean absolute errors (MAE) of 11.70 nm, 15.37 nm, 0.096, and 0.091
for predicting absorption wavelength (λabs), emission
wavelength (λem), quantum yield (Φ) and molar
extinction coefficient (Log(ε)), respectively. We integrated
this work into a tool, Fluor-predictor, which supports comprehensive
retrieval methods and multiproperty prediction. Fluor-predictor will
facilitate data retrieval, prescreening, and structural modification
of dyes.