Using machine learning and linear regression to forecast the water quality in Al-Sain lake
Abstract
Effective management of the quantity and quality of water requires accurate assessment and determination of the pollution levels of surface and groundwater. The goal of this study is to assess the effectiveness of multiple linear regression (MLR) and 19 machine learning (ML) models, which utilize various algorithms such as regression, boosting, and decision tree. Among of these models are linear regression (Lr), least angle regression (Lar), Bayesian ridge chain (Br), ridge regression (Ridge), k-nearest neighbors regression (K-nn), extra tree regression (Et), extreme gradient boosting (XGBoost), etc. By employing these models, the study aims to accurately predict the surface water quality of Al-Sain lake in Latakia city.
To define the water quality index (WQI), data from the drinking water lake intake for the years 2021-2022 were analyzed. The effectiveness of the multiple linear regression (MLR) and machine learning (ML) models were assessed using statistical tools such as the coefficient of determination (R2) and the root mean square error (RMSE) to gauge their accuracy.
The results indicated that the multiple linear regression model (MLR) and 3 of the machine learning (ML) models, including linear regression (Lr), least angle regression (Lar), and Bayesian ridge chain (Br), performed extremely in predicting the (WQI) index with high accuracy (R2 = 0.99, RMSE = 0.15) for the (MLR) model, and high accuracy (R2 = 1.0, RMSE ~= 0.0) for the three aforementioned machine learning (ML) models. The results support the use of multiple linear regression models and machine learning models in predicting the water quality index (WQI) with very high accuracy, which will contribute to improving of water quality management.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 https://creativecommons.org/licenses/by-nc-sa/4.0/

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
The authors retain the copyright and grant the right to publish in the magazine for the first time with the transfer of the commercial right to Tishreen University Journal for Research and Scientific Studies - Engineering Sciences Series
Under a CC BY- NC-SA 04 license that allows others to share the work with of the work's authorship and initial publication in this journal. Authors can use a copy of their articles in their scientific activity, and on their scientific websites, provided that the place of publication is indicted in Tishreen University Journal for Research and Scientific Studies - Engineering Sciences Series . The Readers have the right to send, print and subscribe to the initial version of the article, and the title of Tishreen University Journal for Research and Scientific Studies - Engineering Sciences Series Publisher
journal uses a CC BY-NC-SA license which mean
You are free to:
- Share — copy and redistribute the material in any medium or format
- Adapt — remix, transform, and build upon the material
- The licensor cannot revoke these freedoms as long as you follow the license terms.
- Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- NonCommercial — You may not use the material for commercial purposes.
- ShareAlike — If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.