Abstract
Wind power forecasting is a powerful tool for ensuring the wholesale electricity market operation and power system operation planning while integrating significant wind power plant installed capacity. Outlier detection methods are widely used for improving the wind power forecasting efficiency. The purpose of this paper is to study the wind power forecasting efficiency improvement approaches based on initial data preprocessing using outlier identification techniques for wind power curves. The paper presents a study of outlier identification techniques based on the wind power curves analysis of the individual wind turbines, turbine groups and the whole plant. An existing method based on the interquartile range method is considered. A modification of the existing technique based on the density-based spatial clustering for applications with noise is also proposed for automatic maximum neighborhood distance determination. A new outlier detection technique is proposed based on the global-local outlier score estimated using hierarchical density-based spatial clustering. The proposed technique combined with wind turbines group forecasting strategy reduced the wind power forecast mean square error by 15.9% as compared to the wind farm level forecasting without outlier detection. The set of real data from a wind power plant with an installed capacity of 98.8 MW was used to test the proposed techniques. The developed methods can be used both by wind power plant owners to reduce potential losses when actual generation deviates from the planned generation, and by the System Operator for more accurate short-term power system planning.
Keywords
wind power forecasting, outlier detection, Global-Local Outlier Score from Hierarchies (GLOSH), Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN*), machine learning, wind power plant, renewable energy sources
1. Otchet o funktsionirovanii EES Rossii v 2017-2023 godah [Report on the functioning of the Unified Power System of Russia in 2017-2023]. System operator of Unified Power System: Official site. Available at: https://www.so-ups.ru/functioning/tech-disc/tech-disc-ups/ (accessed 1 September 2024) (in Russian)
2. Assotsiatsiya razvitiya vozobnovlyaemoy energetiki ARVE: ofitsial'nyy sayt. Konkursnye otbory investitsionnykh proektov VIE [Competitive selection of investment projects for renewable energy sources]. Available at: https://rreda.ru/industry/competitive-selection/#big-gallery-9 (accessed 1 September 2024) (in Russian)
3. Ilyushin P.V. Integration of Res-Based Power Plants Into The Unified Energy System of Russia: Problematic Issues and Approaches to Solving Them. Vestnik Moskovskogo Energeticheskogo Instituta [Bulletin of MPEI], 2022, no. 4, pp. 98–107. (in Russian). doi: 10.24160/1993-6982-2022-4-98-107
4. NP Sovet rynka: ofitsial'nyy sayt. Reglament provedeniia konkurentnogo otbora tsenovykh zaiavok na sutki vpered. [Regulations for the competitive selection of price bids for the day ahead.] Available at: https://www.np-sr.ru/sites/default/files/sr_regulation/reglaments/r7_01012026_26112024.docx (accessed 21 August 2024) (in Russian)
5. Hanifi S., Liu X., Lin Z., Lotfian S. A critical review of wind power forecasting methods – past, present and future. Energies. 2020. No. 13(15). 3764. doi: 10.3390/en13153764
6. Smiti A. A critical overview of outlier detection methods. Computer Science Review. 2020, vol. 38, 100306. doi: 10.1016/j.cosrev.2020.100306
7. Obukhov S.G. Sistemy generirovaniya ehlektricheskoi energii s ispolzovaniem vozobnovlyaemykh energoresursov [Electricity generation systems using renewable energy resources]. Tomsk, Tomsk Polytechnic University Publishing, 2008. 140 p. (In Russian)
8. Bilendo F., Meyer A., Badihi H., Lu N., Cambron P., Jiang B. Applications and modeling techniques of wind turbine power curve for wind farms – A review. Energies. 2022, no. 16(1), 180. doi: 10.3390/en16010180
9. Wang Y., Liu Y., Li L., Infield D., Han S. Short-term wind power forecasting based on clustering pre-calculated CFD method. Energies. 2018, no. 11(4), 854. doi: 10.3390/en11040854
10. Yakoub G., Mathew S., Leal J. Direct and indirect short-term aggregated turbine-and farm-level wind power forecasts integrating several NWP sources. Heliyon. 2023, no. 9(11), e21479. doi: 10.1016/j.heliyon.2023.e21479
11. Zhao Y., Lin Ye, Wang W., Sun H., Ju Y., Tang Y. Data-driven correction approach to refine power curve of wind farm under wind curtailment. IEEE Transactions on Sustainable Energy. 2017, no. 9(1), pp. 95-105. doi: 10.1109/TSTE.2017.2717021
12. Tukey J.W. Exploratory data analysis. Reading; Mass; London, Addison-Wesley, 1977. 688 p.
13. Paik C., Chung Y., Kim Y.J. Power Curve Modeling of Wind Turbines through Clustering-Based Outlier Elimination. Applied System Innovation. 2023, no. 6(2), 41. doi: 10.3390/asi6020041
14. Ester M., Kriegel H.-P., Sander J., Xu X. A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD). AAAI Press, 1996. Pp. 226-231.
15. Satopaa V., Albrecht J., Irwin D., Raghavan B. Finding a "kneedle" in a haystack: Detecting knee points in system behavior. Proceedings of the 31st International Conference on Distributed Computing Systems Workshops (ICDCSW). IEEE, 2011. Pp. 166-171.doi: 10.1109/ICDCSW.2011.20
16. Campello R.J.G.B., Moulavi D., Sander J.Density-based clustering based on hierarchical density estimates. Lecture Notes in Computer Science. 2013, vol. 7819, pp. 160-172. doi: 10.1007/978-3-642-37456-2_14
17. Campello R., Moulavi D., Zimek A., Sander J. Hierarchical density estimates for data clustering, visualization, and outlier detection. / ACM Transactions on Knowledge Discovery from Data (TKDD). 2015, no. 10(1), pp. 1-51. doi: 10.1145/2733381
18. Swersky, Lorne. "A study of unsupervised outlier detection for one-class classification". 2018, Thesis. University of Alberta. Accessed April 01, 2025. https://era.library.ualberta.ca/ items/0487bf08-210b-48fa-a899-6c603259a280
19. Friedman J.H. Greedy function approximation: a gradient boosting machine. Annals of statistics. 2001, no. 29(5), pp. 1189-1232. doi: 10.1214/aos/1013203451
20. Feurer M., Hutter F. Hyperparameter optimization. In Automated Machine Learning. The Springer Series on Challenges in Machine Learning. Springer, Cham. 2019. Pp. 3-30. doi: 10.1007/978-3-030-05318-5_1
21. Miyamoto S., Abe R., Endo Y., Takeshita J.-I. Ward method of hierarchical clustering for non-Euclidean similarity measures. 7th International Conference of Soft Computing and Pattern Recognition (SoCPaR). IEEE, 2015. Pp. 60-63. doi: 10.1109/SOCPAR.2015.7492784
22. Sakoe H., Chiba S. Dynamic programming algorithm optimization for spoken word recognition. IEEE transactions on acoustics, speech, and signal processing. 1978, no. 26(1), pp. 43-49. doi: 10.1109/TASSP.1978.1163055
23. Piotrowski P., I Rutyna., Baczyński D., Kopyt M. Evaluation metrics for wind power forecasts: A comprehensive review and statistical analysis of errors. Energies. 2022, no. 15(24), 9657. doi: 10.3390/en15249657
24. Snegirev D.A., Samoylenko V.O., Pazderin A.V., Bartolomey P.I. The Selection of Machine Learning Model and Its Hyperparameters Using Bayesian Optimization for Short-Term Wind Power Forecasting. Belarusian-Ural-Siberian Smart Energy Conference (BUSSEC). IEEE, 2023. Pp. 18-23. doi: 10.1109/BUSSEC59406.2023.10296274
25. Snegirev D.A., Pazderin A.V., Samoylenko V.O., Berdin A.S. Short-Term Wind Power Forecasting Based on Gaussian Process Regression. 6th International Scientific and Technical Conference on Relay Protection and Automation (RPA). IEEE, 2023. Pp. 1-13. doi: 10.1109/RPA59835.2023. 10319865
Snegirev D.A., Pazderin A.V., Samoylenko V.O., Bartolomey P.I. Improving Efficiency of Short-Term Wind Power Forecasting Using Wind Power Curve Outlier Detection. Elektrotekhnicheskie sistemy i kompleksy [Electrotechnical Systems and Complexes], 2025, no. 2(67), pp. 25-34. (In Russian). https://doi.org/10.18503/2311-8318-2025-2(67)-25-34