They then adjusted the hyperparameters to obtain a
model that did not overfit to predict the yield of six
local crops (Saadio et al, 2022). He collected satellite
meteorological data from 98 districts and counties in
Jilin and Shanxi provinces. Based on these data, he
used regression algorithms from random forests and
support vector machines, as well as deep learning
neural network models to predict local corn yields.
Ultimately, each model's prediction effect is assessed
and examined using the mean square error, mean
absolute percentage error, and root mean square error.
The study confirms that the neural network model is
the most effective one, and it also demonstrates the
viability of machine learning techniques in the
agricultural sector (Junxiu, 2021). There is not much
research on machine learning classification
algorithms in agriculture, but these algorithms can use
satellite images or images collected by aircraft to
classify different crops in farmland. This helps to
monitor the health of the field and to manage
irrigation and fertilization promptly. In this study,
based on the classification algorithm of machine
learning, a model is designed to judge which crops
can produce high yields in a given soil environment
based on soil conditions. This model not only
improves the efficiency of land use but also produces
more food. The application of this model can help
alleviate the problem of food shortage.
This study will use three machine learning
methods, which are multiple logistic regression,
support vector machine classification, and random
forest. This study will first train the model of each
algorithm separately based on the data. Secondly, the
optimal model of each algorithm is obtained by cross-
verifying the selection of hyperparameters. Finally,
the overall accuracy, precision, recall, and other
indicators of the best model of each algorithm are
compared to get the best model in this study. This
paper aims to investigate machine learning-based
intelligent planting optimization of soil environment.
2 METHOD
2.1 Data Set Description
Table 1 shows some of the data features and labels in
the data set. This study is based on soil nitrogen
content, phosphorus content, potassium content,
temperature, humidity, soil pH, rainfall data, and
high-yield crop categories. To make the subsequent
data analysis more beneficial, all table data will be
reserved for 3 decimal places. This data comes from
Kaggle. Kaggle is an online platform for data
scientists, machine learning engineers, and other
professionals in data-related fields.
Nitrogen: One of the basic elements of plants, the
main component of chlorophyll. Chlorophyll is the
compound that enables plants to photosynthesize
(Yanxiao et al, 2022).
Phosphorus: Phosphorus is essential for both the
growth of new tissues and cell division. In plants,
phosphorus is also involved in the intricate process of
energy conversion. When phosphorus is added to
soils with low levels of accessible phosphorus, it can
encourage tillering, strengthen roots against cold
stress, and frequently speed up ripening (Miaomiao et
al, 2023).
Potassium: A nutrient obtained by plants from soil
and fertilizers. Potassium can improve disease
resistance, stem strength, drought tolerance, and
winter survival of plants (Hashim & Mohammed,
2023).
Soil temperature: The average soil temperature for
biological activity is between 50 and 75 degrees
Fahrenheit, similar to the temperature of the human
body. These values are conducive to the normal life
functions of the Earth's biota, such as decomposing
organic matter, mineralizing nitrogen, absorbing
soluble substances, and metabolism (Huiyong et al,
2008).
pH: The pH value indicates the acidity or
alkalinity of the soil. The pH range of 5.5 to 6.5 is
ideal for plant growth, as this range guarantees the
availability of nutrients (Chenxiao, 2022).
Rainfall: Rainfall also affects how fast crops grow
from seed, including when they are harvested.
Balanced rainfall and proper irrigation can accelerate
plant growth, which can reduce germination times
and the time between planting and harvesting
(Podzikowski et al, 2023).
Table 1: Partial data on factors affecting crop growth and corresponding high-yielding plants.
Nitrogen Phosphorus Potassium temperature humidity ph rainfall label
0 90 42 43
20.879 82.002 6.502 202.935 rice
1 85 58 41
21.770 80.319 7.038 226.655 rice
2 60 55 44
23.004 82.320 7.840 263.964 rice
3 74 35 40
26.491 80.158 6.980 242.864 rice
4 78 42 42
20.130 81.604 7.628 262.717 rice